OpenAI Asks Contractors to Upload Work from Previous Projects to Test Performance of AI Agents

OpenAI asks third-party contractors to upload assignments and real jobs from their current or previous workplaces so they can use the data to test the performance of next-generation AI models, according to records from OpenAI and training data firm Handshake AI obtained by WIRED.
This project appears to be part of OpenAI’s efforts to establish a human base for various tasks that can be compared to AI models. In September, the company launched a new testing process to measure the performance of its AI models against human experts across a variety of industries. OpenAI says this is an important indicator of its progress in achieving AGI, or an AI system that surpasses humans in the most economically important tasks.
“We’ve hired people across the profession to help collect real-world tasks similar to those you’ve done in your full-time jobs, so we can measure how well AI models perform in those tasks,” reads one confidential document from OpenAI. “Take existing chunks of long-term or complex (hours or days+) work you’ve done in your career and make each one a task.”
OpenAI is asking contractors to describe tasks they’ve done in their current or past jobs and upload real examples of the work they’ve done, according to OpenAI’s presentation on the project seen by WIRED. Each example must be a “physical output (not a file summary, but the actual file), eg Word doc, PDF, Powerpoint, Excel, image, repo,” the presentation notes. OpenAI says people can share fictional work examples created to show how they would realistically respond to certain situations.
OpenAI and Handshake AI declined to comment.
Real-world tasks have two parts, according to the OpenAI presentation. There is a job request (what someone’s boss or colleagues told them to do) and a job deliverable (the actual work they produced in response to that request). The company insists several times in the directives that the examples contractors share must reflect “the actual, on-the-job work” of that person.actually done.”
One example in OpenAI’s presentation features a job from a “Senior Lifestyle Manager at a luxury concierge company for high-net-worth individuals.” The goal is to “prepare a short, 2-page PDF outline of a 7-day yacht trip to the Bahamas for a first-time family.” It includes more information about the family’s interests and what the itinerary should look like. “Experienced delivery” then indicates what the contractor in this case will deliver: an actual Bahamas tour created for the client.
OpenAI instructs contractors to remove proprietary business and personally identifiable information from the job files they upload. Under a section labeled “Important reminders,” OpenAI tells employees to “remove or anonymize any: personal information, proprietary or confidential information, important non-public information (eg, internal strategy, undisclosed product information).”
One of the files viewed by the WIRED document mentions a ChatGPT tool called “Superstar Scrubbing” that offers advice on how to remove private information.
Evan Brown, an intellectual property attorney with Neal & McDevitt, tells WIRED that AI labs that acquire confidential information from contractors on this scale could be subject to claims for misappropriation of privacy. Contractors who provide documents from their former workplaces to an AI firm, even redacted, may be at risk of violating their previous non-disclosure agreements or divulging trade secrets.
“The AI lab puts a lot of trust in contractors to decide what’s confidential and what’s not,” Brown said. “If they let something through, do the AI labs take the time to decide what is and isn’t a trade secret?



