News

OpenAI’s Operator is an AI Agent that Handles Online Tasks

OpenAI, the creators of ChatGPT, have unveiled Operator, a new generative AI service designed to act as a virtual agent for completing tasks on your behalf. Using its browser, Operator interacts with webpages autonomously, performing actions such as typing, clicking, and scrolling—all without requiring user input.

The rollout of Operator will be gradual, with ChatGPT Pro subscribers in the United States being the first to access this new service.

Operator is built to manage repetitive browser tasks, making online processes seamless and efficient. OpenAI states that it can handle actions like filling out forms, ordering groceries, and even creating memes. By using the same graphical user interfaces (GUIs) humans interact with such as buttons, menus, and text fields it can even navigate websites and apps just as a person would. This ability opens up new possibilities for businesses, offering them opportunities to streamline customer interactions and automate workflows.

Ad Powered By Advergic
Loading ad . . .
Ad - Continue scrolling to read

How Does it Work?

OpenAI powers Operator with CUA (Computer-Using Agent), a model that integrates GPT-4’s vision capabilities with advanced reasoning enhanced by reinforcement learning. OpenAI specifically trained CUA to interact with GUIs, mimicking how users engage with tools and services online.

While Operator can work independently, it does have limitations. If it encounters a problem or requires sensitive data like passwords, it will hand control back to the user for manual input. This ensures an extra layer of security for private information.

Supported Platforms

Currently, Operator supports platforms such as Doordash, Etsy, Booking.com, Uber, and Instacart, among others. Additionally, it has partnerships with media outlets like the Associated Press and Reuters, enabling it to conduct research effectively.

Share
Published by
Afaq Wajdan Malik