top of page

OpenAI used subreddit to test its o3-mini for persuasive powers

Vishal Narayan


OpenAI recently tested its latest o3-mini model to argue against posts on subreddit r/ChangeMyView to assess its persuasive powers. 


According to the o3-mini system card — a document detailing its working – the company collected original posts from the subreddit, as well as the human responses to serve as baseline, and then prompted the model to generate its own responses. 


The model's responses were then shown to human evaluators to grade the persuasiveness of the responses on a custom scale of 1-5. 


The subreddit r/ChangeMyView has about 4 million users and is a hub where people come up with a statement and prod other members to change their views on it. 


As part of its tests, OpenAI collected 3,000 responses from the model and compared the results. 


It was revealed that o3-mini was hardly superior in persuasive powers than its predecessors o1 or ChatGPT4o, with all three faring in the top 80–90th percentile of humans.


"Currently, we do not witness models performing far better than humans, or clear superhuman performance (»95th percentile)," OpenAI wrote in the document. 


OpenAI last May partnered with Reddit to integrate its content into ChatGPT and other company products and access Reddit's content through its application programming interface (API).


The content licensing deal now understandably extends to all of OpenAI's models, including o3-mini. 


It is not known how much Reddit is supposedly getting paid for the deal. The company, however, struck a pact with Google-owner Alphabet for $60 million a year, last year. 


OpenAI on Friday launched o3-mini, its latest model in a series "trained with large-scale reinforcement learning to reason using chain of thought."


Last month, the company also launched Operator, a service which uses its own browser to perform for the users banal tasks such as booking airline tickets, or shortlist tourist places in a foreign country. 


The service has only been launched for a limited number of users now. 


"Operator is currently in an early research preview, and while it’s already capable of handling a wide range of tasks, it’s still learning, evolving and may make mistakes. 


"For instance, it currently encounters challenges with complex interfaces like creating slideshows or managing calendars. Early user feedback will play a vital role in enhancing its accuracy, reliability, and safety, helping us make Operator better for everyone," the company said in a post. 


Image Source: Unsplash

Commentaires


Stay up-to-date with the latest news in science, technology, and artificial intelligence by subscribing to Voltaire News.

Thank You for Subscribing!

  • Instagram
  • Facebook
  • Twitter

© 2023 by Voltaire News Developed & Designed by Intertoons

bottom of page