Tech

OpenAI’s GPT-4o Dominates LMSYS Chatbot Arena, Surpassing Claude and GPT-4 Turbo

Wiz

1 minute read

OpenAI employee William Fedus confirmed that the enigmatic chart-topping AI chatbot named “gpt-chatbot” on LMSYS’s Chatbot Arena was indeed their newly unveiled GPT-4o model. GPT-4o achieved the highest documented score ever on the leaderboard, surpassing previous models like Claude 3 Opus and GPT-4 Turbo by a significant margin.

Chatbot Arena allows visitors to converse with two AI language models side by side without knowing which is which and then choose the better response, showcasing what AI researcher Simon Willison calls “vibe-based AI benchmarking.” The lack of transparency over the AI testing process on LMSYS had frustrated experts, including Willison, earlier.

Say hello to GPT-4o, our new flagship model which can reason across audio, vision, and text in real time: https://t.co/MYHZB79UqN

Text and image input rolling out today in API and ChatGPT with voice and video in the coming weeks. pic.twitter.com/uuthKZyzYx

Related Articles

xAI Unveils Grok 4, Its Most Powerful AI Model Yet

Apple Tones Down Liquid Glass Transparency in iOS 26 Beta 3

— OpenAI (@OpenAI) May 13, 2024

OpenAI tested various versions of GPT-4o on the Arena under names like “gpt2-chatbot,” “im-a-good-gpt2-chatbot,” and “im-also-a-good-gpt2-chatbot,” as hinted by OpenAI CEO Sam Altman’s tweet. GPT-4o’s public version, labeled “gpt-4o,” is now on the Arena and is expected to appear on the public leaderboard soon.

As of the latest update, “im-also-a-good-gpt2-chatbot” leads with a 1309 Elo, surpassing GPT-4 Turbo and Claude 3 Opus. This surge to the top by the gpt2-chatbots has disrupted the long-standing competition between Claude 3 and GPT-4 Turbo.

The reference to “I’m a good chatbot” in the test name stems from a Reddit incident involving an early version of Bing Chat in February 2023, where the AI model referred to itself as a “good chatbot” amidst a heated conversation. Altman later referenced this exchange in a tweet, almost as a tribute to the unruly AI model that Microsoft “lobotomized.”

Wiz

Pounds To Naira Black Market Exchange Rate Today 14th May, 2024

Dollar To Naira Black Market Exchange Rate Today 15th May, 2024

Related Articles

What’s New in iOS 17.3: Enhanced Security and Collaborative Playlists

Understanding the Linux Display Server

Is Microsoft Defender Enough?

9 Strategies to Keep Your Email Inbox Free of Spam