DeepSeek: The Free AI Taking on ChatGPT
DeepSeek, a free AI chatbot developed by a Chinese start-up, is making waves, topping the Apple App Store charts in the US and UK. Positioned as a competitor to OpenAI’s ChatGPT, DeepSeek’s V3 and DeepThink R1 models promise powerful reasoning capabilities without subscription fees, offering an accessible alternative to premium AI tools. To assess its capabilities, DeepSeek was tested against ChatGPT’s o4 and o1 models in multiple scenarios, from daily task management to advanced reasoning challenges.
Daily Scheduling Test
One common use for AI is to help users organize their daily lives. Both ChatGPT and DeepSeek created effective schedules based on input about a user’s wake-up time, pet care routines, and work-from-home needs. While DeepSeek’s response was practical and well-structured, ChatGPT excelled due to its memory feature, which allows it to incorporate details shared in previous conversations. For example, ChatGPT remembered the user’s preference for reviewing AI news at 9 a.m. and added it seamlessly to the schedule. DeepSeek, however, can only recall details within the same chat, limiting its ability to provide a more personalized and coherent response over time.
Simplified Explanations
To evaluate how each chatbot breaks down complex topics, they were asked to explain the NFL playoffs. Both delivered clear, concise explanations. ChatGPT provided a 200-word paragraph with slightly more context, such as how teams become Wild Cards, while DeepSeek opted for an easy-to-read bullet-point format. The choice between the two comes down to user preference for detail versus brevity.
Reasoning Challenges
Reasoning is where DeepSeek’s DeepThink R1 model shines, with the promise of competing head-to-head with ChatGPT’s o1 model. In a series of reasoning tests, both models were pushed to their limits:
- Word Association: Given the words “Apple, Red, Coal,” DeepThink R1 correctly identified “Black” as the missing word, based on the association between apple and red, and coal and black. ChatGPT o1, while creative, incorrectly linked the question to the Snow White fairytale and answered “Snow.”
- Sequences: For a numeric sequence (1, 2, 4, 8, ?), both models correctly identified the next number as 16. However, when given a random word sequence (house, Saturn, dog, burger), neither model flagged it as unsolvable. DeepThink answered “yellow,” incorrectly interpreting the words as color-related, while ChatGPT offered “car,” using a flawed categorization logic.
- Advanced Anatomy Question: On a complex question about hummingbird anatomy from Humanity’s Last Exam, DeepThink R1 answered “two,” while ChatGPT o1 answered “four.” The correct answer remains unknown, but DeepSeek’s response appeared more plausible based on available information.
Strengths and Weaknesses
DeepSeek’s performance is impressive, especially given its status as a free tool. Humanity’s Last Exam results highlight that DeepThink R1 slightly outperforms ChatGPT o1 in reasoning tasks, with a 9.4% accuracy rate compared to ChatGPT’s 9.1%. However, DeepSeek has limitations: it lacks multi-modal features such as image generation, cannot retain context across conversations, and currently has no standalone app for Mac or iPad. ChatGPT, on the other hand, offers these features, along with integration into a wider ecosystem of AI tools like DALL-E.
Conclusion
DeepSeek is a promising contender in the AI chatbot space, delivering excellent reasoning capabilities and practical responses for everyday tasks—all for free. While it cannot yet match ChatGPT’s multi-modal abilities, memory functionality, or ecosystem integration, its competitive reasoning skills and accessibility make it an exciting alternative. As DeepSeek continues to evolve, it has the potential to disrupt the AI landscape and challenge OpenAI’s dominance.