• +506-6133-8358
  • john.mecke@dvelopmentcorporate.com
  • Tronadora Costa Rica
Evaluating the Factual Accuracy of ChatGPT-4o, Gemini, and Perplexity.ai in Real-World Queries

Evaluating the Factual Accuracy of ChatGPT-4o, Gemini, and Perplexity.ai in Real-World Queries

Large language models (LLMs) like ChatGPT-4o, Gemini, and Perplexity.ai are assessed using the WildHallucinations benchmark to handle “hallucinations”—generating incorrect information. ChatGPT-4o excels in well-documented areas, Gemini prioritizes accuracy over responsiveness, and Perplexity.ai uses real-time retrieval to update its responses. Each has strengths and weaknesses, necessitating further improvements.