and Perplexity.ai Archives - Development Corporate

Evaluating the Factual Accuracy of ChatGPT-4o, Gemini, and Perplexity.ai in Real-World Queries

ByJohn Mecke August 16, 2024

Large language models (LLMs) like ChatGPT-4o, Gemini, and Perplexity.ai are assessed using the WildHallucinations benchmark to handle “hallucinations”—generating incorrect information. ChatGPT-4o excels in well-documented areas, Gemini prioritizes accuracy over responsiveness, and Perplexity.ai uses real-time retrieval to update its responses. Each has strengths and weaknesses, necessitating further improvements.