Evaluating the Factual Accuracy of ChatGPT-4o, Gemini, and Perplexity.ai in Real-World Queries
Large language models (LLMs) like ChatGPT-4o, Gemini, and Perplexity.ai are assessed using the WildHallucinations benchmark to handle "hallucinations"—generating incorrect information. ChatGPT-4o excels in well-documented areas, Gemini prioritizes accuracy over responsiveness,…