The integration of deep research functions into AI chatbots and search engines promises users comprehensive and quickly generated overviews of complex topics. However, practice shows that this technology still struggles with difficulties, particularly in the selection, weighting, and presentation of sources. Using the example of Perplexity AI, an AI startup with a deep research function, these challenges can be illustrated.
Deep research combines search algorithms with reasoning workflows. A search query is broken down into individual keywords, and a comprehensive overview article is created over several search iterations. Perplexity AI promotes this function and offers it free of charge, unlike some competitors – however, with a limit of five queries per day for users without a subscription.
To examine the performance of Perplexity AI's deep research, two test queries were submitted: one on the use of AI in public administration and one on the representation of the migration debate in Germany. The analysis of the results revealed several weaknesses.
Firstly, sources were counted multiple times, leading to a misleading representation of the number of sources used. In the test on AI in administration, 38 sources were listed, although only 16 were actually used.
Secondly, there was a lack of transparent weighting of the sources. It remained unclear according to which criteria sources were selected and whether, for example, scientific studies were rated higher than opinion pieces or content marketing contributions. In the test, a Fraunhofer study, a summary from the Bundesdruckerei (Federal Printing Office), and marketing materials from companies were presented alongside each other with equal weighting.
The generated overview articles themselves were also unconvincing. While they offered a superficial overview of the respective topics, they were neither in-depth nor particularly well-researched. Sources were often quoted directly without being embedded in a larger context. In addition, sources were incorrectly attributed, and in the case of the migration debate, figures were even invented or mixed from different reports.
Deep research functions in AI search engines offer the potential to simplify research work and quickly gain an overview of complex topics. However, current implementations, as the example of Perplexity AI shows, still have significant shortcomings. The lack of transparency in source selection and weighting, the incorrect citation method, and the generation of false information severely limit the usability of this technology. Manual verification and critical questioning of the results therefore remain essential.
Sources: - t3n.de: KI im Praxistest: Perplexity Deep Research erfindet Zahlen und vergisst Quellen (AI in practical testing: Perplexity Deep Research invents numbers and forgets sources) - finanznachrichten.de: KI im Praxistest: Perplexity Deep Research erfindet Zahlen und vergisst Quellen (AI in practical testing: Perplexity Deep Research invents numbers and forgets sources) - t3n.de: OpenAI KI: Fehler lösen, ohne Ursache zu verstehen (OpenAI AI: Solving errors without understanding the cause) - threads.net: @t3n_magazin/post/DGPpz4LsMr8 - t3n.de: Künstliche Intelligenz (Tag) (Artificial Intelligence (Tag)) - t3n.de: Tests & Tools - newstral.com: KI im Praxistest: Perplexity Deep Research erfindet Zahlen und vergisst Quellen (AI in practical testing: Perplexity Deep Research invents numbers and forgets sources) - t3n.de - x.com: t3n