Struggling to recruit B2B executives for user research? Discover how synthetic data and the “Sandwich Method” can solve the Cold Start Problem, reduce costs, and augment—not replace—human insight.
If you have ever tried to schedule a one-hour discovery interview with a Fortune 500 CTO, you know the pain. It is the “Cold Start Problem” of B2B product discovery: you need deep insights to build the right product, but the people who hold those insights are too busy, too expensive, or simply impossible to reach.
For years, product teams have faced a binary choice: spend weeks and thousands of dollars recruiting a handful of participants, or build based on assumptions. But as we move into late 2025, a third option has emerged—one that is equal parts promising and controversial.
Synthetic Data.
This isn’t about fake users for load testing. We are talking about “Active Personas” and “Synthetic Users”—AI agents driven by Large Language Models (LLMs) that simulate the behavior, preferences, and even the emotional biases of your target buyers.
But can an AI really predict what a VP of Engineering hates about their current tech stack? Or is this just a high-tech hallucination?
In this guide, we explore the landscape of synthetic qualitative research, the fierce academic debate surrounding its validity, and a practical framework—The Sandwich Method—that allows you to leverage this technology without falling into the “Synthetic Persona Fallacy.”
The Landscape: What is Synthetic User Research?
At its core, synthetic user research utilizes Generative AI to simulate qualitative feedback. Instead of waiting days for a human to review your landing page or value proposition, you can spin up 50 synthetic “Enterprise Architects” overnight and wake up to a detailed analysis of your messaging clarity.
Leading platforms in this space, such as Synthetic Users, have moved beyond simple chatbots. They now employ sophisticated architectures designed to mimic human complexity:
- Multi-Model Architecture: Advanced platforms don’t rely on a single model. They use routing agents (like “Shuffle v2”) to switch between different LLMs, ensuring that your synthetic users don’t all sound the same.
- Emotional Simulation: Humans are not purely logical. New “chain-of-feeling” techniques combine the OCEAN personality model with emotional state simulation to generate responses that reflect frustration, skepticism, or excitement.
- Retrieval-Augmented Generation (RAG): To prevents generic advice, these personas are grounded in specific business contexts using RAG, allowing them to pull from vast datasets of industry-specific knowledge.
According to a recent Gartner report, this capability is shifting research from a sporadic, project-based activity to a “Continuous Insight™” loop. The promise is seductive: 85% to 92% “Synthetic Organic Parity,” meaning the AI’s responses are statistically indistinguishable from real human feedback in many contexts.
The Debate: Innovation vs. Hallucination
Before you fire your research agency, it is critical to understand the battle lines being drawn in the academic and professional communities. The consensus is not that synthetic data is “good” or “bad,” but that it has very specific “fidelity” zones.
The Case for Validity: “Silicon Sampling”
Proponents argue that because LLMs are trained on the entirety of the internet—including millions of Reddit threads, forum discussions, and articles written by your target audience—they possess a high degree of “Algorithmic Fidelity.”
A foundational study titled “Out of One, Many: Using Language Models to Simulate Human Samples” by Lisa Argyle, Ethan Busby, et al. introduced the concept of “Silicon Sampling.” They demonstrated that when properly prompted, LLMs could accurately simulate the distribution of opinions found in human populations.
Furthermore, a 2025 study by Mario Simaremare and Henry Edison, “Active Personas for Synthetic User Feedback”, found that LLM-based agents were highly effective for early-stage usability testing. They concluded that these agents serve to effectively “augment” human feedback, catching logical inconsistencies and clarity issues just as well as humans.
The Case for Skepticism: The “Messy Reality” Gap
However, the limitations are stark. The critique is best summarized by Nguyen & Welch (2025) in their paper “Generative Artificial Intelligence in Qualitative Data Analysis”. They argue that while LLMs are excellent at summarizing patterns, they lack true “interpretive” capabilities. An AI cannot read between the lines of a hesitant pause or understand the office politics that kill a software deal.
This view is supported by the Qualtrics “2025 Market Research Trends Report”. While finding that over 70% of researchers are experimenting with synthetic data, the report notes that synthetic personas often filter out the “messy reality” of B2B buying. They tend to be overly rational, approving features because they make logical sense, while ignoring the irrational budget freezes or internal power struggles that dictate real-world decisions.
Perhaps the most damning critique comes from a recent article in ACM Interactions, titled “The Synthetic Persona Fallacy”. The authors warn of a dangerous feedback loop: AI can only remix data it has already seen. It cannot discover novel pain points. If a new regulation was passed yesterday that changes your industry, your synthetic users won’t know about the resulting anxiety until that data permeates the training set.
The Strategic Solution: The “Sandwich Method”
Given these strengths and weaknesses, how should a B2B SaaS company proceed? You should not replace human research, but you also shouldn’t ignore the speed of AI.
The industry “best practice” emerging in 2025 is The Sandwich Method.
This approach uses synthetic data as the “bread” on either side of the “meat”—the real human interactions. This hybrid workflow maximizes efficiency while preserving the unique discovery potential of talking to real people.
Step 1: Synthetic Preparation (The Bottom Slice)
Before you ever get on a Zoom call with a human, use synthetic users to prep.
- Stress-Test Your Interview Guide: Feed your interview questions to a synthetic “CFO” and ask it to flag any confusing jargon or questions that feel irrelevant.
- Refine Your Value Prop: Run your landing page copy through 20 synthetic personas. If the synthetic “VP of Sales” doesn’t understand your headline, a real one won’t either.
- Outcome: You enter your real interviews with a polished, high-fidelity script, ensuring you don’t waste precious minutes on clarifying basic concepts.
Step 2: Real Interviews (The Meat)
This is the irreplaceable core. Conduct a smaller, highly focused set of interviews with real humans (e.g., 5–8 deep-dive sessions instead of 20 broad ones).
- Focus on the “Why”: Since you’ve already validated the logical parts of your product with AI, use this time to dig into the emotional and political drivers. Ask about their last budget meeting, their relationship with their boss, and the fears that keep them up at night.
- Outcome: Deep, novel insights and “unknown unknowns” that no AI could predict.
Step 3: Synthetic Analysis (The Top Slice)
Once you have your transcripts and insights, bring the AI back in to stress-test your conclusions.
- The “Skeptical Buyer” Agent: Upload your key findings and prompt an AI agent to act as a cynical buyer. Ask it: “Why would you not buy this solution based on these findings?”
- Scale the Insight: If you found a unique pain point in your 5 human interviews, use synthetic users to see if that pain point resonates with a simulated audience of 1,000 to check for broad logical consistency.
- Outcome: valid, robust findings that have been checked for bias and logical gaps.
Use Case Suitability: When to Use Synthetic Data
Not all research tasks are created equal. Here is a quick guide on where to apply synthetic data in B2B SaaS.
| Research Goal | Verdict | Rationale |
| Messaging Testing | ✅ Highly Recommended | AI excels at semantic analysis. If a synthetic persona finds your pricing page confusing, it is objectively confusing. |
| Usability Heuristics | ✅ Highly Recommended | Synthetic users are great at spotting friction points in user flows (e.g., “This form is too long”). |
| Feature Prioritization | ⚠️ Use with Caution | AI is too rational. It may say a feature is “essential” because it adds utility, ignoring that real buyers might not have the budget for it. |
| Discovery Interviews | ❌ Not Recommended | AI cannot generate novel insights about new problems. It can only remix the past. |
Conclusion
The dream of fully automating user research is just that—a dream. As long as B2B buying remains a messy, human, political process, we will need human researchers to navigate it.
However, refusing to use synthetic data is equally shortsighted. By adopting the Sandwich Method, you can solve the “Cold Start Problem,” iterate on your messaging while you sleep, and ensure that when you finally do sit down with that elusive CTO, you are asking the questions that truly matter.
Synthetic data is not a replacement for empathy. It is a tool that clears the noise so you can focus on the signal.
References & Further Reading
- Argyle, Busby, et al. – “Out of One, Many: Using Language Models to Simulate Human Samples” – Political Analysis, Cambridge University Press. The foundational academic study on “Silicon Sampling” and algorithmic fidelity.
- Nguyen & Welch (2025) – “Generative Artificial Intelligence in Qualitative Data Analysis: Analyzing—Or Just Chatting?” – Organizational Research Methods. Critical analysis of LLMs’ interpretive limitations.
- Qualtrics – 2025 Market Research Trends Report. Industry survey of 3,000+ market researchers on synthetic data adoption.
- ACM Interactions – “The Synthetic Persona Fallacy”. Critical perspective on AI-generated research in UX contexts.
- Nielsen Norman Group – “Synthetic Users: If, When, and How to Use AI-Generated Research”. Practical guidance on synthetic user applications.
1. Argyle, Busby, et al. — “Out of One, Many: Using Language Models to Simulate Human Samples” (Political Analysis, 2023)
Hypothesis/Objectives: This foundational academic study proposes that large language models like GPT-3 can serve as effective proxies for specific human subpopulations in social science research. The researchers challenge the conventional view that “algorithmic bias” in AI is a uniform, problematic property. Instead, they hypothesize that this bias is actually fine-grained and demographically correlated, meaning that with proper conditioning, LLMs can accurately emulate response distributions from diverse human subgroups. They introduce the concept of “algorithmic fidelity” to describe this phenomenon and propose “silicon sampling” as a novel research methodology.
Methodology: The researchers created “silicon samples” by conditioning GPT-3 on thousands of sociodemographic backstories derived from real human participants in multiple large U.S. surveys, including the American National Election Studies (ANES). They then systematically compared responses from these AI-generated silicon samples against actual human survey responses across various political and social questions. The study employed statistical measures including Cramér’s V to assess the correlation between synthetic and human response distributions across different demographic segments.
Key Findings: The study demonstrates that the information contained in GPT-3 extends far beyond surface-level similarity to human responses. The researchers found that AI-generated responses are nuanced, multifaceted, and reflect the complex interplay between ideas, attitudes, and sociocultural context that characterize real human attitudes. This “algorithmic fidelity” suggests that properly conditioned language models can serve as a novel and powerful tool for advancing understanding of humans and society across multiple disciplines—essentially establishing the academic foundation for what the industry now calls “synthetic research.”
2. Nguyen & Welch (2025) — “Generative Artificial Intelligence in Qualitative Data Analysis: Analyzing—Or Just Chatting?” (Organizational Research Methods)
Hypothesis/Objectives: This critical analysis challenges the growing enthusiasm for applying generative AI tools to qualitative data analysis. The authors question whether LLMs are actually appropriate for qualitative research, pushing back against claims that these tools can automate coding, conduct thematic analysis, or function as virtual research assistants. Their central thesis is that the conversational capabilities of LLM chatbots should not be mistaken for genuine analytical work—when researchers “chat” with AI, they are not engaging with their data but rather eliciting synthetic algorithmic outputs that lack the meaning and communicative intent characterizing human language.
Methodology: Nguyen and Welch provide an accessible, technologically informed analysis of how generative AI—specifically large language models—actually functions at a technical level. They evaluate the claimed transformative potential of GenAI against four criteria for validating scientific tools: factual accuracy, reliability, transparency, and ethical responsibility. The paper also examines the commercial landscape of “Qual-AI” software providers and their marketing claims, analyzing the gap between promised capabilities (such as “qualitative insights in minutes instead of weeks”) and actual technological limitations including hallucinations, lack of reproducibility, and data privacy concerns.
Key Findings: The evaluation reveals significant shortcomings that, if adopted uncritically by researchers, will introduce unacceptable epistemic risks. The authors argue that LLM chatbots generate synthetic text strings probabilistically rather than systematically searching, coding, or undertaking interpretive analysis with user data. They conclude that positioning GenAI as a way to “scale up” qualitative research or automate coding fundamentally misunderstands the interpretive, iterative, meaning-making process central to qualitative inquiry. The paper also critiques the focus on “prompt engineering” as a fix, arguing this overstates model capabilities while redirecting accountability onto users rather than acknowledging technological limitations.
3. Qualtrics — 2025 Market Research Trends Report
Hypothesis/Objectives: This industry survey examines how market researchers are adopting AI and synthetic data methods in response to mounting challenges including privacy concerns, limited budgets, data scarcity, and survey fatigue. The report investigates the hypothesis that synthetic responses—artificially generated data designed to mimic real-world information and personas—are gaining traction as a practical solution to industry constraints. It also explores whether research teams that embrace innovation see increased influence, budgets, and demand for their work within their organizations.
Methodology: Qualtrics surveyed more than 3,000 market researchers across 14 countries about trends shaping the research industry and their organizational priorities for 2025. The survey examined current AI tool adoption rates, satisfaction levels among those who have used synthetic responses, predictions about the future of synthetic data in research, and the organizational outcomes for teams at different levels of innovation adoption. The research segmented respondents by their level of AI experimentation (regular users, experimental phases, non-adopters) to compare outcomes.
Key Findings: The report reveals rapid industry transformation: 89% of researchers are already using AI tools regularly or experimentally, and 83% plan to significantly increase AI investment in 2025. Nearly three-quarters (71%) of market researchers agree that synthetic responses will constitute the majority of market research within three years. Among researchers who have used synthetic responses, 87% report high satisfaction with results, particularly for testing packages, names, and messaging. The data shows a clear innovation dividend: teams identifying as “cutting edge” report increased influence, budgets, and demand for their work compared to those relying on traditional methods. However, researchers also identified key challenges including ensuring third-party panel quality and detecting AI-generated responses in human panels.
4. ACM Interactions — “The Synthetic Persona Fallacy: How AI-Generated Research Undermines UX Research”
Hypothesis/Objectives: This critical perspective argues that the rise of commercial platforms offering AI-powered user research represents a “quiet crisis in research integrity” across HCI, UX, and product design. The authors contend that major vendors promoting large language model outputs as substitutes for qualitative research fundamentally misunderstand what makes user research valuable. Their central thesis—the “synthetic persona fallacy”—is that the value of UX research lies not just in the artifacts produced (like personas) but in the process itself, which fosters deep understanding and empathy that automation cannot replicate.
Methodology: The article draws on discourse from the 2024 ACM SIGCHI Conference on Computer-Supported Cooperative Work and Social Computing, particularly a panel titled “Is Human-AI Interaction CSCW?” that explored whether human-AI partnerships constitute real collaboration. The authors analyze the conceptual and practical limitations of synthetic users through the lens of established UX research principles, examining how AI-generated personas and research outputs differ from human-led research in terms of depth, contextual understanding, and the social dynamics of collaborative work. They also reference empirical studies on bias and validation concerns.
Key Findings: The analysis identifies several critical limitations: synthetic users cannot model multi-user interactions such as collaborative tools or viral effects in social apps, since UX rarely involves just one person interacting with a system. The efficacy of AI-simulated users is directly tied to training data quality, raising significant bias concerns—if datasets lack diversity or contain inherent biases, the generated insights will be skewed. The authors emphasize that rigorous validation against real-world human data is essential, yet collecting that data undermines the premise of using AI as a complete replacement. UX practitioners interviewed expressed that “the person who actually does the research, interviews people, and builds the personas develops a much deeper connection”—the creation process itself constitutes the primary value, something automation cannot provide.
5. Nielsen Norman Group — “Synthetic Users: If, When, and How to Use AI-Generated Research” (2024)
Hypothesis/Objectives: This practical guidance piece from the leading UX research consultancy examines whether AI-generated “synthetic users” can serve as a useful complement to—or dangerous replacement for—real user research. The authors investigate specific use cases where synthetic research might add value while establishing clear boundaries for responsible use. Their guiding principle is that “UX without real-user research isn’t UX,” and they seek to answer whether fake research is truly better than no research at all.
Methodology: The NN/g researchers conducted hands-on evaluations using both Synthetic Users (the commercial product) and ChatGPT to generate synthetic users and insights for three actual studies they had previously performed with real users. They specified user groups and research goals, then compared AI-generated interview transcripts against real user responses across multiple dimensions: accuracy of experiences described, depth of insights, behavioral realism, and value prioritization. They also interviewed Hugo Alves, cofounder of Synthetic Users, to understand intended product applications and limitations.
Key Findings: The evaluation revealed significant limitations: AI provides an unrealistic view of human behavior due to sycophancy (the tendency to please), synthetic users vastly outperform real humans in tasks like tree testing, and responses about past experiences are often idealized rather than accurate. When asked about course completion or discussion forum usage, synthetic users gave uniformly positive responses contradicting real user behavior. Values, desires, and needs from synthetic users are too shallow—they “care about everything” equally, providing no useful prioritization for feature development. The authors conclude that synthetic users are valid only for desk research and hypothesis generation, should never replace real research, and pose particular risks in low UX-maturity organizations where stakeholders might use them as permanent substitutes. They warn that teams starting with synthetic research may become dependent on it and never progress to real-user research, potentially damaging organizational understanding of research value permanently.
This article was written for B2B SaaS executives and product leaders seeking to understand the practical applications and limitations of synthetic data in qualitative research. For consulting support on AI-enhanced market research methodologies, competitive intelligence, or product-market fit studies, contact DevelopmentCorporate LLC.


