When I hopped on a Zoom call with a rep from a major survey panel provider, he had one thing on his mind: synthetic respondents.
After sitting through a long pitch while my head silently shook side to side, he finally asked, “So is this something your team might be interested in?”
I said, “No.”
And added: “Weren’t we supposed to be discussing a survey/Slack integration?”
What's a synthetic respondent?
The rep had taken me deep into the rabbit hole of synthetic respondents and its cousins, synthetic data and synthetic responses.
These terms refer to the use of AI to generate fake survey responses that look like human answers. And companies like Qualtrics and Cint are betting big that these artificial stand-ins are the future of survey research.
In fact, according to a Qualtrics-run survey of 3,200 market researchers (so, you know, grains of salt everywhere), 71% believe most research will be done using synthetic responses within three years.
Yes, Qualtrics has a vested interest. But even their own write-up admits that “some researchers are hesitant", blaming a “lack of familiarity” for the skepticism. (Not, say, a surplus of common sense.)
So, why is the industry so enchanted with synthetic data and its algorithmic origin story?
Synthetic respondents are cheap. And fast. 🤑
Synthetic respondents are far cheaper and faster than real people, who tend to want compensation for their time, an appalling fact that may startle some of you.
Synthetic respondents never sleep. They never complain. And they can “answer” a 10-minute survey in 0.0001 seconds.
But while synthetic data promises scale and efficiency, it also raises a few red flags.
Below are my 3 of my main beefs with synthetic respondents
Beef 1: Weren’t we trying to avoid artificial survey responses? 🤖
I’m old enough to remember when researchers were desperate to eliminate 'bots' from their data. You know back in... every single day of 2025.
Entire QA teams exist just to detect and remove bot responses.
Journal articles like this one and this one are still being written on how to prevent or eliminate the threat of these artificial answers.
Now suddenly, if you slap “AI” on the bot and call it a “synthetic respondent,” it’s innovation?
When I asked the rep how a synthetic respondent differed from a bot, he stammered, “They’re very different! These use AI.”
<long sigh>
In truth, I view the rise of 'synthetic respondents' as a clever repackaging of 'survey bots' that flips the presence of bot responses in data on its head while cutting costs and boosting the margins for respondent providers.
Artificial responses produced survey bots are a problem that researchers must deal with, while artificial responses generated by "AI" are a sexy new survey innovation.
Beef 2: We already know how to simulate responses 💻
The rep told me, with great excitement, that we could pay for 600 real human responses and get an additional 600 synthetic ones modeled on the first group.
'Double your sample, without doubling the cost.'
But this isn’t new. It’s just predictive modeling with some sleek marketing gloss.
Researchers have used imputation, simulations, and forecasts for decades. And frankly, all inferential statistics are just educated guesses about how a broader population would answer.
Yes, AI might make that process easier, but that’s hardly a reason to replace real respondents.
Besides, there’s a practical problem looming…
Beef 3: You can’t report results with a straight face 🙋
After the call, I recapped the pitch to my user research team. We immediately started joking:
'According to our survey of 600 humans and 600 synthetic constructs, the latter group is far more enthusiastic about our new products!...'
It's absurd, like something out of a P.K. Dick novel.
But seriously, what happens when researchers have to address methods questions?
Because someone will ask:
'How exactly were those synthetic responses generated?'
And you’ll say: 'Well, the provider used a proprietary AI model that we’re not allowed to audit or even fully understand, but don’t worry, it’s all good.'
That’s not a satisfying answer. Because the real answer is that it’s a black box. And researchers can’t build trust with black boxes.
Why real people still matter
The purpose of surveys is to understand how people think, feel, and behave. Emphasis on people.
Synthetic respondents are fast and cheap, but they aren’t people. They don’t carry with them opinions and beliefs forged through lived experience. They don’t misread questions, or get frustrated halfway through, or give you creative open-ends that make you pause and rethink your assumptions.
Real people are messy. They hold inconsistent beliefs. They are motivated by habits and prejudices and ideologies. They enact bizarre behaviors motivated for reasons they struggle to articulate. And all of that is exactly what makes their responses valuable.
So I understand the appeal of synthetic data. But I’m still betting on humans.
Comments
Post a Comment