It IS Going to Write Itself. That's a problem.
AI can be your co-pilot in survey design. But if you hand it the controls, don’t be surprised when you crash into a wall of mediocrity.
Somewhere in the world, someone is writing a client feedback survey. Only… they’re not.
They type some details into a survey platform or their favorite generative AI, click a button, and the questions write themselves.
Boom. Done.
They read it through.
It sounds great.
It’s exactly the kind of thing they’ve seen before.
And that’s the problem.
The Questions Are Still Bad—You Just Didn’t Write Them
Last week, I told you most people have no idea their surveys are bad. This week, let’s tackle an even bigger blind spot.
Because people aren’t just writing flawed surveys—they’re handing the whole process to AI and assuming it’ll get it right.
And if you didn’t already know your survey had problems…
What happens when an AI writes one that looks legit in 30 seconds flat?
You’ll use it.
And worse?
You’ll trust it.
Because it looked polished.
Because it used language that felt familiar.
Because it came from a platform you respect.
Flawed surveys are carrying the credibility of tools that haven’t earned it.
This doesn’t just happen with ChatGPT, where confidence can pass for competence. Now it’s built into survey platforms—tools people already see as experts: “If this came from SurveyMonkey, it must be good.”
Let me be clear: It’s not.
Stick with me. I brought receipts.
Real Talk: AI Isn’t the Problem. Generic Thinking Is.
AI isn’t bad at survey design because it’s AI. It’s bad because it trains its AI on bad surveys.
Let’s break that down.
These tools are trained on massive piles of public data—whatever’s out there, whatever gets repeated, whatever looks “typical.” They find patterns in what people post, publish, or push out into the world.
And when it comes to surveys? A lot of what’s floating around is vague, redundant, overly general, or just… not that useful.
So that’s what the AI learns to mimic.
And it does mimic it—beautifully.
It serves up all the classics:
Neutral responses that don’t mean anything.
Answer choices that overlap.
Double-barreled questions.
And of course, the ever-present “overall satisfaction” item that tells you nothing about what needs to change.
None of this should be surprising.
AI is designed to be a generalist.
It’s not a survey expert.
Here’s where it gets messy:
A platform with name recognition rolls out an “AI-powered survey tool,” and people assume it’s credible.
They assume it’s built by experts.
They assume it’s good at survey design.
That’s the trap.
You end up with a mediocre-at-best survey, stamped with the authority of a brand people trust.
And because it looks like a survey—
And it acts like a survey—
And it’s being recommended by a platform built for surveys—
People don’t question it.
I’ve been critiquing platform’s survey templates for years.
AI hasn’t fixed the problem.
It’s just scaled it.
And honestly? That’s what gets me.
Where are the survey experts behind these tools?
How are you going to offer people “smarter surveys” and then suggest questions that break the most basic principles of good design?
You want to talk about audacity?
That’s audacity.
So, Does AI Have a Role in Survey Design? Absolutely.
But only if you’re using it intentionally. AI can’t replace your thinking—but it can help you explore, refine, and pressure-test the thinking you’re already doing.
Stuck on what to ask? AI can help you map the landscape—whether you’re running a workshop, evaluating a program, or measuring client relationships.
Need variety? It’s great at throwing 20 weird ways to frame a question at you when your brain is mush.
Feeling alone? AI can be a solid thought partner. The other day I spent two hours riffing with ChatGPT about how to design a frequency scale that didn’t overlap or overwhelm. Spoiler alert: I figured it out. But AI helped me get there.
Answer choice chaos? Trying to decide between a dropdown, a Likert scale, or something else entirely? AI’s a good sounding board when you need options.
Bottom line?
✅ Use it to brainstorm what to ask.
✅ Use it to talk through your answer choices.
✅ Use it to iterate when you feel stuck.
Use it as a thought partner—not a design lead.
Don’t outsource the thinking.
Don’t let AI smooth-talk you into neutral options and unlabled 0–10 scales.
Don’t assume “looks good” means is good.
At the end of the day, AI doesn’t know good survey design (yet).
It’s just remixing the same tired patterns you’ve seen a thousand times.
They’ll sound legit.
They’ll feel familiar.
And they’ll lead you nowhere.
When You Let AI Write the Survey: A Side-by-Side Test
Need to see it to believe it? Let’s go. Like I said, I brought receipts.
I gave the same client feedback prompt to three tools:
✅ SurveyMonkey’s built-in AI
✅ ChatGPT (in incognito mode)
✅ My fine-tuned model
Let’s just say…I’m only endorsing one of those tools and even still…not yet because it’s still in training.
Here’s the TL;DR
SurveyMonkey AI – On topic, but both overly general and redundant. Answer choices are bland and outdated, scales are default, and the structure is exactly what I’d expect from a template…not AI.
➡️ Feels familiar, but gives you very little to work with.
🛑 Rating: Do not use.SurveyMonkey AI Take 2 – The prompt was more detailed—and so are the questions. But the core issues persist: overlapping answer choices, vague satisfaction language, and another round of “neutral” with no clear meaning.
➡️ More aligned—but still not actionable.
🛑 Rating: Still no.ChatGPT: Polished layout, more questions, and better variety in answer scales. Looks professional and includes structure most users would feel good about sharing. But it still makes key mistakes: double-barreled phrasing, overly general questions, outdated answer choices, and an unlabeled NPS scale.
➡️ Strong draft—but still needs a human editor.
🟡 Rating: Use with caution.My Fine-Tuned Model: Leans toward more specific, more actionable questions. Clarifies intent, asks follow-ups, and reflects the kinds of choices I aim for in strong survey design. Still rough in presentation—and clearly still a little confused about what makes a great survey.
➡️ Still early—but already thinking like a real survey designer.
🦄 Rating: Not ready yet but headed for unicorn status.
👉 Keep reading for full walkthrough.
Here’s the thing: these survey mistakes aren’t cosmetic—they’re foundational.
When your questions are too vague, too combined, too padded with meaningless middle points, the data you collect isn’t just imprecise.
It’s wrong.
And that has consequences.
Years ago I read an impact report where workers had been asked if they had enough lighting and ventilation in their factor—in the same question.
It still haunts me. What an absolute waste of a question. You can’t possibly know if you need to fix the lights, the air system, or both.
Does the factory spend extra money trying to fix everything? Or nothing, because the feedback was too muddy to act on?
That’s what a bad survey does.
It doesn’t just miss the point. It burries it.
And if you don’t catch it early…
you’ll still make decisions—just not the right ones.
👇 Supplement: Play-by-play of what each tool did—and why most of it falls flat.
Yes, it’s a bit of a long read.
But if you’ve ever wanted to see what AI-generated surveys actually look like?
This is the part worth slowing down for.
SurveyMonkey AI: Take 1
With SurveyMonkey, you don’t even have to write a prompt. It has pre-made ones ready to go. I used SurveyMonkey’s prompt for Customer Satisfaction:
I work for a consulting firm and we’re gathering feedback from clients. We want to know about the quality of our services, communication, responsiveness, overall satisfaction, and likelihood to recommend us. Also, include industry and business size.
Pretty detailed, right? Seems like about what people would type into a prompt. I mean heck, I imagine some people just type in “I want a client feedback survey.”
The Survey it Generated
How would you rate the overall quality of our consulting services?
Excellent, Good, Average, Poor, Very poorHow satisfied are you with the communication from our team?
Very satisfied, Satisfied, Neutral, Dissatisfied, Very dissatisfiedHow would you rate our responsiveness to issues?
Excellent, Good, Average, Poor, Very poorHow satisfied are you with our consulting firm overall?
Very satisfied, Satisfied, Neutral, Dissatisfied, Very dissatisfiedHow likely is it that you would recommend our consulting services to a friend or colleague?
[Rating with values 0-10]What is the primary industry of your business?
(Text box)What is the size of your business?
1-10, 11-50, 51-200, 201-500, 501-1000, More than 1000
My thoughts:
At first glance, this looks like a typical customer satisfaction survey. And that’s part of the problem. It leans heavily on generalized satisfaction language without offering the specificity needed to make sense of what’s actually working—or what isn’t.
Across the board, the answer choices reflect outdated conventions. Scales like Very Poor, Poor, Average, Good, Excellent have been used for decades—and haven’t aged well. They lack dimensionality, they may feel stigmatizing, and they don’t offer any guidance on what specific improvements are needed.
The inclusion of “Neutral” is especially problematic. It’s an ambiguous middle ground that can mean anything—from “I’m unsure” to “I’d rather not answer.” That makes it hard to analyze, interpret, or use in a meaningful way.
The question about industry is left as an open text field, which creates unnecessary barriers for both respondents and analysts. A structured list—ideally a “select all that apply” option—would reduce effort, standardize responses, and make the data more usable.
The business size categories are imprecise. Grouping a one-person business with a 10-person team overlooks meaningful differences in scale and complexity. It reflects a broader lack of intentionality in how categories are defined.
The survey reads as repeititive. Questions “How would you rate the overall quality of our consulting services?” and “How satisfied are you with our consulting firm overall?” don’t serve clearly distinct purposes. And, really, how many times do you need to say satisfaction in one survey?
In short, this survey feels generic and overly familiar. You might find out whether clients were satisfied or dissatisfied with something like communication, but you won’t know why. Was it the method? The frequency? The tone? You don’t get enough detail to take meaningful action.
If you’re not learning something specific from your data, why are you sending the survey in the first place?
🛑 Rating: Do not use. Generic, outdated, and lacking in actionable value.
SurveyMonkey AI: Take 2
I tried a second, more specific prompt to see how things changed:
I work for a consulting firm and we're looking to gather feedback from our clients about their experience working with us. I'd like to get specific feedback on the deliverables and the process of working with us--thinking in particular about how responsive we are, if they feel we met their needs, what are areas for improvement, and perhaps something about timeliness of our work? Of course, let's include the NPS.
SurveyMonkey’s revised questions?
How satisfied are you with the overall experience?
Very satisfied, Satisfied, Neutral, Dissatisfied, Very dissatisfiedHow would you rate the quality of deliverables?
Excellent, Good, Average, Below Average, PoorDo you feel we met your needs and expectations?
Exceeded Expectations, Met Expectations, Somewhat Met Expectations, Did Not Meet ExpectationsHow would you rate the timeliness of our work?
Always on Time, Usually on Time, Sometimes Late, Often Late, Always LateWhat areas can we improve?
(text box)How likely is it that you would recommend our consulting firm to a friend or colleague?
[Rating with values 0-10]
My thoughts:
This time, the prompt was more detailed—and to its credit, the survey reflects that. The questions are more aligned with what I asked for: feedback on deliverables, timeliness, and responsiveness. That’s a step in the right direction.
But, it still runs afoul of best survey practices.
The answer scales still feel outdated. One question uses a variation of the typical “excellent to poor” format, swapping in below average between average and poor. But that only introduces more confusion: where exactly is the line between below average and poor? And do those terms mean the same thing to everyone?
We also see overlapping response choices in other questions—for example, “usually on time” vs. “sometimes late.” In many cases, those could describe the same experience, just with different framing.
The “Neutral” response option shows up again, bringing the same interpretive challenges: indifference, uncertainty, and opt-outs are all bundled into one category.
There’s also a double-barreled question that asks if the firm met the client’s needs and expectations. Those aren’t the same thing—you can meet someone’s needs without exceeding their expectations, or vice versa. Lumping them together reduces clarity and masks nuance.
The open-ended question “What areas can we improve?” is a missed opportunity. A structured list of common improvement areas—paired with an “Other (please describe)” option—would have been more helpful for respondents and far easier to analyze. I recommend this change in almost every client project I work on.
And of course, we still see overall satisfaction appear—fine for a headline stat, but not useful if your goal is actionable insight.
So yes, this version looks more specific. But the deeper design choices still fall short.
I’m still not impressed. At this point, I’m curious: Is SurveyMonkey leveraging ChatGPT—or could ChatGPT actually do it better?
🛑 Rating: Do not use. An improvement over the first iteration? Yes. Good enough to make business decisions with? No.
ChatGPT
I ran the exact same prompt in ChatGPT using an incognito browser—no login, no chat history, and no memory of my usual survey preferences. The results?
Client Experience Survey
Thank you for taking a few minutes to share your thoughts about your recent experience working with [Your Consulting Firm's Name]. Your feedback helps us improve our services and better meet your needs.
Section 1: Overall Experience
Net Promoter Score (NPS):
On a scale of 0 to 10, how likely are you to recommend [Firm Name] to a colleague or peer?
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10Please briefly explain your rating: (Open text)
Section 2: Communication & Responsiveness
How would you rate our responsiveness throughout the project?
Excellent, Good, Fair, PoorDid you feel you had consistent and clear communication with our team?
Always, Often, Sometimes, Rarely
Section 3: Understanding & Meeting Needs
To what extent did you feel we understood your goals and needs?
Completely, Mostly, Somewhat, Not at allHow well did our deliverables align with your expectations and objectives?
Exceeded expectations, Met expectations, Fell short of expectationsPlease elaborate on your answer:
Section 4: Timeliness & Process
Was the work delivered on time, according to the agreed schedule?
Yes, Mostly, Occasionally late, Frequently late
How would you rate the overall process of working with us (e.g., structure, collaboration, project management)?
Excellent, Good, Fair, Poor
Section 5: Opportunities for Improvement
What could we have done better or differently to improve your experience? (Open text)
Are there any services or capabilities you wish we had offered as part of this engagement? (Open text)
Section 6: Final Thoughts
Would you consider working with us again in the future?
Yes, No, UnsureAny additional comments you'd like to share? (Open text)
My thoughts:
Right away, this version stood out.
It included an introductory paragraph, clear section headers, piped-in firm names, and well-formatted spacing. Visually? It’s clean, professional, and easy to read. It looks like something a lot of people would use—and feel confident sharing.
It also asked more questions than either of the SurveyMonkey versions, and some of those questions were more specific. There’s good variation in how answer scales are constructed, which adds dimension and avoids the templated feel of the earlier examples.
In short: this looks better. It is better.
If I had to choose between this and SurveyMonkey’s AI, I’d choose this.
But it still violates several core survey design principles.
The most surprising miss? The Net Promoter Score question includes a 0–10 scale…with no labels for either end. That’s a serious oversight. How do you know what a “0” means if you’re not told? (I mean okay, people might know the NPS, but it’s the principle!).
We also see multiple double-barreled questions:
Clear and consistent
Expectations and objectives
Goals and needs
These are not interchangeable concepts. Grouping them together makes it impossible to know what people are actually reacting to—and that makes the results hard to interpret. (As an aside: double-barreled questions are one of the most common issues I see when ChatGPT generates survey content.)
Some of the open-ended questions could easily be closed or structured to reduce cognitive load and improve analysis. And overall, the survey still leans on general, summary-style questions that don’t give you much insight into specific experiences or actionable improvements.
This is definitely the glow-up version: clean, structured, and confident. But that confidence still hides design issues that would trip up your analysis later.
It’s a strong draft—but not a finished survey.
🟡 Rating: Use with caution. Stronger starting point, but needs expert review and revision.
MY Model
Yep—I fed the same prompt into my fine-tuned AI model, here’s what it thought about SurveyMonkey’s Survey (Take 2):
What are you really trying to learn about deliverables? Quality, usefulness, clarity. Consider a more specific question if overall satisfaction isn't enough.
Reword example: To what extent do you agree with each of the following? Our deliverables are high quality. Consider the delivery time, delivery method, timeline, and format.
What was missing? Hard to understand or off.
How would you rate your responsiveness? What are you really trying to measure? Clarify the rating.
Did consulting meet your needs and expectations? Good question, but watch out for overlap between expectations over satisfaction. Make sure they're not redundant.
How would you rate the timeliness of our work? Offer a clear scale.
Look—I know it’s messy.
The phrasing’s rough. The structure’s clunky.
It’s not ready for prime time.
But honestly? This gave me hope.
Because it’s not just parroting a format—it’s pushing for specificity.
It’s pushing for specificity.
It’s asking what we’re really trying to learn.
It’s spotting overlaps.
It’s recommending rewording.
And to me, that’s a much better starting point.
I decided to push further. I gave my model the exact same prompt I used with SurveyMonkey and ChatGPT—and here’s what it generated:
[Keep the NPS as is]
Which deliverables did we provide for your project? (Checkbox)
For each deliverable, did it meet your needs? Completely, Mostly, Somewhat, Not at all.
How responsive was our team throughout your project? Too slow, Just right, Too much
Which statement best describes your experience working with us on deliverables?
The drafts were clear and aligned to what I needed.
I had to give feedback multiple times to get a draft that was aligned.
I never received a draft that met my needs.
Other than timeliness, what is one area we could improve? (Timeliness)
My thoughts:
It’s still learning. It’s a little messy.
But this is the direction I want to see.
Asking which deliverables were provided—and then following up on each one?
That’s textbook KLH survey technique. (Those are my initials, by the way.)
The response options offer real variety.
The questions are focused and specific.
And unlike the other tools, this version isn’t just trying to sound professional—it’s trying to get real answers.
It’s not ready yet. But it’s thinking the right way.
Which, honestly, is more than I can say for most.
🦄 Rating: not ready—but it’s going to be a unicorn.

