How Often Does AI Recommend You? Here Is How to Find Out
Most coaches think they're invisible to AI search. The real answer is messier and more useful: you're cited on some phrasings and missed on others. Here's how to count.

Built BakingSubs to 162,500 Copilot citations and accelerating. Now teaching the system behind it.
- ai-audit
- ai-visibility
- case-study
- chatgpt
"Am I cited by AI?" is the wrong question. The right one is "how often, on which phrasings, and by which engine?" A coach who shows up in ChatGPT 1 time out of 8 buyer questions is in a very different spot than one who shows up 0 out of 8, and the fix is different too.
Key takeaways
- AI recommendation is not binary. The same buyer intent gets phrased 5 to 10 different ways, and most experts get cited on one or two of those phrasings while missing the rest.
- A useful audit counts hits across multiple phrasings per engine. The AI Visibility Check uses 8 buyer-intent questions per engine for this exact reason.
- "3 of 8" means you have a foothold and need to broaden phrasings. "0 of 8" means a structural problem on your site, not a phrasing problem.
- Each engine answers differently. ChatGPT, Claude, Perplexity, and Microsoft Copilot weight different signals, so you have to test all four.
- The fix changes based on the score. A foothold needs more content on the missed phrasings. A zero needs author signals, schema, and clear topic pages first.
Why "am I cited" is the wrong question
Ask ChatGPT "best life coach for new moms in Austin" and you might get a clean answer with three names. Ask it "life coach for postpartum identity loss Austin" and you might get a different three. Ask it "who helps new moms figure out what they want after baby" and you might get a list with zero coaches and three Substack writers.
Same buyer, same city, same problem. Three different answers. If you only tested the first phrasing and saw your name, you'd think you'd won. If you only tested the third and saw nothing, you'd think you were invisible. Both conclusions would be wrong.
This is the part most "check if AI mentions you" advice gets wrong. It treats AI citation as a yes/no. It isn't. It's a frequency, and the frequency tells you what to do next.
How to actually count: the 8-question method
Pick one buyer outcome you want to be hired for. Then write that same outcome 8 different ways, the way real buyers would type it. Three are how an insider phrases it. Three are how a confused beginner phrases it. Two are how a frustrated buyer phrases it after one bad experience.
Then ask all 8 to ChatGPT, Claude, Perplexity, and Microsoft Copilot. Note whether your name (or your site) appears in the answer. Tally per engine.
Here is what the score tells you:
- 0 of 8 across all engines: structural problem. The engines can't tell who you are or what you do. Usually missing author schema, no clear "who I help" page, or your homepage is a feelings poem instead of a clear claim. Fix the site before publishing more.
- 1 to 2 of 8 on one engine, 0 on others: you have a single piece of content that ranked into one engine's training cut. Lucky, not durable. Treat it as zero and build properly.
- 3 to 5 of 8 on at least one engine: real foothold. The engines understand who you are. You're missing on the phrasings you haven't written for. Add content that uses the missed phrasings as H2s.
- 6 to 8 of 8 on one or more engines: you're in the recommendation set. Now you defend it by widening the topical footprint so a newer competitor can't displace you.
This is the same logic behind the AI Visibility Check. It runs 8 discovery-intent questions per engine and sorts you into four branch outcomes (Invisible, Mixed, Winning, Empty-niche), because a single yes/no result hides the information you actually need.
A composite that shows why frequency matters
Take Naomi, a divorce mediator in Sacramento who works with couples splitting up small family businesses. She'd read three "is your business invisible to AI" posts and assumed she was invisible. She wasn't, exactly.
When she tested 8 phrasings on ChatGPT, she got 1 hit: "mediator for business owners going through divorce California." That was the exact phrase she'd used in her homepage H1. On the other 7 phrasings ("divorce mediator small business owner", "splitting a family business in divorce", "what to do when you and your spouse own a company and are divorcing", and so on), she was nowhere.
Her conclusion before the test: "AI doesn't know I exist."
Her conclusion after the test: "AI knows I exist on exactly one phrasing, the one I wrote my homepage around. I have a foothold. I need to write the other 7 phrasings."
She published three new pages, each built around a missed phrasing as the H1, with the actual buyer question wording in H2s. Eight weeks later she retested. ChatGPT: 4 of 8. Perplexity: 3 of 8. Claude: 2 of 8. Microsoft Copilot: 1 of 8. The phrasings she'd published for were now hitting. The fix took weeks, not months, because she wasn't starting from zero. She was widening a foothold.
If she'd accepted the "I'm invisible" conclusion from a single test, she would have rebuilt her whole site. Counting frequency saved her three months of misdirected work.
What changes by engine, and why you have to test all four
Each engine answers the same question differently, so a 4 of 8 on ChatGPT can sit next to a 0 of 8 on Claude. Some patterns worth knowing:
- ChatGPT tends to surface names it has seen across multiple sources. If you're only cited on your own site, ChatGPT often won't pull you. If a podcast guest spot or a guest post mentions you alongside a topic, ChatGPT picks that up.
- Claude weights author identity hard. A clear "this human, with this credential, in this niche" signal moves the needle. Sites without a real Person schema or an obvious bio often get treated as faceless brands.
- Perplexity surfaces sources visibly in its answer, so it favors pages that are clearly the canonical answer to a specific question. Thin "5 tips" posts lose; one deep page on one specific buyer question wins.
- Microsoft Copilot is the one most coaches under-test, which is a mistake. BakingSubs has earned 162,500 Microsoft Copilot citations to date, with 112,500 of those landing in just the last three months. Copilot rewards topical clusters that cover a niche end to end.
If you only test ChatGPT, you're seeing one fourth of the picture. The buyers using Claude, Perplexity, and Copilot are still real buyers, and on some phrasings they get a different answer than the ChatGPT user sitting next to them.
What to do with your score
Run the test. Write down the number per engine. Then pick the action that matches what you found.
If you scored 0 across the board, the problem is upstream of content. Your About page probably doesn't make it obvious who you help. Your homepage probably doesn't name the specific person you serve. There's no clean schema telling engines you're a real expert. Fix that first or every piece of content you publish after will be misread the same way.
If you scored 1 to 5 of 8 on at least one engine, you have a foothold. Look at which phrasings hit and which missed. The missed ones are your editorial calendar for the next quarter. Each missed phrasing becomes one new page, with that phrasing in the H1 and the surrounding buyer language in H2s. This is the part of the Citation Cluster Method that does most of the lifting: you don't need new tactics, you need topical coverage on the phrasings you missed.
If you scored 6 to 8 on at least one engine, you're winning. Now you widen. Add adjacent topics that buyers ask about right before or right after the one you own, so a competitor can't slide in on the next-door query. Smaller, more focused competitors are the ones who displace you, not bigger ones with more brand awareness.
What the score does not tell you
Frequency tells you how visible you are. It doesn't tell you whether the buyers seeing the recommendation are the right ones for your offer. You can be cited 7 of 8 times for "affordable life coach" and still get the wrong calls, because the buyers who type "affordable" are not the buyers who pay $4,000 for a six-month engagement.
So after you count, look at the phrasings you're winning on and ask: do these match the buyer I actually want? If not, the fix is to shift the phrasings you publish around, not to publish more of the same. Google's AI Overviews can quietly send you to the wrong audience the same way. Frequency without intent match is busy work.
Frequently asked questions
How many phrasings do I actually need to test?
8 per engine is the floor for a useful read. Less than that and a lucky hit or unlucky miss will skew your interpretation. The AI Visibility Check uses 8 because below that the false-positive rate is too high, and above 8 the marginal information per question drops off.
Does AI recommend the same business consistently or does the answer change every time?
It changes more than people realize. The same prompt at different times of day, on different accounts, sometimes returns different recommendation sets. This is another reason to test multiple phrasings rather than one prompt repeated, and another reason to interpret your score as a frequency band, not a precise number.
What if I show up on ChatGPT but not Perplexity?
That's common and tells you something specific. ChatGPT visibility without Perplexity usually means your site has enough mentions to be recognized but lacks the clean canonical-answer pages Perplexity prefers. Build one focused page per specific buyer question, with the question itself in the H1, and Perplexity citations tend to follow. There's more on how Perplexity decides who to surface in this breakdown.
Should I test brand questions or buyer questions?
Buyer questions. "Best executive coach for first-time CTOs in Boston" is a buyer question. "What does Naomi at NaomiCoaching do" is a brand question. Brand questions almost always return your site because the engine matches the brand string. Buyer questions are the test, because that's how real buyers actually search.
How often should I re-test?
Quarterly is enough for most expert-led businesses. Re-testing weekly creates noise without signal. When you publish a meaningful new piece of cornerstone content, retest the phrasings it targeted 6 to 8 weeks later, because that's roughly when most engines have indexed and weighted it.
Where to start
Pick one buyer outcome. Write 8 phrasings for it. Ask all four engines. Write down what you find. That single exercise will tell you more about your AI visibility than any guide can, because it's measuring you, not a category average. Once you have the number, the next move is obvious: fix the foundation if you scored zero, widen the topical coverage if you scored a foothold, defend the perimeter if you scored high. If you'd rather not do the manual work, the free check at /visibility-check runs the same logic across all four engines and hands you the score and the next action.