3 GEO experiments you should try this year

Last month, I asked ChatGPT, Perplexity, and Gemini the same question about three of my clients: “Who is [Brand Name] and what do they do?”

Two out of three got it wrong. Wrong services. Outdated office locations. One even suggested a competitor as a better alternative.

Here’s what makes that more than a curious mistake.

What today’s AI errors reveal about brand visibility

AI-sourced traffic jumped 527% year-over-year from early 2024 to early 2025.

While that growth is real, it’s growing from a very small base. Most sites still see AI referrals representing less than 1% of total traffic.

But when half the AI-generated descriptions of your brand are inaccurate, that’s not just a future problem. That’s shaping perceptions right now.

The challenge isn’t whether to optimize for AI systems – it’s how to do so effectively.

It’s figuring out what actually works versus what’s just repackaged fundamentals being sold as something revolutionary.

And unlike traditional SEO, where we can forecast traffic and revenue with reasonable confidence, AI search doesn’t work that way.

You can’t sell certainty here. You can only sell controlled learning.

Most effective GEO tactics turn out to be SEO fundamentals applied to a new visibility layer.

Structure, clarity, and consistent information have always mattered.

What’s changed is that these principles now impact how AI systems summarize and cite your content, not just how users find and interact with it.

The only way to separate truth from fiction is to run small, reversible experiments that produce decision-quality data.

The cost of not knowing what works is higher than the cost of finding out.

Below are three GEO experiments you can run to understand how AI systems read, summarize, and reuse your content.

These are practical tests most teams can complete in 60–90 days, and each one produces clear insights about whether these tactics actually move the needle for your business.

Think of these as controlled learning opportunities, not traffic promises.

Experiment 1: Build an LLM-ready topic cluster

Marketers have been building topic clusters for years. But GEO changes the rules.

Generative systems don’t read content the way humans do.

They chunk it, looking for clean entities, clear answers, consistent language, and predictable structure.

When your content is organized in this way across an entire cluster, it becomes easier for AI systems to understand and cite you as a preferred source.

This first experiment tests exactly that.

Pick a cluster with business value

Choose a topic where you already have strong content or where you desperately need to grow visibility.

Use internal site search, Google Search Console queries, and customer support calls to find the natural-language questions your audience is already asking.

These are often similar queries or prompts potential customers use in LLM platforms.

Tip: If your support team hears the same question three times in a week, that’s your signal.

Build (or rebuild) the cluster for machine readability

Here’s what I’ve seen work across tests.

Structure your pillar page around natural-language questions
- Your H2s should mirror the way real humans phrase queries:
  - “What is [topic]?”
  - “How much does [topic] cost?”
  - “What’s the best option for beginners?”
  - “What should I avoid?”
- AI tools favor pages that answer questions in the way users actually ask them. Not the way we think they should ask them.
Lead with a summary-first design
- Make the first 100–150 words a fast, clear overview.
- No slow intros. No storytelling wind-up. No “In today’s fast-paced digital landscape…” fluff.
Use consistent Q&A formatting
- Break down every page with predictable formatting:
  - Question
  - Short answer (1-2 sentences)
  - Supporting detail (2-3 paragraphs)
  - Optional table or list
- This format is great for LLMs. It tells them exactly where to look and what to extract.
Don’t skip schema and internal links
- Use FAQPage, HowTo, Product, Organization, LocalBusiness, whatever’s relevant to your content.
- Use internal links to establish the cluster hierarchy so models don’t have to guess which page answers which question. Make the relationship between the pillar and supporting pages clear.

Measure the right things

Here’s what to track over the next 60 days:

AI Overview appearances for your target cluster queries (use incognito mode and check manually twice per week, or use tools like Semrush if you have access)
LLM citation patterns: Run the same queries through ChatGPT, Gemini, and Perplexity. Do they reference your site? How accurately?
Organic traffic and conversions within the cluster
Consistency of descriptions: Are LLMs describing your content the same way, or are they confused?

Here’s the key distinction.

In traditional SEO, we focus on traffic and revenue metrics.

With GEO experiments, you’re tracking leading indicators, signals that tell you whether AI systems understand and trust your content, even before those signals translate into measurable traffic.

Think of it like this: Citation accuracy and entity recognition are the new “rankings.”

They indicate whether you’re positioned to benefit as AI search volume grows.

Compare against a control

This is critical: test this cluster against another one you didn’t optimize.

If the LLM-ready cluster shows more AI Overview inclusion, more accurate answers, and steadier organic performance, you’ve found a lever worth scaling.

Example:

I rebuilt a topic cluster for a dental practice around “teeth whitening options.”
Within 75 days, they appeared in AI Overviews for nine out of 13 target queries, up from two.
Traditional organic traffic held steady, but the brand’s visibility in AI-generated answers increased.

Why this works (beyond just AI)

Here’s what makes this experiment particularly valuable: the same structural improvements that help AI systems understand your content also tend to improve traditional search performance.

Clear headings, direct answers, and logical content organization help Google parse your content more effectively.

Users appreciate the clarity, too. Shorter time to finding answers typically correlates with better engagement metrics.

So even if AI traffic remains a small percentage of your total traffic, you’re building content that performs better across all channels.

That’s the kind of optimization worth investing in.

Dig deeper: Chunk, cite, clarify, build: A content framework for AI search

Experiment 2: Run a brand entity and sentiment sprint

AI is terrible at nuance.

If your brand story isn’t consistent across platforms, LLMs will sometimes make something up, or worse, they’ll confidently tell users something completely wrong about you.

Models pull brand information from:

Reviews (Google, Yelp, Trustpilot, niche directories).
Business directories.
Editorial content and news mentions.
Reddit and industry forums.
Social profiles.
Schema markup.
Knowledge graph sources (Wikidata, Crunchbase, etc.).

They mix all of that into “the brand story” they present to users. If that story is inconsistent, models fill in the gaps with outdated or incorrect information.

That’s where this experiment comes in.

Audit what AI already thinks about you

Ask ChatGPT, Gemini, and Perplexity questions like:

“Who is [Brand Name]?”
“What does [Brand] offer?”
“Is [Brand] good for [specific use case]?”
“What are alternatives to [Brand]?”

Log everything:

Accuracy of the description.
Sentiment (positive, neutral, negative).
Sources referenced.
Competitors mentioned.
Any stale, incorrect, or misleading details.

This becomes your “before” snapshot. Save screenshots. You’ll need them.

Clean up entity signals everywhere

You want consistency across all major touchpoints.

Think of it this way: if your brand info is scattered, AI will Frankenstein together whatever it finds first.

Here’s where the biggest wins come from:

On-site cleanup

Update your Home and About page with clear signals: what you do, where you operate, who you serve, recognizable brand names, and key differentiators.
Implement Organization and LocalBusiness schema.
Consolidate or redirect duplicate pages that confuse models.

Off-site consistency

Refresh business listings to ensure your name, descriptions, and categories match how you want the brand represented.
Encourage detailed customer reviews. Details matter: models weigh specificity, not just star ratings.
Strengthen editorial coverage on reputable, niche-relevant sites.

Participate authentically in platforms like Reddit and industry forums.
Many models pull from these sources when evaluating brand trust and sentiment.

Retest and compare

After 60–90 days, ask the same baseline questions again. Look for changes in:

Description accuracy.
Tone and sentiment.
Placement in list-style answers.
Mention frequency.
Correct understanding of your services, product lines, or locations.

Identify what moved the needle

Sometimes, listing cleanup has the biggest impact.

Other times, review detail makes the difference, and in some cases, editorial placements on authoritative sites carry more weight.

This experiment helps you understand which signals matter most so you can build a playbook you can replicate across your brands or locations.

Example

A regional HVAC company I worked with was consistently described as “mainly serving residential customers” by AI systems, even though 60% of their revenue came from commercial work.
After updating their Google Business Profile, homepage, and key directory listings with commercial-focused language and case studies, LLMs began accurately describing them as “residential and commercial” within 70 days.

The fundamentals

If this experiment feels familiar, that’s because it should.

Cleaning up business listings, encouraging detailed reviews, and maintaining consistent NAP (name, address, phone) information has been local SEO best practice for years.

What’s evolved is the impact: AI systems now aggregate this information to form “brand stories” that show up when people ask questions about businesses in your category.

The tactics aren’t new. The reach and influence of getting them right has expanded significantly.

This is actually good news. It means you don’t need to learn an entirely new discipline.

You need to apply what you already know, just with renewed attention to consistency and accuracy across all the touchpoints AI systems reference.

Dig deeper: Your brand in the age of generative search: How to show up and be cited

Get the newsletter search marketers rely on.

See terms.

Experiment 3: Test summary formats for machine readability

The more generative systems accelerate, the more they depend on quick, easy-to-parse summaries.

LLMs lean hard on the first 150 words of your content.

If that opening is unclear, fluffy, or buried in narrative, they’ll either skip your page entirely or misinterpret what you’re trying to say.

This experiment helps you test which summary format increases your AI visibility and improves accuracy when AI systems cite you.

The three formats to test

Short bullet summaries: These work well for:

Definitions.
Processes.
Pricing breakdowns.
Pros and cons.
Comparisons.

Here’s an example:

Quick summary:

Cost range: $1,500–$5,000
Works best for: Small businesses with 10–50 employees
Timeline: 2–4 weeks for full implementation
Alternatives: In-house tools, freelance consultants

Tight paragraph summaries: A two-to-three sentence version of the above. Clear, simple, and focused.

Example:

“[Service] typically costs between $1,500 and $5,000 depending on business size and customization needs. Most small businesses with 10–50 employees see full implementation within 2–4 weeks. Common alternatives include in-house tools and freelance consultants, though these often require more ongoing management.”

Narrative intros: The traditional SEO approach, the “let me tell you a story” opener.

Generative systems often skip this style entirely, which is why it’s worth testing whether removing narrative intros increases AI Overview inclusion.

Where to test them

How-to guides
“Best of” lists
Service pages
Pricing pages
FAQ-rich content

Anywhere clarity matters, and AI systems are likely to pull answers.

What to measure

Over 60 days, track:

AI Overview appearances for pages with each format
Paraphrasing accuracy: Are LLMs using your summary correctly, or are they mangling it?
User engagement patterns: Scroll depth, time on page, bounce rate
Conversions: Do users appreciate clarity as much as machines do?

What success looks like

You’ll discover which summary format gives you:

Higher inclusion in generative answers
Better accuracy in how AI tools describe your content
Stronger engagement from users who prefer clear takeaways

Once you identify the winning format, scale it across your content library.

Example

An ecommerce client tested bullet-style summaries against traditional narrative intros on 20 product category pages.
The bullet-format pages appeared in AI Overviews three times more often and had 22% higher click-through rates from organic search.

Turns out humans appreciate clarity too.

Dig deeper: Organizing content for AI search: A 3-level framework

How to operate GEO testing like a mini program

Most marketers find the 60–90 day model works best.

This timeframe keeps experiments small and reversible while still producing meaningful data.

Think of each experiment as a pilot project, a contained bet that delivers learning, not a major strategic shift requiring massive resources.

Here’s the rhythm I like to use.

Weeks 1–2: Baseline

Document AI Overview presence for target queries.
Log current LLM answers and entity accuracy.
Note sentiment and competitor mentions.
Record current organic metrics (traffic, conversions, engagement).

Weeks 3–6: Execute

Rebuild the cluster with LLM-friendly structure.
Clean up entity signals and business listings.
Implement new summary formats.
Update schema and internal linking.

Weeks 7–12: Measure

Compare AI visibility before and after.
Look for citation, mention, or inclusion changes.
Evaluate user metrics to validate impact.
Document what worked and what didn’t.

This model is easy to replicate and provides clarity instead of guesswork.

Each completed experiment either validates that a tactic works for your business (scale it) or shows it doesn’t move the needle (stop investing time there).

What to avoid: Lessons from testing

After running these experiments with multiple clients, I’ve seen a few patterns emerge around what doesn’t work or what creates more problems than it solves.

Don’t manipulate content specifically for AI extraction

Some marketers are experimenting with invisible text or content cloaking targeting AI bots.

Even if these tactics work short term, AI platforms are rapidly developing spam detection systems.

We’ve seen this pattern before with traditional search engines. Early manipulation tactics work until they don’t.

Don’t test multiple changes simultaneously

When you rebuild a topic cluster, update business listings, and change summary formats all at once, you won’t know which change actually drove results.

Test one thing, measure it properly, then move to the next.

Don’t assume AI systems automatically understand your brand

They aggregate whatever information they find across the web.

Your job is ensuring the right information is consistently available and clearly presented across all the sources they reference.

Keep investment proportional to actual impact

AI search is growing, but for most businesses, it still represents a small fraction of total traffic.

Test these tactics, monitor the results, and invest based on what the data shows, not what the hype suggests.

If these experiments drive meaningful business results for your specific situation, scale them.

If they don’t, you’ve learned something valuable without over-investing in an emerging channel.

What these GEO tests actually buy you

The best part about these GEO experiments is that they’re designed as controlled learning opportunities, not traffic commitments.

Even if AI search stays minimal for your business, the improvements you make – clearer content structure, consistent brand information, better-formatted summaries – typically improve traditional search performance too.

That’s the beauty of focusing on fundamentals.

When you build content that’s genuinely clear, well-structured, and helpful, it tends to perform well regardless of how search technology evolves.

What you’re really buying with these experiments isn’t guaranteed AI traffic.

You’re buying answers to questions that matter for your business:

Do AI systems understand our brand correctly?
Does structured content improve our visibility across multiple channels?
Are there quick wins in entity cleanup that compound over time?
Which summary formats resonate with both machines and humans?

These three tests provide a starting point that’s manageable for most teams while producing actionable insights you can use to make informed decisions.

They’re small enough to be reversible, focused enough to measure clearly, and valuable enough that the learning compounds regardless of how quickly AI search adoption grows.

The goal isn’t to predict the future of AI search.

It’s to position your brand to benefit from it as it grows while ensuring that if it doesn’t grow as fast as predicted, you’ve still made improvements that matter today.

Tags: #SEO #searchengineoptimisation #google #googletips #googletricks #googlehacks #timsabre