AI Search Retesting Cadence | Gregory Shevchenko

Method

Why retesting is different from monitoring

Monitoring tells you what changed. Retesting tells you what to do because something changed.

Most AI visibility pages stop at a dashboard problem: build a prompt library, run it across ChatGPT, Perplexity, Gemini, Google AI Overviews, Claude, and Copilot, then watch citation frequency, share of voice, sentiment, source URLs, and platform variance. That work matters. Otterly.ai frames citation tracking as a repeatable workflow with weekly pulse checks, monthly analysis, and quarterly review, and it separates mentions from linked citations ¹. Semrush defines AI visibility across mentions, citations, and recommendations in AI-generated answers ⁴.

But a dashboard does not repair a source graph by itself. A report can tell you "mentioned but not cited." It cannot decide whether the next task is a title/meta repair, a source-pack update, a third-party correction, a distribution push, or a new canonical page.

That is the missing operating layer. Retesting should convert observations into backlog states.

Method

The 30-day retesting cadence

Use a 30-day cadence for newly published source pages, repaired citation-gap pages, and high-value updates. The exact timing will vary by crawl speed, engine, and category volatility, but the decision logic should stay stable.

Checkpoint	What to prove	What can change	Backlog decision
24-48 hours	Live URL, canonical, robots, sitemap, feed, llms.txt, schema, and extractor output.	Technical eligibility.	Fix crawl/discovery before judging content.
Day 7	Target prompts rerun across engines with baseline comparison.	Answer state, mention, citation, cited URL, sentiment.	Mark early movement, but avoid strategic overreaction.
Day 14	Source dominance and competitor/source-type pattern.	Owned page, competitor page, third-party source, stale source, no source.	Choose page repair, distribution, third-party work, or new source pack.
Day 30	Trend across two or more checks plus downstream signals.	Stable citation, unstable citation, no citation, negative context, wrong source.	Keep, refresh, create, distribute, correct, or escalate to ContentOS brief.

The cadence protects the team from two opposite mistakes: declaring success too early, or rewriting the page before the page has had a fair chance to be discovered and tested.

Method

Start with a baseline before the page changes

The retest loop starts before publication. Save the baseline answer for every target prompt while the page is still unchanged or unpublished.

Baseline capture should include the prompt, engine, market, language, date, answer text, cited URLs, visible source titles, brand mention, competitor mentions, answer state, and screenshot or export. If the answer is personalized or account-dependent, record the account or environment class. Do not mix logged-in and logged-out checks in the same trend line.

The baseline has two jobs. First, it prevents false positives. If Perplexity already cited the brand before the repair, the new page did not cause that citation. Second, it prevents false negatives. If Google AI Overviews did not trigger for the prompt before publication and still does not trigger at day 7, the issue may be answer availability, not page quality.

For a source-pack repair, the baseline row should also include the claim being tested. That claim is the bridge between AI visibility and editorial work. "Does the answer cite us?" is too broad. "Does the answer cite the canonical page for the 30-day citation repair cadence?" is actionable.

Method

Use thresholds, not vibes

The retest loop needs thresholds because AI answers are variable. A single miss can be noise. A repeated miss can be work.

Use conservative thresholds for early-stage pages:

Signal	Threshold	Decision
Technical discovery failure	1 confirmed failure	Fix immediately before prompt retesting.
Brand mentioned but not cited	2 or more target prompts by day 14	Create a source-ownership task.
Competitor cited	Same competitor or source type appears in 2 or more engines	Run competitor/source comparison.
Stale source cited	Any high-risk outdated factual source	Correct or supersede the stale source.
Cited, not absorbed	Owned URL cited in 2 checks without answer-language uptake	Add stronger extractable answer units.
No movement	30 days with no state improvement on strategic prompts	Escalate to source-pack rebuild or new page decision.

The exact numbers can change for a large brand or a high-volume category. The principle should not. Decide what pattern is strong enough to create work before you look at the answer.

Method

The retest log schema

The retest row is the smallest useful unit of AI visibility work.

Field	Why it matters
Prompt	The exact query or buyer question being tested.
Engine	ChatGPT, Perplexity, Google AI Overviews, Gemini, Claude, or another answer engine.
Region and language	AI answers vary by market and language.
Baseline answer state	Absent, mentioned, cited, wrong-source, competitor-cited, stale-source, cited-not-absorbed.
Current answer state	The new state after retest.
Cited URL	The actual URL selected by the answer.
Source type	First-party page, competitor page, third-party article, directory, forum, research report, social profile, documentation.
Answer absorption	Whether the answer used the page's definition, numbers, comparison, or process, not only linked it.
Competitor source	Which competitor or third-party source is winning.
Sentiment	Positive, neutral, negative, or inaccurate.
Action	Keep, refresh, create, distribute, correct, or escalate.
Owner	Content, technical SEO, PR, product marketing, partner/source owner, or ContentOS.
Next retest date	The next scheduled measurement.

If a row has no action field, it is not a retest log. It is a measurement archive.

Method

Group prompts by job, not only by keyword

AI Search prompts should be grouped by the job the answer has to perform.

Use five prompt groups:

Definition prompts: "What is X?" or "What does this method mean?"
How-to prompts: "How do I do X?"
Comparison prompts: "X vs Y" or "best tools for X."
Troubleshooting prompts: "Why is X not working?"
Decision prompts: "When should I do X instead of Y?"

This article's primary job is a how-to and troubleshooting job. It should answer how often to retest, what to check, and what to do when the result is weak. If a comparison prompt asks for the best AI visibility tools, this page may not need to win as the only cited page. It may need to be cited as the workflow source that explains what the tools should feed.

That distinction keeps the backlog sane. Not every prompt should become a new page. Some prompts should map to a tool page, some to a methodology page, some to a glossary note, and some to a third-party corroboration push.

Method

What to do in the first 24-48 hours

The first check is not about whether ChatGPT cites the page. It is about whether the page is technically eligible to become a source.

Check the live URL. Check status code. Check canonical. Check robots and noindex. Check whether the page appears in sitemap, feed, and LLM discovery surfaces. Check whether Article and FAQPage schema parse. Check whether the first screen answers the primary prompt. Check whether the visible source links are actually visible to users, not only hidden in JSON-LD.

For gregshevchenko.com, this means proving the canonical page, sitemap, feed.xml, llms.txt, llms-full.txt, visible sources, FAQ parity, and extracted article text before treating an AI answer as a content-quality signal.

If the page fails this step, do not rewrite the article. Fix the discovery problem.

Method

What to do at day 7

Day 7 is the first prompt retest.

Rerun the target prompts across the target engines. Do not use a single prompt. A page can move for the direct prompt and stay invisible for comparison prompts. A page can be cited in Perplexity and absent in Gemini. A page can be mentioned by ChatGPT but cited through a third-party page.

Classify each prompt into one of six states:

Absent: the brand or page does not appear.
Mentioned, not cited: the brand appears, but no owned or accepted source is linked.
Wrong-source cited: an old page, directory, or third-party summary wins.
Competitor-cited: the answer uses a competitor as the evidence anchor.
Stale-source cited: the answer cites outdated information.
Cited, not absorbed: the URL appears, but the answer does not use the page's evidence or framing.

Day 7 is early. Treat it as movement detection, not final judgment.

Method

How to handle engine-specific results

Do not average engines too early.

ChatGPT, Perplexity, Gemini, Claude, and Google AI Overviews do not behave like five identical search results pages. They differ in source availability, citation UX, retrieval paths, freshness, and how much they expose. A prompt can improve in one engine and stay flat in another for reasons that are not purely editorial.

Use engine-specific notes:

ChatGPT: track whether the answer uses a definition or framework from the page, even when citations are inconsistent by interface or mode.
Perplexity: inspect the exact cited URLs and whether the answer prefers article, forum, directory, or first-party pages.
Google AI Overviews: separate whether an overview triggers from whether the target page is selected as a source.
Gemini: compare answer wording and source mix across repeated runs before deciding that a page failed.
Claude: track source use when browsing/search is available, but avoid treating a no-source answer as the same signal as a cited-source answer.

The output should not be "AI visibility score went up." The output should be "Perplexity moved from competitor-cited to owned-cited for the how-to prompt; Gemini still cites a third-party explainer for the comparison prompt; Google AI Overviews has no stable trigger yet."

Method

Why day 14 matters

Day 14 is when the source pattern starts to matter more than the individual answer.

If the same competitor domain appears across several engines, you have a source-dominance problem. If third-party pages win repeatedly, you may need corroboration or external source work. If the page is cited but not absorbed, your article may have enough authority to be selected but not enough extractable structure to shape the answer.

Profound's platform citation research is useful here because it reminds teams that answer engines source information differently ². A source that wins in Perplexity may not be the source that wins in Google AI Overviews. That is why the day-14 review should group failures by source type and engine, not by one aggregate score.

Use this matrix:

Day-14 pattern	Diagnosis	Next action
Owned page cited and absorbed	The source page is working.	Keep monitoring; add internal links if needed.
Owned page cited but not absorbed	The page is selected but not shaping the answer.	Add clearer definitions, comparison rows, numbers, and procedural snippets.
Brand mentioned but not cited	Recognition exists, source ownership is weak.	Build a stronger source unit and reinforce discovery.
Third-party source wins	The engine trusts independent corroboration.	Improve or influence the third-party source and link it back to canonical facts.
Competitor wins	The competitor owns the answer's evidence shape.	Compare source depth, freshness, proof type, and page structure.
Stale source wins	The answer is anchored to old evidence.	Refresh dated facts and distribute the newer source.
No stable pattern	The prompt may be volatile.	Increase sample size before rewriting.

Method

The day-30 decision

At day 30, make a backlog decision. Do not leave the row in a permanent "monitoring" state.

The day-30 options are:

Keep: the page is cited or answer state improved enough; continue monthly monitoring.
Refresh: the page is eligible but lacks extractable answer units, fresh facts, or stronger structure.
Create: the prompt needs its own page instead of being buried inside a broader article.
Distribute: the source exists, but engines still prefer external corroboration or recent references.
Correct: a third-party or stale source is wrong and needs outreach, profile repair, or replacement.
Escalate: the pattern needs a ContentOS brief, source-pack rebuild, technical audit, or PR/source strategy.

ZipTie's freshness framing supports the refresh question, but refresh is only one possible decision ⁵. A page can be fresh and still lose because the source type is wrong. A page can be old and still win because it has the clearest answer unit.

The day-30 decision should answer: what is the smallest source-graph change likely to move the next retest?

Method

Example backlog after a 30-day retest

Imagine the canonical page was published for the prompt: "How often should I retest AI Search citations?"

At baseline, ChatGPT mentioned the concept but cited no owned page. Perplexity cited a tool guide. Google AI Overviews did not trigger. Gemini cited a general AI SEO article. Claude answered without sources.

At 24-48 hours, the page is live, canonical, indexed in sitemap, included in feed and llms.txt, and Article/FAQPage schema parses. No technical task remains.

At day 7, Perplexity cites the canonical page for one prompt, but not for comparison prompts. ChatGPT uses the "30-day loop" phrasing without citation. Gemini still cites the third-party article.

At day 14, the source pattern is clear: the owned page is strong for the exact how-to prompt, but the comparison prompt still wants broader third-party validation. The backlog should not be "rewrite the whole page." It should be:

Backlog item	Owner	Reason
Add one comparison table that names what tracking tools provide vs what the repair workflow requires.	ContentOS/editor	The comparison prompt needs a clearer bridge from tools to workflow.
Add internal links from measurement and citation-gap pages to the new retesting page.	Site/content	The source graph should show this as the next step after repair.
Create one external LinkedIn/Medium adaptation that cites the canonical framework.	Distribution	Engines may need corroborating offsite references.
Retest comparison prompts at day 30.	AI Visibility operator	Check whether source dominance moved.

That is a productive retest. It created narrow work.

Method

When a retest becomes a ContentOS brief

A retest should become a ContentOS brief when the row contains a repeatable pattern, not just a weird answer.

Good triggers:

Three or more target prompts show the same answer-state failure.
A competitor is repeatedly cited for the claim you want to own.
The brand is mentioned but not linked across multiple engines.
The answer cites your page but does not absorb the definition, table, or proof.
A third-party source wins with stale or incomplete facts.
The prompt clearly needs a page that does not exist.

The ContentOS brief should not say "write an article about AI visibility." It should include the prompt, failed answer state, winning source, source type, target claim, missing evidence, required page surface, required third-party corroboration, and next retest date.

That is how the measurement loop becomes production work.

Method

What not to do after a weak retest

Do not rewrite the page because one answer did not cite it.

Do not add generic "AI search" keywords if the failure is third-party trust.

Do not treat a brand mention as a citation win.

Do not treat a citation as a complete win if the answer ignores the page's definition, table, or evidence.

Do not collapse all engines into one score before source-type analysis.

Do not publish a new page for every prompt. Some prompt gaps need internal links, source refresh, offsite corroboration, schema fixes, or third-party source repair.

And do not leave old retest rows open forever. A row that never becomes a decision teaches the team to ignore the measurement system.

Method

How this connects to the source-pack workflow

The citation-gap repair workflow explains how to diagnose answer states and rebuild the evidence graph. This retesting cadence explains how to know whether the repair worked.

The source pack should travel with the retest row. It should include:

Primary prompt and target prompt cluster.
Desired canonical URL.
Current winning sources.
Claim being supported.
Approved first-party evidence.
Approved third-party corroboration.
Rejected or stale evidence.
Required snippets, FAQ, schema, and internal links.
Retest schedule and owner.

Without that packet, teams tend to re-open the article and make unfocused edits. With it, the next action is obvious.

Method

The operating rhythm

For a small team, the simplest rhythm is weekly review and monthly decisions.

Every week, rerun the watched prompt set and mark state changes. Every month, make backlog decisions for rows that crossed thresholds. Do not let the weekly meeting become a discussion about every interesting answer. Keep it to changed states, source dominance, negative or inaccurate answers, and rows that now require work.

The recurring agenda can be short:

Which prompts changed state?
Which prompts now cite owned sources?
Which prompts cite competitors or stale third-party sources?
Which owned citations are not absorbed into the answer?
Which rows crossed a threshold for ContentOS, technical SEO, distribution, or third-party work?
Which rows should be closed, watched, or escalated?

That is enough. AI Search retesting should not become a second analytics department. It should become the source of the next useful repair.

FAQ

How often should AI Search citations be retested?

For a newly published or repaired source page, use a 30-day loop: 24-48h technical proof, day 7 prompt retest, day 14 source-dominance review, and day 30 backlog decision. After that, move strategic prompts to weekly or monthly monitoring depending on value and volatility.

What should be checked in the first 48 hours?

Check live status, canonical, robots, noindex, sitemap, feed, llms.txt, schema, FAQ parity, visible source links, and extracted first-screen answer. Do not judge citation performance until technical eligibility is proven.

What changes at day 7?

Day 7 is the first prompt-level comparison. Rerun the target prompts across engines and compare answer state, cited URL, source type, sentiment, and whether the answer absorbed the page's evidence.

Why does day 14 matter?

Day 14 is where source patterns become visible. You can see whether owned pages, competitors, third-party sources, stale pages, or no stable source dominate the prompt cluster.

What is the day-30 decision?

The day-30 decision is the backlog action: keep, refresh, create, distribute, correct, or escalate to a ContentOS/source-pack repair brief.

How are mentions different from citations?

A mention names the brand. A citation links or attributes information to a source. A brand can be mentioned often and still have a source ownership problem if the answer cites competitors or third-party summaries.

When should a retest become a ContentOS brief?

Create a ContentOS brief when multiple prompts show the same failure, a competitor repeatedly wins the claim, a third-party source dominates, or the page is cited but not absorbed into the answer.

Sources

[1] Otterly.ai

AI Search retesting cadence: the 30-day loop for citation repair

What to cite from this page

Why retesting is different from monitoring

The 30-day retesting cadence

Start with a baseline before the page changes

Use thresholds, not vibes

The retest log schema

Group prompts by job, not only by keyword

What to do in the first 24-48 hours

What to do at day 7

How to handle engine-specific results

Why day 14 matters

The day-30 decision

Example backlog after a 30-day retest

When a retest becomes a ContentOS brief

What not to do after a weak retest

How this connects to the source-pack workflow

The operating rhythm

FAQ

How often should AI Search citations be retested?

What should be checked in the first 48 hours?

What changes at day 7?

Why does day 14 matter?

What is the day-30 decision?

How are mentions different from citations?

When should a retest become a ContentOS brief?

Sources

How to Track AI Search Engine Citations & Sources: The Complete Guide for 2026

AI Platform Citation Patterns: How ChatGPT, Google AI Overviews, and Perplexity Source Information

The 10 Best AI Visibility Tools in 2026

AI visibility: What it is and how to grow yours in 2026

Content Refresh Strategy for AI Citations

Best AI search monitoring tools for 2026

How to measure AI Search visibility with prompts, citations, traffic, and revenue signals

AI visibility measurement is a weekly operating rhythm

AI Search Citation Gap Repair Workflow

How to build a source pack for AI Search content

Prompt-page map for AI Search site architecture

ContentOS first-party and third-party evidence scoring for AI Search

AI Search retesting cadence: the 30-day loop for citation repair

What to cite from this page

Why retesting is different from monitoring

The 30-day retesting cadence

Start with a baseline before the page changes

Use thresholds, not vibes

The retest log schema

Group prompts by job, not only by keyword

What to do in the first 24-48 hours

What to do at day 7

How to handle engine-specific results

Why day 14 matters

The day-30 decision

Example backlog after a 30-day retest

When a retest becomes a ContentOS brief

What not to do after a weak retest

How this connects to the source-pack workflow

The operating rhythm

FAQ

How often should AI Search citations be retested?

What should be checked in the first 48 hours?

What changes at day 7?

Why does day 14 matter?

What is the day-30 decision?

How are mentions different from citations?

When should a retest become a ContentOS brief?

Sources

Related AI Search workflows