A useful audit to check your current AI search visibility on Shopify works across six zones: crawler access, structured data, on-page content, measurement surfaces, off-site presence, and platform infrastructure. The 50 checks below are ordered so that findings in the earlier zones explain most of the problems you would otherwise spend time debugging in the later zones. The goal is not to tick boxes; it is to produce a prioritised fix list a Shopify operator can act on within the same quarter.
Short answer
Work the zones in order. Check crawler access for Googlebot, OAI-SearchBot, ChatGPT-User, GPTBot, PerplexityBot, Perplexity-User, Claude-SearchBot, Claude-User, ClaudeBot, and Google-Extended. Validate server-rendered Product, Article, FAQPage, and BreadcrumbList schema against visible content. Score a fixed prompt set across five AI engines. Review GA4 AI Search channel, Search Console coverage, server logs, and Merchant Center health. Cross-check off-site coverage and brand-entity consistency. The audit tells you where the gaps are; the fix list tells you where to start.
What you need to know
- Audit outcomes, not audit activity. Every check should map to a decision. If a check cannot change anything, it does not belong in the list.
- Crawler access is usually the bottleneck. A silent AI-bot block has outsized consequences because nothing downstream can compensate for missing retrieval.
- Schema parity matters more than schema presence. JSON-LD that disagrees with visible content is worse than no JSON-LD at all.
- Measurement is its own zone. If you do not measure, every later decision is a guess.
- Off-site and infrastructure matter. The first audit usually finds wins here that on-site teams miss because they were not looking.
- Prioritise severity and reversibility. High-severity, reversible issues (a robots.txt block, broken schema) outrank low-severity structural changes.
Zone one: crawler access (10 checks)
Access governs everything that follows. Shopify's robots.txt is editable through the robots.txt.liquid template per Shopify's robots.txt customisation documentation, and each AI provider documents the user agents relevant here.
- Googlebot is not blocked by robots.txt, theme code, or a staging-era noindex that was never removed.
- OAI-SearchBot is explicitly allowed per OpenAI's bot documentation.
- ChatGPT-User is allowed for live user-triggered fetches during ChatGPT sessions.
- GPTBot allow or disallow is set deliberately based on the brand's stance on training use, not accidentally.
- PerplexityBot is allowed so the index pipe can reach your pages per Perplexity's bots documentation.
- Perplexity-User is allowed so live fetches during user sessions succeed.
- ClaudeBot, Claude-SearchBot, and Claude-User are each handled with an explicit rule per Anthropic's public crawler documentation.
- Google-Extended is handled as a separate decision from Googlebot, with the reasoning documented.
- Any AI-bot blocking app or theme customisation is audited so the current robots.txt matches the intended policy, not a forgotten default.
- Server logs confirm recent crawl activity from the allowed bots over the past 30 days (zero volume is a signal, not an absence of data).
Zone two: structured data and schema parity (10 checks)
AI engines cross-check structured data against visible content. The field-level requirements and recommendations are documented by Google in the Product structured data reference, which serves as the baseline.
- Product JSON-LD is present and server-rendered on every product page.
- Product price, availability, and currency in JSON-LD match the visible page values exactly.
- Product brand, name, and identifier (GTIN or MPN where applicable) are populated.
- Review and AggregateRating schema are present only where real review data exists.
- Article schema is in place on blog and editorial content.
- FAQPage schema is present where there is a visible FAQ section, and the rendered questions and answers match the schema exactly.
- BreadcrumbList schema is present and matches the site's visible navigation hierarchy.
- Organization schema is present on the home page with logo, social profiles, and contact detail where appropriate.
- No schema is injected only by JavaScript that runs after the initial HTML response (the Googlebot and retrieval crawlers see the schema without executing the page).
- Rich Results Test and Search Console's URL Inspection confirm schema is valid and indexable on a representative sample of pages.
Zone three: on-page content and IA (10 checks)
The content layer is what AI engines quote. Clarity here compounds with schema and access to drive citation outcomes. This zone also benefits from alignment with Google's helpful, reliable, people-first content guidance.
- Product pages open with an answer-first paragraph of two to three sentences that states what the product is, who it is for, and a primary specification.
- Product descriptions include real specifications, materials, dimensions, or compatibility facts, not only brand narrative.
- Collection pages have descriptive intro copy that explains what the collection contains and who it is for.
- Blog and editorial pages answer a specific operator or customer question in the opening, with H2s phrased as real questions.
- FAQ sections on key pages reflect real support tickets and customer questions rather than invented queries.
- Page titles and meta descriptions are specific, match intent, and are not auto-generated.
- Headings form a clean hierarchy (H1, H2, H3) without skipped levels or multiple H1s on a page.
- Internal links use descriptive anchor text that names the destination topic.
- Images use meaningful alt text where relevant to comprehension, not just keywords.
- Outdated content (discontinued products, old pricing, archived offers) is either updated or canonicalised so it does not appear in the retrieval set.
Zone four: measurement and reporting (10 checks)
Without measurement, every fix is a belief rather than a tested decision. GA4 and Search Console provide the baseline; a prompt-set routine adds the citation-level view.
- GA4 has a custom channel group with an AI Search channel defined by known assistant referrer hosts per Google's custom channel group documentation.
- The AI Search channel sits above Referral and Organic Search in the rules order so sessions route correctly.
- Google Search Console is connected, verified, and reporting data for all relevant domain and subdomain properties.
- A fixed prompt set of 20 to 40 queries exists and is run monthly across ChatGPT, Perplexity, Claude, Gemini, and Google AI Mode.
- Scoring captures presence, position, accuracy, and competing brands for each engine and query.
- A competitor set of three to five brands is scored on the same prompt set so share-of-citation is tracked.
- Server or CDN log reviews are performed at least monthly to verify AI crawler activity and cross-check GA4 referrer coverage.
- A dashboard (Sheets, Looker Studio, or equivalent) exists and is reviewed weekly for traffic and monthly for citation data.
- Uncertainty is named in reporting (Direct attribution limits, non-deterministic AI answers, prompt-set coverage).
- Decisions driven by the dashboard are logged so quarterly reviews can audit whether the dashboard is producing action.
Zone five: off-site and brand-entity signals (5 checks)
Off-site coverage is what turns a ranking brand into a recommended one. It is often under-invested in by Shopify teams that focus entirely on the storefront.
- There is at least a small set of credible independent reviews of the brand on sites the AI engines retrieve from (specialist publications, honest review sites, major forums).
- The brand appears in at least one editorial round-up or comparison piece alongside recognisable competitors.
- Wikipedia, Wikidata, or a relevant knowledge base entry exists for the brand where merit supports it; if not, the Organization schema and About page are consistent with the brand's public positioning.
- The brand's social profiles, contact detail, and naming are consistent across the storefront, structured data, and external listings.
- There is no obvious paid-masquerading-as-editorial content that an AI engine could penalise retroactively.
Zone six: platform and infrastructure (5 checks)
Shopify-specific infrastructure checks close out the audit. These are the issues that are easy to miss without a platform lens.
- Shopify's Google & YouTube sales channel is connected, products are syncing, and Merchant Center is approving the feed per Shopify's Google channel documentation.
- Core Web Vitals meet passing thresholds on the theme and templates that drive the most traffic.
- Multi-locale routing is clean: hreflang tags are correct, canonicals point to the intended market version, and duplicate content is not spreading.
- Sitemaps are submitted, reachable, and complete; Shopify's default sitemap is supplemented where necessary for journal and long-form content.
- Expansion stores (where present) are audited with the same checklist, because AI engines treat each expansion store as a distinct entity.
Where does the checklist fall short?
It is worth being direct about what a 50-point checklist cannot do.
It does not rank fixes automatically. Two brands with the same findings may need very different priority orders based on their traffic mix, product range, and measurement maturity. The checklist finds problems; prioritisation is still a judgement call.
It does not guarantee AI citations. Passing every point tilts the odds toward inclusion, but AI answers remain non-deterministic and competitive. The audit maps the terrain; it does not control the weather.
It is a snapshot, not a live state. AI providers update crawler policy, schema expectations shift, and Shopify defaults change quarterly. The checklist needs maintenance.
It under-serves niche categories. Highly regulated categories (health, finance) or very narrow product segments sometimes need additional checks specific to their domain, which a generic checklist cannot cover.
It cannot replace a specialist read where the stakes are high. For brands at Plus scale or with significant revenue at stake, a specialist audit complements the self-audit by pressure-testing interpretations and catching pattern-level issues that a checklist alone misses.
Frequently asked questions
How long does a proper AI search visibility audit take?
For a mid-sized Shopify brand, the initial audit typically takes two to four working days spread across a week. Crawler access and schema checks can be done inside a day; prompt-set scoring adds another day because it must be done manually across five or six engines; off-site and infrastructure checks round out the rest. After the first audit, quarterly re-runs usually take half that time because the checklist and the baseline data are already in place.
Can I run this audit myself, or do I need an agency?
Most of the 50 checks are done by someone with a solid understanding of Shopify, Google Search Console, GA4, and basic technical SEO. The steps that usually benefit from specialist help are server log analysis, cross-engine citation interpretation, and the schema-to-content consistency review. A growth manager or in-house SEO can complete an honest self-audit; a specialist tends to produce a sharper prioritisation of fixes.
What should I do first if the audit reveals many problems?
Start with crawler access and schema parity before anything else. If Googlebot, OAI-SearchBot, Claude-SearchBot, and Perplexity's crawlers cannot reach the pages cleanly, no amount of content work will help. Once access is verified, fix schema mismatches where JSON-LD disagrees with visible content. Those two sets of fixes tend to unblock more downstream gains than any single content project.
Does the checklist apply to Shopify Basic stores, or is it only useful at Plus scale?
It applies to both, with different emphasis. Plus brands tend to need the multi-locale and expansion-store sections more than Basic stores. Basic stores often find the schema, content, and measurement sections more decision-changing because they have fewer automatic safeguards from agency partners. The core checklist is the same; the weighting of findings differs by scale.
How often should this audit be repeated?
Quarterly for the full 50 points, with a light monthly check on the highest-value items (crawler access, schema validation, prompt-set scoring). Anything more frequent than quarterly for the full audit usually produces noise rather than signal. AI assistants, Google's generative surfaces, and Shopify's own defaults shift at a quarterly rhythm, which is also the right cadence for structural review.
Key takeaways
- Work the zones in order: crawler access, schema, content, measurement, off-site, infrastructure. Earlier findings explain most of the problems in later zones.
- Crawler access and schema parity deliver the most high-leverage fixes from the first audit for most Shopify brands.
- Build measurement before investing heavily in content. Without a prompt set and a GA4 AI Search channel, every content decision is a guess.
- Under-invest in the checklist and you will repeat the same audit on every cycle; over-invest and the audit eats the budget that should be funding fixes. Quarterly full, monthly light is usually the right balance.
- Prioritise severity and reversibility. Fix silent blocks and schema mismatches first; save the structural content work for after the access and accuracy baseline is clean.
This article is intended for informational purposes. AI provider crawler policies, Shopify platform features, structured data requirements, and analytics surfaces can change over time. Verify current details with the relevant vendor documentation, Shopify Help Center, and a direct conversation with nivk.com before making a strategic or technical decision.



