Why most websites are invisible to ChatGPT — and how to fix it in an afternoon
If ChatGPT can't see your site, it can't cite you, and customers asking about your category never learn you exist. Here's why most sites are invisible, the five fixes that matter, and how to ship them today.
ChatGPT now has more than 800 million weekly users. Many of them ask questions that, three years ago, would have been Google searches. "What's the best CRM for small business?" "Who makes the most durable hiking boots?" "Which tax software is right for freelancers?"
When they ask, ChatGPT picks 1-3 sources and quotes from them. Those sources get cited. Their brands get mentioned. Their websites get a sliver of awareness from a person who never visited.
The vast majority of websites are invisible to this entire conversation. Not because they're badly designed or because their content is bad — because they failed to do five specific things that determine whether ChatGPT can read them at all.
This post covers all five. Each is fixable in 15-30 minutes. Together they take an afternoon, and they meaningfully change whether AI engines can see your site.
The five reasons sites are invisible
1. ChatGPT can't read your content because it's locked behind JavaScript
Modern websites render most of their content via JavaScript. The HTML your browser receives is essentially empty — a <div id="root"> with a script tag that builds the page client-side.
ChatGPT's crawler (GPTBot) and its on-demand fetcher (ChatGPT-User) don't always execute JavaScript. When they don't, they see the empty HTML and assume your page has no content. They move on.
How to know if you have this problem:
In your browser, right-click your homepage and choose "View Page Source" (not "Inspect Element"). Read what's there. If you see a clean structured page with all your content — you're fine. If you see <div id="__next"></div> followed by a script tag, with no actual content, you have a rendering problem.
The fix:
Server-side rendering or static generation. If you're on Next.js, Astro, Remix, Eleventy, or similar — you probably already render most pages server-side. Just verify the important ones (homepage, key product pages) ship full HTML.
If you're on a single-page React app with no SSR, this is a bigger lift. Consider pre-rendering at least your high-value pages, or adding a service like Prerender.io that serves bot-readable HTML to crawlers.
2. Your robots.txt doesn't explicitly allow AI crawlers
Most robots.txt files were last seriously updated when only Googlebot and Bingbot mattered. They don't say anything about GPTBot, ChatGPT-User, OAI-SearchBot, ClaudeBot, PerplexityBot, or Google-Extended.
The defaults vary by crawler. Some assume "allowed unless explicitly blocked." Some are more conservative. Some sites block AI crawlers accidentally via overly aggressive bot-protection systems (Cloudflare's default rules, for example, sometimes block AI bots).
How to know if you have this problem:
Fetch your robots.txt: curl https://yoursite.com/robots.txt. Look for explicit entries for GPTBot, ClaudeBot, PerplexityBot, GrokBot. If you don't see them by name, your robots.txt doesn't address them.
Also check your Cloudflare/Vercel/CDN bot rules — some platforms have "block AI bots" toggles that you may have enabled without realizing what they cover.
The fix:
Add explicit User-agent entries for major AI crawlers. Our full guide is at Robots.txt for AI crawlers, but the short version: name each major AI bot and explicitly allow them.
3. You don't have an /llms.txt file
This is a newer convention — a single Markdown file at the root of your domain that tells AI engines what your site is about and which pages are worth reading. It takes about 15 minutes to write. Most websites don't have one.
The sites that DO have one see meaningful improvements in citation rate. It's the single highest-leverage 15 minutes of AEO work you can do.
How to know if you have this problem:
Fetch https://yoursite.com/llms.txt. If you get a 404, you don't have one. If you get HTML back (a 200 status with a styled 404 page), you don't have one and your server is misconfigured.
The fix:
Write one. We have a complete guide at llms.txt explained with a working example you can adapt in 15 minutes. Or let AISEOLab generate one from your sitemap.
4. Your content has no clear structure
AI engines read documents the way humans skim — by heading, then by paragraph. They look for H1s as title signals, H2s as section signals, and paragraphs as quotable units.
Many websites have terrible structure for this. Multiple H1s on the same page. Headings used for styling instead of hierarchy. Long paragraphs that try to do too much. Content buried inside accordions and tabs.
How to know if you have this problem:
Open your homepage. Open the browser inspector. Look at the structure: how many <h1> tags? How many <h2>? Are headings used to mark logical sections, or are they styling decisions?
A well-structured page has exactly one H1, multiple H2s marking logical sections, H3s for sub-sections, and paragraphs that each express a single claim.
The fix:
Audit your most important pages — homepage, primary product pages, top blog posts. Restructure them with clear H1/H2/H3 hierarchy. Break up long paragraphs into 2-4 sentence chunks, each expressing one idea.
This work has the side effect of making your pages more readable to humans too. AI engines and humans both reward clarity.
5. You have no Schema.org markup
Schema.org structured data is JSON-LD that tells AI engines (and search engines) what your page is about. Organization schema tells them you're a company. Product schema tells them you sell specific things. FAQPage schema lists question-answer pairs in a format both Google and AI engines parse directly.
Most websites have nothing. Even sites with Google rich snippets often have only the bare minimum.
How to know if you have this problem:
Visit Google's Rich Results Test: https://search.google.com/test/rich-results. Enter your homepage URL. See what Schema is detected. If the result is "No structured data found" — you have this problem.
The fix:
Add Schema.org JSON-LD to your key pages. Common types to start with:
Organizationon the homepage (your company)WebSiteon the homepage (your site's name and URL)SoftwareApplicationif you're a SaaS — describes your product including pricingArticleon blog posts (with author, datePublished, etc.)FAQPagewherever you have FAQsProducton e-commerce product pagesBreadcrumbListfor navigation breadcrumbs
This is one of the highest-impact AEO fixes. AI engines actively use Schema as ground truth when they understand what your page is about.
How to ship all five in an afternoon
Here's a 4-hour plan:
Hour 1: Audit
Run AISEOLab's free scan on your site. It'll check all five things above (and 12 more) automatically. You'll get a list of what's there, what's broken, and what's missing.
If you'd rather DIY: open your robots.txt, your llms.txt (if it exists), and Google's Rich Results test on your homepage. Note what's missing.
Hour 2: Quick wins
- Generate or write a
/llms.txtfile. Upload it to the root of your server. - Update your robots.txt with explicit entries for GPTBot, ClaudeBot, PerplexityBot, GrokBot, Google-Extended, Applebot-Extended.
- Add
OrganizationandWebSiteSchema.org JSON-LD to your homepage.
Hour 3: Content structure
Pick your top 3 pages by traffic or business importance. Audit them for proper H1/H2 hierarchy. Fix any pages with multiple H1s or weird heading nesting. Break up the longest paragraphs into shorter ones.
Hour 4: Verify
Re-run the scan. Confirm what you fixed is now passing. Address any remaining items.
After this afternoon, your site goes from "invisible to most AI engines" to "explicitly addressing them, structured for them, and described to them." Within 2-6 weeks, AI engines will re-crawl and start incorporating your content. Citation rate improvements typically appear 4-12 weeks later.
A note on what NOT to do
While you're doing this, avoid common mistakes:
Don't try to game AI engines with keyword stuffing. The new generation of AI is significantly better at detecting low-quality content than Google was in 2010. Write for humans first, then add structure.
Don't block AI crawlers to protect your content. Many sites do this to prevent "stealing" of content. The trade-off is becoming invisible. Unless you have specific legal or competitive reasons, allow AI crawlers and accept the trade-off.
Don't trust any service that promises "guaranteed first-position citations." AEO is real but it's not magic. Anyone making citation guarantees is either misunderstanding the technology or being dishonest about it.
Don't ignore the work because "AI traffic isn't that big yet." It is. Even if it wasn't, the work compounds. The companies that figure out AEO in 2026 will have a multi-year advantage.
The five fixes, in one sentence
Make your content server-rendered, explicitly allow AI crawlers in robots.txt, ship an llms.txt file, fix your heading hierarchy, and add Schema.org JSON-LD.
That's the afternoon. The compounding starts immediately. The full visibility shift takes 2-3 months.
Scan your site free to see exactly where you stand today. Free for one site, forever, no signup required.
Questions about your specific site? Email hello@aiseolab.ai. We're happy to look at concrete examples.