Top 10 mistakes to avoid in your llms.txt file

January 16, 2024
10 min read
llms.txt Team

llms.txt is simple, but small mistakes can make it less useful to AI crawlers. Here are the most common pitfalls, with practical fixes and better patterns you can copy. If you need a compliant file now, generate llms.txt here.

1) Listing every URL instead of the essentials

Mistake: Dumping dozens/hundreds of URLs with no prioritization. Fix: Include only key pages and category hubs; summarize the rest with overviews.

2) Writing marketing fluff instead of helpful summaries

Mistake: Vague superlatives like “cutting‑edge,” “world‑class.” Fix: Use clear, descriptive language that explains what users (and AI) will find on the page.

3) Forgetting contact details

Mistake: No contact section. Fix: Add “## Contact” with Email and Website.

4) No structure or headings

Fix: Use clear sections (Pages, Crawling Rules) and short paragraphs.

5) Including private/admin paths

Fix: Do not include details. Add Disallow lines and enforce on the server/WAF.

6) Ignoring sitemaps and canonical URLs

Fix: Keep URLs canonical and ensure robots.txt references sitemap.xml.

7) Never updating the file

Fix: Review monthly or after major site changes.

8) Over‑detailed product pages

Fix: Use 1–2 concise sentences per page; link to categories for breadth.

9) No Disallow section for AI crawlers

Fix: Add “## Crawling Rules” with Disallow lines for low‑value or sensitive paths.

10) Not testing delivery and headers

Fix: Serve as text/plain; test /llms.txt returns 200 (or a clean 301→200).

Bad vs better example

Bad

# MegaCo
> Innovative solutions for the modern world.

## Pages
### Services
URL: https://megaco.com/services
We provide world-class solutions that redefine excellence with cutting-edge technology.

### Blog
URL: https://megaco.com/blog

Better

# MegaCo
> B2B SaaS platform for inventory analytics and demand forecasting.

## Contact
- Email: team@megaco.com
- Website: https://megaco.com

## Pages
### Services
URL: https://megaco.com/services
Overview of inventory analytics, demand forecasting, and integrations.

### Blog
URL: https://megaco.com/blog
Articles on supply chain analytics, case studies, and product updates.

## Crawling Rules
Disallow: /admin
Disallow: /internal

Maintenance checklist

  • Confirm canonical URLs
  • Keep page blurbs short and specific
  • Update Disallow rules consistently
  • Pair with robots.txt for enforcement
  • Re‑test after site migrations