How to Add LLMS.txt to Drupal Sites

January 13, 2024
10 min read
LLMS.txt Team

Introduction

Drupal's flexible architecture makes it an excellent platform for implementing LLMS.txt functionality. This comprehensive guide will show you how to add LLMS.txt support to your Drupal site using custom modules, configuration management, and Drupal's powerful content API.

Method 1: Manual File Upload (Quick Start)

Step 1: Generate Your LLMS.txt File

Start by creating your LLMS.txt file using our LLMS.txt Generator:

  1. Enter your Drupal site URL
  2. Add relevant disallow rules (e.g., /admin, /user, /node/add)
  3. Include contact information
  4. Generate and download the file

Step 2: Upload to Document Root

Upload the llms.txt file to your Drupal installation'sdocument root:

your-drupal-site/
├── core/
├── modules/
├── profiles/
├── sites/
├── themes/
├── vendor/
├── web/
│   ├── index.php
│   ├── .htaccess
│   └── llms.txt  ← Upload here
├── composer.json
└── composer.lock

Step 3: Verify Access

Test that your file is accessible at https://yoursite.com/llms.txt.

Method 2: Custom Drupal Module (Recommended)

For a more integrated and maintainable solution, create a custom Drupal module. This method provides better integration with Drupal's content system and allows for dynamic generation.

Step 1: Create Module Structure

Create the following directory structure in modules/custom/llms_txt/:

modules/custom/llms_txt/
├── llms_txt.info.yml
├── llms_txt.module
├── llms_txt.routing.yml
├── llms_txt.services.yml
├── config/
│   └── install/
│       └── llms_txt.settings.yml
└── src/
    ├── Controller/
    │   └── LlmsTxtController.php
    ├── Form/
    │   └── LlmsTxtConfigForm.php
    └── Service/
        └── LlmsTxtGenerator.php

Step 2: Module Info File

Create llms_txt.info.yml:

name: 'LLMS.txt Generator'
type: module
description: 'Generates LLMS.txt files for AI-friendly website documentation'
core_version_requirement: ^9 || ^10
package: Custom
dependencies:
  - drupal:system
  - drupal:node
  - drupal:user

Step 3: Routing Configuration

Create llms_txt.routing.yml:

llms_txt.output:
  path: '/llms.txt'
  defaults:
    _controller: '\Drupal\llms_txt\Controller\LlmsTxtController::output'
  requirements:
    _access: 'TRUE'
  methods: [GET]

llms_txt.admin:
  path: '/admin/config/search/llms-txt'
  defaults:
    _form: '\Drupal\llms_txt\Form\LlmsTxtConfigForm'
    _title: 'LLMS.txt Settings'
  requirements:
    _permission: 'administer site configuration'

Step 4: Services Configuration

Create llms_txt.services.yml:

services:
  llms_txt.generator:
    class: Drupal\llms_txt\Service\LlmsTxtGenerator
    arguments: ['@entity_type.manager', '@config.factory', '@url_generator']

Step 5: Controller Implementation

Create src/Controller/LlmsTxtController.php:

<?php

namespace Drupal\llms_txt\Controller;

use Drupal\Core\Controller\ControllerBase;
use Drupal\llms_txt\Service\LlmsTxtGenerator;
use Symfony\Component\DependencyInjection\ContainerInterface;
use Symfony\Component\HttpFoundation\Response;

/**
 * Controller for LLMS.txt output.
 */
class LlmsTxtController extends ControllerBase {

  /**
   * The LLMS.txt generator service.
   *
   * @var \Drupal\llms_txt\Service\LlmsTxtGenerator
   */
  protected $generator;

  /**
   * Constructs a new LlmsTxtController object.
   *
   * @param \Drupal\llms_txt\Service\LlmsTxtGenerator $generator
   *   The LLMS.txt generator service.
   */
  public function __construct(LlmsTxtGenerator $generator) {
    $this->generator = $generator;
  }

  /**
   * {@inheritdoc}
   */
  public static function create(ContainerInterface $container) {
    return new static(
      $container->get('llms_txt.generator')
    );
  }

  /**
   * Outputs the LLMS.txt file content.
   *
   * @return \Symfony\Component\HttpFoundation\Response
   *   A response object containing the LLMS.txt content.
   */
  public function output() {
    $content = $this->generator->generate();
    
    $response = new Response($content);
    $response->headers->set('Content-Type', 'text/plain; charset=UTF-8');
    $response->headers->set('Cache-Control', 'public, max-age=3600');
    
    return $response;
  }

}

Step 6: Generator Service

Create src/Service/LlmsTxtGenerator.php:

<?php

namespace Drupal\llms_txt\Service;

use Drupal\Core\Config\ConfigFactoryInterface;
use Drupal\Core\Entity\EntityTypeManagerInterface;
use Drupal\Core\Url;
use Drupal\Core\Routing\UrlGeneratorInterface;

/**
 * Service for generating LLMS.txt content.
 */
class LlmsTxtGenerator {

  /**
   * The entity type manager.
   *
   * @var \Drupal\Core\Entity\EntityTypeManagerInterface
   */
  protected $entityTypeManager;

  /**
   * The config factory.
   *
   * @var \Drupal\Core\Config\ConfigFactoryInterface
   */
  protected $configFactory;

  /**
   * The URL generator.
   *
   * @var \Drupal\Core\Routing\UrlGeneratorInterface
   */
  protected $urlGenerator;

  /**
   * Constructs a new LlmsTxtGenerator object.
   *
   * @param \Drupal\Core\Entity\EntityTypeManagerInterface $entity_type_manager
   *   The entity type manager.
   * @param \Drupal\Core\Config\ConfigFactoryInterface $config_factory
   *   The config factory.
   * @param \Drupal\Core\Routing\UrlGeneratorInterface $url_generator
   *   The URL generator.
   */
  public function __construct(EntityTypeManagerInterface $entity_type_manager, ConfigFactoryInterface $config_factory, UrlGeneratorInterface $url_generator) {
    $this->entityTypeManager = $entity_type_manager;
    $this->configFactory = $config_factory;
    $this->urlGenerator = $url_generator;
  }

  /**
   * Generates the LLMS.txt content.
   *
   * @return string
   *   The generated LLMS.txt content.
   */
  public function generate() {
    $config = $this->configFactory->get('llms_txt.settings');
    $site_config = $this->configFactory->get('system.site');
    
    $output = [];
    
    // Site header
    $site_name = $site_config->get('name') ?: 'Drupal Site';
    $output[] = "# " . $site_name;
    $output[] = "";
    
    // Site description
    $slogan = $site_config->get('slogan');
    if ($slogan) {
      $output[] = "> " . $slogan;
      $output[] = "";
    }
    
    // Contact information
    $output[] = "## Contact";
    $mail = $site_config->get('mail');
    if ($mail) {
      $output[] = "- Email: " . $mail;
    }
    
    $base_url = $GLOBALS['base_url'];
    $output[] = "- Website: " . $base_url;
    
    $additional_contact = $config->get('additional_contact');
    if ($additional_contact) {
      $lines = explode("\n", $additional_contact);
      foreach ($lines as $line) {
        $line = trim($line);
        if (!empty($line)) {
          $output[] = "- " . $line;
        }
      }
    }
    $output[] = "";
    
    // Pages section
    $output[] = "## Pages";
    $output[] = "";
    
    // Get published nodes
    $node_storage = $this->entityTypeManager->getStorage('node');
    $included_types = $config->get('included_content_types') ?: ['page', 'article'];
    
    $query = $node_storage->getQuery()
      ->condition('status', 1)
      ->condition('type', $included_types, 'IN')
      ->sort('created', 'DESC')
      ->range(0, $config->get('max_pages') ?: 20)
      ->accessCheck(FALSE);
    
    $nids = $query->execute();
    $nodes = $node_storage->loadMultiple($nids);
    
    foreach ($nodes as $node) {
      $output[] = "### " . $node->getTitle();
      
      $url = Url::fromRoute('entity.node.canonical', ['node' => $node->id()], ['absolute' => TRUE]);
      $output[] = "URL: " . $url->toString();
      
      // Get node summary/description
      if ($node->hasField('body') && !$node->get('body')->isEmpty()) {
        $body = $node->get('body')->first();
        if ($body && $body->summary) {
          $summary = strip_tags($body->summary);
        } else {
          $full_text = strip_tags($body->value);
          $summary = substr($full_text, 0, 200) . '...';
        }
        $output[] = $summary;
      }
      
      $output[] = "";
    }
    
    // Crawling rules
    $disallow_rules = $config->get('disallow_rules') ?: [
      '/admin',
      '/user',
      '/node/add',
      '/edit',
      '/delete',
      '?q=admin',
      '?q=user',
    ];
    
    if (!empty($disallow_rules)) {
      $output[] = "## Crawling Rules";
      foreach ($disallow_rules as $rule) {
        $output[] = "Disallow: " . trim($rule);
      }
    }
    
    return implode("\n", $output);
  }

}

Method 3: Configuration Form

Create src/Form/LlmsTxtConfigForm.php for easy admin configuration:

<?php

namespace Drupal\llms_txt\Form;

use Drupal\Core\Form\ConfigFormBase;
use Drupal\Core\Form\FormStateInterface;
use Drupal\Core\Entity\EntityTypeManagerInterface;
use Symfony\Component\DependencyInjection\ContainerInterface;

/**
 * Configuration form for LLMS.txt settings.
 */
class LlmsTxtConfigForm extends ConfigFormBase {

  /**
   * The entity type manager.
   *
   * @var \Drupal\Core\Entity\EntityTypeManagerInterface
   */
  protected $entityTypeManager;

  /**
   * Constructs a new LlmsTxtConfigForm object.
   *
   * @param \Drupal\Core\Entity\EntityTypeManagerInterface $entity_type_manager
   *   The entity type manager.
   */
  public function __construct(EntityTypeManagerInterface $entity_type_manager) {
    $this->entityTypeManager = $entity_type_manager;
  }

  /**
   * {@inheritdoc}
   */
  public static function create(ContainerInterface $container) {
    return new static(
      $container->get('entity_type.manager')
    );
  }

  /**
   * {@inheritdoc}
   */
  protected function getEditableConfigNames() {
    return ['llms_txt.settings'];
  }

  /**
   * {@inheritdoc}
   */
  public function getFormId() {
    return 'llms_txt_config_form';
  }

  /**
   * {@inheritdoc}
   */
  public function buildForm(array $form, FormStateInterface $form_state) {
    $config = $this->config('llms_txt.settings');
    
    $form['additional_contact'] = [
      '#type' => 'textarea',
      '#title' => $this->t('Additional Contact Information'),
      '#default_value' => $config->get('additional_contact'),
      '#description' => $this->t('Additional contact information to include (one per line).'),
    ];
    
    $form['max_pages'] = [
      '#type' => 'number',
      '#title' => $this->t('Maximum Pages'),
      '#default_value' => $config->get('max_pages') ?: 20,
      '#description' => $this->t('Maximum number of pages to include in LLMS.txt.'),
      '#min' => 1,
      '#max' => 100,
    ];
    
    // Content types selection
    $content_types = $this->entityTypeManager->getStorage('node_type')->loadMultiple();
    $options = [];
    foreach ($content_types as $type) {
      $options[$type->id()] = $type->label();
    }
    
    $form['included_content_types'] = [
      '#type' => 'checkboxes',
      '#title' => $this->t('Included Content Types'),
      '#options' => $options,
      '#default_value' => $config->get('included_content_types') ?: ['page', 'article'],
      '#description' => $this->t('Select which content types to include in LLMS.txt.'),
    ];
    
    $form['disallow_rules'] = [
      '#type' => 'textarea',
      '#title' => $this->t('Disallow Rules'),
      '#default_value' => implode("\n", $config->get('disallow_rules') ?: []),
      '#description' => $this->t('Paths to disallow (one per line, without "Disallow: " prefix).'),
    ];
    
    return parent::buildForm($form, $form_state);
  }

  /**
   * {@inheritdoc}
   */
  public function submitForm(array &$form, FormStateInterface $form_state) {
    $config = $this->config('llms_txt.settings');
    
    $disallow_rules = array_filter(
      array_map('trim', explode("\n", $form_state->getValue('disallow_rules')))
    );
    
    $included_types = array_filter($form_state->getValue('included_content_types'));
    
    $config
      ->set('additional_contact', $form_state->getValue('additional_contact'))
      ->set('max_pages', $form_state->getValue('max_pages'))
      ->set('included_content_types', array_values($included_types))
      ->set('disallow_rules', $disallow_rules)
      ->save();
    
    parent::submitForm($form, $form_state);
  }

}

Installation and Configuration

Step 1: Enable the Module

After creating all the files, enable your custom module:

# Via Drush
drush en llms_txt

# Via Drupal Admin UI
Admin → Extend → Enable "LLMS.txt Generator"

Step 2: Configure Settings

Navigate to Configuration → Search and metadata → LLMS.txt Settings(/admin/config/search/llms-txt) to configure:

  • Additional contact information
  • Maximum number of pages to include
  • Content types to include
  • Disallow rules

Step 3: Clear Cache

drush cr

Step 4: Test Your Implementation

Visit https://yoursite.com/llms.txt to see your generated LLMS.txt file.

Drupal-Specific Best Practices

Recommended Disallow Rules

  • /admin - Admin interface
  • /user - User pages
  • /node/add - Content creation forms
  • /edit - Edit forms
  • /delete - Delete confirmation pages
  • ?q=admin - Legacy admin URLs
  • ?q=user - Legacy user URLs
  • /search - Search result pages
  • /taxonomy/term - Taxonomy pages (if not needed)

Performance Optimization

  • Caching: Implement Drupal's cache API for LLMS.txt output
  • Content limits: Use reasonable limits for included content
  • Query optimization: Use entity queries efficiently
  • Cache invalidation: Clear LLMS.txt cache when content changes

Multilingual Support

For multilingual Drupal sites, you can create language-specific versions:

  • Modify the routing to accept language parameters
  • Use Drupal's translation APIs
  • Generate separate LLMS.txt files per language

Advanced Features

Integration with Drupal Commerce

For e-commerce sites, consider including:

  • Product catalog pages
  • Store policies
  • Shipping and return information
  • Customer support details

Custom Field Integration

Enhance your LLMS.txt with custom field data:

  • SEO descriptions from meta tag modules
  • Custom summaries from fields
  • Priority indicators from custom fields
  • Category information

Views Integration

Use Drupal Views to create custom content selections:

  • Most popular content
  • Recently updated pages
  • Featured content
  • Content by category or tag

Troubleshooting

Module Not Installing

  • Check file permissions
  • Verify YAML syntax in configuration files
  • Ensure Drupal version compatibility
  • Check for missing dependencies

Route Not Working

  • Clear all caches
  • Check routing configuration
  • Verify controller namespace and class names
  • Check .htaccess configuration

Empty or Incorrect Content

  • Verify content type selection
  • Check content publication status
  • Review entity access permissions
  • Debug with drupal_set_message() or logger

Conclusion

Implementing LLMS.txt in Drupal provides excellent flexibility and integration with Drupal's content management capabilities. The custom module approach offers the most control and maintainability, while the manual method provides a quick start option.

Choose the implementation method that best fits your technical requirements and maintenance capabilities. The custom module approach is recommended for production sites as it provides better integration, configuration options, and automatic updates when content changes.