A Docusaurus plugin for generating LLM-friendly documentation following the llmstxt standard.
- ⚡️ Easy integration with Docusaurus
- ✅ Zero config required, works out of the box
- ⚙️ Highly customizable with multiple options
- 📝 Creates
llms.txt
with section links - 📖 Produces
llms-full.txt
with all content in one file - 📋 Document ordering control for custom sequence
- 🔄 Path transformation to customize URL construction
- 📚 Option to include blog posts
- 🧩 Custom LLM files for specific documentation sections
- 🧹 Cleans HTML and normalizes content for optimal LLM consumption
- 🚫 Optional import statement removal for cleaner MDX content
- 🔄 Optional duplicate heading removal for concise output
- 📊 Provides statistics about generated documentation
- Installation
- Configuration Options
- Available Options
- Path Transformation Examples
- Document Ordering Examples
- Custom LLM Files
- Content Cleaning Options
- Best Practices
- How It Works
- Implementation Details
- Testing
- Future Enhancements
- License
npm install docusaurus-plugin-llms --save-dev
Then add to your Docusaurus configuration:
module.exports = {
// ... your existing Docusaurus config
plugins: [
'docusaurus-plugin-llms',
// ... your other plugins
],
};
You can configure the plugin by passing options:
module.exports = {
// ... your existing Docusaurus config
plugins: [
[
'docusaurus-plugin-llms',
{
// Options here
generateLLMsTxt: true,
generateLLMsFullTxt: true,
docsDir: 'docs',
ignoreFiles: ['advanced/*', 'private/*'],
title: 'My Project Documentation',
description: 'Complete reference documentation for My Project',
includeBlog: true,
// Content cleaning options
excludeImports: true,
removeDuplicateHeadings: true,
// Generate individual markdown files following llmstxt.org specification
generateMarkdownFiles: true,
// Control documentation order
includeOrder: [
'getting-started/*',
'guides/*',
'api/*',
],
includeUnmatchedLast: true,
// Path transformation options
pathTransformation: {
// Paths to ignore when constructing URLs (will be removed if found)
ignorePaths: ['docs'],
// Paths to add when constructing URLs (will be prepended if not already present)
addPaths: ['api'],
},
// Custom LLM files for specific documentation sections
customLLMFiles: [
{
filename: 'llms-python.txt',
includePatterns: ['api/python/**/*.md', 'guides/python/*.md'],
fullContent: true,
title: 'Python API Documentation',
description: 'Complete reference for Python API'
},
{
filename: 'llms-tutorials.txt',
includePatterns: ['tutorials/**/*.md'],
fullContent: false,
title: 'Tutorial Documentation',
description: 'All tutorials in a single file'
}
],
},
],
// ... your other plugins
],
};
Option | Type | Default | Description |
---|---|---|---|
description |
string | Site tagline | Custom description to use in generated files |
docsDir |
string | 'docs' |
Base directory for documentation files |
excludeImports |
boolean | false |
Remove import statements from generated content |
generateLLMsFullTxt |
boolean | true |
Whether to generate the full content file |
generateLLMsTxt |
boolean | true |
Whether to generate the links file |
ignoreFiles |
string[] | [] |
Array of glob patterns for files to ignore |
includeBlog |
boolean | false |
Whether to include blog content |
includeOrder |
string[] | [] |
Array of glob patterns for files to process in specific order |
includeUnmatchedLast |
boolean | true |
Whether to include unmatched files at the end |
llmsFullTxtFilename |
string | 'llms-full.txt' |
Custom filename for the full content file |
llmsTxtFilename |
string | 'llms.txt' |
Custom filename for the links file |
pathTransformation.addPaths |
string[] | [] |
Path segments to add when constructing URLs |
pathTransformation.ignorePaths |
string[] | [] |
Path segments to ignore when constructing URLs |
pathTransformation |
object | undefined |
Path transformation options for URL construction |
removeDuplicateHeadings |
boolean | false |
Remove redundant content that duplicates heading text |
title |
string | Site title | Custom title to use in generated files |
version |
string | undefined |
Global version to include in all generated files |
customLLMFiles |
array | [] |
Array of custom LLM file configurations |
generateMarkdownFiles |
boolean | false |
Generate individual markdown files and link to them from llms.txt |
keepFrontMatter |
string[] | [] | Preserve selected front matter items when generating individual markdown files |
rootContent |
string | (see below) | Custom content to include at the root level of llms.txt |
fullRootContent |
string | (see below) | Custom content to include at the root level of llms-full.txt |
The rootContent
and fullRootContent
options allow you to customize the introductory content that appears in your generated files, following the llmstxt.org standard which allows "zero or more markdown sections (e.g. paragraphs, lists, etc) of any type except headings" after the title and description.
If not specified, the plugin uses these defaults:
- llms.txt: "This file contains links to documentation sections following the llmstxt.org standard."
- llms-full.txt: "This file contains all documentation content in a single document following the llmstxt.org standard."
Example 1: Add project-specific context
rootContent: `Welcome to the MyProject documentation.
This documentation covers:
- Installation and setup
- API reference
- Advanced usage guides
- Troubleshooting
For the latest updates, visit https://myproject.dev/changelog`
Example 2: Add technical specifications
fullRootContent: `Complete offline documentation bundle for MyProject v2.0.
**Format**: Markdown with code examples
**Languages**: JavaScript, TypeScript, Python
**Last Generated**: ${new Date().toISOString()}
> Note: Some features require authentication tokens.
> See the Authentication section for details.`
Example 3: Add navigation hints for AI assistants
rootContent: `This documentation is optimized for AI assistants and LLMs.
Quick navigation:
- For API endpoints, search for "API:"
- For code examples, search for "Example:"
- For configuration, search for "Config:"
All code examples are MIT licensed unless otherwise noted.`
You can also specify root content for each custom LLM file:
customLLMFiles: [
{
filename: 'llms-api.txt',
includePatterns: ['api/**/*.md'],
fullContent: true,
title: 'API Documentation',
rootContent: `Complete API reference for all REST endpoints.
Authentication required for all endpoints except /health.
Base URL: https://api.example.com/v2`
}
]
The path transformation feature allows you to manipulate how URLs are constructed from file paths:
Example 1: Remove 'docs' from the URL path
pathTransformation: {
ignorePaths: ['docs'],
}
File path: /content/docs/manual/decorators.md
→ URL: https://example.com/manual/decorators
Example 2: Add 'api' to the URL path
pathTransformation: {
addPaths: ['api'],
}
File path: /content/manual/decorators.md
→ URL: https://example.com/api/manual/decorators
Example 3: Combine both transformations
pathTransformation: {
ignorePaths: ['docs'],
addPaths: ['api'],
}
File path: /content/docs/manual/decorators.md
→ URL: https://example.com/api/manual/decorators
The configuration supports multiple path segments in both arrays.
The document ordering feature allows you to control the sequence in which files appear in the generated output:
Example 1: Basic Section Ordering
includeOrder: [
'getting-started/*',
'guides/*',
'api/*',
'advanced/*'
]
Result: Files will appear in the generated output following this section order.
Example 2: Strict Inclusion List
includeOrder: [
'public-docs/**/*.md'
],
includeUnmatchedLast: false
Result: Only files matching 'public-docs/**/*.md' are included, all others are excluded.
Example 3: Detailed Ordering with Specific Files First
includeOrder: [
'getting-started/installation.md',
'getting-started/quick-start.md',
'getting-started/*.md',
'api/core/*.md',
'api/plugins/*.md',
'api/**/*.md'
]
Result: Installation and quick-start guides appear first, followed by other getting-started files, then API documentation in a specific order.
The plugin fully supports Docusaurus partials - reusable MDX content files that can be imported into other documents.
- Partial files (MDX files starting with underscore, e.g.,
_shared-config.mdx
) are automatically excluded from the generatedllms*.txt
files - Import statements for partials are resolved and the content is inlined when processing documents
Given a partial file _api-config.mdx
:
## API Configuration
Set your API endpoint:
```javascript
const API_URL = 'https://api.example.com';
And a document that imports it:
```mdx
---
title: Getting Started
---
# Getting Started Guide
import ApiConfig from './_api-config.mdx';
<ApiConfig />
Now you can make API calls...
The plugin will:
- Exclude
_api-config.mdx
fromllms.txt
- Replace the import and
<ApiConfig />
with the actual content in the processed document
In addition to the standard llms.txt
and llms-full.txt
files, you can generate custom LLM-friendly files for different sections of your documentation with the customLLMFiles
option:
customLLMFiles: [
{
filename: 'llms-python.txt',
includePatterns: ['api/python/**/*.md', 'guides/python/*.md'],
fullContent: true,
title: 'Python API Documentation',
description: 'Complete reference for Python API'
},
{
filename: 'llms-tutorials.txt',
includePatterns: ['tutorials/**/*.md'],
fullContent: false,
title: 'Tutorial Documentation',
description: 'All tutorials in a single file'
}
]
Each custom LLM file is defined by an object with the following properties:
Option | Type | Required | Description |
---|---|---|---|
filename |
string | Yes | Name of the output file (e.g., 'llms-python.txt') |
includePatterns |
string[] | Yes | Glob patterns for files to include |
fullContent |
boolean | Yes | true for full content like llms-full.txt, false for links only like llms.txt |
title |
string | No | Custom title for this file (defaults to site title) |
description |
string | No | Custom description for this file (defaults to site description) |
ignorePatterns |
string[] | No | Additional patterns to exclude (combined with global ignoreFiles) |
orderPatterns |
string[] | No | Order patterns for controlling file ordering (similar to includeOrder) |
includeUnmatchedLast |
boolean | No | Whether to include unmatched files last (default: false) |
version |
string | No | Version information for this LLM file (overrides global version) |
Create separate files for different programming languages:
customLLMFiles: [
{
filename: 'llms-python.txt',
includePatterns: ['api/python/**/*.md', 'guides/python/*.md'],
fullContent: true,
title: 'Python API Documentation'
},
{
filename: 'llms-javascript.txt',
includePatterns: ['api/javascript/**/*.md', 'guides/javascript/*.md'],
fullContent: true,
title: 'JavaScript API Documentation'
}
]
Separate tutorials from API reference:
customLLMFiles: [
{
filename: 'llms-tutorials.txt',
includePatterns: ['tutorials/**/*.md', 'guides/**/*.md'],
fullContent: true,
title: 'Tutorials and Guides'
},
{
filename: 'llms-api.txt',
includePatterns: ['api/**/*.md', 'reference/**/*.md'],
fullContent: true,
title: 'API Reference'
}
]
Create a beginner-focused file with carefully ordered content:
customLLMFiles: [
{
filename: 'llms-getting-started.txt',
includePatterns: ['**/*.md'],
ignorePatterns: ['advanced/**/*.md', 'internal/**/*.md'],
orderPatterns: [
'introduction.md',
'getting-started/*.md',
'tutorials/basic/*.md',
'examples/simple/*.md'
],
fullContent: true,
title: 'Getting Started Guide',
description: 'Beginner-friendly documentation with essential concepts'
}
]
Include version information in your documentation files:
plugins: [
[
'docusaurus-plugin-llms',
{
// Global version applies to all files
version: '2.0.0',
// Custom LLM files with specific versions
customLLMFiles: [
{
filename: 'api-reference.txt',
title: 'API Reference Documentation',
description: 'Complete API reference for developers',
includePatterns: ['**/api/**/*.md', '**/reference/**/*.md'],
fullContent: true,
version: '1.0.0' // Overrides global version
},
{
filename: 'tutorials.txt',
title: 'Tutorials and Guides',
description: 'Step-by-step tutorials and guides',
includePatterns: ['**/tutorials/**/*.md', '**/guides/**/*.md'],
fullContent: true,
version: '0.9.5-beta' // Overrides global version
}
]
}
],
]
The generated files will include the version information under the description:
# API Reference Documentation
> Complete API reference for developers
Version: 1.0.0
This file contains all documentation content in a single document following the llmstxt.org standard.
The plugin provides advanced content cleaning options to optimize your documentation for LLM consumption by removing unnecessary elements that can clutter the output.
The excludeImports
option removes JavaScript/TypeScript import statements from your MDX files, which are typically not useful for LLMs and can create noise in the generated documentation.
- Your documentation uses MDX files with React components
- You have many import statements for UI components
- You want cleaner, more readable output for LLMs
Before (with excludeImports: false
):
import ApiTabs from "@theme/ApiTabs";
import DiscriminatorTabs from "@theme/DiscriminatorTabs";
import MethodEndpoint from "@theme/ApiExplorer/MethodEndpoint";
import SecuritySchemes from "@theme/ApiExplorer/SecuritySchemes";
import MimeTabs from "@theme/MimeTabs";
import ParamsItem from "@theme/ParamsItem";
# Create User Account
This endpoint creates a new user account...
After (with excludeImports: true
):
# Create User Account
This endpoint creates a new user account...
{
excludeImports: true, // Remove all import statements
}
The removeDuplicateHeadings
option removes redundant content that simply repeats the heading text immediately after the heading, which is common in auto-generated API documentation.
- Your documentation has redundant content that repeats heading text
- You have auto-generated API docs with minimal content
- You want to eliminate repetitive patterns for cleaner LLM consumption
Before (with removeDuplicateHeadings: false
):
# Create Deliverable
Create Deliverable
---
# Update User Profile
Update User Profile
---
After (with removeDuplicateHeadings: true
):
# Create Deliverable
---
# Update User Profile
---
{
removeDuplicateHeadings: true, // Remove redundant heading text
}
For optimal LLM-friendly output, you can combine both options:
module.exports = {
plugins: [
[
'docusaurus-plugin-llms',
{
// Enable both content cleaning options for optimal LLM output
excludeImports: true,
removeDuplicateHeadings: true,
// Other configuration options...
generateLLMsTxt: true,
generateLLMsFullTxt: true,
docsDir: 'docs',
},
],
],
};
{
excludeImports: false,
removeDuplicateHeadings: false
}
- Preserves all original content
- Suitable when you want to keep import statements for reference
- Good for documentation that doesn't have redundant patterns
{
excludeImports: true,
removeDuplicateHeadings: false
}
- Removes import statements but keeps all content
- Good for MDX-heavy documentation sites
- Maintains content structure while removing technical imports
{
excludeImports: true,
removeDuplicateHeadings: true
}
- Maximum cleanup for LLM consumption
- Removes both imports and redundant content
- Recommended for API documentation and auto-generated content
- Produces the cleanest, most concise output
If you have auto-generated API documentation (like OpenAPI docs), enable both cleaning options:
{
excludeImports: true, // Remove React component imports
removeDuplicateHeadings: true, // Remove redundant API endpoint descriptions
generateLLMsFullTxt: true, // Create comprehensive single file
}
For hand-written tutorials and guides, you might want selective cleaning:
{
excludeImports: true, // Remove any MDX imports
removeDuplicateHeadings: false, // Keep all content as written
includeOrder: [ // Organize content logically
'getting-started/*',
'tutorials/*',
'advanced/*'
]
}
Create separate clean files for different programming languages:
{
excludeImports: true,
removeDuplicateHeadings: true,
customLLMFiles: [
{
filename: 'llms-python.txt',
includePatterns: ['**/python/**/*.md'],
fullContent: true,
title: 'Python Documentation'
},
{
filename: 'llms-javascript.txt',
includePatterns: ['**/javascript/**/*.md'],
fullContent: true,
title: 'JavaScript Documentation'
}
]
}
- Content cleaning adds minimal processing overhead
- Both options work on the content after HTML tag removal
- No impact on your site's build performance
- Cleaning happens only during the LLM file generation phase
Both options default to false
, ensuring existing configurations continue to work without changes. Only users who explicitly enable these features will see the cleaned output.
The generateMarkdownFiles
option enables the plugin to generate individual markdown files for each documentation page, following the llmstxt.org specification more closely. When enabled, this creates separate .md
files for LLM consumption instead of linking to your original documentation pages.
Default Behavior (generateMarkdownFiles: false):
- Generates
llms.txt
with links to your original documentation pages - Example:
[Getting Started](https://yoursite.com/docs/getting-started)
With generateMarkdownFiles: true:
- Generates individual markdown files (e.g.,
getting-started.md
,api-reference.md
) - Generates
llms.txt
with links to these generated markdown files - Example:
[Getting Started](https://yoursite.com/getting-started.md)
- Standards Compliance: Follows the llmstxt.org specification by providing individual markdown files rather than linking to HTML pages
- LLM Optimization: Generated files contain clean, processed markdown optimized for LLM consumption
- Self-Contained: All necessary content is available in markdown format without requiring HTML parsing
- Flexible Naming: Automatically generates readable filenames based on document titles
module.exports = {
plugins: [
[
'docusaurus-plugin-llms',
{
generateMarkdownFiles: true, // Enable individual markdown file generation
generateLLMsTxt: true, // Generate index file linking to markdown files
excludeImports: true, // Clean up import statements
removeDuplicateHeadings: true, // Remove redundant content
// Other options work normally
includeOrder: ['getting-started/*', 'guides/*', 'api/*'],
pathTransformation: {
ignorePaths: ['docs']
}
}
]
]
}
With generateMarkdownFiles: true
, your output directory will contain:
build/
├── llms.txt # Index file with links to generated markdown files
├── llms-full.txt # Full content file (if enabled)
├── getting-started.md # Generated from your getting started docs
├── api-reference.md # Generated from your API documentation
├── user-guide.md # Generated from your user guides
└── ... # Other generated markdown files
The plugin generates readable filenames using this priority:
- Document Title: Converted to kebab-case (e.g., "Getting Started" →
getting-started.md
) - URL Path: If title is unavailable, uses the document's URL path
- Uniqueness: Automatically appends numbers for duplicate names (e.g.,
getting-started-1.md
)
Generated markdown files include:
- Document title as H1 heading
- Document description as blockquote (following llmstxt.org format)
- Processed content with optional cleaning (import removal, duplicate heading removal)
- Proper markdown formatting optimized for LLM consumption
Input documentation about "API Authentication" would generate api-authentication.md
:
# API Authentication
> Learn how to authenticate with our API using various methods
## Overview
This guide covers all authentication methods supported by our API...
## API Key Authentication
Use your API key to authenticate requests:
```javascript
const client = new Client({ apiKey: 'your-key' });
### Use Cases
#### Standards-Compliant Documentation
Perfect for projects that want to follow the llmstxt.org specification exactly:
```js
{
generateMarkdownFiles: true,
generateLLMsTxt: true,
generateLLMsFullTxt: false // Optional: disable if only individual files are needed
}
Generate clean markdown files for LLM training or fine-tuning:
{
generateMarkdownFiles: true,
excludeImports: true,
removeDuplicateHeadings: true,
customLLMFiles: [
{
filename: 'training-data.txt',
includePatterns: ['**/*.md'],
fullContent: true
}
]
}
Generate both original links and markdown files for different use cases:
{
generateLLMsTxt: true, // Links to original pages
generateMarkdownFiles: true, // Also generate individual markdown files
llmsTxtFilename: 'llms-original.txt', // Original links file
// The markdown-linked version will be in llms.txt
}
- Fully backward compatible: Defaults to
false
, existing configurations unchanged - Works with all existing options: Path transformations, custom LLM files, content cleaning
- Respects ordering: Generated files maintain the same order as configured with
includeOrder
- Custom LLM files: Also support markdown file generation when the global option is enabled
This plugin automatically generates the following files during the build process:
- llms.txt: Contains links to all sections of your documentation
- llms-full.txt: Contains all documentation content in a single file
- Custom LLM files: Additional files based on your custom configurations
These files follow the llmstxt standard, making your documentation optimized for use with Large Language Models (LLMs).
The plugin:
- Scans your
docs
directory recursively for all Markdown files - Optionally includes blog content
- Orders documents according to specified glob patterns (if provided)
- Extracts metadata, titles, and content from each file
- Creates proper URL links to each document section
- Applies path transformations according to configuration (removing or adding path segments)
- Generates a table of contents in
llms.txt
- Combines all documentation content in
llms-full.txt
- Creates custom LLM files based on specified configurations
- Provides statistics about the generated documentation
The plugin includes comprehensive tests in the tests
directory:
- Unit tests: Test the path transformation functionality in isolation
- Integration tests: Simulate a Docusaurus build with various configurations
To run the tests:
# Run all tests
npm test
# Run just the unit tests
npm run test:unit
# Run just the integration tests
npm run test:integration
For more detailed testing instructions, see tests/TESTING.md.
Planned features for future versions:
- Advanced glob pattern matching for file filtering
- Support for i18n content
- Specific content tags for LLM-only sections
This project is licensed under the MIT License.