Robots Txt Plugin
Purpose
Generates a robots.txt file in the dist/ directory during builds, configuring web crawler access rules and linking to the sitemap.
Location
- Package:
@shevky/plugin-robots-txt - Main:
plugin-robots-txt/main.js - Runtime name:
shevky-robots-txt
Discovery and Registration
Listed in site.json -> plugins as "@shevky/plugin-robots-txt". Loaded by PluginRegistry.load(). No load() initializer.
Lifecycle Hooks
| Hook | Implemented |
|---|---|
dist:clean | ✓ |
assets:copy | - |
content:load | - |
content:ready | - |
page:meta | - |
How It Works
During dist:clean, the handler:
- Reads
ctx.config.identity.urland strips trailing slashes for the base URL. - Reads
ctx.config.robots.allowandctx.config.robots.disallowarrays. - Builds a text file with
User-agent: *,Allow:,Disallow:directives. - Appends
Sitemap: {baseUrl}/sitemap.xml. - Writes to
dist/robots.txtviactx.file.write().
Configuration
Uses robots section from site.json:
"robots": {
"allow": ["/"],
"disallow": ["/draft/"]
}No plugin-specific config in pluginConfigs - reads directly from the global robots config section.
Dependencies
@shevky/baseonly. No external dependencies.
Risks and Limitations
- Hardcoded sitemap path: Always references
sitemap.xml. If the sitemap plugin uses a different filename, the reference will be wrong. - No validation: Does not check that allow/disallow paths are well-formed.
- Single User-agent: Only generates rules for
User-agent: *.