AI-Powered Documentation Generation
Docsio generates a complete documentation site from your existing website. Paste a URL, and Docsio handles the rest -- scraping your content, extracting your branding, and producing structured docs ready to publish.
How It Works
The generation process has four stages:
1. Site Mapping
When you provide a URL, Docsio uses Firecrawl to map your website. It discovers all accessible pages (up to 100 per URL) and builds a list of URLs to scrape.
You can provide multiple URLs if your content spans different subdomains or paths (for example, a marketing site and an existing docs site).
2. Content Scraping
Each discovered page is scraped for its main content, which is extracted as clean Markdown. Metadata is also collected:
| Extracted Data | Description |
|---|---|
| Page text | Main content converted to Markdown |
| Page title | From the HTML <title> or Open Graph metadata |
| Content type | Classified by URL path (see below) |
| Description | From meta description or OG description |
| Site name | From OG site name or title tag |
Pages are automatically classified based on their URL path:
- landing -- homepage or root path
- features -- pages containing "feature" or "product"
- docs -- pages containing "doc", "guide", "tutorial", or "api"
- pricing -- pages containing "pricing" or "plans"
- blog -- pages containing "blog", "post", or "article"
- support -- pages containing "faq", "help", or "support"
- about -- pages containing "about", "team", or "company"
3. Brand Extraction
In parallel with content scraping, Docsio extracts your brand identity from the primary URL. This includes colors, logo, fonts, and more. See Brand Extraction for full details.
4. AI Document Generation
Using the scraped content as context, Docsio generates structured documentation pages. The AI follows the Diataxis documentation framework to produce the right type of content for each section:
- Tutorials -- learning-oriented, guided experiences
- How-to Guides -- task-oriented, goal-focused instructions
- Reference -- structured lookup information
- Explanations -- conceptual, understanding-oriented content
Generated Categories
Each generated page is assigned to a category that determines its position in the sidebar:
| Category | What It Contains | Default |
|---|---|---|
| getting-started | Quick Start, Installation, Configuration | Yes |
| features | Individual feature pages or a features overview | Yes |
| api-reference | API Overview, Authentication, Endpoints | No |
| guides | How-to guides, FAQ, Troubleshooting, Changelog | No |
| integrations | Third-party integrations and connections | No |
| concepts | Core concepts and mental models | No |
Categories marked "No" under Default are enabled automatically when the AI detects relevant content on your website (for example, API reference pages appear only if your site mentions an API).
Page Structure
Every generated page includes:
- Title -- a clear, descriptive name
- Content -- accurate Markdown based only on scraped data
- Category -- one of the categories listed above
- Position -- ordering within its category for sidebar navigation
The AI only uses information it found on your website. If it lacks details for a section, it inserts [TODO] placeholders rather than fabricating content. A short, accurate page is always preferred over a long, invented one.
Feature-Aware Generation
Docsio adapts its output based on the number of features it detects:
- 3 or fewer features -- combined into a single "Features Overview" page
- 4 or more features -- each feature gets its own dedicated page (up to 10)
The Result
The output is a fully structured Docusaurus documentation site with:
- Proper sidebar navigation organized by category
- A generated homepage linking to your documentation sections
- Branding applied from your website (colors, logo, fonts)
- A search index for client-side search
- All content ready to edit with the AI Agent Editor
You can also upload files (PDF, DOCX, TXT, Markdown) during project creation. The AI extracts their text and uses it as additional context when generating your docs.