Skip to main content

AI-Powered Documentation Generation

Docsio generates a complete documentation site from your existing website. Paste a URL, and Docsio handles the rest -- scraping your content, extracting your branding, and producing structured docs ready to publish.

How It Works

The generation process has four stages:

1. Site Mapping

When you provide a URL, Docsio uses Firecrawl to map your website. It discovers all accessible pages (up to 100 per URL) and builds a list of URLs to scrape.

You can provide multiple URLs if your content spans different subdomains or paths (for example, a marketing site and an existing docs site).

2. Content Scraping

Each discovered page is scraped for its main content, which is extracted as clean Markdown. Metadata is also collected:

Extracted DataDescription
Page textMain content converted to Markdown
Page titleFrom the HTML <title> or Open Graph metadata
Content typeClassified by URL path (see below)
DescriptionFrom meta description or OG description
Site nameFrom OG site name or title tag

Pages are automatically classified based on their URL path:

  • landing -- homepage or root path
  • features -- pages containing "feature" or "product"
  • docs -- pages containing "doc", "guide", "tutorial", or "api"
  • pricing -- pages containing "pricing" or "plans"
  • blog -- pages containing "blog", "post", or "article"
  • support -- pages containing "faq", "help", or "support"
  • about -- pages containing "about", "team", or "company"

3. Brand Extraction

In parallel with content scraping, Docsio extracts your brand identity from the primary URL. This includes colors, logo, fonts, and more. See Brand Extraction for full details.

4. AI Document Generation

Using the scraped content as context, Docsio generates structured documentation pages. The AI follows the Diataxis documentation framework to produce the right type of content for each section:

  • Tutorials -- learning-oriented, guided experiences
  • How-to Guides -- task-oriented, goal-focused instructions
  • Reference -- structured lookup information
  • Explanations -- conceptual, understanding-oriented content

Generated Categories

Each generated page is assigned to a category that determines its position in the sidebar:

CategoryWhat It ContainsDefault
getting-startedQuick Start, Installation, ConfigurationYes
featuresIndividual feature pages or a features overviewYes
api-referenceAPI Overview, Authentication, EndpointsNo
guidesHow-to guides, FAQ, Troubleshooting, ChangelogNo
integrationsThird-party integrations and connectionsNo
conceptsCore concepts and mental modelsNo

Categories marked "No" under Default are enabled automatically when the AI detects relevant content on your website (for example, API reference pages appear only if your site mentions an API).

Page Structure

Every generated page includes:

  • Title -- a clear, descriptive name
  • Content -- accurate Markdown based only on scraped data
  • Category -- one of the categories listed above
  • Position -- ordering within its category for sidebar navigation
Accuracy over completeness

The AI only uses information it found on your website. If it lacks details for a section, it inserts [TODO] placeholders rather than fabricating content. A short, accurate page is always preferred over a long, invented one.

Feature-Aware Generation

Docsio adapts its output based on the number of features it detects:

  • 3 or fewer features -- combined into a single "Features Overview" page
  • 4 or more features -- each feature gets its own dedicated page (up to 10)

The Result

The output is a fully structured Docusaurus documentation site with:

  • Proper sidebar navigation organized by category
  • A generated homepage linking to your documentation sections
  • Branding applied from your website (colors, logo, fonts)
  • A search index for client-side search
  • All content ready to edit with the AI Agent Editor
Uploaded documents

You can also upload files (PDF, DOCX, TXT, Markdown) during project creation. The AI extracts their text and uses it as additional context when generating your docs.