Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

DeepWiki GitHub

Configuration Reference

Relevant source files

This document provides a comprehensive reference for all configuration options available in the DeepWiki-to-mdBook Converter system. It covers environment variables, their default values, validation logic, auto-detection features, and how configuration flows through the system components.

For information about running the system with these configurations, see Quick Start. For details on how auto-detection works internally, see Auto-Detection Features.

Configuration System Overview

The DeepWiki-to-mdBook Converter uses environment variables as its sole configuration mechanism. All configuration is processed by the build-docs.sh orchestrator script at runtime, with no configuration files required. The system provides intelligent defaults and auto-detection capabilities to minimize required configuration.

Configuration Flow Diagram

flowchart TD
    User["User/CI System"]
Docker["docker run -e VAR=value"]
subgraph "build-docs.sh Configuration Processing"
        AutoDetect["Git Auto-Detection\n[build-docs.sh:8-19]"]
ParseEnv["Environment Variable Parsing\n[build-docs.sh:21-26]"]
Defaults["Default Value Assignment\n[build-docs.sh:43-45]"]
Validate["Validation\n[build-docs.sh:32-37]"]
end
    
    subgraph "Configuration Consumers"
        Scraper["deepwiki-scraper.py\nREPO parameter"]
BookToml["book.toml Generation\n[build-docs.sh:85-103]"]
SummaryGen["SUMMARY.md Generation\n[build-docs.sh:113-159]"]
end
    
 
   User -->|Set environment variables| Docker
 
   Docker -->|Container startup| AutoDetect
 
   AutoDetect -->|REPO detection| ParseEnv
 
   ParseEnv -->|Parse all vars| Defaults
 
   Defaults -->|Apply defaults| Validate
 
   Validate -->|REPO validated| Scraper
 
   Validate -->|BOOK_TITLE, BOOK_AUTHORS, GIT_REPO_URL| BookToml
 
   Validate -->|No direct config needed| SummaryGen

Sources: build-docs.sh:1-206 README.md:41-51

Environment Variables Reference

The following table lists all environment variables supported by the system:

VariableTypeRequiredDefaultDescription
REPOStringConditionalAuto-detected from Git remoteGitHub repository in owner/repo format. Required if not running in a Git repository with a GitHub remote.
BOOK_TITLEStringNo"Documentation"Title displayed in the generated mdBook documentation. Used in book.toml title field.
BOOK_AUTHORSStringNoRepository owner (from REPO)Author name(s) displayed in the documentation. Used in book.toml authors array.
GIT_REPO_URLStringNohttps://github.com/{REPO}Full GitHub repository URL. Used for "Edit this page" links in mdBook output.
MARKDOWN_ONLYBooleanNo"false"When "true", skips Phase 3 (mdBook build) and outputs only extracted Markdown files. Useful for debugging.

Sources: build-docs.sh:21-26 README.md:44-51

Variable Details and Usage

REPO

Format: owner/repo (e.g., "facebook/react" or "microsoft/vscode")

Purpose: Identifies the GitHub repository to scrape from DeepWiki.com. This is the primary configuration variable that drives the entire system.

flowchart TD
    Start["build-docs.sh Startup"]
CheckEnv{"REPO environment\nvariable set?"}
UseEnv["Use provided REPO value\n[build-docs.sh:22]"]
CheckGit{"Git repository\ndetected?"}
GetRemote["Execute: git config --get\nremote.origin.url\n[build-docs.sh:12]"]
ParseURL["Extract owner/repo using regex:\n.*github\\.com[:/]([^/]+/[^/\\.]+)\n[build-docs.sh:16]"]
SetRepo["Set REPO variable\n[build-docs.sh:16]"]
ValidateRepo{"REPO is set?"}
Error["Exit with error\n[build-docs.sh:33-37]"]
Continue["Continue with\nREPO=$REPO_OWNER/$REPO_NAME"]
Start --> CheckEnv
 
   CheckEnv -->|Yes| UseEnv
 
   CheckEnv -->|No| CheckGit
 
   CheckGit -->|Yes| GetRemote
 
   CheckGit -->|No| ValidateRepo
 
   GetRemote --> ParseURL
 
   ParseURL --> SetRepo
 
   UseEnv --> ValidateRepo
 
   SetRepo --> ValidateRepo
 
   ValidateRepo -->|No| Error
 
   ValidateRepo -->|Yes| Continue

Auto-Detection Logic:

Sources: build-docs.sh:8-37

Validation: The system exits with an error if REPO is not set and cannot be auto-detected:

ERROR: REPO must be set or run from within a Git repository with a GitHub remote
Usage: REPO=owner/repo $0

Usage in System:

BOOK_TITLE

Default: "Documentation"

Purpose: Sets the title of the generated mdBook documentation. This appears in the browser tab, navigation header, and book metadata.

Usage: Injected into book.toml configuration file build-docs.sh87:

Examples:

  • BOOK_TITLE="React Documentation"
  • BOOK_TITLE="VS Code Internals"
  • BOOK_TITLE="Apache Arrow DataFusion Developer Guide"

Sources: build-docs.sh23 build-docs.sh87

BOOK_AUTHORS

Default: Repository owner extracted from REPO

Purpose: Sets the author name(s) in the mdBook documentation metadata.

Default Assignment Logic: build-docs.sh44

This uses shell parameter expansion to set BOOK_AUTHORS to REPO_OWNER only if BOOK_AUTHORS is unset or empty.

Usage: Injected into book.toml as an array build-docs.sh88:

Examples:

  • If REPO="facebook/react" and BOOK_AUTHORS not set → BOOK_AUTHORS="facebook"
  • Explicitly set: BOOK_AUTHORS="Meta Open Source"
  • Multiple authors: BOOK_AUTHORS="John Doe, Jane Smith" (rendered as single string in array)

Sources: build-docs.sh24 build-docs.sh44 build-docs.sh88

GIT_REPO_URL

Default: https://github.com/{REPO}

Purpose: Provides the full GitHub repository URL used for "Edit this page" links in the generated mdBook documentation. Each page includes a link back to the source repository.

Default Assignment Logic: build-docs.sh45

Usage: Injected into book.toml configuration build-docs.sh95:

Notes:

  • mdBook automatically appends /edit/main/ or similar paths based on its heuristics
  • The URL must be a valid Git repository URL for the edit links to work correctly
  • Can be overridden for non-standard Git hosting scenarios

Sources: build-docs.sh25 build-docs.sh45 build-docs.sh95

MARKDOWN_ONLY

Default: "false"

Type: Boolean string ("true" or "false")

Purpose: Controls whether the system executes the full three-phase pipeline or stops after Phase 2 (Markdown extraction with diagram enhancement). When set to "true", Phase 3 (mdBook build) is skipped.

flowchart TD
    Start["build-docs.sh Execution"]
Phase1["Phase 1: Scrape & Extract\n[build-docs.sh:56-58]"]
Phase2["Phase 2: Enhance Diagrams\n(within deepwiki-scraper.py)"]
CheckMode{"MARKDOWN_ONLY\n== 'true'?\n[build-docs.sh:61]"}
CopyMD["Copy markdown to /output/markdown\n[build-docs.sh:64-65]"]
ExitEarly["Exit (skipping mdBook build)\n[build-docs.sh:75]"]
Phase3Init["Phase 3: Initialize mdBook\n[build-docs.sh:79-106]"]
BuildBook["Build HTML documentation\n[build-docs.sh:176]"]
CopyAll["Copy all outputs\n[build-docs.sh:179-191]"]
Start --> Phase1
 
   Phase1 --> Phase2
 
   Phase2 --> CheckMode
 
   CheckMode -->|Yes| CopyMD
 
   CopyMD --> ExitEarly
 
   CheckMode -->|No| Phase3Init
 
   Phase3Init --> BuildBook
 
   BuildBook --> CopyAll
    
    style ExitEarly fill:#ffebee
    style CopyAll fill:#e8f5e9

Execution Flow with MARKDOWN_ONLY:

Sources: build-docs.sh26 build-docs.sh:61-76

Use Cases:

  • Debugging diagram placement: Quickly iterate on diagram matching without waiting for mdBook build
  • Markdown-only extraction: When you only need the Markdown source files
  • Faster feedback loops: mdBook build adds significant time; skipping it speeds up testing
  • Custom processing: Extract Markdown for processing with different documentation tools

Output Differences:

ModeOutput Directory Structure
MARKDOWN_ONLY="false" (default)/output/book/ (HTML site)
/output/markdown/ (source)
/output/book.toml (config)
MARKDOWN_ONLY="true"/output/markdown/ (source only)

Performance Impact: Markdown-only mode is approximately 3-5x faster, as it skips:

Sources: build-docs.sh:61-76 README.md:55-76

Internal Configuration Variables

These variables are derived or used internally and are not meant to be configured by users:

VariableSourcePurpose
WORK_DIRHard-coded: /workspace build-docs.sh27Temporary working directory inside container
WIKI_DIRDerived: $WORK_DIR/wiki build-docs.sh28Directory where deepwiki-scraper.py outputs Markdown
OUTPUT_DIRHard-coded: /output build-docs.sh29Container output directory (mounted as volume)
BOOK_DIRDerived: $WORK_DIR/book build-docs.sh30mdBook project directory
REPO_OWNERExtracted from REPO build-docs.sh40First component of owner/repo
REPO_NAMEExtracted from REPO build-docs.sh41Second component of owner/repo

Sources: build-docs.sh:27-30 build-docs.sh:40-41

Configuration Precedence and Inheritance

The system follows this precedence order for configuration values:

Sources: build-docs.sh:8-45

Example Scenarios:

  1. User provides all values:

All explicit values used; no auto-detection occurs.

  1. User provides only REPO:

    • REPO: "facebook/react" (explicit)
    • BOOK_TITLE: "Documentation" (default)
    • BOOK_AUTHORS: "facebook" (derived from REPO)
    • GIT_REPO_URL: "https://github.com/facebook/react" (derived)
    • MARKDOWN_ONLY: "false" (default)
  2. User provides no values in Git repo:

    • REPO: Auto-detected from git config --get remote.origin.url
    • All other values derived or defaulted as above

Generated Configuration Files

The system generates configuration files dynamically based on environment variables:

book.toml

Location: Created at $BOOK_DIR/book.toml build-docs.sh85 copied to /output/book.toml build-docs.sh191

Template Structure:

Sources: build-docs.sh:85-103

Variable Substitution Mapping:

Template VariableEnvironment VariableSection
${BOOK_TITLE}$BOOK_TITLE[book]
${BOOK_AUTHORS}$BOOK_AUTHORS[book]
${GIT_REPO_URL}$GIT_REPO_URL[output.html]

Hard-Coded Values:

SUMMARY.md

Location: Created at $BOOK_DIR/src/SUMMARY.md build-docs.sh159

Generation: Automatically generated from file structure in $WIKI_DIR, no direct environment variable input. See SUMMARY.md Generation for details.

Sources: build-docs.sh:109-159

Configuration Examples

Minimal Configuration

Results:

  • REPO: "owner/repo"
  • BOOK_TITLE: "Documentation"
  • BOOK_AUTHORS: "owner"
  • GIT_REPO_URL: "https://github.com/owner/repo"
  • MARKDOWN_ONLY: "false"

Full Custom Configuration

Auto-Detected Configuration

Note: This only works if the current directory is a Git repository with a GitHub remote URL configured.

Debugging Configuration

Outputs only Markdown files to /output/markdown/, skipping the mdBook build phase.

Sources: README.md:28-88

Configuration Validation

The system performs validation on the REPO variable build-docs.sh:32-37:

Validation Rules:

  • REPO must be non-empty after auto-detection
  • No format validation is performed on REPO value (e.g., owner/repo pattern)
  • Invalid REPO values will cause failures during scraping phase, not during validation

Other Variables:

  • No validation performed on BOOK_TITLE, BOOK_AUTHORS, or GIT_REPO_URL
  • MARKDOWN_ONLY is not validated; any value other than "true" is treated as false

Sources: build-docs.sh:32-37

Configuration Debugging

To debug configuration values, check the console output at startup build-docs.sh:47-53:

Configuration:
  Repository:    facebook/react
  Book Title:    React Documentation
  Authors:       Meta Open Source
  Git Repo URL:  https://github.com/facebook/react
  Markdown Only: false

This output shows the final resolved configuration values after auto-detection, derivation, and defaults are applied.

Sources: build-docs.sh:47-53