Configuration Reference
Relevant source files
This document provides a comprehensive reference for all configuration options available in the DeepWiki-to-mdBook Converter system. It covers environment variables, their default values, validation logic, auto-detection features, and how configuration flows through the system components.
For information about running the system with these configurations, see Quick Start. For details on how auto-detection works internally, see Auto-Detection Features.
Configuration System Overview
The DeepWiki-to-mdBook Converter uses environment variables as its sole configuration mechanism. All configuration is processed by the build-docs.sh orchestrator script at runtime, with no configuration files required. The system provides intelligent defaults and auto-detection capabilities to minimize required configuration.
Configuration Flow Diagram
flowchart TD
User["User/CI System"]
Docker["docker run -e VAR=value"]
subgraph "build-docs.sh Configuration Processing"
AutoDetect["Git Auto-Detection\n[build-docs.sh:8-19]"]
ParseEnv["Environment Variable Parsing\n[build-docs.sh:21-26]"]
Defaults["Default Value Assignment\n[build-docs.sh:43-45]"]
Validate["Validation\n[build-docs.sh:32-37]"]
end
subgraph "Configuration Consumers"
Scraper["deepwiki-scraper.py\nREPO parameter"]
BookToml["book.toml Generation\n[build-docs.sh:85-103]"]
SummaryGen["SUMMARY.md Generation\n[build-docs.sh:113-159]"]
end
User -->|Set environment variables| Docker
Docker -->|Container startup| AutoDetect
AutoDetect -->|REPO detection| ParseEnv
ParseEnv -->|Parse all vars| Defaults
Defaults -->|Apply defaults| Validate
Validate -->|REPO validated| Scraper
Validate -->|BOOK_TITLE, BOOK_AUTHORS, GIT_REPO_URL| BookToml
Validate -->|No direct config needed| SummaryGen
Sources: build-docs.sh:1-206 README.md:41-51
Environment Variables Reference
The following table lists all environment variables supported by the system:
| Variable | Type | Required | Default | Description |
|---|---|---|---|---|
REPO | String | Conditional | Auto-detected from Git remote | GitHub repository in owner/repo format. Required if not running in a Git repository with a GitHub remote. |
BOOK_TITLE | String | No | "Documentation" | Title displayed in the generated mdBook documentation. Used in book.toml title field. |
BOOK_AUTHORS | String | No | Repository owner (from REPO) | Author name(s) displayed in the documentation. Used in book.toml authors array. |
GIT_REPO_URL | String | No | https://github.com/{REPO} | Full GitHub repository URL. Used for "Edit this page" links in mdBook output. |
MARKDOWN_ONLY | Boolean | No | "false" | When "true", skips Phase 3 (mdBook build) and outputs only extracted Markdown files. Useful for debugging. |
Sources: build-docs.sh:21-26 README.md:44-51
Variable Details and Usage
REPO
Format: owner/repo (e.g., "facebook/react" or "microsoft/vscode")
Purpose: Identifies the GitHub repository to scrape from DeepWiki.com. This is the primary configuration variable that drives the entire system.
flowchart TD
Start["build-docs.sh Startup"]
CheckEnv{"REPO environment\nvariable set?"}
UseEnv["Use provided REPO value\n[build-docs.sh:22]"]
CheckGit{"Git repository\ndetected?"}
GetRemote["Execute: git config --get\nremote.origin.url\n[build-docs.sh:12]"]
ParseURL["Extract owner/repo using regex:\n.*github\\.com[:/]([^/]+/[^/\\.]+)\n[build-docs.sh:16]"]
SetRepo["Set REPO variable\n[build-docs.sh:16]"]
ValidateRepo{"REPO is set?"}
Error["Exit with error\n[build-docs.sh:33-37]"]
Continue["Continue with\nREPO=$REPO_OWNER/$REPO_NAME"]
Start --> CheckEnv
CheckEnv -->|Yes| UseEnv
CheckEnv -->|No| CheckGit
CheckGit -->|Yes| GetRemote
CheckGit -->|No| ValidateRepo
GetRemote --> ParseURL
ParseURL --> SetRepo
UseEnv --> ValidateRepo
SetRepo --> ValidateRepo
ValidateRepo -->|No| Error
ValidateRepo -->|Yes| Continue
Auto-Detection Logic:
Sources: build-docs.sh:8-37
Validation: The system exits with an error if REPO is not set and cannot be auto-detected:
ERROR: REPO must be set or run from within a Git repository with a GitHub remote
Usage: REPO=owner/repo $0
Usage in System:
- Passed as first argument to
deepwiki-scraper.pybuild-docs.sh58 - Used to derive
REPO_OWNERandREPO_NAMEbuild-docs.sh:40-41 - Used to construct
GIT_REPO_URLdefault build-docs.sh45
BOOK_TITLE
Default: "Documentation"
Purpose: Sets the title of the generated mdBook documentation. This appears in the browser tab, navigation header, and book metadata.
Usage: Injected into book.toml configuration file build-docs.sh87:
Examples:
BOOK_TITLE="React Documentation"BOOK_TITLE="VS Code Internals"BOOK_TITLE="Apache Arrow DataFusion Developer Guide"
Sources: build-docs.sh23 build-docs.sh87
BOOK_AUTHORS
Default: Repository owner extracted from REPO
Purpose: Sets the author name(s) in the mdBook documentation metadata.
Default Assignment Logic: build-docs.sh44
This uses shell parameter expansion to set BOOK_AUTHORS to REPO_OWNER only if BOOK_AUTHORS is unset or empty.
Usage: Injected into book.toml as an array build-docs.sh88:
Examples:
- If
REPO="facebook/react"andBOOK_AUTHORSnot set →BOOK_AUTHORS="facebook" - Explicitly set:
BOOK_AUTHORS="Meta Open Source" - Multiple authors:
BOOK_AUTHORS="John Doe, Jane Smith"(rendered as single string in array)
Sources: build-docs.sh24 build-docs.sh44 build-docs.sh88
GIT_REPO_URL
Default: https://github.com/{REPO}
Purpose: Provides the full GitHub repository URL used for "Edit this page" links in the generated mdBook documentation. Each page includes a link back to the source repository.
Default Assignment Logic: build-docs.sh45
Usage: Injected into book.toml configuration build-docs.sh95:
Notes:
- mdBook automatically appends
/edit/main/or similar paths based on its heuristics - The URL must be a valid Git repository URL for the edit links to work correctly
- Can be overridden for non-standard Git hosting scenarios
Sources: build-docs.sh25 build-docs.sh45 build-docs.sh95
MARKDOWN_ONLY
Default: "false"
Type: Boolean string ("true" or "false")
Purpose: Controls whether the system executes the full three-phase pipeline or stops after Phase 2 (Markdown extraction with diagram enhancement). When set to "true", Phase 3 (mdBook build) is skipped.
flowchart TD
Start["build-docs.sh Execution"]
Phase1["Phase 1: Scrape & Extract\n[build-docs.sh:56-58]"]
Phase2["Phase 2: Enhance Diagrams\n(within deepwiki-scraper.py)"]
CheckMode{"MARKDOWN_ONLY\n== 'true'?\n[build-docs.sh:61]"}
CopyMD["Copy markdown to /output/markdown\n[build-docs.sh:64-65]"]
ExitEarly["Exit (skipping mdBook build)\n[build-docs.sh:75]"]
Phase3Init["Phase 3: Initialize mdBook\n[build-docs.sh:79-106]"]
BuildBook["Build HTML documentation\n[build-docs.sh:176]"]
CopyAll["Copy all outputs\n[build-docs.sh:179-191]"]
Start --> Phase1
Phase1 --> Phase2
Phase2 --> CheckMode
CheckMode -->|Yes| CopyMD
CopyMD --> ExitEarly
CheckMode -->|No| Phase3Init
Phase3Init --> BuildBook
BuildBook --> CopyAll
style ExitEarly fill:#ffebee
style CopyAll fill:#e8f5e9
Execution Flow with MARKDOWN_ONLY:
Sources: build-docs.sh26 build-docs.sh:61-76
Use Cases:
- Debugging diagram placement: Quickly iterate on diagram matching without waiting for mdBook build
- Markdown-only extraction: When you only need the Markdown source files
- Faster feedback loops: mdBook build adds significant time; skipping it speeds up testing
- Custom processing: Extract Markdown for processing with different documentation tools
Output Differences:
| Mode | Output Directory Structure |
|---|---|
MARKDOWN_ONLY="false" (default) | /output/book/ (HTML site) |
/output/markdown/ (source) | |
/output/book.toml (config) | |
MARKDOWN_ONLY="true" | /output/markdown/ (source only) |
Performance Impact: Markdown-only mode is approximately 3-5x faster, as it skips:
- mdBook initialization build-docs.sh:79-106
- SUMMARY.md generation build-docs.sh:109-159
- File copying to book/src build-docs.sh:164-166
- mdbook-mermaid asset installation build-docs.sh:169-171
- mdBook HTML build build-docs.sh:174-176
Sources: build-docs.sh:61-76 README.md:55-76
Internal Configuration Variables
These variables are derived or used internally and are not meant to be configured by users:
| Variable | Source | Purpose |
|---|---|---|
WORK_DIR | Hard-coded: /workspace build-docs.sh27 | Temporary working directory inside container |
WIKI_DIR | Derived: $WORK_DIR/wiki build-docs.sh28 | Directory where deepwiki-scraper.py outputs Markdown |
OUTPUT_DIR | Hard-coded: /output build-docs.sh29 | Container output directory (mounted as volume) |
BOOK_DIR | Derived: $WORK_DIR/book build-docs.sh30 | mdBook project directory |
REPO_OWNER | Extracted from REPO build-docs.sh40 | First component of owner/repo |
REPO_NAME | Extracted from REPO build-docs.sh41 | Second component of owner/repo |
Sources: build-docs.sh:27-30 build-docs.sh:40-41
Configuration Precedence and Inheritance
The system follows this precedence order for configuration values:
Sources: build-docs.sh:8-45
Example Scenarios:
- User provides all values:
All explicit values used; no auto-detection occurs.
-
User provides only REPO:
REPO:"facebook/react"(explicit)BOOK_TITLE:"Documentation"(default)BOOK_AUTHORS:"facebook"(derived from REPO)GIT_REPO_URL:"https://github.com/facebook/react"(derived)MARKDOWN_ONLY:"false"(default)
-
User provides no values in Git repo:
REPO: Auto-detected fromgit config --get remote.origin.url- All other values derived or defaulted as above
Generated Configuration Files
The system generates configuration files dynamically based on environment variables:
book.toml
Location: Created at $BOOK_DIR/book.toml build-docs.sh85 copied to /output/book.toml build-docs.sh191
Template Structure:
Sources: build-docs.sh:85-103
Variable Substitution Mapping:
| Template Variable | Environment Variable | Section |
|---|---|---|
${BOOK_TITLE} | $BOOK_TITLE | [book] |
${BOOK_AUTHORS} | $BOOK_AUTHORS | [book] |
${GIT_REPO_URL} | $GIT_REPO_URL | [output.html] |
Hard-Coded Values:
language = "en"build-docs.sh89default-theme = "rust"build-docs.sh94[preprocessor.mermaid]configuration build-docs.sh:97-98- Sidebar folding enabled at level 1 build-docs.sh:100-102
SUMMARY.md
Location: Created at $BOOK_DIR/src/SUMMARY.md build-docs.sh159
Generation: Automatically generated from file structure in $WIKI_DIR, no direct environment variable input. See SUMMARY.md Generation for details.
Sources: build-docs.sh:109-159
Configuration Examples
Minimal Configuration
Results:
REPO:"owner/repo"BOOK_TITLE:"Documentation"BOOK_AUTHORS:"owner"GIT_REPO_URL:"https://github.com/owner/repo"MARKDOWN_ONLY:"false"
Full Custom Configuration
Auto-Detected Configuration
Note: This only works if the current directory is a Git repository with a GitHub remote URL configured.
Debugging Configuration
Outputs only Markdown files to /output/markdown/, skipping the mdBook build phase.
Sources: README.md:28-88
Configuration Validation
The system performs validation on the REPO variable build-docs.sh:32-37:
Validation Rules:
REPOmust be non-empty after auto-detection- No format validation is performed on
REPOvalue (e.g.,owner/repopattern) - Invalid
REPOvalues will cause failures during scraping phase, not during validation
Other Variables:
- No validation performed on
BOOK_TITLE,BOOK_AUTHORS, orGIT_REPO_URL MARKDOWN_ONLYis not validated; any value other than"true"is treated asfalse
Sources: build-docs.sh:32-37
Configuration Debugging
To debug configuration values, check the console output at startup build-docs.sh:47-53:
Configuration:
Repository: facebook/react
Book Title: React Documentation
Authors: Meta Open Source
Git Repo URL: https://github.com/facebook/react
Markdown Only: false
This output shows the final resolved configuration values after auto-detection, derivation, and defaults are applied.
Sources: build-docs.sh:47-53