This documentation is part of the "Projects with Books" initiative at zenOSmosis.
The source code for this project is available on GitHub.
Configuration Reference
Loading…
Configuration Reference
Relevant source files
This document provides a comprehensive reference for all configuration options available in the DeepWiki-to-mdBook Converter system. It covers environment variables, their default values, validation logic, auto-detection features, and how configuration flows through the system components.
For information about running the system with these configurations, see Quick Start. For details on how auto-detection works internally, see Auto-Detection Features.
Configuration System Overview
The DeepWiki-to-mdBook Converter uses environment variables as its sole configuration mechanism. All configuration is processed by the build-docs.sh orchestrator script at runtime, with no configuration files required. The system provides intelligent defaults and auto-detection capabilities to minimize required configuration.
Configuration Flow Diagram
flowchart TD
User["User/CI System"]
Docker["docker run -e VAR=value"]
subgraph "build-docs.sh Configuration Processing"
AutoDetect["Git Auto-Detection\n[build-docs.sh:8-19]"]
ParseEnv["Environment Variable Parsing\n[build-docs.sh:21-26]"]
Defaults["Default Value Assignment\n[build-docs.sh:43-45]"]
Validate["Validation\n[build-docs.sh:32-37]"]
end
subgraph "Configuration Consumers"
Scraper["deepwiki-scraper.py\nREPO parameter"]
BookToml["book.toml Generation\n[build-docs.sh:85-103]"]
SummaryGen["SUMMARY.md Generation\n[build-docs.sh:113-159]"]
end
User -->|Set environment variables| Docker
Docker -->|Container startup| AutoDetect
AutoDetect -->|REPO detection| ParseEnv
ParseEnv -->|Parse all vars| Defaults
Defaults -->|Apply defaults| Validate
Validate -->|REPO validated| Scraper
Validate -->|BOOK_TITLE, BOOK_AUTHORS, GIT_REPO_URL| BookToml
Validate -->|No direct config needed| SummaryGen
Sources: build-docs.sh:1-206 README.md:41-51
Environment Variables Reference
The following table lists all environment variables supported by the system:
| Variable | Type | Required | Default | Description |
|---|---|---|---|---|
REPO | String | Conditional | Auto-detected from Git remote | GitHub repository in owner/repo format. Required if not running in a Git repository with a GitHub remote. |
BOOK_TITLE | String | No | "Documentation" | Title displayed in the generated mdBook documentation. Used in book.toml title field. |
BOOK_AUTHORS | String | No | Repository owner (from REPO) | Author name(s) displayed in the documentation. Used in book.toml authors array. |
GIT_REPO_URL | String | No | https://github.com/{REPO} | Full GitHub repository URL. Used for “Edit this page” links in mdBook output. |
MARKDOWN_ONLY | Boolean | No | "false" | When "true", skips Phase 3 (mdBook build) and outputs only extracted Markdown files. Useful for debugging. |
Sources: build-docs.sh:21-26 README.md:44-51
Variable Details and Usage
REPO
Format: owner/repo (e.g., "facebook/react" or "microsoft/vscode")
Purpose: Identifies the GitHub repository to scrape from DeepWiki.com. This is the primary configuration variable that drives the entire system.
flowchart TD
Start["build-docs.sh Startup"]
CheckEnv{"REPO environment\nvariable set?"}
UseEnv["Use provided REPO value\n[build-docs.sh:22]"]
CheckGit{"Git repository\ndetected?"}
GetRemote["Execute: git config --get\nremote.origin.url\n[build-docs.sh:12]"]
ParseURL["Extract owner/repo using regex:\n.*github\\.com[:/]([^/]+/[^/\\.]+)\n[build-docs.sh:16]"]
SetRepo["Set REPO variable\n[build-docs.sh:16]"]
ValidateRepo{"REPO is set?"}
Error["Exit with error\n[build-docs.sh:33-37]"]
Continue["Continue with\nREPO=$REPO_OWNER/$REPO_NAME"]
Start --> CheckEnv
CheckEnv -->|Yes| UseEnv
CheckEnv -->|No| CheckGit
CheckGit -->|Yes| GetRemote
CheckGit -->|No| ValidateRepo
GetRemote --> ParseURL
ParseURL --> SetRepo
UseEnv --> ValidateRepo
SetRepo --> ValidateRepo
ValidateRepo -->|No| Error
ValidateRepo -->|Yes| Continue
Auto-Detection Logic:
Sources: build-docs.sh:8-37
Validation: The system exits with an error if REPO is not set and cannot be auto-detected:
ERROR: REPO must be set or run from within a Git repository with a GitHub remote
Usage: REPO=owner/repo $0
Usage in System:
- Passed as first argument to
deepwiki-scraper.pybuild-docs.sh58 - Used to derive
REPO_OWNERandREPO_NAMEbuild-docs.sh:40-41 - Used to construct
GIT_REPO_URLdefault build-docs.sh45
BOOK_TITLE
Default: "Documentation"
Purpose: Sets the title of the generated mdBook documentation. This appears in the browser tab, navigation header, and book metadata.
Usage: Injected into book.toml configuration file build-docs.sh87:
Examples:
BOOK_TITLE="React Documentation"BOOK_TITLE="VS Code Internals"BOOK_TITLE="Apache Arrow DataFusion Developer Guide"
Sources: build-docs.sh23 build-docs.sh87
BOOK_AUTHORS
Default: Repository owner extracted from REPO
Purpose: Sets the author name(s) in the mdBook documentation metadata.
Default Assignment Logic: build-docs.sh44
This uses shell parameter expansion to set BOOK_AUTHORS to REPO_OWNER only if BOOK_AUTHORS is unset or empty.
Usage: Injected into book.toml as an array build-docs.sh88:
Examples:
- If
REPO="facebook/react"andBOOK_AUTHORSnot set →BOOK_AUTHORS="facebook" - Explicitly set:
BOOK_AUTHORS="Meta Open Source" - Multiple authors:
BOOK_AUTHORS="John Doe, Jane Smith"(rendered as single string in array)
Sources: build-docs.sh24 build-docs.sh44 build-docs.sh88
GIT_REPO_URL
Default: https://github.com/{REPO}
Purpose: Provides the full GitHub repository URL used for “Edit this page” links in the generated mdBook documentation. Each page includes a link back to the source repository.
Default Assignment Logic: build-docs.sh45
Usage: Injected into book.toml configuration build-docs.sh95:
Notes:
- mdBook automatically appends
/edit/main/or similar paths based on its heuristics - The URL must be a valid Git repository URL for the edit links to work correctly
- Can be overridden for non-standard Git hosting scenarios
Sources: build-docs.sh25 build-docs.sh45 build-docs.sh95
MARKDOWN_ONLY
Default: "false"
Type: Boolean string ("true" or "false")
Purpose: Controls whether the system executes the full three-phase pipeline or stops after Phase 2 (Markdown extraction with diagram enhancement). When set to "true", Phase 3 (mdBook build) is skipped.
flowchart TD
Start["build-docs.sh Execution"]
Phase1["Phase 1: Scrape & Extract\n[build-docs.sh:56-58]"]
Phase2["Phase 2: Enhance Diagrams\n(within deepwiki-scraper.py)"]
CheckMode{"MARKDOWN_ONLY\n== 'true'?\n[build-docs.sh:61]"}
CopyMD["Copy markdown to /output/markdown\n[build-docs.sh:64-65]"]
ExitEarly["Exit (skipping mdBook build)\n[build-docs.sh:75]"]
Phase3Init["Phase 3: Initialize mdBook\n[build-docs.sh:79-106]"]
BuildBook["Build HTML documentation\n[build-docs.sh:176]"]
CopyAll["Copy all outputs\n[build-docs.sh:179-191]"]
Start --> Phase1
Phase1 --> Phase2
Phase2 --> CheckMode
CheckMode -->|Yes| CopyMD
CopyMD --> ExitEarly
CheckMode -->|No| Phase3Init
Phase3Init --> BuildBook
BuildBook --> CopyAll
style ExitEarly fill:#ffebee
style CopyAll fill:#e8f5e9
Execution Flow with MARKDOWN_ONLY:
Sources: build-docs.sh26 build-docs.sh:61-76
Use Cases:
- Debugging diagram placement: Quickly iterate on diagram matching without waiting for mdBook build
- Markdown-only extraction: When you only need the Markdown source files
- Faster feedback loops: mdBook build adds significant time; skipping it speeds up testing
- Custom processing: Extract Markdown for processing with different documentation tools
Output Differences:
| Mode | Output Directory Structure |
|---|---|
MARKDOWN_ONLY="false" (default) | /output/book/ (HTML site) |
/output/markdown/ (source) | |
/output/book.toml (config) | |
MARKDOWN_ONLY="true" | /output/markdown/ (source only) |
Performance Impact: Markdown-only mode is approximately 3-5x faster, as it skips:
- mdBook initialization build-docs.sh:79-106
- SUMMARY.md generation build-docs.sh:109-159
- File copying to book/src build-docs.sh:164-166
- mdbook-mermaid asset installation build-docs.sh:169-171
- mdBook HTML build build-docs.sh:174-176
Sources: build-docs.sh:61-76 README.md:55-76
Internal Configuration Variables
These variables are derived or used internally and are not meant to be configured by users:
| Variable | Source | Purpose |
|---|---|---|
WORK_DIR | Hard-coded: /workspace build-docs.sh27 | Temporary working directory inside container |
WIKI_DIR | Derived: $WORK_DIR/wiki build-docs.sh28 | Directory where deepwiki-scraper.py outputs Markdown |
OUTPUT_DIR | Hard-coded: /output build-docs.sh29 | Container output directory (mounted as volume) |
BOOK_DIR | Derived: $WORK_DIR/book build-docs.sh30 | mdBook project directory |
REPO_OWNER | Extracted from REPO build-docs.sh40 | First component of owner/repo |
REPO_NAME | Extracted from REPO build-docs.sh41 | Second component of owner/repo |
Sources: build-docs.sh:27-30 build-docs.sh:40-41
Configuration Precedence and Inheritance
The system follows this precedence order for configuration values:
Sources: build-docs.sh:8-45
Example Scenarios:
- User provides all values:
All explicit values used; no auto-detection occurs.
-
User provides only REPO:
REPO:"facebook/react"(explicit)BOOK_TITLE:"Documentation"(default)BOOK_AUTHORS:"facebook"(derived from REPO)GIT_REPO_URL:"https://github.com/facebook/react"(derived)MARKDOWN_ONLY:"false"(default)
-
User provides no values in Git repo:
REPO: Auto-detected fromgit config --get remote.origin.url- All other values derived or defaulted as above
Generated Configuration Files
The system generates configuration files dynamically based on environment variables:
book.toml
Location: Created at $BOOK_DIR/book.toml build-docs.sh85 copied to /output/book.toml build-docs.sh191
Template Structure:
Sources: build-docs.sh:85-103
Variable Substitution Mapping:
| Template Variable | Environment Variable | Section |
|---|---|---|
${BOOK_TITLE} | $BOOK_TITLE | [book] |
${BOOK_AUTHORS} | $BOOK_AUTHORS | [book] |
${GIT_REPO_URL} | $GIT_REPO_URL | [output.html] |
Hard-Coded Values:
language = "en"build-docs.sh89default-theme = "rust"build-docs.sh94[preprocessor.mermaid]configuration build-docs.sh:97-98- Sidebar folding enabled at level 1 build-docs.sh:100-102
SUMMARY.md
Location: Created at $BOOK_DIR/src/SUMMARY.md build-docs.sh159
Generation: Automatically generated from file structure in $WIKI_DIR, no direct environment variable input. See SUMMARY.md Generation for details.
Sources: build-docs.sh:109-159
Configuration Examples
Minimal Configuration
Results:
REPO:"owner/repo"BOOK_TITLE:"Documentation"BOOK_AUTHORS:"owner"GIT_REPO_URL:"https://github.com/owner/repo"MARKDOWN_ONLY:"false"
Full Custom Configuration
Auto-Detected Configuration
Note: This only works if the current directory is a Git repository with a GitHub remote URL configured.
Debugging Configuration
Outputs only Markdown files to /output/markdown/, skipping the mdBook build phase.
Sources: README.md:28-88
Configuration Validation
The system performs validation on the REPO variable build-docs.sh:32-37:
Validation Rules:
REPOmust be non-empty after auto-detection- No format validation is performed on
REPOvalue (e.g.,owner/repopattern) - Invalid
REPOvalues will cause failures during scraping phase, not during validation
Other Variables:
- No validation performed on
BOOK_TITLE,BOOK_AUTHORS, orGIT_REPO_URL MARKDOWN_ONLYis not validated; any value other than"true"is treated asfalse
Sources: build-docs.sh:32-37
Configuration Debugging
To debug configuration values, check the console output at startup build-docs.sh:47-53:
Configuration:
Repository: facebook/react
Book Title: React Documentation
Authors: Meta Open Source
Git Repo URL: https://github.com/facebook/react
Markdown Only: false
This output shows the final resolved configuration values after auto-detection, derivation, and defaults are applied.
Sources: build-docs.sh:47-53
Dismiss
Refresh this wiki
Enter email to refresh