Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Configuration Reference

Loading…

Configuration Reference

Relevant source files

This document provides a comprehensive reference for all configuration options available in the DeepWiki-to-mdBook Converter system. It covers environment variables, their default values, validation logic, auto-detection features, and how configuration flows through the system components.

For information about running the system with these configurations, see Quick Start. For details on how auto-detection works internally, see Auto-Detection Features.

Configuration System Overview

The DeepWiki-to-mdBook Converter uses environment variables as its sole configuration mechanism. All configuration is processed by the build-docs.sh orchestrator script at runtime, with no configuration files required. The system provides intelligent defaults and auto-detection capabilities to minimize required configuration.

Configuration Flow Diagram

flowchart TD
    User["User/CI System"]
Docker["docker run -e VAR=value"]
subgraph "build-docs.sh Configuration Processing"
        AutoDetect["Git Auto-Detection\n[build-docs.sh:8-19]"]
ParseEnv["Environment Variable Parsing\n[build-docs.sh:21-26]"]
Defaults["Default Value Assignment\n[build-docs.sh:43-45]"]
Validate["Validation\n[build-docs.sh:32-37]"]
end
    
    subgraph "Configuration Consumers"
        Scraper["deepwiki-scraper.py\nREPO parameter"]
BookToml["book.toml Generation\n[build-docs.sh:85-103]"]
SummaryGen["SUMMARY.md Generation\n[build-docs.sh:113-159]"]
end
    
 
   User -->|Set environment variables| Docker
 
   Docker -->|Container startup| AutoDetect
 
   AutoDetect -->|REPO detection| ParseEnv
 
   ParseEnv -->|Parse all vars| Defaults
 
   Defaults -->|Apply defaults| Validate
 
   Validate -->|REPO validated| Scraper
 
   Validate -->|BOOK_TITLE, BOOK_AUTHORS, GIT_REPO_URL| BookToml
 
   Validate -->|No direct config needed| SummaryGen

Sources: build-docs.sh:1-206 README.md:41-51

Environment Variables Reference

The following table lists all environment variables supported by the system:

VariableTypeRequiredDefaultDescription
REPOStringConditionalAuto-detected from Git remoteGitHub repository in owner/repo format. Required if not running in a Git repository with a GitHub remote.
BOOK_TITLEStringNo"Documentation"Title displayed in the generated mdBook documentation. Used in book.toml title field.
BOOK_AUTHORSStringNoRepository owner (from REPO)Author name(s) displayed in the documentation. Used in book.toml authors array.
GIT_REPO_URLStringNohttps://github.com/{REPO}Full GitHub repository URL. Used for “Edit this page” links in mdBook output.
MARKDOWN_ONLYBooleanNo"false"When "true", skips Phase 3 (mdBook build) and outputs only extracted Markdown files. Useful for debugging.

Sources: build-docs.sh:21-26 README.md:44-51

Variable Details and Usage

REPO

Format: owner/repo (e.g., "facebook/react" or "microsoft/vscode")

Purpose: Identifies the GitHub repository to scrape from DeepWiki.com. This is the primary configuration variable that drives the entire system.

flowchart TD
    Start["build-docs.sh Startup"]
CheckEnv{"REPO environment\nvariable set?"}
UseEnv["Use provided REPO value\n[build-docs.sh:22]"]
CheckGit{"Git repository\ndetected?"}
GetRemote["Execute: git config --get\nremote.origin.url\n[build-docs.sh:12]"]
ParseURL["Extract owner/repo using regex:\n.*github\\.com[:/]([^/]+/[^/\\.]+)\n[build-docs.sh:16]"]
SetRepo["Set REPO variable\n[build-docs.sh:16]"]
ValidateRepo{"REPO is set?"}
Error["Exit with error\n[build-docs.sh:33-37]"]
Continue["Continue with\nREPO=$REPO_OWNER/$REPO_NAME"]
Start --> CheckEnv
 
   CheckEnv -->|Yes| UseEnv
 
   CheckEnv -->|No| CheckGit
 
   CheckGit -->|Yes| GetRemote
 
   CheckGit -->|No| ValidateRepo
 
   GetRemote --> ParseURL
 
   ParseURL --> SetRepo
 
   UseEnv --> ValidateRepo
 
   SetRepo --> ValidateRepo
 
   ValidateRepo -->|No| Error
 
   ValidateRepo -->|Yes| Continue

Auto-Detection Logic:

Sources: build-docs.sh:8-37

Validation: The system exits with an error if REPO is not set and cannot be auto-detected:

ERROR: REPO must be set or run from within a Git repository with a GitHub remote
Usage: REPO=owner/repo $0

Usage in System:

BOOK_TITLE

Default: "Documentation"

Purpose: Sets the title of the generated mdBook documentation. This appears in the browser tab, navigation header, and book metadata.

Usage: Injected into book.toml configuration file build-docs.sh87:

Examples:

  • BOOK_TITLE="React Documentation"
  • BOOK_TITLE="VS Code Internals"
  • BOOK_TITLE="Apache Arrow DataFusion Developer Guide"

Sources: build-docs.sh23 build-docs.sh87

BOOK_AUTHORS

Default: Repository owner extracted from REPO

Purpose: Sets the author name(s) in the mdBook documentation metadata.

Default Assignment Logic: build-docs.sh44

This uses shell parameter expansion to set BOOK_AUTHORS to REPO_OWNER only if BOOK_AUTHORS is unset or empty.

Usage: Injected into book.toml as an array build-docs.sh88:

Examples:

  • If REPO="facebook/react" and BOOK_AUTHORS not set → BOOK_AUTHORS="facebook"
  • Explicitly set: BOOK_AUTHORS="Meta Open Source"
  • Multiple authors: BOOK_AUTHORS="John Doe, Jane Smith" (rendered as single string in array)

Sources: build-docs.sh24 build-docs.sh44 build-docs.sh88

GIT_REPO_URL

Default: https://github.com/{REPO}

Purpose: Provides the full GitHub repository URL used for “Edit this page” links in the generated mdBook documentation. Each page includes a link back to the source repository.

Default Assignment Logic: build-docs.sh45

Usage: Injected into book.toml configuration build-docs.sh95:

Notes:

  • mdBook automatically appends /edit/main/ or similar paths based on its heuristics
  • The URL must be a valid Git repository URL for the edit links to work correctly
  • Can be overridden for non-standard Git hosting scenarios

Sources: build-docs.sh25 build-docs.sh45 build-docs.sh95

MARKDOWN_ONLY

Default: "false"

Type: Boolean string ("true" or "false")

Purpose: Controls whether the system executes the full three-phase pipeline or stops after Phase 2 (Markdown extraction with diagram enhancement). When set to "true", Phase 3 (mdBook build) is skipped.

flowchart TD
    Start["build-docs.sh Execution"]
Phase1["Phase 1: Scrape & Extract\n[build-docs.sh:56-58]"]
Phase2["Phase 2: Enhance Diagrams\n(within deepwiki-scraper.py)"]
CheckMode{"MARKDOWN_ONLY\n== 'true'?\n[build-docs.sh:61]"}
CopyMD["Copy markdown to /output/markdown\n[build-docs.sh:64-65]"]
ExitEarly["Exit (skipping mdBook build)\n[build-docs.sh:75]"]
Phase3Init["Phase 3: Initialize mdBook\n[build-docs.sh:79-106]"]
BuildBook["Build HTML documentation\n[build-docs.sh:176]"]
CopyAll["Copy all outputs\n[build-docs.sh:179-191]"]
Start --> Phase1
 
   Phase1 --> Phase2
 
   Phase2 --> CheckMode
 
   CheckMode -->|Yes| CopyMD
 
   CopyMD --> ExitEarly
 
   CheckMode -->|No| Phase3Init
 
   Phase3Init --> BuildBook
 
   BuildBook --> CopyAll
    
    style ExitEarly fill:#ffebee
    style CopyAll fill:#e8f5e9

Execution Flow with MARKDOWN_ONLY:

Sources: build-docs.sh26 build-docs.sh:61-76

Use Cases:

  • Debugging diagram placement: Quickly iterate on diagram matching without waiting for mdBook build
  • Markdown-only extraction: When you only need the Markdown source files
  • Faster feedback loops: mdBook build adds significant time; skipping it speeds up testing
  • Custom processing: Extract Markdown for processing with different documentation tools

Output Differences:

ModeOutput Directory Structure
MARKDOWN_ONLY="false" (default)/output/book/ (HTML site)
/output/markdown/ (source)
/output/book.toml (config)
MARKDOWN_ONLY="true"/output/markdown/ (source only)

Performance Impact: Markdown-only mode is approximately 3-5x faster, as it skips:

Sources: build-docs.sh:61-76 README.md:55-76

Internal Configuration Variables

These variables are derived or used internally and are not meant to be configured by users:

VariableSourcePurpose
WORK_DIRHard-coded: /workspace build-docs.sh27Temporary working directory inside container
WIKI_DIRDerived: $WORK_DIR/wiki build-docs.sh28Directory where deepwiki-scraper.py outputs Markdown
OUTPUT_DIRHard-coded: /output build-docs.sh29Container output directory (mounted as volume)
BOOK_DIRDerived: $WORK_DIR/book build-docs.sh30mdBook project directory
REPO_OWNERExtracted from REPO build-docs.sh40First component of owner/repo
REPO_NAMEExtracted from REPO build-docs.sh41Second component of owner/repo

Sources: build-docs.sh:27-30 build-docs.sh:40-41

Configuration Precedence and Inheritance

The system follows this precedence order for configuration values:

Sources: build-docs.sh:8-45

Example Scenarios:

  1. User provides all values:

All explicit values used; no auto-detection occurs.

  1. User provides only REPO:

    • REPO: "facebook/react" (explicit)
    • BOOK_TITLE: "Documentation" (default)
    • BOOK_AUTHORS: "facebook" (derived from REPO)
    • GIT_REPO_URL: "https://github.com/facebook/react" (derived)
    • MARKDOWN_ONLY: "false" (default)
  2. User provides no values in Git repo:

    • REPO: Auto-detected from git config --get remote.origin.url
    • All other values derived or defaulted as above

Generated Configuration Files

The system generates configuration files dynamically based on environment variables:

book.toml

Location: Created at $BOOK_DIR/book.toml build-docs.sh85 copied to /output/book.toml build-docs.sh191

Template Structure:

Sources: build-docs.sh:85-103

Variable Substitution Mapping:

Template VariableEnvironment VariableSection
${BOOK_TITLE}$BOOK_TITLE[book]
${BOOK_AUTHORS}$BOOK_AUTHORS[book]
${GIT_REPO_URL}$GIT_REPO_URL[output.html]

Hard-Coded Values:

SUMMARY.md

Location: Created at $BOOK_DIR/src/SUMMARY.md build-docs.sh159

Generation: Automatically generated from file structure in $WIKI_DIR, no direct environment variable input. See SUMMARY.md Generation for details.

Sources: build-docs.sh:109-159

Configuration Examples

Minimal Configuration

Results:

  • REPO: "owner/repo"
  • BOOK_TITLE: "Documentation"
  • BOOK_AUTHORS: "owner"
  • GIT_REPO_URL: "https://github.com/owner/repo"
  • MARKDOWN_ONLY: "false"

Full Custom Configuration

Auto-Detected Configuration

Note: This only works if the current directory is a Git repository with a GitHub remote URL configured.

Debugging Configuration

Outputs only Markdown files to /output/markdown/, skipping the mdBook build phase.

Sources: README.md:28-88

Configuration Validation

The system performs validation on the REPO variable build-docs.sh:32-37:

Validation Rules:

  • REPO must be non-empty after auto-detection
  • No format validation is performed on REPO value (e.g., owner/repo pattern)
  • Invalid REPO values will cause failures during scraping phase, not during validation

Other Variables:

  • No validation performed on BOOK_TITLE, BOOK_AUTHORS, or GIT_REPO_URL
  • MARKDOWN_ONLY is not validated; any value other than "true" is treated as false

Sources: build-docs.sh:32-37

Configuration Debugging

To debug configuration values, check the console output at startup build-docs.sh:47-53:

Configuration:
  Repository:    facebook/react
  Book Title:    React Documentation
  Authors:       Meta Open Source
  Git Repo URL:  https://github.com/facebook/react
  Markdown Only: false

This output shows the final resolved configuration values after auto-detection, derivation, and defaults are applied.

Sources: build-docs.sh:47-53

Dismiss

Refresh this wiki

Enter email to refresh