This documentation is part of the "Projects with Books" initiative at zenOSmosis.
The source code for this project is available on GitHub.
Auto-Detection Features
Loading…
Auto-Detection Features
Relevant source files
Purpose and Scope
This document details the automatic configuration detection mechanisms in the DeepWiki-to-mdBook converter. These features enable the system to operate with minimal manual configuration by intelligently inferring settings from the Git environment and repository metadata.
The auto-detection system primarily operates in the build orchestration script and focuses on repository identification and related URL construction. For information about other configuration options that require explicit setting, see Configuration Reference. For the overall build orchestration process, see build-docs.sh Orchestrator.
Overview of Auto-Detection
The system implements two categories of auto-detection:
| Category | Features | Fallback Behavior |
|---|---|---|
| Primary Detection | Repository identification from Git remote | Fails with error if not detected and not provided |
| Derived Configuration | Author names, URLs, badge links | Uses sensible defaults based on detected repository |
The auto-detection executes during the initialization phase of scripts/build-docs.sh:8-46 before any content scraping or processing begins.
Repository Auto-Detection Flow
Detection Algorithm
Detection Algorithm Flow
Sources: scripts/build-docs.sh:8-38
Git Remote Parsing
The repository detection uses a single sed regular expression to handle multiple GitHub URL formats:
Git Remote URL Parsing
graph LR
subgraph "Supported URL Formats"
HTTPS["https://github.com/owner/repo.git"]
HTTPSNOGIT["https://github.com/owner/repo"]
SSH["git@github.com:owner/repo.git"]
SSHNOGIT["git@github.com:owner/repo"]
end
subgraph "Extraction Process"
GitConfig["git config --get\nremote.origin.url"]
SedRegex["sed -E\ngithub\.com[:/]([^/]+/[^/\.]+)"]
RepoVar["REPO variable\nowner/repo"]
end
HTTPS --> GitConfig
HTTPSNOGIT --> GitConfig
SSH --> GitConfig
SSHNOGIT --> GitConfig
GitConfig --> SedRegex
SedRegex --> RepoVar
The parsing logic at scripts/build-docs.sh16 uses this pattern:
sed -E 's#.*github\.com<FileRef file-url="https://github.com/jzombie/deepwiki-to-mdbook/blob/0378ae61/#LNaN-LNaN" NaN file-path="">Hii</FileRef>(\.git)?.*#\1#'
| Pattern Component | Purpose |
|---|---|
.*github\.com | Match any characters before github.com |
[:/] | Match either : (SSH) or / (HTTPS) separator |
([^/]+/[^/\.]+) | Capture group: owner/repo (stops at / or .) |
(\.git)? | Optional .git suffix |
.* | Match remaining characters |
#\1# | Replace entire string with capture group 1 |
Sources: scripts/build-docs.sh:8-19
Derived Configuration Values
Once the REPO value is established (either through auto-detection or explicit setting), the system derives several related configuration values automatically.
graph TB
REPO["REPO\n(owner/repo)"]
Split["String split on '/'"]
RepoOwner["REPO_OWNER\n(first segment)"]
RepoName["REPO_NAME\n(second segment)"]
subgraph "Derived URLs"
GitURL["GIT_REPO_URL\nhttps://github.com/owner/repo"]
DeepWikiURL["DEEPWIKI_URL\nhttps://deepwiki.com/owner/repo"]
end
subgraph "Derived Badges"
DeepWikiBadge["DEEPWIKI_BADGE_URL\nhttps://deepwiki.com/badge.svg"]
GitHubBadge["GITHUB_BADGE_URL\nimg.shields.io badge"]
end
subgraph "Default Metadata"
BookAuthors["BOOK_AUTHORS\n(defaults to REPO_OWNER)"]
end
REPO --> Split
Split --> RepoOwner
Split --> RepoName
RepoOwner --> BookAuthors
REPO --> GitURL
REPO --> DeepWikiURL
DeepWikiURL --> DeepWikiBadge
REPO --> GitHubBadge
Derivation Chain
Configuration Value Derivation
Sources: scripts/build-docs.sh:40-51
Default Value Assignment
The script uses shell parameter expansion with default values at scripts/build-docs.sh:44-46:
| Variable | Default Value | Condition |
|---|---|---|
BOOK_AUTHORS | $REPO_OWNER | If not explicitly set |
GIT_REPO_URL | https://github.com/$REPO | If not explicitly set |
DEEPWIKI_URL | https://deepwiki.com/$REPO | Always constructed |
DEEPWIKI_BADGE_URL | https://deepwiki.com/badge.svg | Always constructed |
GITHUB_BADGE_URL | https://img.shields.io/badge/GitHub-{label}-181717?logo=github | Always constructed with URL encoding |
Sources: scripts/build-docs.sh:44-51
Badge URL Construction
The GitHub badge URL requires special encoding for the repository label at scripts/build-docs.sh:50-51:
GitHub Badge URL Encoding
The encoding is necessary because the badge service interprets - and / as special characters. The double-dash (--) escapes the hyphen, and %2F is the URL encoding for forward slash.
Sources: scripts/build-docs.sh:50-51
Configuration Precedence
The system follows a clear precedence order for all configurable values:
Configuration Precedence Order
| Priority | Source | Example |
|---|---|---|
| 1 (Highest) | Explicit environment variable | docker run -e REPO=owner/repo |
| 2 | Git auto-detection | git config --get remote.origin.url |
| 3 (Lowest) | Hard-coded default | BOOK_TITLE="Documentation" |
Sources: scripts/build-docs.sh:8-46
Error Handling
Repository Detection Failure
If repository detection fails and no explicit REPO value is provided, the script terminates with a descriptive error at scripts/build-docs.sh:34-38:
Repository Validation and Error Flow
The error message provides actionable guidance:
ERROR: REPO must be set or run from within a Git repository with a GitHub remote
Usage: REPO=owner/repo $0
Sources: scripts/build-docs.sh:34-38
graph TB
subgraph "Auto-Detection Phase"
DetectRepo["Detect/Set REPO"]
DeriveVars["Derive configuration\nvariables"]
end
subgraph "Template Processing"
LoadTemplate["Load header.html\nand footer.html"]
InvokeScript["Execute\nprocess-template.py"]
PassVars["Pass variables as\ncommand-line arguments"]
Substitute["Variable substitution\nin templates"]
end
subgraph "Available Variables"
VarRepo["REPO"]
VarTitle["BOOK_TITLE"]
VarAuthors["BOOK_AUTHORS"]
VarGitURL["GIT_REPO_URL"]
VarDeepWiki["DEEPWIKI_URL"]
VarDate["GENERATION_DATE"]
end
DetectRepo --> DeriveVars
DeriveVars --> LoadTemplate
LoadTemplate --> InvokeScript
InvokeScript --> PassVars
DeriveVars -.-> VarRepo
DeriveVars -.-> VarTitle
DeriveVars -.-> VarAuthors
DeriveVars -.-> VarGitURL
DeriveVars -.-> VarDeepWiki
DeriveVars -.-> VarDate
VarRepo --> PassVars
VarTitle --> PassVars
VarAuthors --> PassVars
VarGitURL --> PassVars
VarDeepWiki --> PassVars
VarDate --> PassVars
PassVars --> Substitute
Integration with Template System
Auto-detected values are automatically propagated to the template processing system, where they can be used as variables in header and footer templates.
Template Variable Propagation
Template Variable Propagation Flow
The invocation at scripts/build-docs.sh:205-213 passes all auto-detected and derived values to process-template.py:
Sources: scripts/build-docs.sh:195-234 README.md:34-36
Usage Examples
Auto-Detection in Local Development
When running the Docker container from within a Git repository with a GitHub remote configured:
The script automatically:
- Detects
REPOfromgit config --get remote.origin.url - Derives
BOOK_AUTHORSfrom the repository owner - Constructs all URLs based on the detected repository
Explicit Override
Users can override auto-detection by explicitly setting environment variables:
In this case:
REPOuses the explicit value (no auto-detection)BOOK_AUTHORSuses the explicit value (not derived fromREPO)BOOK_TITLEuses the explicit value- URLs are still derived from the explicit
REPOvalue
Sources: README.md:14-27 scripts/build-docs.sh:8-46
Detection Validation Output
The build script outputs configuration information after detection completes at scripts/build-docs.sh:53-59:
Configuration:
Repository: jzombie/deepwiki-to-mdbook
Book Title: Documentation
Authors: jzombie
Git Repo URL: https://github.com/jzombie/deepwiki-to-mdbook
Markdown Only: false
This output serves as verification that auto-detection and default derivation completed successfully before content processing begins.
Sources: scripts/build-docs.sh:53-59
Limitations and Considerations
Git Repository Requirement
Auto-detection only works when the Docker container is run with the workspace mounted and contains a Git repository with a GitHub remote. For non-Git scenarios or non-GitHub repositories, the REPO environment variable must be explicitly provided.
GitHub-Specific Detection
The URL parsing logic at scripts/build-docs.sh16 specifically looks for github.com in the remote URL. Repositories hosted on other platforms (GitLab, Bitbucket, etc.) will not be auto-detected and require explicit configuration.
Single Remote Assumption
The detection reads from remote.origin.url specifically. If a repository has multiple remotes or uses a different primary remote name, auto-detection will use the origin remote or fail if it doesn’t exist.
Sources: scripts/build-docs.sh:8-19 scripts/build-docs.sh:34-38
Dismiss
Refresh this wiki
Enter email to refresh