# How to Set Up a Sphinx Autodoc Pipeline with Markdown and Mermaid This document explains how to set up a **reliable Sphinx autodoc pipeline** for a Python codebase using: - `autodoc` and `autosummary` for API documentation - Markdown (`.md`) files via MyST - Mermaid.js diagrams rendered during the Sphinx build - A clean separation between source code, documentation source, and build output The goal is a pipeline that is: - Deterministic - Reproducible - Import-safe - Suitable for long-lived projects ## Conceptual Overview The pipeline has three responsibilities: 1. **Import Python code safely** 2. **Extract documentation from docstrings** 3. **Combine generated API docs with authored Markdown content** Sphinx is used as the orchestrator, not as a content author. ## Required Project Structure A minimal, sane layout looks like this: ```text my_project/ ├── src/ │ └── my_project/ │ ├── __init__.py │ ├── core.py │ └── utils.py ├── docs/ │ ├── conf.py │ ├── index.md │ ├── api/ │ │ └── index.md │ ├── diagrams/ │ │ └── architecture.md │ └── _static/ ├── pyproject.toml └── README.md ``` Key rules: - Code lives under `src/` - Docs live under `docs/` - Sphinx never mutates source code - Documentation output is disposable ## Step 1: Create a Virtual Environment From the project root: ```bash python -m venv .venv source .venv/bin/activate python -m pip install --upgrade pip ``` This environment must be active whenever you build docs. ## Step 2: Install Dependencies Install Sphinx and required extensions: ```bash python -m pip install \ sphinx \ sphinx-rtd-theme \ myst-parser \ sphinxcontrib-mermaid ``` Optionally freeze them later in `requirements.txt`. ## Step 3: Initialize the Sphinx Project From the project root: ```bash cd docs sphinx-quickstart ``` Use these answers: - Separate source and build directories: **No** - Project name: your project name - Author: your name - Language: `en` This creates: - `conf.py` - `index.rst` (you will replace this) Delete `index.rst`. ## Step 4: Configure `conf.py` ### Make Code Importable At the top of `conf.py`: ```python import os import sys sys.path.insert(0, os.path.abspath("../src")) ``` This allows autodoc to import your package. ### Enable Extensions ```python extensions = [ "sphinx.ext.autodoc", "sphinx.ext.autosummary", "sphinx.ext.napoleon", "sphinx.ext.viewcode", "myst_parser", "sphinxcontrib.mermaid", ] ``` Enable autosummary generation: ```python autosummary_generate = True ``` ### Configure MyST Markdown ```python myst_enable_extensions = [ "colon_fence", "deflist", "fieldlist", ] ``` ### Theme and Static Assets ```python html_theme = "sphinx_rtd_theme" html_static_path = ["_static"] ``` ## Step 5: Replace `index.md` Create `docs/index.md`: ```md # My Project Documentation This site contains developer documentation for `my_project`. ## User and Conceptual Documentation ```{toctree} :maxdepth: 2 diagrams/architecture ``` ## API Reference ```{toctree} :maxdepth: 2 api/index ``` This file is the root of the documentation graph. ## Step 6: Set Up API Autodoc Pages ### Create API Index `docs/api/index.md`: ```md # API Reference ```{toctree} :maxdepth: 2 modules ``` ``` ### Create Autosummary Stub `docs/api/modules.md`: ```md /``{autosummary} :toctree: generated :recursive: my_project /`` ``` This tells Sphinx to: - Import `my_project` - Generate per-module API pages - Place them under `docs/api/generated/` You do **not** write those files by hand. ## Step 7: Write Good Docstrings Sphinx autodoc is only as good as your docstrings. Example: ```python def parse_config(path: str) -> dict: """ Parse a configuration file. Args: path: Path to the configuration file. Returns: Parsed configuration as a dictionary. """ ``` Use: - Google-style or NumPy-style docstrings - No side effects at import time - No runtime dependencies in module top-level code ## Step 8: Add Mermaid Diagrams Create a Markdown file: `docs/diagrams/architecture.md` # Architecture ```mermaid graph TD A[User] --> B[CLI] B --> C[Core Engine] C --> D[Output] ``` Sphinx will render this during the build. ## Step 9: Build the Documentation From the `docs/` directory: ```bash python -m sphinx -b html . _build/html ``` Output appears in: ```text docs/_build/html/ ``` Open `index.html` in a browser to verify. ## Step 10: Iterate Safely When modifying the pipeline: - Change one thing at a time - Rebuild often - Treat `_build/` as disposable - Never commit generated API files unless publishing artifacts ## Common Failure Modes - Code imports fail because dependencies are missing - Side effects execute at import time - Modules rely on runtime configuration - Autosummary files are edited by hand - Markdown files are not added to a `toctree` All of these are fixable, but avoidable. ## Recommended Practices - Keep docstrings import-safe - Treat documentation as a build artifact - Separate authored docs from generated docs - Document the pipeline itself - Automate builds only after manual success ## Summary A Sphinx autodoc pipeline works when: - Python code is importable - Docstrings are authoritative - Markdown provides narrative structure - Mermaid diagrams illustrate architecture - Sphinx assembles everything deterministically Once set up correctly, the system scales without surprises.