Pattern Language Miner
A modular NLP engine for mining structured content patterns from unstructured text.
The Pattern Language Miner identifies recurring rhetorical patterns in Markdown, HTML, and plain text to support schema creation, structured authoring, and content reuse. Designed for AI-readiness, it helps convert human-authored text into machine-interpretable components.
Key Features
- Mines recurring “Context-Problem-Solution” patterns from documents
- Supports Markdown, HTML, and YAML input
- Generates JSON Schema-compatible outputs
- Modular architecture for extensibility
Status: Active
Markdown Validator
A structural linter for Markdown used in static site generation.
This tool validates the structure of Markdown documents using declarative rules. Ideal for teams using static site generators like Hugo or DocFX, it enforces consistent formatting and structure across large content sets.
Key Features
- Validates section hierarchy and heading structure
- Customizable rule sets
- YAML frontmatter support
Status: Active
KWIC (Keyword-in-Context)
A lightweight linguistic tool for analyzing text contextually.
KWIC provides a classic corpus linguistics method for exploring text by keyword. Useful for content analysis, information retrieval research, and editing pipelines.
Status: Active
MeasureWords
Analyze your writing quantitatively.
MeasureWords offers a set of Python scripts that generate word count reports and other writing metrics for authors looking to track output or revise more strategically.
Status: Active
Exquisite Corpse Generator
A playful surrealist text generator.
Inspired by the Surrealist parlor game, this Python module generates randomized texts that blend user input with pattern-driven language logic.
Status: Stable
PDF Link Checker
Scan PDFs for broken or malformed hyperlinks.
Checks PDF files for link validity, helping content teams maintain QA across publishing pipelines and documentation sets.
Status: Stable
Markdown to YAML Converter
A content engineering utility for transforming Markdown into structured YAML.
This script helps convert annotated Markdown into JSON/YAML that complies with a given JSON Schema, enabling automated ingestion into content platforms or AI knowledge bases.
Status: Experimental
Information Retrieval Graph POC
A graph-based prototype for AI-assisted content search and evaluation.
This proof-of-concept builds an evaluable retrieval graph using Markdown/HTML and JSON-LD sources, offering a framework for benchmarking AI-generated answers against structured content.
Status: Experimental
Let’s Talk
Interested in contributing or adapting a solution for your own content workflow?