Pattern Language Miner

A modular NLP engine for mining structured content patterns from unstructured text.

GitHub

The Pattern Language Miner identifies recurring rhetorical patterns in Markdown, HTML, and plain text to support schema creation, structured authoring, and content reuse. Designed for AI-readiness, it helps convert human-authored text into machine-interpretable components.

Key Features

Status: Active


Markdown Validator

A structural linter for Markdown used in static site generation.

GitHub

This tool validates the structure of Markdown documents using declarative rules. Ideal for teams using static site generators like Hugo or DocFX, it enforces consistent formatting and structure across large content sets.

Key Features

Status: Active


KWIC (Keyword-in-Context)

A lightweight linguistic tool for analyzing text contextually.

GitHub

KWIC provides a classic corpus linguistics method for exploring text by keyword. Useful for content analysis, information retrieval research, and editing pipelines.

Status: Active


MeasureWords

Analyze your writing quantitatively.

GitHub

MeasureWords offers a set of Python scripts that generate word count reports and other writing metrics for authors looking to track output or revise more strategically.

Status: Active


Exquisite Corpse Generator

A playful surrealist text generator.

GitHub

Inspired by the Surrealist parlor game, this Python module generates randomized texts that blend user input with pattern-driven language logic.

Status: Stable


Scan PDFs for broken or malformed hyperlinks.

GitHub

Checks PDF files for link validity, helping content teams maintain QA across publishing pipelines and documentation sets.

Status: Stable


Markdown to YAML Converter

A content engineering utility for transforming Markdown into structured YAML.

GitHub

This script helps convert annotated Markdown into JSON/YAML that complies with a given JSON Schema, enabling automated ingestion into content platforms or AI knowledge bases.

Status: Experimental


Information Retrieval Graph POC

A graph-based prototype for AI-assisted content search and evaluation.

GitHub

This proof-of-concept builds an evaluable retrieval graph using Markdown/HTML and JSON-LD sources, offering a framework for benchmarking AI-generated answers against structured content.

Status: Experimental


Let’s Talk

Interested in contributing or adapting a solution for your own content workflow?

Get in touch →