PTXprint MCP — Typeset scripture from a prompt

§ I. The pitch

A thin, opinionless layer over a deeply opinionated craft.

PTXprint is the tool Bible translation teams use to typeset Paratext projects into print-ready PDFs — & it is glorious. Hundreds of settings. Real diglot, polyglot, study-Bible layouts. Real XeTeX under the hood. The MCP server you are looking at does not pretend to know any of that. It exposes filesystem-shaped IO, content-addressed job submission, and gets out of the way.

The opinions live next door, in a canon repository served by oddkit. But the agent doesn't talk to two MCPs — it talks to one. The docs(query) tool on this server proxies canon retrieval upstream, so the agent's loop is ask docs · understand · act · observe across a single MCP connection. One server, one concern — the design rationale is in §VI.

i.

For translation teams

Hand a translation agent your Paratext project — in any language, any script — and get a publication-quality PDF back. The agent knows when to ask, what to tweak, and when the result is ready to send to the press.

ii.

For agent builders

Three async tools. No domain quiz to pass. Submit a job, poll for status, cancel if it overruns. The server takes care of XeTeX, autofill passes, content-addressed caching, and surfacing failures in language a model can reason about.

iii.

For systems people

Cloudflare Worker dispatches via service binding into a Container running PTXprint & XeTeX. Durable Objects hold per-job state. R2 stores content-addressed outputs. SHA-256 of the canonical payload is the cache key — identical jobs cost zero CPU.

§ II. Live demo · real MCP calls

Submit a job. Get a real PDF.

Both demo payloads are checked-in smoke fixtures from the repo's smoke/ directory and have been rendered before, so they cache-hit and return instantly — zero container CPU. The PDF below is the real artifact served from R2.

protocol

JSON-RPC 2.0 / MCP 2025-06-18

book

font

BSB · John · Gentium Plus · cache hit — instant PDF

view book fixture on github →

actions

Each call is a real tools/call over MCP streamable-http. Response envelopes are shown verbatim. The page identifies itself with x-ptxprint-client headers so it appears on the transparency leaderboard.

browser ⇌ ptxprint.klappy.dev/mcp idle

artifact

awaiting submit

No artifact yet.

Click submit_typeset to call the real MCP server. A cached PDF will load right here.

—

§ III. The canon, live

Ask the docs tool anything.

The MCP server's docs(query) tool searches the project's canon — the prose articles, specs, and governance documents that give an agent enough context to drive PTXprint. Type a question; see the actual answer plus the canon URIs that backed it.

docs(query, audience=headless)

try: · · ·

§ IV. The contract

Three tools. One contract.

A typesetting job for a whole New Testament can take half an hour. Synchronous tools collide with every chat-shaped surface in existence. So the protocol is async: submit returns immediately, status is pollable, cancellation is honored.

SPECIMEN
PLATE

i

tool · async

`submit_typeset`

Hand it a project, a config, a book selection. Returns a job_id immediately and a predicted output URL. Identical payloads cache-hit.

// returns immediately
{
  job_id: "611700a0…",
  payload_hash: "611700a0…",
  cached: true,
  predicted_pdf_url: "…/r2/…/pdf"
}

ii

tool · pollable

`get_job_status`

Per-pass progress, log tail, error list, overfull-box count. A human_summary string for downstream chat agents.

{
  state: "succeeded",
  progress: { passes_completed: 1 },
  overfull_count: 8,
  errors: [],
  human_summary: "Done. 61 pages."
}

iii

tool · safety valve

`cancel_job`

A 30-minute autofill pass needs a kill switch. SIGTERM to the subprocess; partial outputs preserved on disk; state moves to cancelled.

{
  ok: true,
  was_running: false,
  cancelled_at: "2026-04-30T23:24:00Z"
}

cache

SHA-256 of the canonical payload (RFC 8785 JCS) is the only cache key. No TTL. Identical jobs cost zero CPU and return the same R2 object.

timeout discipline

Per-job timeout in the request, default 30 min for autofill, 5 min for simple. No platform-edge timeout exposed to the caller.

progress shape

Per-pass, not per-page. PTXprint doesn't expose useful per-page progress in headless mode — honest "pass 3 of ~5" beats fabricated percentages.

§ V. Live telemetry · this server

No information asymmetry.

Every tool call against ptxprint.klappy.dev writes one structural data point to ptxprint_telemetry. Same data the maintainer sees, queried over MCP from this page in your browser, right now. Identify yourself with an x-ptxprint-client header and you'll appear on the consumer leaderboard below.

DATASET

ptxprint_telemetry

cloudflare AE

Write SQL with semantic field names
(event_type, tool_name, …)
— the worker rewrites them to
positional refs. Schema:
/diagnostics/schema

events · last 30d

—

activity · last 24h · this server

—

tool_call leaderboard · last 30d · ptxprint

SUM(_sample_interval) GROUP BY tool_name

loading…

consumer leaderboard · who is calling this server

—

querying…

companion · oddkit_telemetry — for context, the related canon service

show

loading…

live · telemetry_policy() — what this server tracks and why

show

loading…

audit · the SQL this page just ran (submitted & rewritten)

show

The page submits SQL with semantic field names (event_type, tool_name, consumer_label, …). The worker rewrites them to the positional blobN / doubleN form Cloudflare Analytics Engine actually accepts, and returns the rewritten SQL on each response so this audit can show both sides. The /diagnostics/schema endpoint is the canonical mapping; it's the same data the telemetry_schema MCP tool returns.

submitted — semantic

executed at AE — positional (rewriter output)

waiting for query responses…

—

§ VI. Architecture

Vodka architecture.

Each MCP server holds opinions about exactly one concern. The PTXprint server holds none about typesetting craft — only about subprocess lifecycle, content-addressed caching, and sandboxed file IO. Domain knowledge lives next door, in canon. Agents see one MCP; PTXprint delegates canon retrieval to oddkit upstream when serving docs().

The agent's reasoning loop is one MCP wide: ask docs → understand → act → observe. Two services in concert — one of them invisible to the agent. Each thin enough to maintain by one person indefinitely.

flow

agent-visible MCP traffic
internal · agent never sees

opinionless server

No piclist syntax. No adjlist semantics. No font tables. No USFM. The server treats every file as opaque text and every subprocess as opaque action.

content-addressed

Cache keys are SHA-256 hashes (RFC 8785 JCS) of the canonical payload. No TTL. No staleness. Two identical jobs share one PDF.

async by design

Cloudflare's 30s Worker timeout collides with 30-minute autofill jobs. The two-step contract is the only honest answer.

canon-governed

Every architectural decision is encoded in OLDC+H artifacts and stored under canon/. The repo is the spec.

§ VII. Stack

Built on the shoulders of two giants.

edge runtime

Cloudflare

Workers — MCP transport, auth, dispatch via service binding
Containers — PTXprint + XeTeX + SIL Charis (standard-2: 1 vCPU, 6 GiB)
Durable Objects — per-job state, cancellation, polling
R2 — content-addressed PDF and log storage
Analytics Engine — public usage telemetry

typesetting

SIL & Paratext

PTXprint — Hosken, Penny, Gardner et al · headless CLI mode
XeTeX — Unicode-native typesetting engine
USFM — scripture markup as the source format
SIL Charis — bundled font for the English-first scope
LFF — Language Font Finder for BCP 47 → font resolution