Architecture

The whole codebase fits in one mental model: source → parser → AST → renderer. There are no plugins, no parser combinators, no build pipelines. This is intentional.

Pipeline

.noma source ──► fmt.ts ──► .noma source           (table re-alignment)
   │
   ▼  parser.ts                                    (or book.ts for manifests)
typed AST  ─────────►  validator.ts  ─►  diagnostics
   │
   ├──►  renderer-html.ts  ─►  HTML
   ├──►  renderer-llm.ts   ─►  LLM context (optionally selected/budgeted)
   ├──►  renderer-json.ts  ─►  JSON
   ├──►  renderer-noma.ts  ─►  .noma source        (roundtrip + patch)
   ├──►  renderer-markdown.ts ─► Markdown
   ├──►  renderer-docx.ts  ─►  DOCX
   │
   ├──►  ids.ts            ─►  ID / alias registry
   │
   ▼  pdf.ts               ─►  Chromium print
PDF

DOCX ──► docx-control-data.ts ─► bound control JSON ──► docx-control-sync.ts ─► .noma source
  └──► docx-review-data.ts  ─► native review JSON  ──► docx-review-sync.ts ─► .noma source

The parser does not validate. The renderers do not parse. The validator does not render. HTML/LLM/JSON/Noma/Markdown renderers are pure AST-to-string functions; the DOCX renderer is pure AST-to-Buffer and writes no files itself. docx-control-data.ts is the reverse bridge for form handoffs: DOCX package bytes in, bound ::control value JSON out. docx-control-sync.ts turns that value layer into source-preserving default= updates for matching controls. docx-review-data.ts reads native Word comment, revision, footnote, endnote, heading, and table parts back into JSON so review state can be inspected outside Word, and docx-review-sync.ts maps anchors back to source block IDs for heading updates, comment add/update/resolve, note insertion/update, table/dataset cell/row/column updates, and tracked-revision change-request patches. PDF generation is the one render path that performs I/O because it writes a temporary HTML file and drives Chromium. The patch layer has two paths: AST-level patch(doc, op) for callers already operating on a tree, and source-level patchSource(source, ops) for the CLI/MCP path that rewrites only the addressed source spans.

Modules

src/ast.ts

The discriminated Node union. Every other module imports types from here. Adding a node variant breaks every renderer's exhaustive switch — that is the desired safety property.

src/parser.ts

Hand-written line-based recursive descent. ~250 lines. Parses frontmatter, headings, fenced code, lists, quotes, thematic breaks, and arbitrary-depth directive blocks via colon-counting.

src/inline.ts

Tiny inline markup parser plus shared pipe-table helpers used by the parser, formatter, patch layer, and the ::table renderer. Three entry points: inlineToHtml, inlineToPlain, splitPipeRow; Markdown link labels support escaped literal brackets so DOCX-returned labels such as \[source\] render as [source], escaped table pipes render as literal pipes outside code spans while staying escaped in source, and table serializers preserve pipes inside code spans while escaping separator pipes elsewhere.

src/loader.ts

Pre-render source loaders for file-backed directives. inlineDatasetSources inlines CSV/TSV/JSON/YAML datasets; inlineFigureSources inlines local PNG/JPEG/GIF/SVG figure assets for DOCX. Renderers consume the enriched AST and never read the filesystem.

src/renderer-html.ts

AST → semantic HTML. Special-cases the typed blocks (claim/evidence/reasoning metadata, grid, card, technical API/reference blocks, metric/code/computation blocks, computed controls/metrics/plots, memory profile blocks, review/collaboration blocks, plot, agent_task, state_change, table, ...), resolves ::plot{dataset=, column=} against a per-render dataset registry, embeds a local runtime for standalone computed artifacts, and falls through to readable labeled panels for unknown or namespaced directive names.

src/renderer-site.ts

Book chapters → static Noma Space. Renders each chapter through the pure HTML renderer, rewrites cross-chapter wikilinks with depth-aware hrefs, writes shared theme/search assets, and injects sidebar navigation, breadcrumbs, page metadata, related pages, backlinks, copy/print actions, embedded search data, and a root _assets/search-index.json for agents.

src/renderer-llm.ts

AST → deterministic plain text with [TAG attr=value] markers. Designed to maximize signal density inside LLM context windows. Supports selection by node type or directive name plus a character budget for agent context windows, and evaluates computed formulas against control defaults so agents see static metric/series results without any browser runtime.

src/renderer-noma.ts

AST → .noma source. Roundtrip-safe printer that backs noma render --to noma. parse → renderNoma → parse preserves the AST modulo positions.

src/renderer-markdown.ts

AST → portable Markdown. Preserves normal Markdown prose, headings, lists, code, quotes, and pipe tables; converts [[id]] wikilinks to [id](#id); emits hidden HTML anchors for IDs and aliases; renders tasks as checklists, figures as images, callouts as GitHub-style admonitions, and directive tables as Markdown pipe tables. Noma-only semantics degrade to readable labels plus hidden noma:block comments so .md exports remain useful for humans while retaining agent context.

src/renderer-docx.ts

AST → WordprocessingML package. Generates a minimal DOCX ZIP directly: document part, core/app properties from frontmatter, header/footer/comment/comment-extended/footnote/endnote/settings parts, custom XML control-data parts, part-local relationship files for hyperlinks inside headers, footers, comments, footnotes, and endnotes, media parts for embedded figures and static SVG plots, styles, numbering, content types, relationships, section-level page setup with native section breaks, native document-protection settings for form handoffs, page-setup-aware native Word tables for ::table, ::dataset, ::grid, and ::columns, field-numbered table captions, field-numbered figure/plot captions, generated heading TOCs and caption lists with page-reference fields, Word REF fields for wikilinks to captioned blocks, framed ::card panels, typed ::memory / ::memory_index panels, metric KPI blocks, static computed metric/plot/table handoffs, technical API/reference panels, addressable ::code snippets, code-cell/output computation blocks, flattened ::hero / ::tabs / ::accordion containers, titled :::tab panels, framed ::sidebar asides, natural ::abstract / callout / note / warning / tip blocks, Office Math for ::math and inline math, native checkbox action-item rendering for ::agent_task / ::todo, native text/dropdown/date/checkbox content-control rendering for ::control with lock metadata and data bindings, action/export rendering for ::button and ::export_button, diagram/Plotly source fallbacks, page-number fields, page breaks, semantic review-block styling and metadata, review/provenance/confidence metadata blocks, targeted/threaded native comments with rich inline body content and resolved-state metadata, tracked change-request revisions, state-change deltas, clickable citations, target-anchored rich native footnotes/endnotes, generated bibliographies, block-ID bookmarks, internal wikilinks, external hyperlinks with rich Markdown labels including combined bold+italic spans, visible escaped table pipes outside code spans, and readable fallback labels for custom directives. Web-first blocks without Word equivalents degrade to labeled placeholders for review handoff.

src/docx-control-data.ts

DOCX package reader for bound form data. Opens the ZIP central directory, reads customXml/itemN.xml parts, extracts the urn:noma:controls value layer, falls back to visible noma-control:<id> content-control values from word/document.xml when that value layer is stale or missing, maps visible dropdown/combobox list controls back to select values, preserves Word w:cr carriage-return and w:br manual-break runs as line breaks, normalizes Word w:noBreakHyphen runs to -, preserves Word w:softHyphen runs as U+00AD soft hyphen characters, preserves Word w:tab and w:ptab runs as tabs, preserves Unicode w:sym glyphs, accepts those empty run-token elements whether serialized as self-closing or paired empty elements, ignores deleted and moved-from tracked ranges inside visible fields, including w:moveFromRangeStart / w:moveFromRangeEnd spans, while keeping moved-to ranges as current text, recovers generated text/symbol checkbox glyphs when native checkbox metadata was stripped, and reads native noma-task:<id> checkbox controls so noma docx-data and API callers can recover edited control values and task state as JSON.

src/docx-control-sync.ts

DOCX-to-source bridge for fillable form loops. Takes bound control data and task checkbox state, matches it to source ::control, ::agent_task, and ::todo IDs, and calls the source-preserving patcher to update only matching default=, done, and status attributes in the .noma file.

src/docx-review-data.ts

DOCX package reader for native Word review state. Reads word/comments.xml, word/commentsExtended.xml, word/footnotes.xml, word/endnotes.xml, native headings/tables/captions/metric/control/action labels/block titles/block bodies/metric values/metric metadata/block metadata, review-part relationship files, document relationships for table-cell hyperlinks, and tracked revisions/moves in word/document.xml, extracting current comment/note bodies and bookmarked table cells as lightweight Noma Markdown for bold/emphasis/code/internal wikilinks/external links, coalescing adjacent same-style Word runs before Markdown rendering including inside hyperlink labels, preserving formatted or custom internal #id hyperlink labels when the visible anchor text or a verified Noma-generated bookmark identifies the target, preserving intentional spaces inside hyperlink labels, escaping literal brackets inside returned hyperlink labels, percent-encoding whitespace and parentheses inside returned hyperlink targets, preserving Word w:cr carriage-return and w:br manual-break runs as line breaks, normalizing Word w:noBreakHyphen runs to -, preserving Word w:softHyphen runs as U+00AD soft hyphen characters, preserving Word w:tab and w:ptab runs as tabs, preserving Unicode w:sym glyphs, accepting those empty run-token elements whether serialized as self-closing or paired empty elements, ignoring deleted and moved-from tracked ranges including w:moveFromRangeStart / w:moveFromRangeEnd spans while keeping moved-to ranges as current text, preserving deleted or moved-from text and run tokens as revision old values, merging adjacent same-ID revision fragments before grouping insert/delete/replace proposals, grouping multiple adjacent delete/insert pairs in one paragraph as separate replacements while keeping text-separated delete/insert runs independent, treating missing native comment done metadata as unknown rather than unresolved, and returning heading/caption/metric-label/control-label/action-label/block-title titles, prose block body text, metric value text, metric metadata plus citation, technical, computation, computed-metric/plot, control, memory, and semantic block metadata fields, author/date metadata, resolved state, reply linkage, insert/delete/replace revision text including Word move pairs and range-marked moves, heading/caption/metric-label/control-label/action-label/block-title/block-body/metric-value/metric-metadata/block-metadata bookmarks, table-cell comment range/reference, note, and revision anchors inherited from preceding table bookmarks, cell-scoped layout-table review anchors inherited from nested child bookmarks or the outer layout fallback, authored tables/datasets and table-cell review anchors discovered inside layout cells, later framed directive body paragraph anchors inherited from the directive bookmark, and Noma-generated change-request source bookmarks for review-data handoff loops.

src/docx-review-sync.ts

DOCX-to-source bridge for native review state. Maps Word comment ranges/references, threaded replies, footnote/endnote references, heading anchors, caption anchors, metric-label/control-label/action-label/block-title/block-body/metric-value/metric-metadata and block-metadata anchors, table anchors, tracked revisions, and wrapper or range-marker tracked moves back to Noma bookmark IDs, updates accepted heading edits with update_heading, updates accepted caption edits with update_attribute or computed-plot body title: edits with replace_body, updates accepted metric, computed-metric, control, action label, block title, and prose block body edits with update_attribute or body-field edits with replace_body, updates accepted metric value edits with update_attribute, replace_body, and optional unit= removal, updates accepted metric metadata edits with update_attribute and remove_attribute, updates accepted citation, technical, computation, computed-metric/plot, control, memory, and semantic block metadata edits with update_attribute, replace_body, and remove_attribute, adds new anchored comments/replies/notes with add_comment, add_footnote, and add_endnote, updates existing source-position comments/notes, matched threaded replies, matched targeted notes, matched ::change_request attrs/bodies, simple matching ::dataset edits with update_dataset_cell, insert_dataset_row, delete_dataset_row, insert_dataset_column, delete_dataset_column, or replace_body, and simple matching ::table edits with update_table_header_cell, update_table_cell, insert_table_row, delete_table_row, insert_table_column, or delete_table_column, resolves existing source comments with resolve_comment, reopens comments by clearing stale resolution attrs with remove_attribute, treats source-bookmarked replies as returned thread state, compares alias and canonical IDs as the same review target so canonical Word bookmarks do not rewrite unchanged alias-authored wikilinks, internal links with escaped labels, table/dataset cell links, or target attrs, compares hyperlink labels by visible text so escaped/raw backslashes do not rewrite unchanged source links, treats source \| escapes and Word-returned literal pipes as unchanged visible text, matches target-only comment markup removals against source links whose visible labels contain escaped brackets, keeps metadata-conflicting same-text comments/replies/revisions distinct, ignores deleted/withdrawn source comments, notes, and change requests when matching returned Word artifacts even if older DOCX source bookmarks still point at them, marks missing sibling comments, including previously resolved source comments, targeted notes, and change requests status="deleted" with update_attribute, and imports unmatched tracked revisions or moves with add_change_request. Tables, headings, captions, labels, block bodies, metric values, metric metadata, and block metadata containing tracked revision markup are not applied as direct accepted edits. Computed-plot caption sync follows the DOCX-rendered label precedence (label=, title=, name=, body label:, body title:), updates the same source field on return, and adds title= when the displayed caption came only from the block ID fallback. Metric label sync follows the DOCX-rendered label precedence for ::metric (label=, title=, name=) and ::computed_metric (label=, title=, name=, body label:, body title:), updating the same source field on return and adding label= when the displayed label came only from the block ID fallback. Control label sync follows the DOCX-rendered label precedence for ::control (Label=, label=, then the Word control fallback), updating the same source field on return and adding label= when the displayed label came only from the fallback. Action label sync follows the DOCX-rendered label precedence for ::button and ::export_button (Label=, label=, body Label:, then the action fallback), updating the same source field on return and adding label= when the displayed label came only from the fallback. Block title sync follows the DOCX-rendered title precedence for titled directive headings, updating title=, caption=, or name= only when the rendered title maps unambiguously to that source field and adding title= when a fallback card/callout/sidebar/tab/memory/dataset/bibliography/technical/custom title is edited. Custom fallback heading metadata sync parses readable attribute summaries on unknown and namespaced directive headings, updating, adding, or removing matching attrs, including whole-summary deletion and comma-bearing attr values, while keeping renderer-only directive labels out of stored titles. Block body sync extracts framed prose paragraphs following Noma directive headings and applies accepted edits with replace_body only for supported body-only directives such as claims, cards, callouts, memory blocks, tasks, and semantic reasoning blocks; soft-wrapped source lines compare against Word's paragraph-normalized text before patching. Technical prose body sync extends that same replace_body path to body-only API/reference/instruction blocks, including ::api, ::endpoint, ::parameter, ::instruction, ::changelog, and non-language-backed ::query / ::example. Code body sync extracts rendered monospace directive bodies in code mode for ::code, ::code_cell, ::output, and language-backed ::query / ::example blocks, preserving code line breaks when accepted Word edits return through replace_body. Custom fallback body sync gives unknown and namespaced directives a default Word frame, then applies accepted body-only edits with replace_body so community-pack fallback panels still participate in the DOCX review loop. Metric value sync follows the DOCX-rendered value precedence for ::metric (value=, current=, amount=, then body text), updates the same source field on return, strips the rendered unit from stored numeric values when appropriate, and removes unit= when a reviewer deletes the unit from the visible value line. Metric metadata sync follows the DOCX-rendered metadata precedence for ::metric, updating or removing the same source attrs for existing change= / delta= and as_of= / asOf= / date= spellings while adding new metadata with the canonical change= and as_of= attrs. Block metadata sync follows the DOCX-rendered metadata labels for citations, technical API/reference blocks, computation blocks, computed metrics and plots, form controls, memory profile blocks, reasoning, review, provenance, confidence, and task blocks, updating or removing the same source attrs or body fields for existing alias spellings such as href=, baseUrl=, url=, location=, runtime=, count=, cell=, range=, suffix=, min=, max=, step=, lastSeen=, validUntil=, supersededBy=, due_at=, decided_at=, author=, and at=. Metadata extraction treats Word's visible · separator as structural only before another valid field label in the rendered field order for that block, preserving returned values such as source="CRM · status: stale" and owner="Ops · Q1: Finance" even when Word isolates the value separator into its own run.

src/validator.ts

AST → diagnostics list. Detects duplicate IDs, broken references (incl. [[wikilink]] targets), plot/dataset linkage errors, plot/figure issues, claim-without-evidence, risk/decision/agent-task shape rules, computed formula/control shape, stale citations, escape-hatch trust, state_change and change_request shape, and out-of-profile-directive when the document declares a profile.

src/patch.ts

Block-level patch ops: replace_block, replace_body, update_heading, add_comment, resolve_comment, remove_attribute, add_footnote, add_endnote, add_change_request, update_table_cell, update_table_header_cell, insert_table_row, delete_table_row, insert_table_column, delete_table_column, update_dataset_cell, insert_dataset_row, delete_dataset_row, insert_dataset_column, delete_dataset_column, move_block, add_block, delete_block, update_attribute, rename_id. AST-level patching returns a new tree; source-level patching reparses between ops and splices only the target ranges. add_comment creates targeted review notes near the addressed block or threaded replies with reply_to=, resolve_comment marks review notes closed without rewriting their bodies, remove_attribute clears stale non-id directive metadata without replacing a block, add_footnote and add_endnote add targeted note blocks near reviewed content, add_change_request creates targeted tracked-review deltas, table ops edit header cells, body cells, body rows, or columns in ID-bearing ::table directives and share parser-compatible pipe escaping for code spans, dataset ops edit one body cell, data row, or data column in simple ID-bearing inline datasets, move_block relocates an existing directive while preserving its body/attributes and normalizing fence depth as needed, and rename_id retargets reference attributes and [[wikilink]] references across the document.

src/ids.ts

AST → canonical ID registry. Used by noma ids to expose document-order IDs, alias mappings, source lines, and basic block metadata for agent discovery.

src/book.ts

YAML manifest loader. loadBook concatenates each chapter's parsed AST into a single DocumentNode, so every renderer works on books with no per-target wiring.

src/fmt.ts

Source formatter. Walks the file line-by-line, recognises GitHub-style pipe tables, rebuilds them to a single column width, and leaves everything else byte-identical (including tables inside fenced code blocks).

src/cli.ts

The noma init|parse|render|ids|check|export|patch|fmt|docx-data|docx-sync|docx-review-data|docx-review-sync|verify|diff command-line entry. Glue around parser/renderer/validator/patch modules; transaction-shaped --ops payloads are validated here before writing.

action.yml

Reusable GitHub Action wrapper. Installs the CLI from the action checkout by default, optionally runs noma check, renders the requested target, and uploads the output as a workflow artifact.

themes/default.css

The HTML renderer is theme-agnostic — it only emits semantic class names. The default theme is a single CSS file with custom properties for easy reskinning.

Why no plugin system (yet)

A plugin API will arrive once three things are true: the AST is stable, two or more independent block packs exist in the wild, and at least one community-contributed renderer ships externally.

Performance

The parser is hand-written precisely so optimization stays straightforward — if Noma ever needs to render a million-block book in under a second, the path is to port parser.ts to a streaming variant or to Rust, without touching the AST or any renderer.

Conventions

  • One module, one file. No utils/ dumping ground.
  • *No comments unless the why is non-obvious. Identifiers carry the what*.
  • TypeScript strict mode. noUncheckedIndexedAccess is on — array access returns T | undefined.
  • No build step in development. tsx runs TypeScript directly. npm run build exists for the published npm package.