Noma Research Memo -- LaTeX, Markdown, and HTML Pain Points

External research across X, Reddit, Hacker News, official Markdown and Overleaf material, GitHub issues, and Stack Overflow supports the current Noma wedge: LaTeX remains strongest for publication-grade typesetting, Markdown remains the best low-ceremony source for simple prose, HTML remains the best artifact target, and the open gap is a readable, structured source format that renders rich artifacts and gives agents stable patch targets.

Executive conclusion

Pain map

Observed pain	Where it shows up	Noma answer
LaTeX compile errors, generated-file errors, and Overleaf timeouts interrupt the writing loop	Overleaf docs, Reddit LaTeX/Overleaf threads	Keep math/PDF support but make artifact authoring diagnostics-first, live-previewable, and block-addressable
Markdown tables are tedious, fragile, and ugly at scale	Reddit, Markdown Guide, GitHub/Stack Overflow issues	Keep pipe tables for simple cases; use `::table`, datasets, and plots for richer data
Markdown dialects render differently	CommonMark/GFM docs, Obsidian/CommonMark Reddit thread	Own a typed parser and validator instead of relying on ambient Markdown flavor behavior
Rich documents become Markdown plus HTML/CSS/Mermaid soup	Reddit technical-doc thread, X/HTML debate	Provide declarative layout and data directives with controlled escape hatches
Long agent outputs become linear walls	X, HN, Reddit Claude Code threads	Render navigable HTML artifacts from structured source
Raw HTML is hard to co-author, diff, and re-consume	HN and Reddit Claude Code threads	Treat HTML as generated artifact, not the source of truth
Documentation gets stale and hard to verify	Reddit documentation discussions, Stack Overflow/industry research	Use typed claims, evidence, citations, owners, stale checks, and patch transcripts
Agents rewrite too much	HN co-authoring concerns and Noma's own patch protocol work	Stable block IDs and patch operations

Findings

1. LaTeX is powerful, but the compile loop is heavy for agent-maintained artifacts

2. Markdown tables are the loudest everyday pain

3. Markdown's extension story is still fractured

4. People now want HTML because agent outputs became artifacts

5. Raw HTML loses as durable source

6. The agent-era gap is stable structure, not more formatting

Positioning recommendations

Recommendation	Why
Stop saying "beats Markdown" without context	Users defend Markdown for good reasons: editability, Git diffs, portability, and model familiarity
Say "HTML is an output, not the source" more often	The current X/HN/Reddit debate makes this immediately legible
Lead with tables, research memos, PR reviews, and decision artifacts	These are the recurring pain clusters where Markdown is too flat and raw HTML is too noisy
Show source and artifact side by side	The differentiator is the split: readable `.noma` source plus rich HTML/PDF/LLM output
Be LaTeX-friendly	Users who need equations and print output should see Noma as a lighter source/artifact loop around math-capable reports, not as an anti-TeX crusade
Treat agent patching as the wedge, not an extra feature	Stable IDs and source-preserving patches answer the co-authoring and diff objections to generated HTML
Keep escape hatches controlled	Users already worry about hidden or unsafe HTML; strict mode and LLM stripping are positioning assets

What Noma can credibly solve

Rich tables without hand-maintained pipe alignment for every complex case.
Math-capable reports with live preview, diagnostics, and PDF output without making TeX the collaboration surface.
Structured claims, evidence, risks, decisions, citations, and stale checks.
Agent-safe targeted edits through block IDs and patch operations.
Beautiful standalone HTML without making authors or agents hand-write HTML.
Deterministic LLM context that is less noisy than HTML and more structured than Markdown.
Multi-format output from one source: HTML, PDF, JSON, LLM context, and round-tripped Noma.

What Noma should not claim

It should not claim Markdown is obsolete. Markdown is still ideal for short, linear prose.
It should not claim HTML is bad. HTML is the right artifact target and the right escape hatch for bespoke pages.
It should not claim LaTeX is obsolete. LaTeX remains the right tool for many journal and equation-heavy publication workflows.
It should not claim to solve live collaborative editing. Google Docs and Notion still own messy human collaboration.
It should not claim arbitrary app UI is core Noma. Pixel-perfect apps should remain HTML/React/etc.
It should not hide the adoption cost. A .noma document pays off only when structure, validation, rendering, or agent edits matter.

Next product work suggested by the research

agent_task

Revise docs/comparison.noma to frame the tradeoff as source layer vs artifact layer vs agent layer. Add a short note on the May 2026 HTML-vs- Markdown agent-output debate and why Noma chooses a split architecture.

agent_task

Create a demo that starts from one .noma source and renders: a dense HTML review artifact, a print-ready PDF, and a scoped LLM context. The demo should target a pain cluster visible in the research: PR review, research synthesis, or complex comparison tables.

agent_task

Add a table-focused section to the getting-started or comparison docs: simple pipe table, ::table directive, dataset-backed plot, and why each exists.

agent_task

Make source-preserving patching more central in the homepage and README. The clearest external objection to HTML artifacts is that humans lose the ability to co-author and review diffs; this is exactly where Noma is strongest.

Source notes

Overleaf documentation on using its error pane to locate and understand LaTeX compiler errors.

Overleaf documentation on compile timeouts, accumulated errors, heavy assets, and project-organization causes.

Reddit LaTeX thread about frustrating error messages and debugging .tex files.

Reddit LaTeX thread about Overleaf compile-timeout frustration and local workflow alternatives.

Original X post/thread by Thariq Shihipar that triggered the May 2026 HTML-vs-Markdown-for-agent-output discussion. X did not expose the post body in text mode during this research, so the memo cross-checks it via Simon Willison, Hacker News, Reddit, and the public companion gallery.

Simon Willison link post summarizing Thariq Shihipar's X argument and explaining why HTML output can improve agent-generated explanations.

Companion gallery of twenty self-contained HTML artifacts grouped by agent-output use case.

Hacker News discussion of the X post, including objections around human/LLM co-authoring, Markdown source, templating, token cost, and local HTML ergonomics.

Reddit discussion of HTML vs Markdown for Claude Code outputs, including token cost, diffs, security, Markdown storage, JSON structure, and case-by-case use of HTML artifacts.

Reddit thread on technical-doc pain in Markdown, especially tables, documentation drift, and Markdown mixed with HTML/CSS/Mermaid.

Reddit discussion of Markdown vs HTML in AI/document workflows, including token economics, editor needs, portability, and rich formatting.

Original Markdown syntax documentation by John Gruber.

GitHub Flavored Markdown specification, including rationale for a spec and extended table support.

Markdown Guide extended syntax reference covering availability, tables, cell limitations, footnotes, heading IDs, and other non-core features.

GitHub issue about Markdown tables in GitHub issues rendering too wide and requiring horizontal scrolling.

Stack Overflow question where a table rendered in VS Code preview but not on GitHub until the author added a blank line before the table.

Reddit thread about Obsidian Markdown and CommonMark compatibility, including user frustration around exact parsing behavior.

Internal Noma direction document defining the source/artifact/agent layer model and the core positioning.

Internal Noma agent protocol document covering stable IDs, patch operations, validation, and transcript-oriented agent collaboration.