Why Not Just Use JSON?

No existing format fully satisfies the problem of AI agent audit logs. TAI is a necessary consolidation of features that are currently scattered and incompatible across existing formats.


01 — Format Comparison

The Requirements

AI audit logs need features that no single existing format provides.

JSONL (JSON Lines)

✓ Pros

  • Streaming-friendly
  • Easy incremental parsing
  • Wide tooling adoption

✗ Fails because

  • Escaping Hell: Still requires escaping \n, \" inside strings
  • Truncation Fragility: Invalid JSON fragment = lost line
  • No Multi-line: Everything must be a single line
  • LLM Reliability: LLMs often produce multi-line strings, breaking the format

Markdown + YAML Frontmatter

✓ Pros

  • Native long-form text
  • No escaping
  • Familiar to developers

✗ Fails because

  • Not Stream-Safe: YAML frontmatter requires full document
  • Truncation: Truncation in frontmatter is fatal
  • Ambiguity: YAML is too permissive, difficult for LLMs
  • Modal Confusion: Multiple syntactic modes confuse LLMs

XML / SGML

✓ Pros

  • Streamable (SAX parsers)
  • Truncation-tolerant with custom rules
  • Explicit opening/closing tags

✗ Fails because

  • Verbosity: Extremely verbose syntax
  • Escaping: Requires escaping <, &, "
  • Hallucination: LLMs hallucinate malformed tags constantly
  • Ergonomics: Not ergonomic for Markdown or multi-line text

Binary Formats (Protobuf, MessagePack, CBOR)

✓ Pros

  • Compact and efficient
  • Strong typing
  • Wide adoption in infrastructure

✗ Fails because

  • LLM Output: LLMs can't reliably output clean binary
  • No Human Readability: Regulators can't open in Notepad
  • Not Suited: Not designed for natural-language payloads

02 — The Verdict

The Comparison Table

Feature TAI JSON JSONL YAML
Truncation-safe ✔ Auto-heal ✘ Entire file invalid ⚠ Last line lost ✘ Breaks
Zero escaping in content ✔ Content shield ✘ Escape quotes, newlines ✘ Escape quotes, newlines ⚠ Indentation-dependent
Human-readable ⚠ For simple data ⚠ No structure across lines
Atomic frames ✔ Each [[frame]] independent ✘ Monolithic ✔ Each line independent ✘ Monolithic
Metadata support ✔ id, parent_id, timestamps ⚠ Manual ✘ No standard ⚠ Manual
LLM generation reliability ✔ Engine Mode / ⚠ Direct Mode ✘ Fails on truncation ✘ Fails on truncation ✘ LLMs terrible at indentation

03 — The Blockchain Question

The Blockchain Question

Should TAI use blockchain?

No. Absolutely not.

Why this is a terrible idea:

  • Solves Zero Problems: TAI already provides tamper-evidence (append-only), auditability (human-readable), and provenance
  • Destroys Core Value: TAI's killer feature is "A regulator can open the file in Notepad." Blockchain requires specialized tools
  • Performance Catastrophe: TAI frames are written at LLM speed. Blockchain requires mining/consensus
  • Kills Enterprise Adoption: "Blockchain" is an instant rejection from legal/compliance teams

The Better Solution: Cryptographic Signatures (v0.1)

TAI includes Ed25519 signatures:

  1. Agent writes frames (e.g., 100 frames)
  2. Agent signs them with a private key
  3. Writes a [[signature]] frame

This proves authenticity, integrity, and non-repudiation without network overhead, mining, or complexity. It's like a notary stamp, not a global distributed ledger.


The Conclusion

The combination of triple-quoted strings, linear streaming, auto-closure, tolerance for partial documents, explicit section typing, and zero escaping is not present anywhere.

TAI is not a "new invention" but a necessary consolidation of features that are currently scattered and incompatible across existing formats.