[ANN] ToonFormat.jl - Token-Oriented Object Notation for Julia πŸ”΅

Hello Julia community!

I’m excited to announce ToonFormat.jl, a community-driven Julia implementation of TOON (Token-Oriented Object Notation) - a compact, human-readable serialization format optimized for LLM contexts.

:package: GitHub: GitHub - toon-format/ToonFormat.jl: πŸ”΅ Community-driven Julia implementation of TOON
:books: Documentation: Home Β· ToonFormat.jl

What is TOON?

Following up on this discussion about TOON format, we now have a fully compliant Julia implementation!

TOON is a line-oriented, indentation-based text format that encodes the JSON data model with explicit structure and minimal quoting. It achieves 30-60% token reduction compared to JSON while maintaining readability and deterministic structure - making it ideal for LLM applications where every token counts.

Key Features

  • Compact representation of tabular data with 40-60% token savings
  • Minimal quoting requirements for cleaner output
  • Explicit array lengths for validation
  • Multiple delimiter support (comma, tab, pipe)
  • Strict mode for validation
  • Key folding & path expansion for deeply nested structures
  • 100% compatible with JSON data model

Quick Example

JSON (156 tokens):

{
  "users": [
    { "id": 1, "name": "Alice", "role": "admin" },
    { "id": 2, "name": "Bob", "role": "user" }
  ],
  "count": 2
}

TOON (89 tokens - 43% reduction):

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user
count: 2

Usage

Installation

using Pkg
Pkg.add("ToonFormat")

Encoding

using ToonFormat

# Simple object
data = Dict("name" => "Alice", "age" => 30)
toon_str = ToonFormat.encode(data)
# name: Alice
# age: 30

# Tabular arrays
users = [
    Dict("id" => 1, "name" => "Alice", "role" => "admin"),
    Dict("id" => 2, "name" => "Bob", "role" => "user")
]
toon_str = ToonFormat.encode(Dict("users" => users))
# users[2]{id,name,role}:
#   1,Alice,admin
#   2,Bob,user

Decoding

using ToonFormat

input = "name: Alice\nage: 30"
data = ToonFormat.decode(input)
# Dict("name" => "Alice", "age" => 30)

Advanced: Key Folding

# Deep nesting made compact
data = Dict("api" => Dict("v1" => Dict("users" => Dict("endpoint" => "/api/v1/users"))))

options = ToonFormat.EncodeOptions(keyFolding="safe")
ToonFormat.encode(data, options=options)
# api.v1.users.endpoint: /api/v1/users

Specification Compliance

:white_check_mark: FULLY COMPLIANT with TOON Specification v3.0

  • 2090+ passing tests covering all normative requirements
  • 349/349 official fixture tests passing (100%)
  • Complete round-trip compatibility
  • All primitive types, tabular arrays, nested structures, and error conditions validated

Performance

Data Type Token Reduction
Tabular data 40-60%
Nested objects 20-40%
Mixed structures 30-50%

Contributing

Contributions are welcome! The repository includes comprehensive tests and documentation. See CONTRIBUTING.md for guidelines.

Related Resources

License

MIT License Β© 2025 TOON Format Organization


Special thanks to @johannschopplich for creating the TOON specification and to everyone who showed interest in bringing this to Julia!

Feedback, issues, and contributions are very welcome. Looking forward to hearing your thoughts! :rocket:

6 Likes

Since TOON is basically just CSV with some additional metadata, can’t you use an optimized CSV package like CSV.jl to do the actual data reading/writing?

2 Likes

We are open to contributions (especially performance metrics to show current status of implementation and see how it can be improved thanks to CSV.jl)