TOON: Token-Efficient Data Format for LLMs
Today I discovered TOON (Token-Oriented Object Notation), a serialization format designed specifically for LLMs that achieves 30-60% token reduction compared to JSON.
Why It's Better
Unlike JSON, TOON eliminates redundant syntax (braces, brackets, most quotes). Unlike CSV, it supports nested fields. It also provides better accuracy through explicit lengths and fields before hand, this seems to make LLM better to understand the data that's comming.
Example
JSON:
{
"users": [
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" }
]
}
TOON:
users[2]{id,name,role}:
1,Alice,admin
2,Bob,user
Keys are declared once, then data flows as comma-separated rows with YAML-style indentation for nesting.
Tweet