CROUT Format
CROUT (Crous Text) is a human-readable text serialization format. It's designed for debugging, inspection, and scenarios where readability matters more than size. CROUT features a token compression system that replaces repeated dictionary keys with single characters.
Basic Usage
import crous
data = {"name": "Alice", "age": 30, "active": True}
# Encode to CROUT text
text = crous.to_crout(data)
print(text)
# Output:
# CROUT1
# {s5:Alice , i30 , T}
# Decode from CROUT text
result = crous.from_crout(text)
assert result == dataValue Syntax
Every value in CROUT has a type prefix that indicates how to parse it:
| Prefix | Type | Example | Description |
|---|---|---|---|
N | Null | N | Python None |
T | True | T | Boolean true |
F | False | F | Boolean false |
i | Integer | i42, i-7 | 64-bit signed integer |
f | Float | f3.14 | IEEE 754 double |
s | String | s5:hello | Length-prefixed, binary-safe |
b | Bytes | b4:deadbeef | Length-prefixed, hex-encoded |
# | Tagged | #90:[i1,i2] | Tagged value with numeric tag |
{} | Dict | {s3:key:i42} | Key-value mapping |
[] | List | [i1,i2,i3] | Ordered sequence |
() | Tuple | (i1,i2,i3) | Immutable sequence |
Token Compression
CROUT features a token table that replaces frequently-used dictionary keys with single-character tokens. This significantly reduces size for data with repeated keys.
import crous
# Data with repeated keys
data = [
{"name": "Alice", "age": 30},
{"name": "Bob", "age": 25},
{"name": "Charlie", "age": 35},
]
text = crous.to_crout(data)
print(text)
# Output:
# CROUT1
# @ a=name
# @ c=age
# [{a:s5:Alice , c:i30} , {a:s3:Bob , c:i25} , {a:s7:Charlie , c:i35}]
# "name" → "a", "age" → "c" (single-character tokens)Token Assignment
s, i, f,b, N, T, F). Keys appearing ≥ 2 times get tokens, sorted by frequency (most frequent first). Maximum 64 tokens.Special Float Values
# Special float values
crous.to_crout(float('inf')) # "finf"
crous.to_crout(float('-inf')) # "f-inf"
crous.to_crout(float('nan')) # "fnan"CROUT ↔ FLUX Conversion
Crous provides direct conversion between CROUT text and FLUX binary without going through Python objects:
import crous
data = {"name": "Alice", "scores": [98, 95, 100]}
# Python → CROUT text
crout_text = crous.to_crout(data)
# CROUT text → FLUX binary (direct, no Python intermediary)
flux_binary = crous.crout_to_flux(crout_text)
# FLUX binary → CROUT text (direct)
crout_back = crous.flux_to_crout(flux_binary)
# All representations are equivalent
assert crous.from_crout(crout_text) == crous.loads(flux_binary)CROUT Format Header
Every CROUT document starts with the magic string CROUT1 followed by optional token definitions:
CROUT1 ← magic + version
@ a=name ← token "a" maps to key "name"
@ c=age ← token "c" maps to key "age"
[{a:s5:Alice , c:i30}] ← data using tokensString Encoding
Strings in CROUT use a length-prefix encoding: s{length}:{data}. This is binary-safe — strings can contain any bytes including null bytes, newlines, and other special characters.
# String encoding examples:
# s0: → empty string ""
# s5:hello → "hello"
# s11:hello world → "hello world"
# s3:a\nb → "a\nb" (newline in string, length includes it)Bytes Encoding
Bytes use hex encoding with a length prefix: b{length}:{hex}. The length is the number of decoded bytes (not the hex string length).
# Bytes encoding examples:
# b0: → b""
# b3:414243 → b"ABC" (hex for 0x41, 0x42, 0x43)
# b4:deadbeef → b"\xde\xad\xbe\xef"