CROUT Format

CROUT (Crous Text) is a human-readable text serialization format. It's designed for debugging, inspection, and scenarios where readability matters more than size. CROUT features a token compression system that replaces repeated dictionary keys with single characters.

Basic Usage

crout_basic.py

import crous

data = {"name": "Alice", "age": 30, "active": True}

# Encode to CROUT text
text = crous.to_crout(data)
print(text)
# Output:
# CROUT1
# {s5:Alice , i30 , T}

# Decode from CROUT text
result = crous.from_crout(text)
assert result == data

Value Syntax

Every value in CROUT has a type prefix that indicates how to parse it:

Prefix	Type	Example	Description
`N`	Null	`N`	Python `None`
`T`	True	`T`	Boolean true
`F`	False	`F`	Boolean false
`i`	Integer	`i42`, `i-7`	64-bit signed integer
`f`	Float	`f3.14`	IEEE 754 double
`s`	String	`s5:hello`	Length-prefixed, binary-safe
`b`	Bytes	`b4:deadbeef`	Length-prefixed, hex-encoded
`#`	Tagged	`#90:[i1,i2]`	Tagged value with numeric tag
`{}`	Dict	`{s3:key:i42}`	Key-value mapping
`[]`	List	`[i1,i2,i3]`	Ordered sequence
`()`	Tuple	`(i1,i2,i3)`	Immutable sequence

Token Compression

CROUT features a token table that replaces frequently-used dictionary keys with single-character tokens. This significantly reduces size for data with repeated keys.

token_compression.py

import crous

# Data with repeated keys
data = [
    {"name": "Alice", "age": 30},
    {"name": "Bob", "age": 25},
    {"name": "Charlie", "age": 35},
]

text = crous.to_crout(data)
print(text)
# Output:
# CROUT1
# @ a=name
# @ c=age
# [{a:s5:Alice , c:i30} , {a:s3:Bob , c:i25} , {a:s7:Charlie , c:i35}]

# "name" → "a", "age" → "c" (single-character tokens)

Token Assignment

Tokens are assigned from a safe alphabet that avoids type prefixes (s, i, f,b, N, T, F). Keys appearing ≥ 2 times get tokens, sorted by frequency (most frequent first). Maximum 64 tokens.

Special Float Values

# Special float values
crous.to_crout(float('inf'))      # "finf"
crous.to_crout(float('-inf'))     # "f-inf"
crous.to_crout(float('nan'))      # "fnan"

CROUT ↔ FLUX Conversion

Crous provides direct conversion between CROUT text and FLUX binary without going through Python objects:

conversion.py

import crous

data = {"name": "Alice", "scores": [98, 95, 100]}

# Python → CROUT text
crout_text = crous.to_crout(data)

# CROUT text → FLUX binary (direct, no Python intermediary)
flux_binary = crous.crout_to_flux(crout_text)

# FLUX binary → CROUT text (direct)
crout_back = crous.flux_to_crout(flux_binary)

# All representations are equivalent
assert crous.from_crout(crout_text) == crous.loads(flux_binary)

CROUT Format Header

Every CROUT document starts with the magic string CROUT1 followed by optional token definitions:

CROUT1                     ← magic + version
@ a=name                   ← token "a" maps to key "name"
@ c=age                    ← token "c" maps to key "age"
[{a:s5:Alice , c:i30}]    ← data using tokens

String Encoding

Strings in CROUT use a length-prefix encoding: s{length}:{data}. This is binary-safe — strings can contain any bytes including null bytes, newlines, and other special characters.

# String encoding examples:
# s0:        → empty string ""
# s5:hello   → "hello"
# s11:hello world → "hello world"
# s3:a\nb    → "a\nb" (newline in string, length includes it)

Bytes Encoding

Bytes use hex encoding with a length prefix: b{length}:{hex}. The length is the number of decoded bytes (not the hex string length).

# Bytes encoding examples:
# b0:           → b""
# b3:414243     → b"ABC"  (hex for 0x41, 0x42, 0x43)
# b4:deadbeef   → b"\xde\xad\xbe\xef"

Custom Types Error Handling