Type System
Crous maps Python types to an internal crous_value tree, which is then encoded to one of the wire formats. This page documents every supported type, its encoding behavior, and edge cases.
Core Types
Null
Python's None maps to the CROUS NULL type (tag 0x00).
crous.dumps(None) # 7 bytes (6-byte header + 1 tag byte)
crous.loads(binary) # NoneBoolean
Booleans are encoded as single-byte tags: 0x01 for False,0x02 for True. Since Python's bool is a subclass of int, Crous checks for bool first to preserve the type.
crous.dumps(True) # tag 0x02
crous.dumps(False) # tag 0x01
# Type preservation
assert type(crous.loads(crous.dumps(True))) is bool # not int!Integer
Integers are stored as 64-bit signed values using zigzag encoding in FLUX format. Small integers (0–24) are encoded in a single byte.
# Small int optimization (single byte!)
crous.dumps(0) # tag 0x10 → 1 byte
crous.dumps(24) # tag 0x28 → 1 byte
crous.dumps(-1) # tag 0x29 → 1 byte
crous.dumps(-32) # tag 0x48 → 1 byte
# Larger integers use zigzag varint
crous.dumps(1000) # tag 0x03 + zigzag varint
# Full 64-bit range
crous.dumps(2**63 - 1) # max int64
crous.dumps(-(2**63)) # min int64Integer Overflow
CrousEncodeError. Use default or a custom serializer to handle int values larger than 64 bits.Float
Floating-point numbers are stored as 8-byte IEEE 754 doubles. Special values (NaN, Infinity, -Infinity) are preserved.
import math
crous.dumps(3.14) # IEEE 754 double
crous.dumps(float('inf')) # preserved
crous.dumps(float('-inf')) # preserved
crous.dumps(float('nan')) # preserved
# NaN comparison caveat
val = crous.loads(crous.dumps(float('nan')))
assert math.isnan(val) # True (but val != val)String
Strings are stored as UTF-8 with a varint length prefix. The encoder validates UTF-8 encoding and rejects invalid sequences.
crous.dumps("") # empty string (length 0)
crous.dumps("hello") # varint(5) + "hello"
crous.dumps("こんにちは") # varint(15) + UTF-8 bytes
# Full Unicode support
crous.dumps("🎉🐍💚") # emoji support
crous.dumps("مرحبا") # Arabic
crous.dumps("Привет") # CyrillicBytes
Both bytes and bytearray are stored as raw byte sequences with a varint length prefix.
crous.dumps(b"\x00\x01\x02") # raw bytes
crous.dumps(bytearray([1, 2, 3])) # also works
# Round-trip always returns bytes (not bytearray)
result = crous.loads(crous.dumps(bytearray([1, 2, 3])))
assert type(result) is bytesContainer Types
List
Lists are encoded with a varint count followed by each element.
crous.dumps([1, 2, 3]) # varint(3) + elements
crous.dumps([]) # empty list (count 0)
crous.dumps([1, "two", 3.0]) # mixed types OK
crous.dumps([[1, 2], [3, 4]]) # nested listsTuple
Tuples have their own type tag (TUPLE), distinct from lists. Type is preserved on round-trip.
# Tuples are NOT lists!
data_list = [1, 2, 3]
data_tuple = (1, 2, 3)
result_list = crous.loads(crous.dumps(data_list))
result_tuple = crous.loads(crous.dumps(data_tuple))
assert type(result_list) is list # ✓
assert type(result_tuple) is tuple # ✓ (preserved!)Dictionary
Dictionaries are encoded with a varint count, then each key-value pair. Keys must be strings.
crous.dumps({"a": 1, "b": 2}) # varint(2) + pairs
crous.dumps({}) # empty dict
# Keys MUST be strings
try:
crous.dumps({1: "value"})
except crous.CrousEncodeError:
print("Integer keys not supported!")String Keys Only
CrousEncodeError.Extended Types
Set
Sets are encoded as tagged values with tag 90, wrapping a list of the set's elements. On decode, the list is automatically reconstructed as a set.
data = {1, 2, 3, "four"}
binary = crous.dumps(data)
result = crous.loads(binary)
assert type(result) is set # ✓
assert result == data # ✓Frozenset
Frozensets use tag 91 and are reconstructed as frozenset on decode.
data = frozenset([1, 2, 3])
result = crous.loads(crous.dumps(data))
assert type(result) is frozenset # ✓Tagged Values
Tagged values wrap any value with a numeric tag. Tags 90 and 91 are reserved for set/frozenset. Tags 100+ are used by the custom serializer registry.
Built-in Tag Assignments
| Tag | Type | Description |
|---|---|---|
| 80 | datetime | Named tag (parser only) |
| 81 | date | Named tag (parser only) |
| 82 | time | Named tag (parser only) |
| 83 | timedelta | Named tag (parser only) |
| 84 | decimal | Named tag (parser only) |
| 90 | set | Built-in set encoding |
| 91 | frozenset | Built-in frozenset encoding |
| 92 | complex | Named tag (parser only) |
| 100+ | Custom | Auto-assigned by register_serializer |
Type Encoding Summary
| Tag Byte | Type | Encoding |
|---|---|---|
0x00 | NULL | 1 byte |
0x01 | FALSE | 1 byte |
0x02 | TRUE | 1 byte |
0x03 | INT | 1 + zigzag varint |
0x04 | FLOAT | 1 + 8 bytes (big-endian) |
0x05 | STRING | 1 + varint(len) + data |
0x06 | BYTES | 1 + varint(len) + data |
0x07 | LIST | 1 + varint(count) + elements |
0x08 | DICT | 1 + varint(count) + pairs |
0x09 | TAGGED | 1 + varint(tag) + value |
0x0A | TUPLE | 1 + varint(count) + elements |
0x10–0x28 | POSINT | 1 byte (integers 0–24) |
0x29–0x48 | NEGINT | 1 byte (integers -1 to -32) |
Nesting Limits
Crous enforces a maximum nesting depth of 256 levels to prevent stack overflow. Attempting to encode deeper structures raises CrousEncodeError.
Size Limits
Individual strings and byte sequences are limited to 64 MB (67,108,864 bytes). This prevents memory exhaustion from malicious or corrupted data.