Serialization Model
This section explains how MIND data is actually stored and moved around.
JSON and FlatBuffers
MIND uses two encodings:
- JSON → easy to read, debug, and prototype with.
- FlatBuffers → fast, binary, great for real-time systems (XR, robotics, agents).
General rule of thumb:
- Use JSON when you are:
- designing schemas,
- building tools,
- doing offline analysis in Python, etc.
- Use FlatBuffers when you need:
- low-latency streaming,
- high-frequency data,
- on-device agents.
Containers & External Files
A MIND Container is the “main file” that describes everything.
It can reference external files, like:
session1_rgb.mp4(camera video),session1_eeg.raw(EEG),session1_depth.bin(depth frames).
The Container holds the structure; external files hold the heavy data.
This is similar to:
- glTF + external buffers,
- BIDS datasets with separate modality files.
URIs for Resources
External data is referenced by URIs:
{
"uri": "media/session1_rgb.mp4",
"media_type": "video/mp4",
"role": "cv_rgb_video"
}
This lets you store big assets wherever you want:
- local disk,
- cloud buckets,
- mounted drives.
How Schemas Are Found
Modalities, events, metadata each have IDs like:
MIND.pose/SkeletalPose@1.0.0MIND.cv/Pose2D@1.0.0MIND.skeleton/HumanDefault@1.0.0
The registry and optional schema_table tell you:
- which schema to use,
- what version it is,
- how to validate it.
Streaming & Appendability
You can write a MIND Container while a session is running.
Think:
- a robot performing a task,
- a human in XR,
- a subject in a biosensing experiment.
The Container is:
- “open” while logging,
- “finalized” after recording ends and the index is written.
This supports real-time training pipelines and online agents.
Compression
Data can be compressed internally, like:
"compression": {
"codec": "zstd",
"block_size": 65536
}
This is optional, but very helpful for:
- long EEG sessions,
- high-res CV data,
- long-term archive storage.
JSON vs FlatBuffers in Practice
-
JSON:
- Good for research,
- Great for open datasets,
- Easy to parse with standard libraries.
-
FlatBuffers:
- Ideal for XR runtimes,
- Robotics,
- On-device learning/inference,
- Servers with high-throughput requirements.
Most ecosystems will use both.
Indexing
The index is like a table of contents for the Container:
- “Where is pose stream X between t=10s and t=20s?”
- “Where are the events for episode 3?”
- “How long is the recording?”
It’s optional, but highly recommended for:
- training pipelines,
- interactive tools,
- server-side analytics.
Summary
Serialization in MIND is designed to support:
- real-time control & streaming agents,
- high-volume training & analysis,
- multimodal datasets with big external files,
- incremental logging & online operation,
- long-term archival of complex experiments.