Skip to main content

Serialization Model

This section explains how MIND data is actually stored and moved around.


JSON and FlatBuffers

MIND uses two encodings:

  • JSON → easy to read, debug, and prototype with.
  • FlatBuffers → fast, binary, great for real-time systems (XR, robotics, agents).

General rule of thumb:

  • Use JSON when you are:
    • designing schemas,
    • building tools,
    • doing offline analysis in Python, etc.
  • Use FlatBuffers when you need:
    • low-latency streaming,
    • high-frequency data,
    • on-device agents.

Containers & External Files

A MIND Container is the “main file” that describes everything.

It can reference external files, like:

  • session1_rgb.mp4 (camera video),
  • session1_eeg.raw (EEG),
  • session1_depth.bin (depth frames).

The Container holds the structure; external files hold the heavy data.

This is similar to:

  • glTF + external buffers,
  • BIDS datasets with separate modality files.

URIs for Resources

External data is referenced by URIs:

{
"uri": "media/session1_rgb.mp4",
"media_type": "video/mp4",
"role": "cv_rgb_video"
}

This lets you store big assets wherever you want:

  • local disk,
  • cloud buckets,
  • mounted drives.

How Schemas Are Found

Modalities, events, metadata each have IDs like:

  • MIND.pose/SkeletalPose@1.0.0
  • MIND.cv/Pose2D@1.0.0
  • MIND.skeleton/HumanDefault@1.0.0

The registry and optional schema_table tell you:

  • which schema to use,
  • what version it is,
  • how to validate it.

Streaming & Appendability

You can write a MIND Container while a session is running.

Think:

  • a robot performing a task,
  • a human in XR,
  • a subject in a biosensing experiment.

The Container is:

  • “open” while logging,
  • “finalized” after recording ends and the index is written.

This supports real-time training pipelines and online agents.


Compression

Data can be compressed internally, like:

"compression": {
"codec": "zstd",
"block_size": 65536
}

This is optional, but very helpful for:

  • long EEG sessions,
  • high-res CV data,
  • long-term archive storage.

JSON vs FlatBuffers in Practice

  • JSON:

    • Good for research,
    • Great for open datasets,
    • Easy to parse with standard libraries.
  • FlatBuffers:

    • Ideal for XR runtimes,
    • Robotics,
    • On-device learning/inference,
    • Servers with high-throughput requirements.

Most ecosystems will use both.


Indexing

The index is like a table of contents for the Container:

  • “Where is pose stream X between t=10s and t=20s?”
  • “Where are the events for episode 3?”
  • “How long is the recording?”

It’s optional, but highly recommended for:

  • training pipelines,
  • interactive tools,
  • server-side analytics.

Summary

Serialization in MIND is designed to support:

  • real-time control & streaming agents,
  • high-volume training & analysis,
  • multimodal datasets with big external files,
  • incremental logging & online operation,
  • long-term archival of complex experiments.