Serialization Model

This section is normative.

The Serialization Model defines how MIND data is encoded, stored, and exchanged. It supports both real-time and offline workflows for XR, robotics, CV, biosensing, mocap, and embodied AI.

Canonical Encodings

MIND defines two canonical encodings:

JSON (text-based)
FlatBuffers (binary)

Both encodings are normative.

JSON is REQUIRED for read and write.
FlatBuffers is REQUIRED for read, and RECOMMENDED for write, especially in real-time systems.

All normative fields and structures defined in this specification MUST have equivalent representations in both encodings.

Containers and External Resources

A Container is a logical unit representing a recording or dataset.

A Container MAY be serialized as:

a JSON document
a FlatBuffers binary
or both, in parallel

The Container file MUST be able to reference external resources, including:

video files
audio files
raw EEG/Bio data blocks
image sequences
point cloud or depth sequences

The Container MUST remain the authoritative descriptor of:

streams,
events,
metadata,
indices,
resource references.

External Resource References

External resources MUST be referenced via URI-like structures, for example:

{
  "uri": "media/session1_rgb.mp4",
  "media_type": "video/mp4",
  "role": "cv_rgb_video",
  "description": "Primary RGB video for CV pipeline"
}

Rules:

URIs MAY be absolute or relative.
Relative URIs MUST be interpreted relative to the Container’s location, unless otherwise specified.
media_type MUST be a valid MIME type string.
role SHOULD indicate the semantic role (e.g., cv_rgb_video, bio_eeg_raw).

Schema Binding and Lookup

Each stream, event, and metadata object MUST declare identifiers such as:

modality (for streams)
event_type (for events)
metadata_id (for metadata)

These identifiers MUST follow the namespaced, versioned identifier rules defined in Sections 5 and 7.

A decoder MUST be able to resolve these identifiers to concrete schemas via:

A central registry (e.g., MIND-Registry/modality_registry.json, MIND-Registry/metadata_registry.json), and/or
An optional schema_table within the Container, providing local mappings.

A schema_table, if present, MUST map identifiers to:

schema version
schema hash or fingerprint
encoding type (JSON, FlatBuffers)

Streaming and Appendability

Containers MUST support streaming and appendable behavior.

A Container MAY exist in two states:

open (in-progress)
finalized (complete, with index)

While open:

streams MAY be appended with new samples,
events MAY be added,
metadata MAY be added or updated, subject to rules in Section 7.

When finalized:

no further modifications are allowed,
an index (Section 8.9) SHOULD be written.

Streaming containers MUST remain valid partial Containers at all intermediate points, subject to relaxed indexing requirements.

Compression

MIND Containers MAY use internal compression for large data blocks.

Compression MUST be declared explicitly, for example:

{
  "compression": {
    "codec": "zstd",
    "block_size": 65536
  }
}

Rules:

Supported codecs SHOULD be documented in the registry.
Tools MUST NOT assume any particular compression codec.
A decoder that cannot handle a given codec MUST fail gracefully.

External resources MAY also be compressed using any suitable means, as indicated by their MIME types.

FlatBuffers Encoding Rules

FlatBuffers-encoded MIND Containers MUST:

Use Little-Endian byte order.
Follow the .fbs schemas defined under MIND-Schemas/flatbuffers/.
Represent all normative fields defined in this specification.

Tools MUST validate FlatBuffers payloads against the relevant .fbs schemas.

JSON Encoding Rules

JSON-encoded MIND Containers MUST:

Use UTF-8 encoding.
Represent all required fields and structures as specified in Sections 3–7.
Use canonical field names as defined in the schemas.

Unknown fields MUST be ignored unless a schema or extension marks them as required.

Indexing and Random Access

Containers MAY include an index structure to enable efficient querying and random access.

The index MAY contain:

per-stream sample offsets (time → byte offset / array index),
per-event offsets,
per-metadata references,
summary statistics (duration, sampling rates, etc.).

If present, the index MUST be consistent with the Container’s content. A mismatch between index and content MUST invalidate the Container.

ASCII sketch:

Container
 ├── header
 ├── metadata[]
 ├── streams[]
 ├── events[]
 └── index   ← optional

Minimum Conformance Requirements

A conformant implementation MUST:

Read and write JSON Containers.
Read FlatBuffers Containers.
Respect all normative structures defined in this specification.
Handle unknown modalities, events, and metadata in a forward-compatible way.

Implementations intended for real-time or embedded use SHOULD:

Support writing FlatBuffers Containers.
Support internal compression for performance and size.

Summary

The serialization model:

defines JSON and FlatBuffers as canonical encodings,
supports external resources via URIs,
allows streaming and appendable Containers,
supports optional internal compression,
defines clear schema binding behavior,
supports optional standardized indexing,
and specifies conformance behavior for implementations.

Canonical Encodings​

Containers and External Resources​

External Resource References​

Schema Binding and Lookup​

Streaming and Appendability​

Compression​

FlatBuffers Encoding Rules​

JSON Encoding Rules​

Indexing and Random Access​

Minimum Conformance Requirements​

Summary​