Skip to main content

Environment & Scene Graph

This section explains how environments work in MIND.


Why an Environment Model?

Human data, XR data, biosensing, robotics, and AI training all depend on context:

  • Where is the person?
  • Where is the robot?
  • What objects are in the scene?
  • What does the environment look like?

Environments give meaning to spatial data.


Scene Graph Basics

MIND uses a lightweight scene graph, simpler than USD but more expressive than glTF.

ASCII example:

world_root
├── room_frame
│ ├── table
│ │ └── cup
│ └── chair
└── robot_base
└── robot_arm

Node Types

FrameNode

Just a transform: defines a coordinate frame.

ObjectNode

Represents real-world or virtual objects.

  • geometry
  • category
  • affordances

RegionNode

Areas of interest:

  • floor plane
  • table surface
  • no-go region
  • pickup zone

AnchorNode

Stable references (XR anchors, SLAM landmarks).


Geometry Handling

You can embed geometry directly:

"geometry": { "primitive": "box", "size": [1,1,1] }

Or reference external files:

"uri": "models/cup.glb"

Semantics & Affordances

ObjectNodes can tell agents how they might be used:

"category": "cup",
"affordances": { "can_grasp": true, "can_contain_liquid": true }

This is essential for embodied AI.


Dynamic Nodes

Objects can move.

Example:

  • tracked cups
  • robot arms
  • humans interacting with objects

Dynamic transforms come from streams such as:

  • ObjectPose
  • MarkerSets
  • CV tracking

Linking Streams and Events

Nodes can reference:

  • Pose streams → how the object moves
  • Events → interactions involving the object

Example:

linked_streams: ["cup1_pose"]
linked_events: ["evt_grasp_23"]

Embedded vs External Environment

Small scenes → embed in the Container.
Large scenes → reference external environment files.

Flexible for many workflows.


Summary

MIND’s environment system:

  • describes physical & virtual worlds,
  • connects objects to events and streams,
  • supports geometry + semantics + movement,
  • is lightweight but expressive,
  • works across XR, robotics, CV, biosensing, embodied AI.