RUDRA / 003 - EMBODIED INTELLIGENCE STACK

Sensory intelligence
for machines that
move like living beings.

We capture the full human signal - sight, motion, touch, intent - across egocentric, wrist, body, and tactile sensors. The most exhaustive multi-modal stack feeding the most advanced VLA datasets in production.

6+
Synchronized sensor streams
1280²
Fisheye + wrist RGB
500Hz
9-axis IMU + mocap
VLA
Training-aware curation
EGOCENTRIC + WRIST CAPTUREVLA FOUNDATION MODELSMOTION CAPTURE + TACTILEROBOT-READY DATASETSMULTI-MODAL SENSOR FUSIONBENGALURU · GLOBALEGOCENTRIC + WRIST CAPTUREVLA FOUNDATION MODELSMOTION CAPTURE + TACTILEROBOT-READY DATASETSMULTI-MODAL SENSOR FUSIONBENGALURU · GLOBAL
/ 01 - STACK

From human perception to robot intelligence.

003 PILLARS
01

Capture

A full-body sensor stack on every operator. Egocentric fisheye RGB, wrist-mounted cameras, 4-camera SLAM array, 9-axis IMU at 500Hz, motion-capture markers, and hand-mounted tactile sensors - all hardware-synchronized.

02

Curate

We run our own VLA models internally. We know what training pipelines actually need, and we shape multi-modal datasets to that signal - aligned vision, proprioception, and contact at frame-accurate timing.

03

Deliver

The most advanced VLA-ready datasets shipping today. Structured, labelled, multi-sensor, and verified by humans. Drop straight into partner training loops.

/ 02 - HARDWARE

Wearables that see the way humans see.

Six-plus synchronized sensor streams on a single operator. Egocentric and wrist cameras, SLAM array, 9-axis IMU, motion-capture markers, and hand tactile sensors. Sub-millisecond timing. Every frame tagged with pose, gaze, and contact - ready to train the next generation of vision-language-action models.

Fisheye RGB
1280×1280 @ 60fps
Wrist cameras
Stereo bi-manual
SLAM array
4 × 640×480 @ 30fps
IMU
9-axis @ 500Hz
Motion capture
Full-body markers
Tactile
Hand contact sensors
/ 03 - PIPELINE

Raw human perception → foundation-model fuel.

  1. 00
    Human

    Operator wears the rig and performs a real-world task.

  2. 01
    Capture

    Egocentric, wrist, SLAM, IMU, mocap, and tactile - all fused.

  3. 02
    Annotate

    Human-in-the-loop labelling of intent, contact, and trajectory.

  4. 03
    Curate

    Our internal VLA loop scores and selects what actually trains.

  5. 04
    Deliver

    Drop-in datasets for partner foundation-model pipelines.

/ 04 - SAMPLE CAPTURES

What our data actually looks like.

Raw egocentric capture, fused with annotation and 3D world reconstruction. Rendered in Rerun to show the depth of signal we hand to partner training pipelines.

RERUN · 3D WORLD MODEL
Electronics / sorting
Pick-place sequence · fine manipulation
Guacamole / prep
Bi-manual task · hand pose · object contact
Stationery / sorting
Multi-object sort · category reasoning
/ 05 - DATA PRODUCTS

A full catalogue of VLA-ready datasets.

Six product lines covering the full surface of embodied learning, from human locomotion to humanoid whole-body manipulation. Standardized schema, cross- embodiment compatible, validated on partner training loops.

006 PRODUCT LINES
Locomotion / Motion Capture
P-01

Locomotion / Motion Capture

Whole-body human motion from monocular and multi-view video. Robot joint angles, SMPL-X parameters, time-aligned 3D skeletons.

SMPL-XJoint angles3D skeleton
Humanoid walking · RL · imitation learning
Single / Dual-Arm Manipulation
P-02

Single / Dual-Arm Manipulation

Bi-manual demonstration data from human operators and teleop rigs. End-effector trajectories with frame-accurate sync.

TCP trajectoryBi-manualTeleop
Policy learning · behavior cloning
Humanoid Whole-Body Manipulation
P-03

Humanoid Whole-Body Manipulation

Upper-limb and torso coordination for full humanoid embodiments. Captured with motion-capture markers and multi-DoF rigs.

Whole-bodyTorso poseMulti-DoF
Humanoid upper-limb · WBC policy
Egocentric Video
P-04

Egocentric Video

First-person streams from our head-mounted rigs. 180° fisheye RGB + wrist cameras + SLAM, temporally aligned with action.

Fisheye 1280²Wrist RGBSLAM
VLA pretraining · VLM grounding
Human-Object Interaction (HOI)
P-05

Human-Object Interaction (HOI)

Structured contact, hand-object 6-DoF pose, and object state transitions. The signal models actually need to learn manipulation.

6-DoF objectHand-objectContact state
Manipulation prediction · VLM training
Synthetic & Augmented Data
P-06

Synthetic & Augmented Data

Sim and generative augmentation for long-tail and high-risk scenarios. Same schema as real capture, drop-in compatible.

Sim2realLong-tailAugmentation
Coverage expansion · safety scenarios
/ 06 - SPECIFICATIONS

Every dataset, fully specified.

Cross-embodiment, multi-modal, temporally aligned. We standardize the schema so partners can drop our data straight into the training pipelines they already run - no glue code, no format conversions, no surprises.

Data modalities
TCP trajectory · object trajectory · pointcloud · tactile · 6-DoF pose · IMU · gaze
Temporal alignment
Hardware-synchronized multi-modal capture · sub-millisecond accuracy
Sensor stack
Fisheye 1280² · wrist RGB stereo · 4× SLAM · 9-axis IMU 500Hz · mocap · hand tactile
Scenario coverage
Industrial · logistics · warehousing · retail · kitchen · service · in-the-wild
Robot embodiments
Cross-embodiment: Unitree G1, Tien Kung, Fourier, Galbot, custom arms
End-effector support
Inspire · Xhand · Allegro · Shadow · parallel grippers
Delivery format
LeRobot · RLDS · HDF5 · custom partner schemas
Quality control
Human-in-the-loop verification · VLA-scored curation · traceable versioning
µs
Sync accuracy
6+
Sensor modalities
8+
Robot embodiments
100%
Human-verified

“The robots of today are blind. They move without understanding, act without perception, and fail the moment the world stops cooperating. We are here to change that.”

- RUDRA / FOUNDING DOCTRINE
/ 07 - PARTNER

Training a VLA model? Let's talk data.