sensor.calibrate() with the gripper open and unloaded. This zeros out finger self-contact and cable stress. Recalibrate if the arm is repositioned significantly.
Recording Tactile Demonstrations
Tactile data is the missing modality in most robot learning datasets. This page explains why it matters, how to record synchronized tactile + arm + camera streams, the extended dataset format, and how to train policies that use tactile inputs.
Why Tactile Data Improves Robot Learning
Vision tells a policy where the gripper is. Proprioception tells it how far the fingers are closed. Neither tells it whether the grasp is stable. A policy trained on vision + joint data alone must learn to infer grasp quality indirectly — from object motion, arm force limits, or trial-and-error during rollout. Adding tactile sensing provides direct contact state supervision: the policy receives a ground-truth signal distinguishing a secure grasp from a slip-prone one at every timestep of every demonstration. This is especially impactful for deformable, transparent, or variably-sized objects where visual grasp quality estimation is unreliable.
Hardware Setup for Synchronized Recording
A complete multi-modal recording rig requires three hardware layers, all synchronized to a common clock:
- Robot arm — provides joint positions, velocities, and end-effector pose at 100–500 Hz via USB or Ethernet. Use the arm SDK's timestamp API, not system time, to get hardware-stamped joint state.
- Paxini Gen3 sensor(s) — plugged into a powered USB hub mounted at the robot wrist. Each frame is timestamped by the host PC at USB interrupt time (nanosecond resolution, <0.5 ms jitter).
- Camera — one wrist-mounted camera (optional: one overhead camera). Use a USB or GigE camera with hardware trigger sync, or a software-triggered camera with known latency. Record at 30–60 fps.
All three sources write timestamps using the same monotonic host clock. The platform SDK's MultiSourceRecorder aligns frames at post-processing time using timestamp interpolation.
Dataset Format — Extended LeRobot Schema
The Paxini Gen3 data collection pipeline extends the standard LeRobot HDF5 dataset format with additional tactile channels. Existing LeRobot tools (data loading, visualization, policy training) remain fully compatible — the new keys are simply ignored by pipelines that do not use them.
| HDF5 Key | Shape | Source |
|---|---|---|
| observation.state | (T, 7) | Arm joint positions + gripper width |
| observation.images.wrist | (T, H, W, 3) | Wrist camera (uint8 RGB) |
| action | (T, 7) | Target joint positions + gripper command |
| observation.tactile.pressure_map | (T, 8, 8) | Paxini Gen3 pressure array (kPa, float32) |
| observation.tactile.total_force_n | (T,) | Total normal force per frame (Newtons) |
| observation.tactile.in_contact | (T,) | Boolean contact flag per frame |
| observation.tactile.contact_centroid | (T, 2) | Contact centroid (row, col) per frame |
| meta/timestamps_ns | (T,) | Nanosecond timestamps for all channels |
New tactile keys are highlighted. All other keys follow the standard LeRobot schema.
Quality Checklist for Tactile Data
in_contact rising edge should coincide with the visible moment of fingertip-object contact in the camera feed. A lag >20 ms indicates a timestamp alignment issue.
paxini.annotate.flag_slip_events(episode) to automatically mark these for review.
pressure_map.max() hits 600 kPa in any episode, the sensor is saturating. Reduce gripper force or use the palm variant (lower peak pressure per taxel) for heavier grasps.
Policy Training with Tactile Inputs
To add tactile as an observation modality in ACT or Diffusion Policy, extend the observation config to include the pressure map or the aggregated scalar (total_force_n). The pressure map provides full spatial information but adds 64 floats per frame per sensor; the scalar is easier to integrate and sufficient for binary grasp quality tasks.
For the full training walkthrough — including how to compute normalization statistics from your recorded dataset and evaluate against a vision-only baseline — see the learning path Unit 5.
For broader context on dexterous hand data collection strategies, see the Dexterous Hands guide.