How to Record a LeRobot-Compatible Dataset

LeRobot is Hugging Face's open-source robot-learning stack. Its dataset format has become the default for arm-scale imitation learning: episode-based, parquet + video shards, self-describing metadata, HuggingFace Hub native. This tutorial walks through recording a clean 50-episode dataset from scratch using the lerobot record CLI and publishing it.

Data collection Total time: about 2 hours Difficulty: Beginner Updated April 2026

What you will accomplish

At the end of this tutorial you will have: a 50-episode LeRobot dataset on disk, a visual inspection pass over it, normalization statistics, and a public or private dataset published to the HuggingFace Hub. That dataset is immediately compatible with LeRobot policy training (ACT, Diffusion Policy, VLA heads), reference implementations, and community evaluation.

The LeRobot dataset format stores each episode as a row range in a parquet file with joint states, actions, and references to video frames stored in separate MP4 shards. The format is designed to stream efficiently from the Hub, load cleanly into PyTorch DataLoaders, and be self-describing via a JSON metadata file.

Prerequisites

The steps

  1. Install LeRobot

    Install LeRobot from pip. Optional extras exist per hardware type:

    conda create -n lerobot python=3.10 -y
    conda activate lerobot
    pip install --upgrade pip
    pip install lerobot
    
    # hardware-specific extras (pick yours)
    pip install 'lerobot[feetech]'    # SO-100, Koch
    pip install 'lerobot[dynamixel]'  # Aloha, WidowX
    pip install 'lerobot[intelrealsense]'  # for RealSense cameras

    Verify the install: python -c "import lerobot; print(lerobot.__version__)". Check upstream at github.com/huggingface/lerobot for the current supported hardware matrix — the project moves quickly.

  2. Calibrate your arm

    Each supported arm has a calibration sub-command. General pattern:

    lerobot-calibrate \
      --robot.type=so100 \
      --robot.port=/dev/ttyACM0

    This discovers motor IDs, sets joint zero positions, and stores the calibration to ~/.cache/huggingface/lerobot/calibration/. Re-run if you ever remount the arm or swap cables.

  3. Verify cameras and teleop

    Run a teleop session with a live preview to sanity-check everything:

    lerobot-teleoperate \
      --robot.type=so100 \
      --robot.port=/dev/ttyACM0 \
      --robot.cameras='{"top": {"type":"opencv","index":0}, "wrist": {"type":"opencv","index":2}}' \
      --display_data=true

    Confirm both cameras stream at 30 FPS without tearing, arm tracks cleanly, and the preview window updates. If the arm jitters, check USB cable quality first — LeRobot arms are serial-bus sensitive to flaky cables.

  4. Define the task and instruction

    Pick one well-defined task for your first dataset. "Pick the red cube and place it in the blue bowl" beats "do stuff with the cubes" every time. Your natural-language instruction will be saved verbatim on every episode and will be the only text conditioning a downstream policy sees.

    Tip: keep task success criteria binary and observable from the data. "Cube is in the bowl" is crisp. "Cube is placed gently" is not.
  5. Record 50 episodes

    Now the main event. Record 50 episodes with the lerobot record CLI. The flag names move slightly between releases — the general invocation pattern is:

    lerobot-record \
      --robot.type=so100 \
      --robot.port=/dev/ttyACM0 \
      --robot.cameras='{"top": {"type":"opencv","index":0}, "wrist": {"type":"opencv","index":2}}' \
      --dataset.repo_id=<your-username>/red_cube_blue_bowl \
      --dataset.num_episodes=50 \
      --dataset.single_task="Pick the red cube and place it in the blue bowl." \
      --dataset.fps=30 \
      --dataset.episode_time_s=20

    Between episodes, reset the scene to a slightly different initial configuration — vary cube position, bowl position, lighting angle. This is the single biggest factor in how well a trained policy generalizes. Take breaks. Operator fatigue after ~30 episodes is real; the quality of your last 20 demos matters more than you think.

  6. Inspect the dataset

    Scrub through episodes with the built-in visualizer:

    lerobot-dataset-visualize --repo-id=<your-username>/red_cube_blue_bowl

    Watch for: camera dropouts, joint discontinuities, episodes where you missed the task, frame-rate variability. Common sanity check: plot action norm over time — spikes usually mean teleop artefacts.

  7. Drop bad episodes and compute stats

    Remove failed or messy episodes from the dataset metadata. LeRobot supports filtering by episode index. After cleanup, recompute normalization stats (mean / std per action and state dimension) so downstream training uses correct values. The lerobot CLI includes a compute-stats command for this.

  8. Push to HuggingFace Hub

    Authenticate once with huggingface-cli login, then push. The lerobot record CLI supports automatic upload; you can also push after the fact:

    huggingface-cli upload <your-username>/red_cube_blue_bowl \
      ~/.cache/huggingface/lerobot/<your-username>/red_cube_blue_bowl \
      --repo-type=dataset

    The dataset is now public (or private). You can load it anywhere with LeRobotDataset.from_pretrained(...). Train a baseline ACT policy with one command to validate — if ACT can fit 50 demos in under an hour, your dataset is healthy.

What to do next

Once you have a clean LeRobot dataset, two high-value follow-ups: (1) train a policy on it — ACT is the easiest first baseline, and (2) fine-tune OpenVLA on the dataset to compare against the ACT baseline. If you are scaling up to serious data collection, the ALOHA bimanual teleop is the next step up in hardware; for humanoids, start with Unitree G1 camera calibration.

Common failure modes

Episodes have different lengths: expected — LeRobot pads during training. Keep recording at a fixed target duration, then let the framework handle length variance.

Huge dataset size: LeRobot video shards are H.264-encoded. If size is still a problem, drop to 15 FPS or 480p for cameras that do not need more resolution.

Policy overfits instantly: not enough scene variation during collection. Randomize initial conditions more aggressively.

Hub upload fails: usually repo-exists or permission errors. Create the repo first with huggingface-cli repo create.

Deep dive: the LeRobot dataset format

LeRobot dataset layout on disk: a top-level directory containing meta/info.json, meta/stats.json, meta/episodes.jsonl, a data/ folder with parquet shards (usually one per episode), and a videos/ folder with MP4 shards. The info.json declares dataset features — joint state shape, action shape, camera names, FPS — and serves as the authoritative schema for loaders. The parquet rows are the per-timestep scalars and references; the MP4s hold the image observations, referenced by frame index.

This design is deliberate: scalars in parquet gives fast columnar access for batched dataloading, and MP4s give order-of-magnitude better on-disk compression than per-frame image files without sacrificing random access. H.264 encoded at CRF 18 is nearly lossless for manipulation; CRF 23 gives you roughly 3x smaller files with no visible quality impact on policy training for most tasks.

Deep dive: scene variation is the single biggest quality lever

If you have 2 hours for recording, spend the first 30 minutes planning scene variation, not the first 90 minutes recording baseline episodes. What to vary:

Deep dive: from recording to training in one afternoon

The whole point of recording a LeRobot dataset is to train a policy on it. The baseline recipe once recording is done: lerobot-train policy=act env=real dataset_repo_id=<your-username>/red_cube_blue_bowl. On a single GPU, an ACT policy fits 50 episodes in about 45 minutes to 2 hours. You should see a steadily decreasing loss and action-space error. Evaluate on the real robot by running the trained checkpoint with lerobot-eval. If your dataset was clean, you should see 70%+ success on the trained task with 50 episodes, and 85%+ with 100.

Deep dive: when to stop recording and start training

Teams frequently over-record. The right heuristic: train a baseline ACT policy after every 25 episodes, measure success, and stop when the success curve flattens. You will often hit diminishing returns around 80 to 150 episodes for a single task. Spending that extra recording time on a different task instead produces a better multi-task policy.

Frequently asked questions

Can I use my own custom robot with LeRobot? Yes. Implement a Python class that matches the LeRobot Robot interface — connect, disconnect, read joint positions, send actions, read camera frames. A few hundred lines typically.

How big is a typical 50-episode dataset? Roughly 1 to 5 GB depending on camera count and resolution. Easy to host on the Hub free tier.

Can I record simulation data? Yes — LeRobot supports sim envs (ALOHA, PushT, Xarm). Sim data is great for pretraining baselines.

What about proprioception beyond joint angles? LeRobot supports arbitrary state dimensions. Declare them in the dataset info and the loader handles the rest.

Related tutorials and resources