Skip to main content

Documentation Index

Fetch the complete documentation index at: https://lancedb-bcbb4faf-mintlify-371da1b6.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

LeRobot is Hugging Face’s open-source robotics stack for collecting data, training policies, running simulations, and sharing robotics datasets and models on the Hub. LeRobotDataset v3.0 standardizes robot learning data across sensorimotor time series, actions, multi-camera video, and task metadata. Its v3 layout stores high-frequency tabular signals in Parquet, visual streams as MP4 shards, and metadata that reconstructs episode-level views from larger files. Lance pairs well with LeRobot when you need high-performance random access, lazy multimodal blob reads, and a single table interface for curation, search, and training data preparation. The lerobot-lancedb package ships Lance-backed LeRobotDataset subclasses, and LanceDB can open Lance-formatted LeRobot datasets on the Hub directly through hf:// URIs.

Install

pip install lancedb lance lerobot-lancedb

Use Lance-backed LeRobotDataset loaders

LeRobotLanceDataset is useful when your Lance-backed dataset stores decoded image observations. It’s a drop-in replacement for LeRobotDataset, so existing policy training code keeps working with the usual PyTorch dataset and dataloader patterns. For datasets that store camera observations as MP4 video segments, use LeRobotLanceVideoDataset instead.
Use the image loader for Lance-backed repos that store image frames. Use the video loader for MP4-backed LeRobot datasets such as lance-format/lerobot-pusht-lance.

Open LeRobot Lance tables with LanceDB

Lance-formatted LeRobot datasets published by lance-format expose each .lance file under data/ as a LanceDB table. The PushT dataset, for example, has frames, episodes, and videos tables. Opening the tables directly is handy for inspecting schemas, counting rows, sampling metadata, or building curation workflows before any data reaches the training loop.

Filter a frame window

Most robotics workflows want a deterministic slice by episode_index, frame_index, or task metadata long before training begins. LanceDB filters those rows without touching the video blobs. With the filtered set in hand, you can materialize a smaller local LanceDB database, add derived columns, attach embeddings, or build vector and scalar indexes for faster repeated access.

Example Lance-formatted LeRobot datasets

LeRobot PushT

A Lance-formatted version of lerobot/pusht with frame, episode, and video tables.

LeRobot X-VLA Soft-Fold

A multi-camera robotics dataset packaged as Lance tables for frame-level and episode-level access.

More resources

LeRobotDataset v3.0

Hugging Face’s guide to the v3 dataset layout, streaming, transforms, and migration.

lerobot-lancedb

API documentation for the Lance-backed LeRobotDataset implementations.

When to use each interface

InterfaceBest for
LeRobotDatasetStandard LeRobot training loops and policy code
LeRobotLanceDatasetDrop-in training on Lance-backed image datasets
LeRobotLanceVideoDatasetDrop-in training on Lance-backed video datasets
LanceDBInteractive inspection, filtering, curation, search, indexing, and materializing subsets
lance.dataset(...)Lower-level schema, fragment, index, and blob access