Level 5 Open Data

Advancing self-driving technology, together.

Level 5 is developing a self-driving system for the Lyft network. We’re collecting and processing data from our autonomous fleet and sharing it with you.

We’re collaborating to solve one of the biggest engineering challenges.

At Level 5, we believe we can deliver the benefits of self-driving technology sooner by working together. We share data to empower the research community to accelerate machine learning innovation. Self-driving is too big – and too important — an endeavor for any one team to solve alone.

The road to autonomous driving.


Enter the Motion Prediction Competition

Experiment with the largest-ever self-driving Prediction Dataset to build motion prediction models and compete for $30K in prizes.


Prediction Dataset

Train motion prediction models with the largest collection of prediction data released to date. The dataset includes the logs of over 1,000 hours of movement of various traffic agents—such as cars, cyclists, and pedestrians—that our autonomous fleet encountered on Palo Alto routes.

Learn More

Perception Dataset

Use raw camera and lidar inputs from our fleet of autonomous vehicles to train perception systems. To supplement the data, we’ve included human-labeled 3D bounding boxes of traffic agents and an underlying HD spatial semantic map.

Learn More

Data Collection

Level 5’s in-house sensor suite


Our vehicles are equipped with 40 and 64-beam lidars on the roof and bumper. They have an Azimuth resolution of 0.2 degrees and jointly produce ~216,000 points at 10 Hz. Firing directions of all lidars are synchronized.


Our vehicles are also equipped with six 360° cameras built in-house. One long-focal camera points upward. Cameras are synchronized with the lidar so the beam is at the center of each camera’s field of view when images are captured.

Dataset Features

Real-world data from our autonomous fleet


High-definition semantic map

The datasets include a high-definition semantic map to provide context about traffic agents and their motion. The map features over 4,000 manually annotated semantic elements, including lane segments, pedestrian crosswalks, stop signs, parking zones, speed bumps, and speed humps. All semantic map elements are registered to an underlying geometric map so their locations are clear.


Variety of conditions

Datasets include elements from real-world scenarios, including vehicles, pedestrians, intersections, and multi-lane traffic.

Level 5 Team

We’re looking for great minds that think different.

Stay Informed

Subscribe for Level 5 updates