Datasets are undeniably the machine learning algorithms’ lifeblood. In fact, they help to train artificial intelligence (AI) facts pertaining to the real world through speaking. In areas like autonomous driving, it is highly crucial to make sure that they are of the highest quality.
For this reason, two of the most influential companies in the self-driving ecosystem, Scale Labs and Aptiv PLC ’s nuTonomy, recently open-sourced a huge research dataset, primarily developed to assist autonomous driving car projects.
nuTonomy released an autonomous driving dataset dubbed nuScenes, which it claims surpasses in both accuracy and size public datasets such as the Udacity Self-Driving Car library, Baidu’s ApolloScape and KITTI.
On the other hand, San-Francisco-based data labeling startup Scale delivered annotations.
“We’re proud to provide the annotations … as the most robust open source multi-sensor self-driving dataset ever released. We believe this will be an invaluable resource for researchers in developing autonomous vehicle systems and one that will help to shape and accelerate their production for years to come,” said Scale CEO Alexandr Wang.
Self-driving cars depend on artificial intelligence models in making navigation decisions. In turn, the AI models have to be trained using massive amounts of sample information to attain the needed accuracy. This is where the new dataset comes into play.
Currently, nuTonomy and Scale are referring to the library as the most detailed and largest among the autonomous driving datasets available on the market.
According to both companies, this comprises DeepDrive, a dataset that is designed by researchers drawn from the University of California and is considered the ideal choice for self-driving car initiatives.
DeepDrive consists of 100,000 images and 100,000 video sequences taken in the course of drives on public roads.
Both Scales and nuTonomy call their dataset NuScenes, which comprises 1,000 20-second video clips, 1.4 million pictures and 400,000 3D scans taken using LIDAR sensors. What ’s more, it boasts 400,000 bounding boxes drawn from the images that point out the objects of interest.
nuTonomy led the entire data gathering effort. The group, which was initially established in 2013 as an MIT spinoff has created a “full-stack” software platform mainly for self-driving. Furthermore, nuTonomy was purchased last October by Aptiv, formerly Delphi, for more than $40 million.
Scale, on the other hand, has secured about $22.6 million from investors such as Accel and offers a data interpretation service aimed at AI training data. The platform was utilized in labeling the videos and photos that makeup NuScenes.
Away from that, most of the companies that would potentially use nuScenes in the autonomous driving ecosystem already work together with either Scale or nuTonomy. In fact, Scale alone deals with several key customers including General Motors Co., Drive.ai Inc., and Lyft Inc.
Autonomous driving car datasets are not a rare commodity. In fact, this summer, Flir Systems located in Oregon released about 10,000 labeled images captured by its remarkable thermal camera system.
On the other hand, the University of California Berkeley uploaded about 100,000 video sequences taken by RGB cameras while Mapillary published 25,000 street-level photos.