The model retraining cycle

3 min readJul 31, 2024

Here is a quick example of badly annotated data: Of course this is a person..no doubt (Taken from BD100K) — automatically detected by our system:

Our method identifies underrepresented, badly annotated, and out-of-distribution data, allowing us to capture difficult-to-simulate variability in the real-world data distribution. By leveraging generative AI techniques (such as style transfer) and combining them with other approaches in a systematic, product-level approach, we can develop faster, more robust, and accurate perception models better suited for real-world driving scenarios, ultimately improving the safety and performance of self-driving cars and guiding the development of advanced simulation tools.

Our analysis of the BDD100K Dataset revealed high ratios of incorrectly labeled data, with 7% for the “train” label, 4% for the “person” label and 8% for the car label. This can negatively impact model evaluation and results. Here are real-world examples from the dataset and our automatic identification of these cases.

Here we see how through our graph representation of the data, we can detect the various cases to automatically build a dataset.

Underrepresented examples:

Here we have a person in a car cluster and most of its neighbors are cars. It has very few neighbours that are the same label but we have a lot of people in general. This is a prime example of under-represented modality that needs boosting using generative AI (we can use the person cluster to generate more permutations of the rare person modality).

By focusing on rare modalities and hard-to-find scenarios, we can use generative AI techniques to augment the dataset and provide additional coverage of the space of possible scenarios allowing us to capture difficult-to-simulate variability in the real-world data distribution. This is different from generating synthetic data from scratch. We use Style Transfer to find common modalities of the label, and transform them to the rate modality (People in the day vs people at night). This approach will ultimately lead to more robust and accurate perception models for self-driving cars that can handle a wide range of real-world scenarios and challenges, while also guiding the development of more advanced simulation tools.

Badly labeled data

Here we a truck in the car cluster, its has lots of neighbors that are cars as well and some are trucks but the larger truck cluster is further away . This is either a badly defined label or badly annotated . We can automatically resolve it by changing the label to car (thus increasing consistency) or send it to annotation along with the supporting information to help the annotator and product resolve this.

Badly defined labels

The numbers is the average similarity score.

Here we can see the case of badly defined labels. We can see that Drivable area has a lot of very similar images, in two different connected components. This indicates that the initial labeling requirements were improperly set.

The model retraining cycle

Badly labeled data

Badly defined labels

Written by Alexander Lan