Detecting Road Anomalies with Self-Supervised AI (Edge-Ready)

The work described in this article is the result of a collaborative effort across multiple teams at Multiverse Computing. You can find the full list of contributors at the end of this article.

Executive Summary

Self-Supervised Learning: We eliminated the need for human annotation by training an AutoEncoder to model the mathematical distribution of “normal” road surfaces, allowing the pipeline to flag deviations without requiring labeled examples of every possible anomaly.
Real-Time Edge Performance: Our pipeline achieves sub-0.3s latency with less than 1% reduction in accuracy after compression, delivering near-baseline performance optimized for immediate action in autonomous systems.
83.3% Compression: Using proprietary quantum-inspired tensorization, we reduced the model from 66M to 11M parameters while maintaining detection precision.

For machine learning engineers, anomaly detection is often an "anomalous" task itself. Unlike standard classification, road anomalies suffer from a "Cold Start" problem: there is a severe underrepresentation of the anomaly class and a nearly infinite variety of what an anomaly can actually be.

In a real-world driving environment, anything from a fallen shipping container to a stray animal can obstruct a path. To address this, we developed a self-supervised pipeline that achieves high generalization power by focusing on the context of the drivable area.

The Philosophy: Learning "Normal" to Find "Strange"

We define an anomaly as any object or condition within the drivable area that obstructs or obscures safe driving. Rather than trying to teach a model every possible road hazard, we focused on modeling the road surface mathematically. If a model understands the intrinsic features of a road surface, any external object will appear as a deviation from the distribution.

The Core Pipeline:

Drivable Area Segmentation: Identifying the road surface within the frame using the BDD100K dataset.
Reconstruction: An AutoEncoder (AE) encodes the road patch into a low-dimensional bottleneck and attempts to decode it back to the original image.
Error Analysis: Because the AE is trained only on "clean" roads, it fails to reconstruct anomalous objects. The pixel-wise difference between the original and reconstructed image generates an anomaly heatmap.

Technical Spotlight: Solving Edge Constraints

Our initial baseline model featured 66 million parameters. To meet the strict requirements of edge deployment in autonomous vehicles, we had to drastically reduce this footprint.

Quantum-Inspired Compression (SDL): We utilized Singularity Deep Learning (SDL), our proprietary Python package (more info here), to perform high-order compression.

The Method: SDL converts standard torch.Conv2D layers into TensorConv2D layers using High Order Singular Value Decomposition (HOSVD) and tensor contractions.
The Result: This replicates the behavior of standard convolutions while significantly reducing trainable parameters. We compressed the model to 11 million parameters, slightly improving accuracy metrics in the process.

Engineering Hurdles & Mitigation Strategies

While the reconstruction-error heatmap highlights pixel-level deviations, converting it into localized anomaly detections and an image-level binary anomaly decision required solving three critical post-processing challenges:

Performance Metrics

Testing was conducted on an image-wise basis using a curated dataset. To calculate final scores, we used a Gaussian Mixture Model (GMM) on the anomaly heatmaps.

GMM Scoring Logic: The GMM fits two components to the heatmap pixel distribution (representing "high" and "low" error). The final score is derived as:

Final Results

The results demonstrate that the model maintains strong detection performance while achieving significant parameter reduction, validating its effectiveness for real-time, edge deployment.

AUC (Area Under the ROC Curve) measures how well the model separates normal vs anomalous samples across all thresholds (higher = better overall discrimination). AP (Average Precision) summarizes the precision–recall tradeoff, making it more informative under class imbalance. Accuracy at Best Threshold (%) reports classification accuracy at the threshold that maximizes (TPR − FPR), i.e., the point that best separates true positives from false positives.

Conclusion

We took on an ambitious goal: detecting road anomalies that may put the driver in danger, even when those anomalies are rare, highly variable, and difficult to capture in a representative dataset. Recognizing that it is not feasible to collect examples of every possible anomaly, we shifted our approach toward modeling the road surface as accurately as possible and identifying deviations from that learned representation.

This foundation allowed us to flag unexpected elements within the drivable area, while post-processing converted intensity-based anomaly signals into discrete, analyzable object detections. Using our expertise in deep learning architectures and novel compression techniques, we compressed our end product by 83.3% while preserving almost all the accuracy.

Self-Supervised Anomaly Detection: Navigating the "Unknown Unknowns" of the Road