Computer Vision Artificial Intelligence Model Development

By Avionics Team | November 20, 2024

As part of our SUAS 2025 mission planning, the Avionics Team is actively developing a modular computer vision system capable of detecting and geolocating targets in real time. This system builds on SUAS guidelines for autonomous object detection and localization, and is designed to operate entirely through software without requiring changes to UAV hardware.

Our current implementation uses a Docker-based inference pipeline integrated with Roboflow to manage multiple computer vision models in parallel. This architecture enables reliable, high-performance image processing with minimal integration overhead.

Detection Pipeline Overview:

Ingestion: Incoming images from the UAV’s onboard camera are split into overlapping tiles. This increases robustness by ensuring partially visible or small targets are not missed.
Parallel Model Inference: Each tile is processed by three different YOLO-based models:
- YOLOv11-L: Custom trained for SUAS-relevant targets
- YOLO-NAS-L: Optimized for low-latency, general-purpose inference
- YOLOv11-X: Tuned for edge-case coverage
Consensus Fusion: Model predictions are merged using a consensus algorithm to reduce false positives and generate a final stitched prediction map.

Backend Integration:

The detection system is fully containerized using Docker and exposed via a lightweight HTTP client. This setup allows any component — such as our flight computer or ground control station — to submit images and receive detection results using simple RESTful API calls.

Consistent Environments: Containerization ensures identical performance across local development and field environments.
Offline Capability: The container runs locally, enabling real-time inference without internet dependency.
Ease of Updates: New model versions can be deployed by simply replacing the container image.

System Integration:

To meet the real-time demands of SUAS, the GCS includes a multi-threaded client architecture:

Thread 1: Monitors and ingests new camera captures
Thread 2: Batches and sends frames to the detection server
Thread 3: Integrates detections with our geomatics pipeline to compute precise geocoordinates

Why This Works for UAVs:

This design gives us high-accuracy vision capabilities with minimal impact on system complexity. By using standard APIs and containerization, the vision subsystem integrates cleanly with our existing software stack and runs efficiently on our mission computer hardware. This makes the system maintainable, scalable, and aligned with our SUAS strategy for autonomous object detection and geolocation.