AeroLoop, Autonomous Aircraft Detection System
A system that builds, improves, and validates its own models at the edge
Context
Built for the 2025 Edge Impulse Hackathon over one month (31 Oct to 30 Nov).
The concept: a fully autonomous pipeline that collects audio, validates samples, and improves its own aircraft detection model with minimal human oversight.
This wasn’t a model project.
It was a system project from end to end.
The Problem
Detecting aircraft in noisy outdoor environments generates mountains of irrelevant data. A Pi recording 24/7 might capture 1 hour of aircraft and 23 hours of wind, traffic, and silence.
Manual annotation would take 23-34 hours per day.
Even worse: most negative samples are useless for training. What you need are hard negatives, the sounds that confuse the model, like machinery, cars, and construction.
The real question: How do you find hard negatives without drowning your annotators?
The Solution
Three techniques in one autonomous loop:
1. Ground Truth via Sensor Fusion
RTL-SDR decodes aircraft transponders. When a plane enters a 3km radius, the system triggers a 60-second recording. The aircraft’s presence becomes the label.
2. Hard-Negative Mining at the Edge
When no aircraft is within 10km, the system records 20 seconds of background noise and runs on-device inference immediately. If predictions stay below 0.4 confidence, it deletes the sample. If any prediction is at least 0.4, it keeps the sample because model confusion is valuable training data.
Result: 91-98% of negatives rejected on-device. ~5.5 hours of annotation time saved.
3. Remote MLOps Pipeline
Automated workflow: trim audio, upload to Edge Impulse, retrain, evaluate, build TFLite, and deploy to Pi only if accuracy improves. Runs autonomously via SSH and EI API.
System Architecture
Raspberry Pi 4 (The Collector)
- Decodes ADS-B transponder signals (dump1090)
- Records via Arduino Nano (USB microphone)
- Runs TFLite inference for negative filtering
- Stores validated samples with metadata
Arduino Nano 33 BLE Sense (The Target Device)
- 256 KB RAM / 1 MB flash constraint
- Triple duty: collection mic, validation target, production deployment
- Using same hardware throughout ensures acoustic consistency
Local Machine (The Orchestrator)
- Remote download via SSH
- Streamlit annotation GUI with audio trimming
- Automated training, evaluation, conditional deployment
Key Design Decisions
Audio Processing
- 2-second windows (fits Nano memory, <500ms inference)
- MFE DSP: 32,000 samples to 1,984 features
- Compact CNN: 4x Conv2D to 128 neurons to 2-class output
- Weighted moving average over 3 predictions for stable real-time output
Training Strategy
- 50:50 train/test split initially (needed headroom to measure improvement)
- Fixed test set throughout (evolved to 80:20 by project end)
- GPU training: <10 minutes per iteration
Results
Baseline (Iteration 0)
- 25 hours collection, 4.5 hours annotation
- 142.6 minutes total (40.5 aircraft, 102.1 negative)
- 94.73% test accuracy
Autonomous Loop (Iterations 1-4)
- Test accuracy: 94.73% to 95.9%
- Class distribution shifted naturally: 28:72 to 62:38, towards model weaknesses
- Negative drop rate: 91-98%
- Annotation time saved: 5.5 hours
The DSP Failure (Iterations 5-7)
Aircraft flight patterns changed (takeoffs vs. landings). Model adapted, but my custom DSP implementation diverged from Edge Impulse’s MFE block. On-device filtering broke (0% drop rate) while studio accuracy kept improving.
Root cause: 32-bit Pi couldn’t run EI’s Linux SDK. Custom NumPy/SciPy DSP worked initially but failed as model sophistication increased.
The lesson: Production systems need production-grade tooling.
Final Production Model
- 97.11% test accuracy (F1: 0.96 aircraft, 0.98 negative)
- Total negatives rejected: 330.31 minutes (5.5 hours)
- Annotation time savings: 73% (7.5 hours to 2 hours)
- Real-world validation: Nano deployed at window, predictions aligned with FlightRadar24
Why This Matters
Traditional ML: collect everything, label everything, train, then hope it works
AeroLoop: deploy model, collect only what improves it, retrain automatically, repeat
This approach scales. The 73% reduction in annotation time over 60 hours isn’t the story. The story is a self-improving system that respects human time, hardware constraints, and real-world messiness.
Domain Transferability
The methodology generalises to any domain with:
- Ground-truth sensor (SDR, GPS, camera, scheduled events)
- Target sensor (microphone, IMU, accelerometer)
- Sparse events in noisy data
Hypothetical applications: gunshot detection, predictive maintenance, wildlife monitoring.
Links
- GitHub: github.com/blakedownward/aero_loop
- Edge Impulse: studio.edgeimpulse.com/public/833123/live
- Dataset: zenodo.org/records/17765718
- Demo: youtu.be/uabWnYxGiCg
Built with Edge Impulse, Raspberry Pi, Arduino, and a philosophy: collect less, learn more.