Inspiration
What Inspired Me The inspiration for this project stemmed from the need for versatile object counting tools in scenarios where training data is scarce and diverse object types must be handled. Traditional counting models tend to be class-specific and lack flexibility, often requiring substantial retraining for new categories. Upon discovering the CounTR model’s class-agnostic, exemplar-driven approach, I was motivated to build a solution that leverages few-shot learning for automated counting on ARM devices—making advanced AI more accessible beyond cloud environments.
I chose the fruit detection and counting scenario because it represents a real-world challenge: accurately assessing crop yield and quality in agricultural settings. Fruit counting is important for growers, researchers, and supply chain companies, where object shapes, colors, and lighting conditions vary dramatically. By selecting fruits, I could demonstrate the flexibility of the few-shot CounTR model in solving practical counting tasks, especially where category-specific models would require retraining or manual annotation.
What I Learned Throughout this project, I learned about:
The Counting Transformer (CounTR) technique and its attention-based mechanism for comparing exemplars and image patches
The concept and practical benefits of class-agnostic, few-shot object counting for real-world applications
ARM device optimization, including efficient model deployment for edge scenarios
Generating and interpreting density maps as part of the counting workflow
How I Built the Project
Target Device; MacBook Air M1 To strengthen the project, I leveraged the deepNIR dataset ("Datasets for Generating Synthetic NIR Images and Improved Fruit Detection System Using Deep Learning Techniques," Sensors, 2022). This dataset offers both synthetic NIR and visible spectrum images, making it possible to experiment with multi-modal inputs and robust fruit detection in challenging conditions (e.g., varying illumination or occlusions). I integrated the CounTR model, which was pre-trained on the large and diverse FSC-147 dataset with 6,135 images spanning 147 object categories (average 56 objects per image).
Used a Matlab-based implementation, adapting the pipeline to run efficiently on an ARM-based device.
The workflow involves selecting a few exemplar images of the target object, then processing a new image by comparing its patches using the model’s attention mechanism.
A density map output enables visualization and accurate count derivation, without extra training even when presented with new categories or visual variations.
Codebase included open-source libraries for deep learning and image processing.
# Challenges Faced
Optimizing inference speed and memory usage for ARM hardware versus standard desktop GPUs.
Ensuring that the model remained accurate despite variations in object size, color, shape, and texture.
Finding reliable methods to select good exemplars for few-shot counting, impacts count accuracy.
Adapting pre-trained models so they run efficiently on resource-constrained devices without sacrificing flexibility.
Debugging data pipeline issues and visualising intermediate outputs to validate the solution.
Accomplishments that I am proud of
Used exemplar images of target fruits as references.
Applied the CounTR model’s attention mechanism to compare image patches and exemplars, counting fruits automatically.
Incorporated data augmentation and domain adaptation techniques from the deepNIR dataset to boost detection reliability.
Built the pipeline to run efficiently on ARM devices, enabling edge deployment for farm or field monitoring.
What I learned
Machine and deep learning procedures and optimisation while studying and analyzing the dataset.
What's next for AI power powered app using ARM device
Expand object types: Adapt the app for counting flowers, leaves, and other agricultural products using different exemplars.
Real-time field deployment: Integrate with camera modules for continuous, on-site yield monitoring in farms and orchards.
Edge optimization: Further improve inference speed and power efficiency for low-cost, battery-powered ARM devices.
User-friendly interface: Develop a mobile/web dashboard for easy exemplar selection, image uploads, and result visualization.
Advanced data fusion: Add support for thermal, hyperspectral, or drone imagery alongside RGB/NIR inputs.
Collaborative annotation: Enable users to share exemplars and results, crowdsourcing improved detection performance.
Model update pipeline: Experiment with active learning to update exemplar sets and refine counts as new data arrives.
Log in or sign up for Devpost to join the conversation.