Inspiration
One of our team members spent a summer building a vision analytics system to identify bottlenecks in CT scanning workflows, tackling one of Ontario's most persistent healthcare problems: medical imaging wait times that can stretch months. The work was meaningful, but it came with a hard constraint. Research Ethics Board requirements (rightfully) prevented any real patient data from leaving local machines, which meant cloud-based annotation platforms like Roboflow and CVAT were completely off the table. The workaround involved synthetic data and team members roleplaying CT scan scenes, which was less than ideal. The deeper problem is that this isn't unique. Teams worldwide who are building technology for social good, healthcare, humanitarian response, or satellite imagery analysis face the same issue: sensitive data can't leave a controlled environment, and no annotation tooling exists for that reality. datascale is what we wished had existed.
What it does
datascale is a collaborative data annotation platform that runs entirely on your local network via Tailscale. Teams can create projects, upload images, and annotate them with bounding boxes and polygons either manually or with AI assistance.
The AI layer runs fully on-device on Apple Silicon. No cloud APIs, no data leaving your network. Users can click or drag a box to segment objects with MobileSAM, auto-segment entire images, or type commands like "annotate all dogs" and a local AI agent finds, segments, and labels every matching object. A quality review agent audits annotations for label mismatches, geometric anomalies, and missing annotations. Tailscale-based admin controls manage access.
Challenges we ran into
Tailscale ACL Policy Confusion: machine tags were being used on annotator machines. Because tagged devices are treated as service accounts (not tied to a single user), Tailscale could not attribute the requests of that machine to a user as the requests did not contain the injected Tailscale headers such as
tailscale-user-login. As a result, our auth was failing. To fix this, we opted for a group policy approach to authenticating.Noise reduction with detection before segmentation: our initial approach asked SAM to segment everything in the image first, then filter by label — which meant wading through dozens of irrelevant masks. Flipping the order fixed it: YOLO-World takes the text query and returns only high-confidence bounding boxes that match, then SAM2 generates pixel-perfect masks within those tight regions.
AI quality review: early implementations produced many false positives because SAM's segment-everything mode generates dozens of regions that weren’t relevant. We iterated by raising thresholds, adding CLIP filtering, capping issue counts, and using YOLO-World for targeted detection.
Real-time collaboration: making WebSocket connections work across Tailscale, Vite dev proxy, and CORS required careful routing.
Accomplishments that we're proud of
- Tailscale-native RBAC (Role Based Access Control on the network level) via ACLs. Role checks use Tailscale identity to determine group membership and ACLs to control access to IPs at certain ports; system fails closed if the admin service is down.
- Zero cloud dependencies for AI. MobileSAM, SAM2, YOLO-World, OpenCLIP, and Ollama all run locally on Apple Silicon to ensure data privacy.
- Natural language annotation works. Commands like "annotate every dog" generate precise polygon annotations on-device in one request increasing annotator productivity.
- Tool-calling agent loop: A local LLM decides which vision tools to call (segment, count, describe, quality check) and chains them in multi-turn conversations.
What's next for datascale
- Granular roles (particularly the reviewer role)
- Smarter quality review using YOLO-World for label-aware detection.
- Active learning to surface informative unlabeled images.
- Model training integration to export annotations and fine-tuned models.
- Annotation versioning to track changes and roll back mistakes.
- Multi-device support beyond Apple Silicon to CUDA and CPU-only setups.
Built With
- express.js
- fastapi
- node.js
- ollama
- react
- sqlite
- tailscale
- vite
- zustand
Log in or sign up for Devpost to join the conversation.