Inspiration

Bio-Fed began with decentralization as its foundation. Much of today’s environmental data infrastructure is centralized: observations flow into a single platform, governed by a single authority, dependent on a single database. That creates structural fragility. Policy shifts, funding changes, or infrastructure failures can affect entire communities at once. Inspired by federated systems like ActivityPub, Bio-Fed explores a different model; one where each participant runs their own instance, retains ownership of their data, and connects to others through open protocols rather than a central gatekeeper. If one server disappears, the network continues.

The second motivation was location risk. Biodiversity data is powerful, but precise geographic coordinates can expose sensitive habitats, rare species, and vulnerable ecosystems. Instead of treating privacy as an optional setting, Bio-Fed enforces it architecturally. Exact latitude and longitude are never stored or federated. Instead, location is converted into a precision-controlled fixphrase before validation. A fixphrase is a human-readable set of two, three, or four words that represents a geographic area at different levels of granularity. Higher word counts mean finer precision; lower counts intentionally reduce specificity. Coordinate fields are actively rejected by the schema, making accidental leakage impossible by design.

To support this, we built and open-sourced our own fixphrase web service so anyone can self-host it. The service runs in Docker, converts temporary browser-provided coordinates into fixphrases, and returns only the word-based representation. Coordinates exist only in memory long enough to perform the conversion and are never persisted. By releasing the service as open source, Bio-Fed ensures that the privacy layer is transparent, inspectable, and deployable by anyone.

Finally, Bio-Fed is intentionally open and accessible. It uses open standards, open protocols, and open-source tooling. The goal is not simply to build an application, but to demonstrate that decentralized, privacy-preserving environmental infrastructure can be practical, auditable, and community-owned.

What it does

Bio-Fed is a federated, privacy-first biodiversity tracking system designed to let individuals and communities document species observations without relying on a central authority. Each participant runs their own instance, storing data locally in a lightweight database and connecting to other instances through ActivityPub. Observations are shared across the network using standard Create and Follow activities, preserving provenance as user@instance while avoiding central aggregation.

At its core, Bio-Fed defines a structured observation format that supports human and machine-assisted records. Each entry includes taxonomic information, method metadata, timestamp, and a precision-controlled fixphrase location. Exact coordinates are never stored or federated. Instead, the system converts temporary browser-provided latitude and longitude into a two, three, or four word fixphrase, allowing users to choose the appropriate level of geographic precision. The schema actively rejects raw coordinate fields, ensuring that privacy is enforced technically rather than socially.

Bio-Fed also incorporates built-in authentication and role-based access control to protect administrative functions, while keeping federation endpoints open. Rate limiting, inbox logging, and actor or domain blocking provide resilience against abuse. Together, these components form a distributed environmental data infrastructure that balances openness, accountability, and ecological responsibility.

How we built it

Bio-Fed was built deliberately as a minimal, transparent reference implementation rather than a complex platform. The backend uses Node.js with Express and SQLite, prioritizing portability and ease of self-hosting. SQLite was chosen over heavier database systems because each instance is intended to serve individuals or small communities, not centralized scale. This keeps deployment simple, one container, one database file, while making backups and migrations straightforward. Docker was used from the start to ensure reproducibility and to reinforce the idea that anyone can run their own node without specialized infrastructure.

Federation is implemented directly using ActivityPub, with explicit handling of WebFinger discovery, Actor documents, Follow/Accept handshakes, and Create(Note) activities. Instead of abstracting this behind a large framework, the protocol logic was written explicitly to keep it auditable and understandable. This choice makes the system easier to reason about and extend, but it also means future improvements, such as full HTTP signature verification or shared inbox optimizations, will need to be layered carefully.

Privacy enforcement is structural. The observation schema actively rejects coordinate fields, and all location handling is mediated through a self-hosted fixphrase service. Building and open-sourcing the fixphrase web service ensures that the privacy boundary is transparent and independently deployable. This architectural decision reduces risk but also constrains future features: any mapping or spatial analysis must operate within the fixphrase precision model rather than exact geospatial coordinates.

Authentication was implemented with built-in role-based access control using Argon2id for password hashing and server-side session storage. Admin and write actions are protected, while federation endpoints remain open. This separation keeps the network interoperable while securing local administration. Going forward, this structure makes it feasible to introduce multi-user instances, per-actor cryptographic keys, and more advanced moderation without redesigning the core.

Overall, Bio-Fed was built around explicit trade-offs: simplicity over scale, federation over centralization, enforced privacy over convenience. These decisions constrain certain high-resolution analytical features but create a resilient foundation for distributed, community-owned environmental infrastructure.

The overall concept was small enough that a single developer could conceptualise, design and build the entire system in a way that would provide a starting point for community development in future. Every choice was made with an eye to future development.

Challenges we ran into

Developing Bio-Fed surfaced several technical and architectural challenges, most of them rooted in the decision to prioritize decentralization and privacy from the outset.

One of the primary challenges was implementing ActivityPub directly. While the specification is well documented, real-world interoperability requires careful handling of WebFinger discovery, actor resolution, Follow/Accept flows, and Create delivery semantics. Even small inconsistencies in headers, content types, or JSON structure can cause federation to fail silently. Because Bio-Fed intentionally avoids large abstraction frameworks, the protocol logic had to be implemented explicitly and tested between multiple local instances. This made debugging more complex but ultimately improved our understanding and control of the federation layer.

Another challenge was enforcing privacy structurally rather than socially. It is easy to say “we won’t store coordinates,” but much harder to guarantee that no code path accidentally persists latitude or longitude. We addressed this by rejecting coordinate fields at schema validation and isolating the fixphrase conversion process so that coordinates exist only briefly in memory. This required careful API design and avoidance of overly verbose logging, especially during debugging.

Authentication and role management also introduced complexity. Federation endpoints must remain publicly accessible, while administrative and publishing routes must be protected. Designing middleware that cleanly separates these concerns, without breaking interoperability, required precise route scoping and role checks from the beginning.

Finally, balancing simplicity with extensibility was a constant tension. Choosing SQLite and a minimal Node/Express stack keeps the system deployable and understandable, but it requires thoughtful planning to ensure future additions, such as HTTP signature verification, multi-user actors, or moderation workflows, can be integrated without architectural rework.

These challenges reinforced the core philosophy of Bio-Fed: privacy, decentralization, and transparency demand deliberate design decisions, even when they increase short-term complexity.

If was was hired to develop something like this I'd probably start by saying they had hired the wrong guy and should probably get a Rust developer and considered using Rust for this project but since I have never written any Rust I selected JavaScript. This proved to be a good choice because as time moved on I kept imagining how other people could build on this system with me and JavaScript is just a much better environment for that sort of development.

Accomplishments that we're proud of

One of the accomplishments we are most proud of is building and open-sourcing the fixphrase server that underpins Bio-Fed’s privacy model. Rather than relying on a proprietary location encoding system or a third-party API, we created a self-hosted, Docker-ready web service that converts coordinates into human-readable word phrases at configurable precision levels. By releasing it openly, we ensured that anyone can inspect, deploy, and extend the privacy layer themselves. This makes the system transparent and reproducible, not dependent on hidden algorithms or external providers.

We are also proud of designing Bio-Fed as a complete, working federated system from scratch. It does not sit on top of an existing biodiversity platform; instead, it rethinks how environmental data can be shared at the protocol level. By using ActivityPub directly and defining a structured observation format with enforced privacy rules, Bio-Fed demonstrates that biodiversity tracking can be decentralized, interoperable, and secure without a central authority. The system handles discovery, follow handshakes, publication, validation, and moderation boundaries in a cohesive way.

Perhaps most importantly, Bio-Fed shows that environmental data infrastructure can be reimagined rather than incrementally patched. It introduces a model where location sensitivity is enforced technically, ownership is distributed, and participation does not require surrendering data to a platform. If adopted more widely, this approach could represent a new pattern for sharing ecological knowledge responsibly and collaboratively.

What we learned

One of the most important lessons from building Bio-Fed is that decentralization is not simply a deployment choice, it is an architectural discipline. Designing for federation from the beginning forces clarity around identity, ownership, and protocol boundaries. There is no implicit trust in a central database, so every interaction, Follow, Accept, Create, must be explicit and well defined. This made the system more transparent and resilient, but also required careful thinking about how state is shared, validated, and preserved across independent servers.

We also learned that privacy must be enforced structurally, not culturally. It is easy to promise not to store coordinates; it is much harder to design a system where storing them is technically impossible. By rejecting coordinate fields at schema level and isolating fixphrase conversion into a separate service, we discovered that privacy constraints can actually simplify long-term reasoning about data flows. Strong boundaries reduce ambiguity.

Another lesson was the importance of simplicity. Using Node, Express, SQLite, and Docker kept the system understandable and reproducible. Implementing ActivityPub directly, rather than abstracting it away behind a large framework, made federation mechanics clearer and more controllable. This choice required more hands-on debugging, but it improved auditability and extensibility.

Finally, we learned that environmental technology benefits from being designed as infrastructure rather than as a platform. When communities own their nodes, their data, and their moderation rules, participation becomes more durable. Bio-Fed reinforced the idea that environmental collaboration does not require centralization, only shared standards and careful design.

What's next for Biodiversity Federated

The next steps for Bio-Fed focus on strengthening federation, expanding usability, and preparing the system for real-world deployment beyond a reference implementation.

The first priority is full ActivityPub hardening. While the current implementation handles Follow, Accept, and Create flows, adding HTTP signature verification and outbound signing will significantly improve interoperability and security. This will allow Bio-Fed instances to integrate more confidently with the wider federated ecosystem and prevent spoofed activity delivery. Alongside this, improving shared inbox handling, replay protection, and signature auditing will make federation more robust.

The second step is multi-user support within a single instance. Currently, each instance is effectively single-actor. Introducing multiple local users, each mapped to their own ActivityPub actor and keypair, will allow small communities, schools, or field groups to share infrastructure while retaining identity separation. This builds naturally on the existing role-based authentication system.

A third development path is structured machine observation ingestion. Bio-Fed already supports method metadata for human and machine-assisted records. Expanding this to support automated pipelines, such as audio classification or image recognition, while retaining human review flags will allow the platform to support hybrid citizen-science workflows.

Longer term, Bio-Fed could introduce optional community index nodes. These would not centralize data ownership but could aggregate public metadata or statistics across consenting instances to enable broader ecological analysis without collecting raw coordinates.

Ultimately, the goal is to evolve Bio-Fed from a working prototype into a durable, federated environmental infrastructure, secure, privacy-preserving, and capable of scaling through cooperation rather than centralization.

Built With

Share this project:

Updates

posted an update

I've already started on moving Bio-Fed away from inline HTML to use a theming engine which is the last step before I start expanding the server. The plan is for v0.2 to be released in the coming month and have the first server in the wild federating with the wider community and all the existing Federated ActivityPub servers.

Log in or sign up for Devpost to join the conversation.