ShelfEngine

Inspiration

ShelfEngine came from a very simple frustration: I had an absurd number of bookmarks. Not 50. Not 200. I’m talking hundreds that slowly turned into “I know I saved it somewhere” chaos.

Folders worked until they didn’t. At some point, it stopped being storage and started being a graveyard. I wasn’t organizing knowledge. I was hoarding links.

The real problem wasn’t saving. It was recall.

I didn’t want another prettier folder tree. I wanted something that felt like querying my own memory. Type what I vaguely remember and get back the right thing. Fast. Local. Private.

That was the bar.

What it does

ShelfEngine is a local-first bookmark search engine built for people who’ve let their bookmark list spiral out of control.

You import your bookmarks once. After that, you can search using:

Plain language
Power operators like site: and folder:
Quoted phrases
Excludes and OR logic

It blends lexical matching with semantic retrieval so even vague queries work. And instead of just dumping results, it tells you why each link matched.

There’s also an optional Chrome extension bridge to sync changes, but the core product runs entirely in your browser.

No accounts. No backend dependency. No data leaving your machine.

How I built it

Frontend: React + TypeScript + Vite.
Storage: IndexedDB via Dexie.
Embeddings: transformers.js inside a Web Worker so the UI never freezes.

The ranking logic blends lexical and semantic signals:

$$ \text{score} = \alpha \cdot s_{\text{semantic}} + (1-\alpha)\cdot s_{\text{lexical}} $$

Then I layer boosts and penalties for:

Phrase matches
Recency
Noisy titles
Field weights

I also built structured query parsing so it feels powerful without feeling academic. If you want to search like a power user, you can. If you don’t, you don’t have to.

Challenges

The hardest part wasn’t embeddings. It was consistency.

People search in messy ways:

Half-remembered titles
Random keywords
“That article about vector databases from last month”

Balancing semantic intuition with deterministic lexical signals was tricky. If they disagreed, the ranking could feel random.

Sync reliability was another issue. Handling extension deltas, queue flushes, and resync flows without duplicating or corrupting state required strict ingestion rules.

Performance mattered too. Around ~1000 bookmarks, indexing and scoring had to stay responsive or the whole thing felt fake.

What I’m proud of

It actually solves the “absurd bookmark pile” problem.
It’s truly local-first.
The hybrid retrieval feels intelligent without being opaque.
The “why matched” layer builds trust.
The extension sync is optional, not required.

It doesn’t pretend to be a startup SaaS. It’s a focused system that does one thing well.

What I learned

Search quality is systems engineering.

It’s parsing + scoring + filtering + explainability + UX. If one part is weak, the whole thing feels unreliable.

I also learned:

How to build browser-only data pipelines.
How to offload heavy compute to Web Workers cleanly.
How to think about eventual consistency without a backend.
How to document architecture like I expect other engineers to read it.

What’s next

Before adding features, I want to harden it.

Better regression fixtures.
Repeatable performance instrumentation.
Stronger sync edge-case coverage.
More confidence in ranking behavior across weird queries.

The long-term goal isn’t “AI bookmarks.”
It’s disciplined, privacy-first personal search that works when your digital life becomes messy.

Built With

Updates

Chimdumebi Nebolisa started this project — Feb 14, 2026 08:01 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.