Inspiration
Every group trip we've planned hits the same wall: a group chat with eight people throwing out preferences, someone volunteering to "look into flights," and three days later nothing's booked. We wanted an agent that could actually do that comparison work — check real prices, not just suggest a vibe.
What it does
TripSync collects trip preferences from a group via a shareable link — budget, vibe, dates, must-haves. Once enough people respond, the organizer hits "Generate Trip Plan" and the agent takes over: it reconciles everyone's input, brainstorms candidate destinations, browses real flight prices, hotel rates, and local activities for each, then scores and ranks them against the group's stated preferences. Output: one report with ranked destination cards and a plain-English "why this fits" for each.
How we built it
A small Express server handles live group input — shareable trip codes, QR links, real-time response tracking. That triggers a four-stage pipeline: aggregate group responses into one preference set, brainstorm candidates (pure LLM reasoning), research each via real Browserbase sessions (flights, hotels, activities), then score and rank against the group's actual preferences. Every reasoning step uses forced tool calls for structured output instead of prompting for JSON — far more reliable across a multi-step chain. Each stage also runs standalone for isolated testing.
Challenges we ran into
Google Flights and Booking.com are dense, frequently-changing pages — getting reliable extraction took real iteration, plus fallbacks (Skyscanner) so one site's quirks couldn't sink a destination's results. Keeping multi-device form submissions race-condition-free took some care, solved by Node's single-threaded handling as long as nothing awaits between reading and writing state.
Accomplishments that we're proud of
Our activities step originally pulled a full Wikivoyage page through Stagehand's extractor every run — ~70k tokens to find a "things to do" list. We rewrote it to grab only the relevant sections via a lightweight DOM script first, cutting that to ~2.6k tokens (~27x), with a fallback to full extraction if a page's structure ever changes. Small fix, big difference for a pipeline meant to run repeatedly.
What we learned
The biggest lever for cost and reliability was separating "the agent should reason here" from "we already know where this data lives" — Claude reasons where it adds value (reconciling preferences, scoring tradeoffs), and gets pre-mapped extraction everywhere else.
What's next for TripSync
Two-way Google Calendar sync for the winning destination, group voting on final picks instead of one organizer deciding, and more research sources for larger or pickier groups.
Built With
- anthropic
- browserbase
- css
- html
- javascript
- typescript
Log in or sign up for Devpost to join the conversation.