Inspiration
SerpApi was initially built at the beginning of 2017 when we wanted to programmatically download images from Google Images for ML. Surprisingly, no such tool was existing.
Now, SerpApi is used by our customers for SEO and Local SEO, ad verification, OSINT, data for ML models, news monitoring, etc.
What it does
SerpApi scrapes search engine results pages (SERPs) and responds with JSON.
How we built it
We use a pool of residential proxies and our own. We fetch HTML, extract data, and respond with JSON.
The interactive playground is built with React. Documentation is partially generated.
Challenges we ran into
Avoid being blocked; maintain and extend many versions of target websites; extract JavaScript-generated content.
Accomplishments that we're proud of
Customers pay for successful searches only; 99.95% SLA; 2-second average response time; mimick TLS fingerprints.
What we learned
Working in a trusting and transparent environment is a lot simpler, useful, and fun.
Web-scraping without a browser is both simpler and harder.
It's possible to submit a PR with the lxml2
speed up without the prior C experience.
Deobfuscated JavaScript can be reverse-engineered.
What's next for SerpApi
Enterprise and Yearly Plans; Bulk API for export and import; support for more target websites (Amazon, Google Lens, Google Trends, etc.); self-updating parsers.
Log in or sign up for Devpost to join the conversation.