SerpApi

Inspiration

SerpApi was initially built at the beginning of 2017 when we wanted to programmatically download images from Google Images for ML. Surprisingly, no such tool was existing.

Now, SerpApi is used by our customers for SEO and Local SEO, ad verification, OSINT, data for ML models, news monitoring, etc.

What it does

SerpApi scrapes search engine results pages (SERPs) and responds with JSON.

How we built it

We use a pool of residential proxies and our own. We fetch HTML, extract data, and respond with JSON.

The interactive playground is built with React. Documentation is partially generated.

Challenges we ran into

Avoid being blocked; maintain and extend many versions of target websites; extract JavaScript-generated content.

Accomplishments that we're proud of

Customers pay for successful searches only; 99.95% SLA; 2-second average response time; mimick TLS fingerprints.

What we learned

Working in a trusting and transparent environment is a lot simpler, useful, and fun.

Web-scraping without a browser is both simpler and harder.

It's possible to submit a PR with the lxml2 speed up without the prior C experience.

Deobfuscated JavaScript can be reverse-engineered.

What's next for SerpApi

Enterprise and Yearly Plans; Bulk API for export and import; support for more target websites (Amazon, Google Lens, Google Trends, etc.); self-updating parsers.