TraffiQ

Imgur

Link to video with more information on Traffiq here: https://youtu.be/uBk6meDX-dY

Inspiration

Traffiq was inspired by our team members’ experiences with trying to buy high-demand products online. It’s a frustrating experience to enter a website when a new product has been restocked, only to encounter high latency and potential crashes, all the while hoping that said product hasn’t already sold out (chances are, it has).

This negative experience is a symptom of a larger problem. Consumer purchasing habits are shifting towards online shopping, partially due to the pandemic, necessitating businesses of every size to have and maintain online presences. Smaller businesses will often lack the resources and expertise to handle spikes in user traffic, leading to situations like the one described above: slow loading times, crashes, lost revenue, and frustration on both ends.

Objective

With Traffiq, we want to provide an intermediate waiting queue for websites that are expecting a large amount of traffic in a short amount of time. We also prioritized three key considerations: -Accessible: Traffiq should allow almost any website to handle spikes in user traffic. Sites run by small businesses, individual, or large businesses are all accommodated. Additionally, website owners should be able to customize Traffiq to fit their individual website needs. -Self-contained: Besides rerouting from the target website, Traffiq won’t touch any other part of the target website’s backend, alleviating privacy concerns. Additionally, the minimal setup required to integrate Traffiq allows for easier onboarding. -Transparent: By leveraging its position as a sort of ‘middle-man’ between the visitor and the website, Traffiq can provide valuable information and live communication between both sides.

What it does

Traffiq consists of two main components: the queue service, or waiting page, and the admin dashboard for website owners.

Waiting Page
Imgur
When a potential visitor tries to enter a website using Traffiq, they are redirected to the waiting page. The waiting page shows the visitor’s live position in the queue, estimated time to the front of the queue, and a customized title that can be updated by the site owner. Upon reaching the front of the queue, the page will change color and provide a button that will redirect the visitor to the target website.

Admin Dashboard
Imgur The admin dashboard allows the website owner to create queues, set endpoint URLs, broadcast custom titles, and determine the target latency for the site. Target latency is the latency goal for the website. So while users are on the queue, Traffiq will send requests to the target website to measure latency. If the latency is above the target, Traffiq will decrease the rate at which users are dequeued. If it is below the target, Traffiq will increase the rate. Lastly, if it is right around the target, Traffiq will maintain the rate. Once a queue has been created and launched, they can also edit the queue as necessary.

How we built it

Traffiq is composed of the following components: the frontend, main service, queue service, bouncer service, and dashboard service.

Imgur

Frontend
Imgur

Backend Service
Imgur

Queue Service
Imgur

Bouncer Service
Imgur

Dashboard Service
Imgur

Graviton Imgur
As seen above, we ran a performance test running our queue service to compare AWS Graviton t4g instance against AWS intel t3a instance, and we found that from these tests there is a 98.88% performance per dollar increase with Graviton. This ended up being very beneficial to us because of how performance reliant our application is. We’re glad to have been introduce to Graviton through this hackathon, and we’re excited to continue using it in the future.

Customer Value

We wanted to maintain a customer focus while building Traffiq, so we thought a lot about our potential customers and the value they would get out of our service. We identified two sample users: one website visitor and one website owner.

Website Visitor:
Imgur

Website Owner:
Imgur

Challenges we ran into

Over the course of the hackathon, we ran into a few technical challenges. The first was the transition from open-source software to AWS offerings. Our team had used to working with open-source software, so identifying what AWS offering to use that had the most benefit to our project was the first challenge. Once we made the transition, we quickly realized that AWS was a platform that offered many great capabilities that could improve our application, and we utilized EC2, ElasticCache, and DynamoDB.

The next technical challenge was designing a fault-tolerant, robust, and scalable application. These were all major points of emphasis we wanted to keep in mind when building Traffiq, and we decided to split our application into scalable microservices. This resulted in significant time investment to design and structure the architecture of our application. Another challenge came from trying to minimize cross-dependencies across the various microservices. Ultimately, we were able to design our application in such a way that the different components handled non-overlapping logic and worked together to form a fully functional, scalable application.

Another challenge we ran into was how to deter bot users from flooding the queue while also preventing users from skipping the queue by sending each other their unique URLs. We solved this by implementing JSON web tokens into our application to ensure each user was set in their place in the queue, and also added a number of anti-bot checks.

Accomplishments that we're proud of

One of the accomplishments from the hackathon is designing an algorithm to maintain a certain latency requirement on the target application. This non-trivial effort was critical in ensuring our service satisfied our customer needs and fulfilled our objective, so successfully creating and implementing the algorithm is something we’re very proud of.

We’re also proud of how we integrated with various AWS services, including EC2, ElasticCache, and DynamoDB.

Perhaps our proudest accomplishment from the hackathon is creating a service that solves a common, tangible problem that we encounter on a regular basis. Supply shortages and the increased emphasis on online distribution channels have led to frustrating situations for website visitors and owners alike, and Traffiq is a product that provides a solution to these issues of reliability and scalability. After the hackathon, we can look back on our product as something that can help real people with their problems, which we couldn’t be more proud of accomplishing.

What we learned

We learned over the course of this hackathon. Many pertain to the technical challenges and accomplishments we’ve outlined above, such as how to properly use and integrate various AWS services, how to build and structure microservices, and techniques to measure server load.

In addition to what we learned technically, we learned about building products with the customer in mind. Traffiq was ideated from problems we’ve faced ourselves, and we found it critical to prioritize customer needs while we were designing and building the service. We didn’t want to create something that nobody would use, and veering towards the customer resulted in a lot of valuable learning.

What's next for TraffiQ

After the hackathon, we want to go out and find some real customers. We plan on reaching out to small businesses, retailers, shops-- any business with an online presence that would have use for Traffiq. We want to talk to them, get feedback, and iterate. We want to keep building Traffiq with the customer in mind.

We also want to expand Traffiq’s use cases. It doesn’t just need to be online retailers and shops, it can be anyone who has the need for an online, self-contained queuing service. There’s many different use cases for Traffiq-- we just need to identify and act on them.

On the technical end, there are a lot of places we can take Traffiq even further. A few include: more customization options for websites, more granular control over the queues for website owners, more information shown on the queue to visitors, etc. The algorithms that control how fast to let users into the queue can also be improved to incorporate machine learning, making Traffiq a smarter service.

We think Traffiq has real potential, and we’re excited to see where we can take it moving forward.

FAQ

Q: If websites are expecting a spike in user traffic, can’t they scale up themselves?
A: Yes, they can scale up themselves, but it is often non-trivial to do so. For example, many sites may have been built with a monolithic architecture and are hard to scale. They may also be running on non-cloud bare-metal servers, which have limited capacity. Scaling up is also often costly, and using something like Traffiq instead can help save money. Handling large spikes also often requires the overprovisioning of servers since autoscaling is not instant. Without this excess capacity (which many small services can not or may not be willing to afford), the application may be overloaded before the new automatically provisioned servers can handle traffic. Finally, many small applications may only need to handle bursts of traffic during certain periods of time. For example, a small clothing company may only need to handle traffic spikes during Black Friday, but not need to scale up during other times of the year. Or an indie game developer might post their game on Reddit once and receive a large amount of traffic, but generally have a small playerbase. These are examples of cases where it may not be worth it for them to invest the significant engineering manpower to implement autoscaling themselves.

Q: How are users redirected to the waiting page if they visit the application website directly?
A: For most traffic, the load balancer or web server (e.g. Apache/nginx) can check for the existence of the traffiqToken URL query param, and redirect to the queue site with a 303 HTTP status code. While this redirects most legitimate users, some malicious users may attempt to try to trick the system by setting that URL param. For extra protection, you can add token verification logic to your application server (like with our Express NPM module)

Q: Isn’t Traffiq itself susceptible to huge spikes in user traffic?
A: Traffiq’s incremental server load when new users join is very low. Adding another user to the queue is an inexpensive operation, which means it takes much more traffic to cause a significant spike in usage.

Q: How does Traffiq know the rate at which to let people into the site?
A: Clients set a target application latency and Traffiq will attempt to control the flow of traffic into the site such that the mean latency stays below that number. This assumes that latency increases as the load increases, and allows the application to increase the number of clients it processes per second as it scales up or as clients leave the site.

Built With

Submitted to

AWS Graviton Hackathon
- Winner Third Place - New App

Created by

Theo Shiao
Eric Ong
Software Engineer @ OpenTable
Joshua Sun
William Shiao
Hi! I'm a PhD student at UC Riverside conducting research in machine learning.