Inspiration
My involvement in this hackathon is somewhat down to happenstance. I took part in a different hackathon with Amazon on the topic of sustainability, although I was new to DevPost, and I was unfortunate to have not been able to submit my project in time due to work commitments. I was then recommended to take part in this hackathon, and so I did. It is slightly unusual from other hackathons on DevPost in that there is a significant public policy angle to the challenge posed, which is different from other hackathons which are entirely focused on coding. This distinctiveness appealed to me however because of my diverse background in economics, public policy as well as coding.
Counterfeiting is defined as the imitation of an item, good or service of something that is authentic. As someone from an economics background, there are two situations in which we are typically confronted with the issue of counterfeiting in the economics field. The first is the counterfeiting of currency or money. This is a problem almost as old as humanity itself and carried with it severe penalties – in England in the 19th century, the seriousness of counterfeiting was such that the death penalty was imposed for anyone who chose to counterfeit. Historically, counterfeiting money typically took the form of deliberately slightly lowering the value/quantity of gold and using it in daily transactions. In more recent times, counterfeiters of currency use highly sophisticated means to imitate a currency to evade the inbuilt mechanisms on banknotes. Peru is sometimes described as the hotspot for imitating US Dollars. Counterfeiting currencies is of concern to economists because of a phenomenon known as “Gresham’s Law”: as more counterfeit money enters into circulation in the economy, people are less likely inclined to want to use the currency on the basis of fears that it is not the authentic currency they are trading, but the fake currency. Gresham’s Law is sometimes dubbed, “bad money drives out good”.
The other situation where counterfeiting arises in economic theory is on the topic of information. Historically, information was not seen as important in the context of economic theory. It would take until the economist George Akerlof and his groundbreaking work The Market for Lemons: Quality Uncertainty and the Market Mechanism published in 1970 to make the role of information in economic transactions more explicit, and to which he received the Nobel Memorial Prize in Economic Sciences. In Akerlof’s work, he presents the situation where a person wishes to buy a used car. In a world of imperfect information, the buyer has no idea of the quality of the used car. The car can either be in really good condition (colloquially called a “peach”) or bad (colloquially called a “lemon”). Because of this uncertainty, the buyer is only prepared to pay an average price between the two extremes. But this creates a problem – because this price is likely to be less than that requested by sellers of good cars, they leave the market, eventually causing the market to be filled with highly defective cars (“lemons”) and the market collapses upon itself. Although it is easy to overstate Akerlof’s analysis – for instance, his paper would presume a platform such as eBay would not exist – it is nevertheless a cornerstone in economic thinking and his paper is among one of the most cited in economics (although ironically it was rejected several times). Both of these have to be stated because counterfeiting is, in the main, an economic problem. And therefore to understand counterfeiting, one also needs to have a basic understanding of economics. Furthermore, it is necessary to state the above because consumers who interact with counterfeit goods can be categorised into two categories. There are some consumers who have full or perfect information – these are consumers who knowingly buy counterfeit goods. This can be analogous to the situation described by David Foley in the early stages of the hackathon of there being a “subculture” around counterfeit goods. The other situation surrounding counterfeit goods can be described as one of imperfect information. This typically takes the form of the seller knowing more about the good than the consumer, that is to say, the seller passes off a counterfeit as an authentic. The exact distribution between these two categories worldwide is unclear. When I asked Joe Wheatley, attorney-at-law at the Amazon Counterfeit Crimes Unit (CCU), he informed me that he believed most of the problem with counterfeiting on Amazon was from consumers who unknowingly bought a counterfeit as opposed to consumers knowingly buying counterfeits on Amazon.
I chose online shopping as the context in which to focus on because it is the marketplace that exhibits this kind of asymmetric information the most compared to other kinds of marketplaces. I believe that the browser extension can help engender trust by identifying goods that are potentially counterfeits to the end user.
What it does
The Beagle browser extension detects potential counterfeits by using a statistical discipline known as outlier analysis. Outlier analysis seeks to find outliers in a dataset, which may be defined as an aberration in a given dataset. Outliers can either be a particular point or, as in some applications such as fraud detection, a series of points. The Beagle browser extension does this by looking at the typical price of an item sold on an online marketplace such as Amazon and then checks for whether the price offered for an item differs statistically significantly from the norm. More concretely, suppose the typical price, sometimes known as the ”Manufacturer’s Suggested Retail Price” (MSRP) of a Rolex watch is $12,000 USD. If one is sold for $100 USD then this would be classified as an outlier and would be flagged by Beagle as
The browser extension is called “Beagle” after the name of the dog breed. There are two reasons why it is called “Beagle”. The first is that the breed is frequently used by authorities to sniff out suspect items; in a similar way, it is hoped that the browser extension helps alert users about items that are potentially counterfeit on major internet stores. The second reason why the extension is called Beagle is because there is a famous expression that, “on the internet, no knows you’re a dog”. The expression calls to mind the anonymity that exists on the internet and the difficulty in knowing the authenticity of a particular person or entity, which contributes to the counterfeit goods problem in online shopping.
How we built it
As with almost all browser extensions, Beagle is programmed in HTML (HyperText Markup Language), CSS (Cascading Stylesheets), JSON (JavaScript Object Notation) as well as JavaScript. The statistical prowess of the extension, however, comes from a language called R. R is a powerful programming language specifically geared towards statistics. It has a vibrant community and is extremely helpful with statistical techniques such as regression. It is because of R that we are able to conduct the outlier analysis on the data that is received. However, one difficulty is performing R on the front-end of the browser for the user. To get around this, R is at a remote backend – it can be an actual website, Firebase, or something similar – and the language PHP is used to receive numerical information from the browser extension that is passed via JSON and passed to an R script through PHP’s exec() function. The output of R, which is an outlier score, is then passed through back to PHP and PHP then sends this to browser extension. The attached diagram hopefully makes this clear. It is not necessary for the user to have the R or PHP file.
Challenges we ran into
I actually didn’t start off with this idea initially. My first idea was for a browser extension to use a form of statistical analysis known as hedonic regression – this is a form of regression analysis that breaks down an item or entity into its characteristics and then investigates what these characteristics have on the dependent variable. It is usually used for characteristics that affect house prices – e.g. size of windows, location, size and so on. How this would have worked is that it would find items throughout a particular online marketplace and then break down the item into various characteristics. For example, for a shoe, this would entail breaking the shoe into a sole, laces, colour and so on. It would then see whether the price charged for the shoe in the marketplace general significantly differs from that of the hedonic regression. However, a major problem with this is that it is computationally complex: there are billions of items sold on online marketplaces such as Amazon, and so a browser extension would quickly run into bottlenecks. Another problem with analysing characteristics is that counterfeiters are becoming increasingly adept at what they do, in some cases having the ability to create near-identical items that can only be distinguished under a microscope. I later settled on analysing prices because research suggests that this is what counterfeiters tend to manipulate and, furthermore, it is what tends to attract consumers to purchase the counterfeit item in question.
Another problem that I ran into was deciding how to do the statistical calculations necessary to evaluate what is and what isn’t an outlier. The language that would be most relevant to do this on the front-end is JavaScript. However, JavaScript is not equipped to do mathematical tasks such as regression analysis. Furthermore, even if it were, it would necessitate “reinventing the wheel” of richer languages with an extensive history with statistics such as Python and R. I settled on R because this was the statistical language I was most familiar with.
A minor potential problem that one might consider is whether the code is publicly accessible. Whilst JavaScript, HTML and CSS in the browser extension are publicly accessible, the statistical language R is hidden from public sight, which makes it less likely for counterfeiters to “game the system”. And even if it were publicly accessible, it’s unlikely it would cause much difference because counterfeiters would have to price their item close to that of the original item which substantially weakens one of the ways counterfeiters try to lure their victims.
Accomplishments that we're proud of
I’m actually proud that I have been able to use outlier analysis in a real-world context. As someone from an economics background, outlier analysis is not a tool that is used often except in rare situations where the data might need to be cleansed.
What we learned
I really liked how I learned just how much counterfeiting affects the US and world economy. I was also glad to learn about the various ways Amazon confronts the problem of counterfeiting. I have also developed an interest in following counterfeiting stories – as someone from London, I recently read that a series of shops on Oxford Street were targeted in a raid based on their distribution of counterfeit goods.
What's next for Beagle:An extension that alerts you of possible counterfeits
One of the next steps would be to work with CINA to see if the browser extension has potential. If it does have potential, it would also be good to develop a means by which it can interact with mobile phones, as browsers on mobile phones do not typically have or permit extensions.

Log in or sign up for Devpost to join the conversation.