There is a demand for a supermarket crowd tracker to better optimize shopping time and better disperse supermarket users along the opening hours in order to reduce waiting time and contagion risk. Inspired by google community mobility report https://www.blog.google/technology/health/covid-19-community-mobility-reports that provided insights into what has changed in response to policies aimed at beating the propagation of COVID-19 by showing trends over time by geography, across different categories of places such as retail and recreation, groceries and pharmacies.
What it does
Our website " ShopSafely " models the occupancy of the supermarkets using the analysis of Swiss telecom companies aggregated location data in compliance with the swiss law on data confidentiality and protection. In a period where social distancing is needed this application will allow us to do it in an effective way especially for the post quarantine period. This application will allow users to monitor the movement of the people going to the stores and gives better recommendation to optimize the reduction of affluence in the stores
We used Swisscom anonymized aggregated data
As a small experiment, we used Swisscom's insights tool to generate predictions on the affluences in the supermarkets of two different shopping malls in Crissier and Ecublens
How we are building it
We will first provide a proof of concept using randomized data simulating peoples visits to local supermarkets ( inspired by Google Popular Now feature ) and by showing people the closest “available”* shop to them. The goal is also to have a dummy heatmap with manually generated data deployed on a platform built with common web development tools. *available : we define a shop as available when it is not near full capacity. During this pandemic period many supermarkets adapted a queue system where each shop had a specific capacity and people coming after that would have to wait in a queue.
For now, we implemented a website " ShopSafely" in which we show a heatmap of the city of Lausanne, along with localisations of the user and existing markets around such that Green denotes a density (RealTimeLoad/MaxLoad) that is below 0.2 Yellow denotes a density that is between 0.2 and 0.7 Red denotes a density that is above 0.7
How to optimize
Find a mathematical model to accurately represent the movement of the users. Leverage the data collected from the users. Motivate swiss telecom companies to provide us access to anonymized location data
For a given position in the space we would like to estimate the probability to go to each store. We model the store as a gaussian distribution centered in its coordinates and with a variance depending on the density occupation of the store. Indeed, the further the store is empty the bigger we need the variance to be. Having these prior distributions we obtain the position of the user and we compute the maximum likelihood estimator of the posterior distribution which gives us the store to select.
What have we already implemented?
Our website "ShopSafely" is still locally hosted
In order to run our simulations , we used dummy data based on the data collected for the month of march , we adapted this data so that it reflects the current situation better .
Our recommendation algorithm has been implemented , it takes into consideration real time occupancies of stores , maximum occupancies and distance from the user .
We used our data to run simulations to highlight how well our algorithm improves the overall occupancy of the stores .
By taking the daily distribution of the telecom, we aimed to simulate realistic data by taking the average daily attendance and adding noise to it to randomize it. In the simulation we took a population of 50000 customers and 10 stores. We assigned to each customer its closest store and a time to go there based on the store density. We assigned a surface to each store proportional to its assigned customers and a threshold (delta_0) that defines the risk of contagion if exceeded. We run a first simulation with no interference then a second simulation where we try to reallocate the customers in overcrowded stores to other emptier stores. We found in the result a significant decrease in the overall exceeding capacities in both time intervals and stores (Result in figures). The final result: 81.44% decrease in overall exceeding capacity
We tried to use three weeks of swisscom insights timeseries data to make forecasts on the occupancy of two supermarkets as a feature-experimentation.
The data showed how many people are present in the area of the supermarket at a given hour and day in between 03 Mar 2020 and 29 Mar 2020
As a first approach, we tried few forecasting models on the data and moved a bit forward with a variation of the ARIMA model which gave us non accurate results.
We suspect the reason behind these bad results the lack of data (only 3 weeks) and the decreasing affluence on the supermarkets and the non regularity of the cycles due to what’s happening in the city at that period (March 2020)
PICTURE 2 and 3 : The model was able to predict the affluence of the next day (30-Mar) since it had seen 21 days already but wasn’t able to predict more than that correctly
We decided to proceed with manually estimated data based on the dataset from swisscom