Being a machine learning researcher, I always want to use machine learning/data science to solve real-world problems, particularly for education, the environment, and the healthcare sector.
What it does
Our first objective was to aggregate the features e.g. noise, temperature, humidity, etc for each location. However, it is not possible to aggregate it reliably using raw latitude and longitude values. Therefore, we divided the given spatial area (based on minimum and maximum lat and lon values) into square grids (cells) of 5meters by 5 meters area, and then computed the mean features (e.g. noise, temperature, humidity etc) for each cell (or cell_id).
In this project, we first convert the raw locations into 5 by 5-meter square grids so that features such as temperature humidity, noise, wifi signal strength, etc. can be aggregated.
Based on the observation of the WifiAccessPoints, we can see there are mainly four types of PDSB network:
PDSB_Wifi - mainly for school students PDSB_GUEST - Mainly for guests and visitors to the schools PDSB_Media Assume, it is mainly for downloading large media or streaming PDSB_Admin - for school admins and staffs
For each lat, lon, we extract the mean, minimum, and maximum signal strength for each of the above four type Networks. we also extract the number of networks available in each category for each location.
These grid-wise aggregated features can be used to find locations where students:
(i) feel more comfortable to study based on temperature, humidity, and noise_level (ii) students get better internet connection with more options of network available (iii) parents and visitors can also find the locations where they can get a better GUEST network connection etc.
We also provide color visualization of these features on a map (using folium) for better understanding. Each circlemarker on the folium map is shown by different colors (range) depending on the different values of the feature. Moreover, clicking on any circleMarker also pops up the corresponding feature value at that location.
We also compute how temperature, humidity, noise and wifi signal strength changes across different sessions of a day (early morning, morning, noon, etc.) across different location grids.
These analyses can be helpful for daily planning of leisure area usage. e.g. less noisy or comfortable space with good internet connectivity can be used as study spaces. During summer or at certain times, areas with higher temperature and humidity will have reinforced availability of water and refreshments. This tool can be used by school administrators for daily planning, arranging leisure activities, study spaces, providing better connectivity for guests and visitors
How I built it
Programmed in Jupyter notebook.
Challenges I ran into
- convert raw data into grids, which is a complex task
- extract different features from the WifiAccessPoints column.
Accomplishments that I'm proud of
We were able to do most of the part in such a short time, and hope our analysis can be helpful to improve school resources availability.
What I learned
Handling the Spatio-temporal data from the Education field.