Growing the Mobile App Store Business

Inspiration

We live in a world where the media is significantly influential in our life. The emergence of smartphones and laptops further triggers the media's impact. Mobile applications designed by brilliant engineers have empowered us to have a more convenient life. However, many mobile apps offer similar products and have fierce competition, so it is important for app designers and providers to know users' habits to better find and serve their target consumers. Thus, we are particularly interested in analyzing user's habits among different brands (i.e., Apple, Samsung, and Blackberry, etc.), and whether a significant difference exists between users using different brands.

What it does

Our goal is 1) to help app stores better advertise their apps through four steps of app download: when, why, how, what. Based on the data given, we discover some patterns of user habits and give app developers and providers useful suggestions accordingly. For example, we find out that games, social networking, utilities, weather, music are generally the top five types of apps that people will download. We then recommend app stores push apps to their customers based on this pattern. 2) to help the app store better advertise the store itself and its corresponding mobile phone brand. We build construct two models, SVM linear model and 4 layers neural network nonlinear model, to predict what mobile phone brand people are using based on their demographic information. As a result, Apple, Blackberry, Android, Nokia, and Samsung happen to be the top five phone brands that people prefer the most. These two models can help app stores to better target their customers based on individuals’ demographic information, and so can advertise their phones with higher efficiency.

How we built it

We first use Python to read the data and do some data cleaning to choose the variables we are particularly interested in. Then we export the dataset to R studio to do some exploratory data analysis. Using R, we then plot the data to analyze users' habits for each of the five App stores. Based on the pattern from users' feedback, we come up with some suggestions for App stores and providers to better serve their current customers. Besides, given the large amount of data, we also use python to do the modeling, which predicts user's preference of phone brands based on their Gender, Age, Marital status, Occupation, income, etc. In this way, we are able to further identify potential customers for different kinds of mobile applications. Thus, the app stores may use our model to better recommend products to their users according to their profiles.

Challenges we ran into

In the beginning, the diverse survey questions were hard to tackle with a generalized topic for us to further explore, but we fixed such problems by separating out 4 steps of app downloading. When constructing two models (one linear and one non-linear), we modified the parameters of the 4-layer neural network several times in order to achieve a higher accuracy. However, after modifying for more than 10 times, we found that the accuracies of both models were still around 40%. We then discussed possible reasons behind this problem, and we concluded that it might be due to the unbalanced nature of the original dataset.

Accomplishments that we're proud of

We generally contribute two impacts to the app store and its corresponding mobile phone brands with both visualization skills along with data analysis and model construction from machine learnings.

What we learned

When there are numerous outcomes on both the x-axis and y-axis of a graph, we have learned to provide analysis based on heat maps with percentages to have a more direct visualization. Also, we learned to apply the SVM model and neural networks models with 4 layers from machine learning classes. Such a real-world application is invaluable. Moreover, maximization of resource utilization is another important lesson that the Boom Shaka Laka team has realized. In specific, during our group brainstorm, we split one mobile phone user survey into two parts so that we can explore them separately and generate more results and thus more impacts.