This repository contains a set of scripts to automate the process of gathering data from malware samples, training a machine learning model on that data, and plotting its classification accuracy.
Make a copy of config-template.ini called config.ini and edit it.
Ensure that the "tools" subdirectory has been initialized ("
$ git submodule update --init tools")Either use
get_samples.pyto download samples or copy them into "all_apks" from another source.sort_malicious.pyuses andrototal.org to sort them into "malicious_apk" and "benign_apk" folders.extract_apks.shunpacks the .apk files into folders and checks the AndroidManifest.xml files for validity.parse_xml.pyreads the AndroidManifest.xml files and puts the permissions requested by each app into "app_permission_vectors.json".run_trials.shruns thetensorflow_learn.pyscript (where the ML happens) a number of times and writes the results to "results.csv".plot_data.pyplots the data produced by the previous step using matplotlib.
Log in or sign up for Devpost to join the conversation.