The project aimed to identify fraudulent transactions through known patterns, such as flow pattern and circular pattern, using machine learning. The known patterns were first identified using visualization and filtering data on the basis of various factors provided in the sample data by Credit Suisse such as transaction time, frequency, source location and source destination. That data was then selected as training set for Machine learning using a random Forest Classifier. The visualization was initially used was on Neo4j (as provided by Credit Suisse).
The patterns were attempted to be identified on the basis of Dijkstra Algorithm by assigning weight to transactions on the basis of the criteria listed above and then visualized on the using directed graphs in Matlab. Simultaneously, Machine learning algorithm based on the training data gathered was also attempted. However, the group ran into issues due to poor training data or possibly not identifying appropriate data. Despite running into these issues, it was over all a very positive experience as we are proud of being able to accomplish so much in less than 24 hours. Furthermore, it was also a great learning experience in the fields of Data Science and Computer Science.
Future work can be carried out in this field on the ideas listed above, as well as probability, ML and Null Hypothesis could also be performed to further improve the training data or just to provide addition information regarding the confidence to the user.