Inspiration
This project stems from our dedication to enhancing organizational transparency and accountability. Through analyzing data from multiple sources and drawing on anomaly detection algorithms, we've learned how nuanced and challenging it is to integrate diverse datasets into a single model. Understanding data behavior across disparate systems and spotting subtle patterns has not only strengthened our technical capabilities but also deepened our knowledge of data-driven compliance.
What it does
In today’s complex business landscape, ensuring data integrity and identifying anomalies is crucial for maintaining regulatory compliance and mitigating financial risk. Our project, Anomaly Detection - Employee 360, is inspired by the need for a comprehensive approach to detect fraudulent and non-compliant activities within an organization’s extensive data ecosystem. By leveraging machine learning techniques, particularly Isolation Forest and Graph Neural Networks (GNN), we aim to provide an AI-driven solution that effectively monitors diverse datasets—from CRM and SAP systems to Medical Information and Grants.
How we built it
Our approach involved a multi-stage development process:
Data Collection & Feature Engineering: We first gathered data from CRM, SAP, T&E, Medical Info, and Grants. Extensive feature engineering was then applied to create meaningful attributes that amplify the detection of anomalous activities.
Model Development & Comparison: We experimented with Isolation Forest for its efficiency in identifying outliers in high-dimensional data. We also incorporated Graph Neural Networks to capture complex relationships and dependencies within our data.
Ontology and Entity Resolution: A critical step in our process involved creating an ontology to define relationships and align entities across systems, allowing for more accurate anomaly detection.
Deployment & Productionalize: To ensure usability, we developed and implemented models through a scalable, production-ready framework, allowing seamless integration into existing infrastructure.
Challenges we ran into
Accomplishments that we're proud of
Implementing and fine-tuning both Isolation Forest and Graph Neural Networks (GNN) for anomaly detection was a proud milestone. We developed a model architecture that effectively identifies complex patterns of non-compliance and potential fraud, adding significant value to organizational data monitoring.
What we learned
Developing an anomaly detection model with high accuracy while ensuring compliance with strict data privacy and regulatory standards was a major challenge. We also faced technical constraints due to data integration issues and performance limitations. Creating an efficient ontology to support entity resolution and relationship mapping was particularly complex. Additionally, balancing model precision with the flexibility needed to detect evolving patterns of non-compliance required constant tuning and testing.
What's next for Anomaly Detection of claims data
With Anomaly Detection - Employee 360, we hope to set a new standard for data integrity in enterprise environments, demonstrating how advanced AI can drive responsible and proactive compliance. Moving forward, we plan to enhance model adaptability and streamline the onboarding of new data sources, ensuring our solution remains robust and scalable for future needs.
Built With
- databricks
- neptune
- starburst
Log in or sign up for Devpost to join the conversation.