Inspiration

The rise of social engineering scams, such as phishing, vishing, and smishing, has become a growing concern as they evolve in sophistication and scale with the advancement of artificial intelligence. These scams not only pose a universal threat but also disproportionately target vulnerable communities. The Federal Trade Commission's (FTC) October 2021 Staff Report, "Serving Communities of Color," highlighted that minority communities are frequently targeted by fraud in areas such as education, credit products, and government impersonation. Further, a survey by Malwarebytes revealed that Black people, Indigenous people, and people of color (BIPOC) are at a higher risk of falling victim to identity theft compared to others. This project seeks to address these disparities by developing an AI-driven solution to detect, analyze, and combat social engineering scams, with a specific focus on protecting marginalized communities from online fraud.

What it does

The application features three main functionalities aimed at enhancing user security.

Check if Email is Phishing attempt: Users can input email received into the application, which leverages the fine-tuned Gemini model to analyze the content. Fine tuned model analyzes email against trained phishing patterns, providing users with a definitive assessment of whether the email is likely to be a phishing attempt. This functionality helps users avoid falling victim to malicious tactics designed to steal information or money.

Check if call is a scam attempt: Users can upload audio recordings in MP3 format to assess the legitimacy of potential scam calls. The application utilizes Google Speech-to-Text technology to convert the audio into text. This transcribed text is then analyzed by the fine-tuned Gemini model, which has been specifically trained on transcripts of known scam calls. By analyzing the audio, the application can offer insights into the legitimacy of the call, helping users make informed decisions about how to respond.

Ask Questions on General Security Guidelines: In addition to threat detection, the application serves as an educational resource by allowing users to ask questions related to general security guidelines. This feature provides users with personalized advice on various topics, such as password management, recognizing phishing attempts, and safe online practices. By fostering an understanding of security measures, the application empowers users to take proactive steps in safeguarding their personal information.

How we built it

Finetuning gemini 1.5 flash model:

Data collection: The data collection process for fine-tuning the model involves creating two specialized datasets. First, the phishing email dataset includes both human-generated and LLM-generated phishing and genuine emails. This diverse dataset is essential because modern phishing attempts increasingly use AI to craft highly convincing, genuine-looking emails. By incorporating LLM-generated emails, the model becomes capable of identifying subtle signs of phishing in both traditional and AI-generated content. Second, the scam call dataset consists of transcriptions of scam and genuine calls. Using transcriptions allows the model to analyze linguistic patterns typical in scam calls, such as urgency, manipulation, or requests for personal information, while also learning to distinguish legitimate conversations. Together, these datasets ensure the model can accurately detect phishing emails and scam calls, even when they mimic genuine interactions.

Data transformation: For preparing the dataset for Gemini fine-tuning, the phishing email and scam call datasets are restricted to 200 samples in total, with an equal split of 100 genuine and 100 phishing or scam entries. The dataset is then structured into two columns: the input column, which contains the pre-processed and cleaned text, such as the body of the email or the transcribed conversation from a call, and the output column, which holds the corresponding label, either "phishing" or "genuine" for emails, and "scam" or "genuine" for call transcriptions. This format aligns with Gemini’s expected input-output structure for fine-tuning, allowing the model to learn to classify future data based on these labelled examples. Each row presents the text in a readable format paired with the appropriate label, ensuring that the model can process and understand the relationship between the input and output efficiently.

The model is fine-tuned for 10 epochs using a batch size of 4 and a learning rate of 0.001. This configuration allows the model to iteratively learn from the 200 samples in the dataset, with each batch containing 4 examples. The learning rate of 0.001 strikes a balance between convergence speed and stability, enabling the model to effectively adjust its weights based on the gradients computed during backpropagation. Fine-tuning over 10 epochs provides sufficient iterations for the model to capture the nuances in the phishing and scam call data, improving its ability to differentiate between genuine and malicious communications. Next in testing phase, the model processes each entry in the testing dataset, generating predictions based on the learned patterns from the fine-tuning phase. These predictions are then compared against the actual labels

Integrating with UI:

Built a React and Flask application that interacts with the fine-tuned model for analysing phishing emails and scam calls. The React frontend serves as the user interface, allowing users to input text or upload audio for analysis, which are then sent to the Flask backend for processing. Once the model generates a response, the Flask backend returns the predictions back to the React frontend for display.

Challenges we ran into

Character limit of the Gemini model (40000 input and 5000 output), which restricted us to using only 200 samples for fine-tuning. This limitation made it difficult to select a diverse range of scams, as we could only include a limited number of examples, potentially hindering the model's ability to generalize effectively.

Gathering a dataset for scam calls was challenging due to privacy concerns. Many relevant datasets are not publicly available, as releasing this information could invade the privacy of users involved in such calls.

Future Scope

Looking ahead, one potential direction is to enhance the model’s capabilities by increasing the data size used for fine-tuning, which would provide a more robust foundation for recognizing a wider range of phishing attempts and scams. Additionally, integrating real-time capabilities with devices to alert users of potential scams or phishing attempts could significantly improve user protection.

Built With

Share this project:

Updates