Step 1: Project Setup

  1. Environment Setup:

    • Install Python and necessary libraries. You can use Anaconda or a virtual environment.
    • Install PyTorch: bash pip install torch torchvision
    • Install additional libraries: bash pip install numpy pandas matplotlib scikit-learn
  2. Prepare Your Workspace:

    • Create a new directory for your project. Inside this directory, create subdirectories for:
      • Data
      • Notebooks (for Jupyter notebooks, if you prefer)
      • Scripts (for your Python scripts)

Step 2: Data Collection

  1. Gather Network Traffic Data:

    • You need historical network traffic data. If you don’t have access to real data, you can simulate it using random data or find datasets online (e.g., Kaggle, UCI Machine Learning Repository).
    • Ensure your dataset contains time-stamped traffic volume data (e.g., packets per minute, bytes transferred).
  2. Data Format:

    • Your data should ideally be in CSV format, with columns for timestamps and traffic volumes: timestamp, traffic_volume 2024-01-01 00:00:00, 150 2024-01-01 00:01:00, 200 ...

Step 3: Data Preprocessing

  1. Load the Data: ```python import pandas as pd

# Load your dataset data = pd.read_csv('data/traffic_data.csv', parse_dates=['timestamp']) data.set_index('timestamp', inplace=True)


2. Visualize the Data:
   - Plot the traffic data to understand its trends.
   ```python
   import matplotlib.pyplot as plt

   plt.figure(figsize=(12, 6))
   plt.plot(data.index, data['traffic_volume'], label='Traffic Volume')
   plt.title('Network Traffic Over Time')
   plt.xlabel('Timestamp')
   plt.ylabel('Traffic Volume')
   plt.legend()
   plt.show()
  1. Normalize the Data:
    • Normalize the traffic volume to help the model converge faster. ```python from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler() data['traffic_volume'] = scaler.fit_transform(data[['traffic_volume']])


4. Create Sequences:
   - Create sequences for RNN input. For instance, if you want to predict the next hour based on the last 10 minutes:
   ```python
   import numpy as np

   def create_sequences(data, seq_length):
       sequences = []
       labels = []
       for i in range(len(data) - seq_length):
           seq = data[i:i + seq_length]
           label = data[i + seq_length]
           sequences.append(seq)
           labels.append(label)
       return np.array(sequences), np.array(labels)

   seq_length = 10  # For 10 time steps
   X, y = create_sequences(data['traffic_volume'].values, seq_length)

Step 4: Splitting the Data

  1. Split Data into Training and Test Sets:

    train_size = int(len(X) * 0.8)  # 80% training data
    X_train, X_test = X[:train_size], X[train_size:]
    y_train, y_test = y[:train_size], y[train_size:]
    
  2. Reshape for RNN:

    • RNNs expect input in the form of (batch_size, sequence_length, features). python X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], 1)) # Add feature dimension X_test = X_test.reshape((X_test.shape[0], X_test.shape[1], 1))

Step 5: Build the RNN Model

  1. Define the RNN Model: ```python import torch import torch.nn as nn

class SimpleRNN(nn.Module): def init(self, input_size, hidden_size, output_size): super(SimpleRNN, self).init() self.rnn = nn.RNN(input_size, hidden_size, batch_first=True) self.fc = nn.Linear(hidden_size, output_size)

   def forward(self, x):
       out, _ = self.rnn(x)  # Get the output from the last time step
       out = self.fc(out[:, -1, :])  # Linear layer for final output
       return out

# Initialize the model input_size = 1 # Number of features hidden_size = 64 # You can adjust this output_size = 1 # Predicting one value model = SimpleRNN(input_size, hidden_size, output_size)


 Step 6: Train the Model

1. Set Up Loss Function and Optimizer:
   ```python
   criterion = nn.MSELoss()  # Mean Squared Error for regression
   optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
  1. Training Loop:

    num_epochs = 100  # Adjust as needed
    for epoch in range(num_epochs):
       model.train()
       inputs = torch.tensor(X_train, dtype=torch.float32)
       labels = torch.tensor(y_train, dtype=torch.float32)
    
       optimizer.zero_grad()  # Clear previous gradients
       outputs = model(inputs)  # Forward pass
       loss = criterion(outputs, labels)  # Compute loss
       loss.backward()  # Backward pass
       optimizer.step()  # Update weights
    
       if (epoch + 1) % 10 == 0:  # Print every 10 epochs
           print(f'Epoch [{epoch + 1}/{num_epochs}], Loss: {loss.item():.4f}')
    

Step 7: Evaluate the Model

  1. Make Predictions:

    model.eval()
    with torch.no_grad():
       test_inputs = torch.tensor(X_test, dtype=torch.float32)
       predictions = model(test_inputs).numpy()
    
  2. Inverse Transform Predictions:

    • Convert predictions back to original scale. python predictions = scaler.inverse_transform(predictions)
  3. Plot Results:

    plt.figure(figsize=(12, 6))
    plt.plot(data.index[seq_length + train_size:], predictions, label='Predicted Traffic Volume')
    plt.plot(data.index[seq_length + train_size:], scaler.inverse_transform(y_test.reshape(-1, 1)), label='Actual Traffic Volume', alpha=0.5)
    plt.title('Traffic Volume Predictions')
    plt.xlabel('Timestamp')
    plt.ylabel('Traffic Volume')
    plt.legend()
    plt.show()
    

Step 8: Analyze Results

  1. Evaluate Model Performance:
    • Calculate metrics such as RMSE, MAE, etc. to evaluate how well your model is performing: ```python from sklearn.metrics import mean_squared_error, mean_absolute_error

rmse = mean_squared_error(scaler.inverse_transform(y_test.reshape(-1, 1)), predictions, squared=False) mae = mean_absolute_error(scaler.inverse_transform(y_test.reshape(-1, 1)), predictions) print(f'RMSE: {rmse:.4f}, MAE: {mae:.4f}')


2. Interpret Predictions:
   - Analyze when the model predicts a spike in traffic and consider how you might adjust network resources in those cases.

 Step 9: Deployment (Optional)

1. Save Your Model:
   ```python
   torch.save(model.state_dict(), 'rnn_model.pth')
  1. Create a Script for Predictions:
    • You can write a script to load the model and make predictions on new data in the future.

Step 10: Presentation

  1. Prepare Your Findings:

    • Summarize the key insights from your analysis, including how well the model performed and any potential recommendations for network adjustments.
  2. Create Visual Aids:

    • Use charts and graphs from your analysis to create an engaging presentation.

Conclusion

This detailed step-by-step guide outlines everything you need to implement your RNN project for predicting network traffic based on historical data. Each step builds on the previous one, ensuring that you have a clear path to follow.

Inspiration

We chose to build a project related to artificial intelligence since it is such an important part of society now. We know that 67% of people worldwide are connected to the internet, so improving internet security can impact many lives. It is also a good challenge since we have been wanting to develop our knowledge on machine learning. Thus, we have decided to build a product that helps telecommunication practices.

We also wanted to incorporate AI since it’s a flexible technology that hasn’t been applied to this area yet. AI has been used to automate customer service, but not to maximise network connections.

SD-WAN

What it does

The AI model takes in a CSV file with a network’s activity over a certain time period, and responds with whether it is a malicious attack or not. A user could input their network’s data and by anticipating malicious attacks, they can respond before serious consequences occur.

Things it can predict: brute force attacks.

How we built it

Trained the model in Python with Jupyter notebook, connected it to an HTML, CSS and JS frontend with Flask.

Challenges we ran into

Accessing data sets - we had to switch from our first idea, of predicting future network failures due to past failures, because data sets were not available.

We were running out of time, we had to work as a team to debug our code because we realised that there was an error with the output.

Accomplishments that we're proud of

We have successfully learnt to work on artificial intelligence and found a way to improve network connections.

What we learned

We have learnt from our mistakes to better organise our time because we had trouble finishing by the time limit.

What's next for Shield Net

In the future of Shield Net, we would like to generate more statistics that could help better analyse network connections such as when it crashes and at what time does it usually crash and at what speed.

Share this project:

Updates