The problem with our managed SOC (called Cloud Security Center) was the use of different dashboards and mailboxes to get the security alerts. This information that was fragmented and we needed to consolidate and correlate the information (e.g. which user used which workstation). Our SecOps department was very busy consolidating all the information from the different Security & Compliance products and we needed to operate in a multi-tenant environment.
We needed to optimize our information gathering in a safe and secure environment which supported automation. The Protect solutions were implemented by our SecAdmins via the Security Baseline. The SecOps workload is Detect, investigate and Respond to the Security alerts and to minimize the number of false-positives (our security baseline helps with that).
What it does
We leveraged the Security Graph API to read the security alerts, first via a pull mechanism and in the next version in a push mechanism via notifications. We built the solution to connect any ticketing system possible via the API of the ticketing system, in our case we used FreshService.
We read alerts from all Security providers supported by the graph API and present them via the user chain objects Identity, Devices, Apps & Data by leveraging the security providers like e.g. Azure Security Center, Identity Protection Center, Windows Defender Advance Threat Protection, Microsoft Cloud App Security. The same approach applies to the Infrastructure solution.
The solution supports multi-tenancy and we safely store the data to enrich the tickets (e.g. classify the object like the user, device or data of the c-level management).
To elaborate on that, we gather information about security from the following sources:
- Graph API - Security
- Secure Score
- Secure Score Control Profiles (for recommendations on raising secure score)
- Graph API - Sign-In
- Retrieving the last sign-ins of a user to determine if authentication was with MFA enforced. If so automatically closing the security alert.
- Graph API - Directory Audit
- Monitor users for strange behavior, especially on sign-ins from an unfamiliar location.
Next to the Graph API we also get information through Azure Resource REST API to gather:
- All Azure Resources
- We get notified on resources that are not monitored by our SOC and therefore are a possible threat for breaches. This is done by reading tags from the Azure resources.
How we built it
Using Azure functions for specific serverless logic like:
- Polling Security Alerts
- Receiving Notifications from Graph API
- Managing Graph API notification subscriptions
- Processing and transforming data
Using Logic Apps:
- Receiving Alerts from Azure Functions
- Pushing updates of Alerts back into Azure
- Enriching alerts with data retrieved from other above-mentioned API's
- Posting a ticket to a third party cloud-based ticket system
Other than Functions and Logic Apps we are using CosmosDB for detecting duplicate security alerts. We are also using an Azure Data Lake for storing information so we can use AI to further analyze the security alerts.
Challenges we ran into
- Not storing any sensitive data that comes out of Azure security products about users and resources on places other than our Ticket system.
- Making it unavailable from the outside (verified via pen-testing).
- Not storing any Secrets in configs, only Azure KeyVault is used.
- Not every security provider through the Graph API supports notifications (only polling).
Accomplishments that we are proud of
- Making a highly scalable solution.
- Very low monthly costs by not only pulling all the alerts by a set interval but also subscribing to notifications.
- Everything is Stateless, making is easy to resend/retry part of the solution and also to update the production environment while it's live.
- Making data in CosmosDB only available for the tenant that the code is executing for by using the tenant id as a partition key. By making the partition key required it is virtually impossible to retrieve data from a different tenant.
What we learned
The learning curve from day one in a Cloud world is the use of the Security Graph API itself (version control) and the output from the different providers, some providers give more in-depth information and some give basic information. Second, we needed to find a solution on duplicates from the different providers, e.g. Azure AD Identity Protection has an overlap with Microsoft Cloud App Security (Threats).
What's next for Cloud Security Operations Center
- Adding AI to correlate more data (sources) and use intelligence to reduce the number of false-positives
- Integrate SOC services like vulnerability scanning and new(s) malware reports for our customer (CISO) to give them a peace of mind