The use of stolen credentials, via phishing or other means, is becoming one of the greatest challenges security professionals face. Based on the notion that people's behaviour form predictable patterns, I thought it must be possible to "profile" users' successful auth and therefore detect the use of misappropriated credentials. Like all good ideas, the mechanism of this correlation search came to me as I was trying to get to sleep one night and was consequently developed on the back of an envelope.
Regarding the name of this entry: At the time of submission to this competition, I needed to quickly come up with a name to succinctly describe this correlation search, but "Stolen Credentials" or similar, struck me as too mundane. The Third Man (an obvious movie reference) sounds better, and initially was just an allusion to a bad actor. However, after more thought, if we consider the first man to be the administrator, the second man to be the user, and the third man to be the bad actor, I think this analogy works quite well.
How it works
This correlation search takes the CIM Authentication data model and enriches it with autonomous system information and an abstraction of time, then creates a statistical "fingerprint" of each users' behaviour in relation to what, when, where and how they successfully auth. A significant deviation from a user's pattern triggers the alert. Although this sounds relatively straightforward, importantly this correlation searches' ability to detect anomalous behaviour is derived from it's unique high-level abstraction of circumstances.
Most alerts of a similar nature use geographical topology lookups (known for being less than reliable), however this alert uses a logical network topology lookup to determine where a user authenticates from in relation to the structure of the Internet. This is significant because a typical user accesses the Internet from half a dozen or less Autonomous Systems (akin to postcodes at a global scale), in their day-to-day activities. By using statistics to model the likelihood of a user authenticating from a particular part of the Internet in relation to the other aspects of the tuple (what the resource is that's being accessed, when it's being accessed, and how it's being accessed), we can make a determination as to how confident we can be that the user credentials being used and the person they represent are one and the same.
Challenges I ran into
- The out of the box Autonomous System lookup (asn_by_cidr) in ES doesn't work because the ip field isn't in CIDR format. It took a great deal of time and energy to produce a CIDR-based ASN lookup.
- Other than my own honeypot's open dataset, I couldn't find any open authentication dataset and so created my own using SA-eventgen.
What's next for Third Man Correlation Search
- IPv6 support
- The ASN lookup needs a long-term solution and I'll work with Splunk to look at how this can be achieved