What is metadata? According to Wikipedia, it is defined as "data that provides information about other data."
A practical example: If you make a phone call to someone, that is a piece of data. The time, your location, duration of the call, topics discussed, etc. is the metadata about the call.
As John Oliver stated: If you made phone calls to your ex 12 times last night from outside of a bar between 1am-4am and each call had a duration of 15 minutes (considering this metadata) someone could be fairly certain you left pretty pathetic voicemails.
Metadata provides Intelligence. If you have seen the movie, Snowden, the term metadata is used a lot. This is because NSA and other intelligence agencies have been leveraging many different sources of data for years to enhance their ability to interpret data. Sources such as phone calls, web traffic, travel, credit card data, text messages, contacts, geolocation have been used to garner additional context. This is because metadata provides intelligence.
Metadata & Proofpoint
Proofpoint ITM collects metadata to provide intelligence on insider threats. When a user logs into a system the Proofpoint agent collects the timestamp & duration of the session, login account, system name, the far endpoint the user came in from, total number of slides, and link to the video.
In the above image, hitting the “+” button allows the Security and IT teams to expand the user session they are interested in evaluating. This provides them the ability to see each user action that took place (application names, window titles, URLs, website names, process names, USB insertions, file copies, print jobs, and commands) as seen below:
Why is the Proofpoint metadata valuable?
Simply put, it provides the context of user actions, not system or machine data.
When a user types on the keyboard or clicks on the mouse, behind the scenes the Proofpoint agent is grabbing different types of metadata as outlined above.
To add more context, a common form of insider threat is data exfiltration.
Let’s pop over to the Proofpoint lab and explore the scenario of a user exfiltrating data via Dropbox™. We will compare the data that Proofpoint collects vs network/system data.
Here is the Proofpoint metadata trail of a user copying a file to Dropbox:
It doesn’t take a technical wizard or a data scientist to make out what this user did in this session. We see Kevin Donovan, logging in at 11:16 PM, launching Google Chrome, and making a file copy (“SensitiveDoc.doc”) to Dropbox.
Here is what the same scenario would like in Wireshark (Network tool):
Here is what this would look like in Splunk (SIEM tool):
Wireshark captures logs at the network layer and Splunk tries to make sense out of many different log sources while Proofpoint's focus is at the endpoint and, specifically, on user data.
Leveraging User Data to Combat Insider Threat:
User data is one of the most important sources of data to leverage when it comes to combating insider threat. Two well-respected industry sources, the Software Engineering Institute at Carnegie Mellon University, CERT, & the National Insider Threat Task Force (NITTF) provide commentary on the importance of monitoring user data:
“UAM [User Activity Monitoring] refers to the technical capability to observe and record the actions and activities of an individual, at any time, on any device accessing…information [used] to detect insider threats and support authorised investigations” -NITTF Guide
"Oftentimes UAM serves as the starting point and core of an insider threat analysis hub.” - CERT
With capabilities like rich user-specific metadata, Proofpoint is ushering in a new era of insider threat detection & prevention.