AlienVault® USM Anywhere™

USM Anywhere Log Data Enhancement

When evaluating threats to your systems, the more complete the initial picture of an incident is, the more accurate and efficient USM Anywhere can be in identifying and responding to those threats.

Log data is one of the key sources of this threat data context. Log messages can tell us a tremendous amount about network events. Every network connection, authentication request, file transfer, and privilege escalation, as well as many other events that happen on the network or in the cloud, generates a log message.

However, many messages were never intended to be used for security purposes. There are no official standards for log contents (although there are best practices) therefore, they are often inconsistent. Two messages from the same product or vendor may be completely different in terms of the information they carry, and most vendors have different logging standards and terminology. Even if a vendor has made significant investments in log consistency and completeness, most log messages still lack valuable context.

For example, look at a typical log message generated by an authentication event:

	"outcome" : "Allow",
	"type" : "Authentication",
	"source" : "",
	"destination" : "",
	“time” : “2018-10-17T19:03:26+00:00”

This message is rather brief. Let’s take a look at what needs to be improved for security purposes.

Data Normalization

The first step USM Anywhere takes when it digests your system logs is to normalize them so that all incoming data uses the same terminology. In this context, normalization means mapping it to a standard terminology. For example, a vendor may use the term "outcome" or “result” to describe the success or failure of the authentication attempt. USM Anywhere will normalize these two different attributes, replacing them with a single, standard term. Likewise, things like “source,” “source_ip,” “client,” “client_ip” all need to be mapped to the same terminology so events from different vendors can be used for correlation and alarm generation.

Below is an example of how normalization works. Note that USM Anywhere preserves the original log message as a best practice, in case you need to share it with a vendor or need to refer to the original alert. This means that the normalization phase of message processing will likely increase the size of the log message by around 100%.

	"log" : "{ \"outcome\" : \"Allow\", 
		   \"type\" : \"Authentication\", 
                   \"source\" : \"\", 
	           \"destination\" : \"\" }",
	"source_address" : "",
	"destination_address" : "",
	“event_outcome” : “ALLOW”,
	"event_name" : "Authentication",
	"timestamp_occured" : “2018-10-17T19:03:26+00:00”

Data Enrichment

Normalization allows us to analyze all of the log messages USM Anywhere receives. Given the incomplete nature of so many log messages, it also makes sense to use this same process to add valuable information to the log that will help do better incident detection.

That’s the idea behind data enrichment. The USM Anywhere infrastructure has a large amount of contextual data about the network and systems that can it can attach to the log message to fill in the gaps and to enhance threat detection. It also has access to many databases of things like the location of certain IP addresses, device types, and threats it can also leverage.

Let’s look at some examples of how this is done to improve incident detection.

Device Identity

The majority of servers rely on DHCP for dynamic IP address allocation. From a security point of view, this means that identifying and containing threats is much more difficult. By the time a system is identified as compromised, it may be on the network in a completely different place with a completely different IP address. To address that problem, USM Anywhere will use the network context it has to collect and include the MAC address and/or a unique identifier for the system, depending on which are known:

"source_asset_id" : "f8ebb373-b551-43d0-a628-a00771b5d0c1",
"source_mac" : "98:01:A7:B4:D8:47",	


Knowing where your network connections are terminating is important when deciding if traffic should be permitted, blocked, or more carefully monitored. Geolocation can play a role in deciding if a given incident is worthy of more attention. Therefore USM Anywhere will augment logs with geolocation information of source and destination. In the example below, this data quickly allows an operator to determine this particular destination is probably not an issue:

"destination_address" : "",
"destination_name" : "AD Server",
"destination_asset_id" : "8cdf98a1-533d-9ec2-b5bc-3424caecef15",   
"destination_organisation : "Microsoft Azure",
"destination_city" : "Redmond",
"destination_fqdn" : "",
"destination_hostname" : "ad",
"destination_organisation : "Microsoft Azure",
"destination_latitude" : "47.6801",
"destination_longitude" : "-122.1206",
"destination_region" : "WA",
"destination_country" : "US",
"destination_country_registered" : "US",		

Collection Details and Flags

USM Anywhere also includes some additional information about how the log message was acquired and how it was processed. We include this information to give the security analyst and correlation algorithms insight into the source of the log, when it was received by a sensor, and how it was processed. For example, was_fuzzied = true means that the log message was received from a source that USM Anywhere doesn’t have a specialized plug-in for and therefore it may not have normalized all the fields. If the log is key to an investigation, the operator should look at the original log message and ensure nothing was overlooked.

"timestamp_received" : "1518626634857",
"was_fuzzied" : false,
"sensor_name" : "3aee1c56-5736-463c-a022-c3ac005ca82e",
"plugin" : "Azure Authentication",
"plugin_device_type" : "Cloud Platform",	

Impact on Log Storage

Because USM Anywhere adds data to log messages, the size of the original log message inevitably grows. Very sparse messages can grow as much as 1,860%. The messages themselves are still small in the grand scheme of things, typically growing from less than 250 bytes to as much as 2.6K bytes. Still small, but clearly these add up over time. The good news is that the amount of metadata added is stable, that is, it doesn't grow much larger or shrink in size for different event classes. So with careful planning, storage use can still be quite predictable. For larger events, (for example events coming from Network IDS and AWS), the percentage goes down significantly since the messages start out quite large. However, for small events such as the one in our example, it can have a noticeable impact on the total amount of data stored.

Here are some syslog- and AWS-heavy data points for planning purposes:

Syslog-heavy deployment

From a sample size of 599,979 events

  • Total size including enriched data in bytes: 1,612,790,164
  • Total size of just log data in bytes: 145,781,057
  • Average log size in bytes: 243
  • Average log size with enriched data: 2,688
  • Increase in size: 1106%

AWS-heavy deployment

From a sample size of 500,000 events

  • Total size including enriched data in bytes: 1,934,740,282
  • Total size of just log data in bytes: 711,502,141
  • Average log size in bytes: 1,423
  • Average log size with enriched data: 3,868
  • Increase in size: 272%

What Happens When We Hit Our Tier Limit?

If you find yourself running into problems with inadequate storage space, your first step should be to review your logging strategy with AT&T Cybersecurity Technical Support, or your service provider. It may be that you don’t need to send as many logs as you are. However, we always prefer to err on the side of logging too much rather than logging too little, since lost logs cannot be recovered and security investigations can lead in unexpected directions.

When approaching your monthly storage limit in USM Anywhere, you have two choices: just rely on transient mode, or actively prune your consumption with Event Filters.

Transient Mode

USM Anywhere calculates and projects both how much space you have consumed, and will consume, during the month. If consumption exceeds the monthly capacity, transient mode is automatically turned on. When transient mode is turned on, it's important to understand the following:

  • All events are still stored within cold storage. Transient mode does not affect cold storage.
  • All events are still correlated. You will not miss any alarms being generated because of transient mode.
  • Alarms, Vulnerabilities, and Configuration Issues will still be generated and stored. You will not miss any security issues because of transient mode.
  • All events are dropped before persisting into hot storage. So while you will see alarms, you will not see the events that generated them. Additionally, you will not be able to see new events within your events view.

Event Filtering

If you want be proactive with your data consumption, consider reducing the amount of data stored by using filters. Event filtering allows packets to be dropped before they enter correlation and persistence, before they consume any of the monthly storage allotment. Filtering allows you to define a set of rules for fields which, when matched, will be dropped. This allows you to cherry-pick certain types of packets which you wish not to enter the system easily. When using filtering it's important to realize the impact:

  • Filtered Events will not be stored within cold storage.
  • Filtered Events will not be correlated. Alarms will not be generated off filtered Events.
  • Filtered Events will be dropped from going into hot storage. You will not see them within your Events view.

When using filters it's important to make sure that you're precisely defining the criteria for Events to be dropped. If the filter rule is too broad, there is a chance you may drop packets which you are interested in keeping.

Is There Any Way of Freeing Space?

If you are in transient mode and wish to free up space to allow for more Events, you can purge the last 7 days of Events. This will only affect Events and will not purge any Alarms that were generated by them. Additionally, the purge does not affect any Events that are in cold storage. This will remove those Events from hot storage and you will not see them within your Events view.

Compliance Considerations for Filtering and Purging

It is important to remember that most security compliance regimes require storage of 90 days of logs. Therefore, purging logs may put you in violation of your compliance regulations. It is also important to understand if there are any compliance implications to filtering out data as well. If a filter will restrict the amount of data logged by a system under PCI or similar requirements, it is important to check with your compliance team first.