The case for a common taxonomy for the description of malicious behavior

June 26, 2015  |  Russ Spitler

The task of defending our environments from attack is made more difficult from the lack of a common taxonomy for describing malicious behavior observed. Each security control we deploy describes the threats it can detect in a different manner, each providing little insight into the nature of the behavior being reported. This disparity and lack of consistency makes it difficult to understand the potential impact of the report and difficult to relate incidents reported by one control with another.

The Problem

The ever-changing landscape of threats is hard enough to keep track of, but to complicate matters each security researcher and vendor describes the latest threats in their own terms. These terms are often quite complicated and understanding them requires a substantial amount of knowledge in not only the methods hackers’ employ but also familiarity with the way the particular research team providing the report describes those methods. This makes the task of the typical analyst far harder when they are trying to understand exactly what is going on in their environment. To fully illustrate this problem let’s take a look at a few examples.

Snort Free Ruleset:

The snort free ruleset is a set of signatures for the SNORT IDS engine provided by the Sourcefire VRT research team. This is one of the most widely distributed sets of IDS rules.

FILE-PDF PDF with large embedded JavaScript - JS string attempt

This alert is reporting someone attacking one of the users in the environment trying to deliver an exploit embedded in a PDF. Even someone who only reads the news would know that PDF’s are common attack vectors, a junior security analyst would know that a hackers often use the javascript engine embedded in PDF viewers as a way of executing a malicious payload.

EXPLOIT-KIT Cool Exploit kit java payload detection

This alert is describing an observed attack launched from an exploit kit. Again a junior security analyst may know that exploit kits are subscription-based platforms that provide hackers exploits for the latest discovered vulnerabilities. In this case the payload is targeting a known vulnerability in java. Use of an exploit kit does not affect the potential impact of the attack but it does provide some understanding of nature of the vulnerability being targeted (a client running a web-browser) and potentially some insight into the nature of the attacker (who bought ‘off the shelf’ malicious software).

BROWSER-OTHER Novell Messenger Client nim URI handler buffer overflow attempt

This alert is reporting an attack attempting to exploit a component of the Novell messenger client. This is a little easier to decipher, a buffer overflow is a very traditional exploit technique that attempts to execute code on the target machine. This particular exploit is targeting a known vulnerability of a program that is used on a user’s desktop.

FILE-IDENTIFY Microsoft Office Access file magic detected

This alert is reporting that a file that was being transferred was identified as a Microsoft Office Access file. The identification of the file was done by inspecting the few-bytes at the beginning of the file that encode the file type called ‘file magic.’

McAfee AntiVirus

McAfee AntiVirus is one of the most popular antivirus programs. The alerts generated by this program are updated daily by the McAfee security research team. The naming of these alerts is particularly obtuse which is ironic considering these alerts are most often displayed to a completely naïve end-user.


This alert is indicating that a Trojan[1] that gathers information about browsing habits and displays unwanted advertisements has been detected on the machine. The string of numbers is the beginning of the unique file identifier for the Trojan. To complicate matters even more, other anti-virus vendors refer to this particular Trojan as:

  • Adware/Win32.Hotbar
  • Win32:HotBar-BL
  • Generic_r.EZ (Adware)
  • ADSPY/AdSpy.Gen2
  • Trojan.Generic.7444697
  • PUA.Win32.Packer.Upx-53
  • W32/HotBar.L.gen!Eldorado

AlienVault Labs Correlation Rules for OSSIM

OSSIM is an open-source SIEM platform that comes with a number of security controls embedded in a common framework. AlienVault Labs is the security research team that produces correlation rules for the OSSIM platform as well as for the commercial alternative made by AlienVault.

Service attack, successful denial of service against Microsoft SSL server DST_IP (MS04-011)

This alert is indicating that an attacker exploited a known vulnerability (MS04-011) and successfully performed a denial of service attack.

Attack, file /etc/passwd access on DST_IP

This alert indicates that an attacker attempted to access the file used to store user passwords on unix based systems.

Malware, Spyware Hotbar detected on SRC_IP

This alert indicates that an infection of the same Trojan described by the McAfee alert examined above (Adware-HotBar.f!886F6F2A1226) was detected on the machine identified by ‘SRC_IP.’

While the intention of these alerts can be determined with some domain expertise it is a task that requires substantial cognitive engagement. The unfortunate nature of the world we live in today is that these alerts often come dozens at a time. Even an expert user, spending a minimal amount of time interpreting the intention of the alert, can quickly be overwhelmed. Let us examine the issues in the current naming convention.

Lack of Context

Alerts are descriptions of behaviors in the environment they relate to. That behavior may be the result of a user’s action, automated actions of software installed in the environment, or a malicious activity. The examples above do not provide context as to what the nature of the behavior being described. For example:

  • Adware-HotBar.f!886F6F2A1226
  • FILE-IDENTIFY Microsoft Office Access file magic detected
  • FILE-PDF PDF with large embedded JavaScript - JS string attempt

All describe different types of behaviors, an infection by a Trojan, the transfer of a simple file, and an attack. However, this cannot easily be determined without domain knowledge.

Colloquial Naming

Vendors have taken a colloquial approach for the naming of their alerts. While this this approach may seem to be more user friendly ultimately it makes efficient comprehension of the alert more difficult.

  • Attack, file /etc/passwd access on DST_IP
  • Service attack, successful denial of service against Microsoft SSL server DST_IP (MS04-011)

In order to understand the alert the full name must processed by the end user. In each alert the entire name must be read, then processed to understand what is being described. The use of natural language means that when reading two alerts the same piece of information can appear in drastically different places and comprehension requires full cognitive engagement.

NOTE: Above, we also examined the McAfee alert names that certainly took on a different nature but the critique still applies, though on the other extreme of perhaps being a little too technical.


While there is certainly not consistency across vendors, even internal to a vendor the alerts fail to provide information in a consistent manner. For example the SNORT rules:

  • BROWSER-OTHER Novell Messenger Client nim URI handler buffer overflow attempt
  • EXPLOIT-KIT Cool Exploit kit java payload detection
  • FILE-IDENTIFY Microsoft Office Access file magic detected
  • FILE-PDF PDF with large embedded JavaScript - JS string attempt

In this set of rules there is a basic structure, each alert starts with an identifier that is common to a number of alerts (peers not shown in this example set). Unfortunately those identifiers are not always used to describe the same properties of the alert.

  • BROWSER-OTHER - attempts to classify the target of the observed attack
  • EXPLOIT-KIT - describes the nature of the tool used to launch the attack
  • FILE-IDENTIFY - describes the nature of the action the IDS engine has performed
  • FILE-PDF - is describing the nature of the payload used in a particular attack

While all of these properties are certainly useful information, failure to describe the same property of each alert in a consistent manner quickly creates confusion and requires additional thought to understand the relevance of the information.

While these issues are apparent in the examples above, it is not the intention to pick on this set of vendors. Throughout the security industry the same patterns can be observed. The failure to address this causes the amount of expertise needed to effectively leverage these tools increase dramatically. In addition it places an additional burden on those actually responsible for handling the alerts as they arise.

Prior Work

To date there have been substantial efforts to propose taxonomies for the enumeration and categorization of events (logs). What is being discussed here has a markedly different goal. A common taxonomy for describing malicious behavior is intended to provide consistency for the explicit description of alerts and events for the purpose of consumption by a security analyst. It is not the point, as some event taxonomies attempt, to provide a generic all-encompassing description of observable behavior. Such efforts include CEE and CEF.

In addition there are some efforts related to describing ‘Cyber Observables’ (CyBox) as well as indicators of compromise (OpenIOC), which is an attempt to describe an instance of malicious behavior. These systems are methods for describing the detection of malicious behavior, not categorizing the output. For example, OpenIOC provides structured steps for detecting Zeus malware infections by checking for files, network events, etc. The purpose of a common taxonomy for describing malicious behavior is to give a consistent name to the output of that analysis.

A similar approach for categorizing common errors found in software was the original basis for this work. This can be found in the ‘7 Pernicious Kingdoms’ paper[2].

Introduction to the ‘Intrusion Kill Chain’

Categorizing malicious behavior is a difficult task. Fortunately, a substantial amount of work has been done to this end by the Lockheed Martin CSIRT and published in the paper ‘Intelligence-Driven Computer Network Defense Informed by Analysis of Adversary Campaigns and Intrusion Kill Chains.’ The premise of this approach is to apply the military concept of a ‘kill chain’ to the stages of a breach. The ‘kill chain’ in military terms is the idea that for a particular incident to occur there is a series of necessary preliminary events each of which must occur in order before the ultimate incident can happen. Disruption of any preliminary event can prevent the occurrence of the ultimate incident. The Lockheed team mapped the stages of a breach to this concept and produced the following:

  • Reconnaissance – gathering data on the targeted environment
  • Weaponization – developing or selecting an exploit for use on a vulnerability in the target environment
  • Delivery – delivering the exploit to the target environment
  • Exploitation – the successful exploitation of the vulnerability
  • Installation – establishing a persistent presence in the target environment
  • Command and Control – the persistent presence retrieving additional instructions for action
  • Action on Objectives – actually carrying out the actions necessary to accomplish the original objectives, such as ex-filtrating data.

The model proposed provides very fine granularity to describe the kill chain and provides an excellent conceptual model for an attacker-centric defense. However, there are a few issues to be considered when drawing from this model to describe malicious behavior.

A Recursive Model

The series of events described above can be conceptually mapped to the breach of an entire environment, however with that approach there are some shortcomings. Often, during the course of a breach, an attacker first attacks a single host and then moves laterally or pivots within an environment to ultimately achieve the objectives they have set out to accomplish. The model as described above does not account for this type of pivoting inside the environment. Other proposals have been made to further expand the kill chain to account for these additional steps. Alternatively, the kill chain model can also be conceptually mapped to the breach of a single machine. The behaviors necessary for lateral movement within an environment are the same as the first stage of an attack (as an attacker attempts to breach the initial target within an organization) regardless of how many times it has been repeated before; as a computer is compromised all of the steps of the kill chain are necessary. This approach provides advantages not only by keeping the model simple, but also it provides some functional advantages to be explored later.


From the perspective of deploying preventative controls the granularity of the kill chain is quite beneficial. A user might focus on disrupting the ‘exploit’ stage of a buffer overflow (a common attack methodology) by configuring the systems they are defending to use executable space protection (a common protection against those attacks). Or they could configure a network-based IDS device to attempt to detect the actual ‘delivery’ of the exploit payload. Deploying defenses analyzing the controls deployed at each of these stages makes good sense. However when describing malicious behavior there are a few stages where the ability to differentiate is not necessarily important. For example, differentiating between an exploit being observed and further installation of malicious software is not always possible – one event may be observed without the other.

Disruptive Attacks

The ‘Intrusion Kill Chain’ was created to describe the stages in an intrusion, however not all incidents are necessarily intrusions. The notable exception to this is Denial of Service attacks. Attacks of this nature are driven by one of two objectives – with the aim to distract from a concurrent intrusion attempt, or as a form of vandalism. For these types of attacks the later stages of the ‘Intrusion Kill Chain’ are irrelevant, but the first four stages are still applicable when describing the event – Reconnaissance, Weaponization, Delivery, and Exploit.

Application of the ‘Intrusion Kill Chain’ to Categorize Behavior

With an understanding of the ‘Intrusion Kill Chain’ model and the potential issues in its use to describe malicious behavior, we can start down the path of creating a methodology for using this model to categorize observed behavior. Before describing such a methodology let us first review the issues we originally set out to address:

  • Lack of Context – the categorization must provide the user context as to the nature of the behavior observed
  • Structure – the categorization must provide clear structure with explicitly defined purposes for each element in the structure to ensure clear communication.
  • Consistency – the categorization must not be ambiguous. The behavior being described must have one and only one way of being categorized to ensure consistency as different users apply the categorization.

In addition to the original problems that needed to be addressed there are also some requirements for the methodology in order to ensure it is flexible enough to address the range of behaviors it will be responsible for describing. This is, any model must be both complete to provide consistency when a user encounters a new behavior, and extensible in order to be flexible enough to describe any behavior observed.


Intent – Strategy – Method

The categorization uses a three-tiered model for describing an observed behavior. The first tier is the ‘Intent’ of the behavior, this roughly maps to the ‘Intrusion Kill Chain’ to provide an understanding of the context of the behavior. The second tier is the ‘strategy’ of the attacker took used to describe the methodology employed. The third tier is the ‘method’ of the behavior used to describe the details of the particular methodology.


The intent describes the context of the behavior that is being observed. These intents roughly map to the stages of the ‘Intrusion Kill Chains’ but collapsed so as to ensure that each is discrete.

  • Reconnaissance & Probing - observed behavior indicating an actor attempting to discover information about your organization. This is broad-based, including everything from port scans to social engineering to open-source intelligence.
  • Delivery & Attack - observed behaviors indicating an attempted delivery of an exploit. This can include detection of malicious email attachments, network-based detection of known attack payloads or analysis-based detection of known attack strategies such as SQL Injection.
  • Exploitation & Installation - observed indicators of successful exploit of a vulnerability or a remote access trojan or backdoor being installed on the system.
  • System Compromise - observed indicators of a compromised system.
  • Environmental Awareness - observed behavior and status about the environment being monitored. This includes information about services running, behavior of users in the environment, and the configuration of the systems.


The strategy describes the broad-based strategy or behavior that is detected. The intention is to describe the strategy the malicious user is using to achieve their goal or to . For example, when trying to exploit a known vulnerability in a web browser the attacker is launching a ‘Client-Side Attack - Known Vulnerability.’


The method describes the particular method employed by the actor. To further the previous example if the method would provide additional detail on the target of the attack and the particular vulnerability ‘Firefox - CVE-2008-4064’

Categorization in Action

Let us examine the examples provided in the first section using the new categorization.

Original New Categorization

FILE-PDF PDF with large embedded JavaScript - JS string attempt

Intent: Delivery & Attack

Strategy: Client-Side Attack – PDF

Method: JavaScript Payload – JS String
EXPLOIT-KIT Cool Exploit kit java payload detection

Intent: Delivery & Attack

Strategy: Client-Side Attack – Exploit Kit

Method: Cool Exploit Kit – Java Payload
BROWSER-OTHER Novell Messenger Client nim URI handler buffer overflow attempt

Intent: Delivery & Attack

Strategy: Client-Side Attack – Known Vulnerability

Method: Novell Messenger Client –CVE-2013-1085

FILE-IDENTIFY Microsoft Office Access file magic detected

Intent: Informational

Strategy: File Transfer – Microsoft Office Access

Method: File Magic Detection

Intent: System Compromise

Strategy: Trojan - Adware

Method: Hotbar
Service attack, successful denial of service against Microsoft SSL server DST_IP (MS04-011)

Intent: Exploit & Installation

Strategy: Denial of Service

Method: Microsoft SSL Server – CVE-2003-0533

Attack, file /etc/passwd access on DST_IP

Intent: Reconnaissance & Probing

Strategy: Attempted System File Access

Method: Unix - /etc/passwd
Malware, Spyware Hotbar detected on SRC_IP

Intent: System Compromise

Strategy: Trojan - Adware

Method: Hotbar

As can be seen, the concerns laid out in the original analysis of these alerts are addressed. The categorization convention provides a clear structure for the alert and provides the user with a clear context and consistency.

Additional Benefits

Beyond improving the ability for users to comprehend the alerts, this approach also affords an additional benefit by having a well-defined structure. Priority of alerts can be guided by the stage the alert relates to and further, automated, analysis of the alerts can also be employed. For example, an alert related to ‘Delivery & Attack’ originating from an internal host such as:

Intent: Delivery & Attack

Strategy: Client-side Attack – Known Vulnerability

Method: Microsoft Windows Explorer - CVE-2006-4690

Could then be used to generate an alert like the following:

Intent: System Compromise

Strategy: Internal Delivery & Attack

Method: Delivery & Attack Alert - Internal Host

The structure of the alerts allows for automatic contextual analysis of the environment the alert is generated in. Such automation can be used to more efficiently leverage detection capabilities and improve incident response efficiency.

What AlienVault is doing to help

Since June 2013 the AlienVault labs team has delivered all of their threat intelligence using this taxonomy. Over the course of the last two years we have seen a huge benefit for our end users as a result. The first thing we have seen is a drastic reduction in the domain expertise needed to understand the alarms generated by our system. Our users have plenty of other responsibilities during the day, so reducing the time it takes for them to understand an alarm from a minute to a few seconds is a huge benefit. The second benefit we have seen is a drastically simplified mechanism for prioritizing their work. The users no longer need to rely on an opaque risk equation to order their alarms, they can simply look at the intent to discover the alarms that they need to pay attention to first. The end result of this is that our users can look at the alarms that are more important first and it takes less time to understand and effectively react.

[1] Non-replicating malicious software embedded in a shell program intended to hide its true purpose

Share this with others

Get price Free trial