YARA Rules for Finding and Analyzing in InfoSec

March 29, 2018  |  Monty St John


If you work in security anywhere, you do a lot searching, analyzing, and alerting.  It’s the underpinning for almost any keyword you can use to describe the actions we take when working.  The minute any equation I’m working on comes down to “finding” or “analyzing”, I know what to reach for and put to use.  It’s YARA. The variables of the equation really don’t matter.  A quick interrogation of a file to find out about its contents?  Dig through source code to find a specific algorithm?  Determining if something is malicious or safe to whitelist?  YARA handles those use cases and plenty more.  Really, it comes down to finding things.  Finding fragments of what I’m looking for, whether I want to do so directly, by absence, via a pattern or through some form of calculus.  YARA is my go-to.

Outlining what it can do at a high level is simple to express, but it’s unreasonable to expect that you are as familiar with YARA as I am.  If you are up for a little exploration, dive into the details with me for a minute.

Delving into Details of Data

When it comes to finding, it’s a discussion of what “whole” thing am I looking for or what “fragment” of a whole am I look to find.  In YARA-speak, that’s a detection or detection fragment.  Just like bacon makes everything better, so do examples.  As a detection, we are going to use “Alienvault”.  It’s a recognizable term, after all, and one we want to find.  However, perhaps it’s not exactly as we spelled it.  To combat spelling, spacing and other issues, we can break the whole thing we are looking to find into detection fragments.  Those might be “Alien” and “vault”.  Written in a rule, that would look something like this:

rule at_whole_frag {


      description = “simple detection and detection fragment logic”


      $whole = “Alienvault”

      $frag1 = “Alien”

      $frag2 = “vault”


      $whole or ($frag1 and $frag2)  


The syntax and structure of YARA is pretty intuitive, so I’m going to skip going into full detail about it.  I chatted about the basics of YARA previously on Alienvault and it’s a good primer to get started.  Equally, you can jump into one of our classes and really get into the details.  Regardless, you have to outline a name for your rule, in this case “an_whole_frag”, that identifies it.  Then, you have three internal sections: “meta”, “strings”, and “condition” within a pair of curly brackets.  The meta and string sections are handled like variable assignments.  The condition section is written to return a Boolean value.  If true, it will match, and if false, it will not.  The normal code actions of concatenation, stemming, counting, comparison, and looping are allowed at the condition line.

What we did previously in the example was very simple, ASCII text detection.  We can shift those detections to Unicode strings, remove issues with upper and lower case, or include negation logic at the condition line to look for the absence or negative space.

rule av_whole_frag_alt {


      description = “simple detection and detection fragment logic with a little more spice.”


      $whole = “Alienvault” fullword nocase

      $frag1 = “Alien” ascii wide

      $frag2 = “vault” ascii wide


      ($frag1 and $frag2) and not $whole


The changes we made here reflect the above points.  The detection fragments now look for “Alien” and “vault” in both ASCII and UTF formats.  The “whole” detection looks for “Alienvault”, regardless of how its spelled and matches only when it’s a complete word bounded by non-alphanumeric characters.  Lastly, the condition line has been rebuilt to express logic that will only match when the two fragments are present and the whole is not, showing a negation check.  We could do more but that’s a good depiction of the heart of direct or negative detection with YARA.

Describing Patterns with YARA

Where YARA shines very brightly is in describing patterns.  If you have used grep or regex, then you likely understand what I mean about searching via patterns.  YARA effectively does both of these things, plus a lot more with patterns.  When you see rules that leverage patterns, you begin to see a person’s craftsmanship.  Patterns are descriptive in nature.  You use YARA to outline a concept in a file, like an algorithm or a repeating set of data; a structured output of data and as a means of describing a combination of knowns and unknowns.

Before dropping deeply into this example, I want to introduce a powerful concept.  When you put more than one rule together in a file to build a ruleset, you can use a rule as part of the condition of another rule.  The only real sticky part here is YARA reads rules in a set from top to bottom.  That means a rule has to be placed before the rule using it in the ruleset or it will error.  Let’s combine this with an algorithm to look for data.

To keep this compact, I’m going to focus on the algorithm and reference the rule we are going to import.  This rule will be called IsHTML and its job is to match on HTML files.  This will be brought into the condition of our rule described below.

rule detect_shell_in_div {


      description = “Looking for a target value within a set of



      webhshell = “wso_webshell”


      $my_target = “

      $divopen = “


      $divclose = “



      // I only want to look at HTML files and exclude HTML in other files.

      IsHTML and

      /* Here is the iteration.  The “@“ symbol in YARA means position.  The “#” symbol in YARA is a reference to the count.  Here, I’m looking for a value within the position of the starting and ending

tags within the HTML file and going until I reach the end of the count of opening



      for any i in (1..#divopen) : my_target in (@divopen[i]..@divclose[i])

A Little Bit of Math

Pattern calculus is a great way to perform threat hunting techniques (grouping, stacking, clustering, etc.).  I like to call it “verbal” YARA, since you describe the actions, e.g., “within X of”, “inside of”, “compared to”, “stacked with”, “constant to”, etc.  If I have a favorite, its to look inside of a defined area for a detection.  I’ve an example of that for you below.  Rule logic is longer than what we’ve previously described but bear with me.  I’ll break it down with comments in the rule.  As a note, this leverages the Hash module of YARA.

rule looking_inside {


      description = “looking inside the last 200 bytes for a hash match.”


/* IsPE represents a rule we’ve previously defined to find portable executable files. */

  IsPE and

//and we are looking for a specific file size only

  filesize < 420KB and


This beautiful expression says to hash 20 bytes of the file, starting at the end of the file and backward until 200 bytes are reach to see if it matches the provided hash value.


  for any i in (20,40,60,80,100,120,140,160,180, 200) : hash.md5(filesize-i, filesize+20-i) ==  “302f73788a2dcfac52f4a9b3397c35f6”


Some Figure-Ground Reversal

Let me cap this off by describing what I consider the most elegant use case for YARA.  Its finest hour is when you need to tackle something for which no community exists to drawn on, no easy store of information is on hand, and you can’t fall back on old faithful, e.g., Google it. 

YARA can provide the platform to allow you to identify what the issue isn’t.  You won’t be able to do this with a singular rule but definitely via the right composition it can be done.  By describing what you know, it can help isolate the unknowns.  Once you know where something isn’t, you can exclude those locations within the file and narrow the scope of investigation to where it might be. 

It also can find a solution by helping you invert the problem.  Example: you have a file that has been packed with an unknown packing utility.  Instead of trying to identify the packer, identify which packer it isn’t and then isolate out its characteristics.  It will rapidly thin out the list of possible suspects.

Take a Step to the Left, and Then a Step to the Right

Another fun technique to tackle these types of problems is to use YARA to move laterally to define the problem.  Example:  I know where one algorithm in a file is located but I don’t know where the problem algorithm is.  Via observation, I’ve derived a likely chain of execution that tests to be true.  With that in hand, I can use YARA to move laterally from my known point in the file to test to find (via detection fragments) in the file where the problem algorithm lies.


Every one of these techniques is a form of pattern matching or pattern calculus logic and YARA handles them well.  Its use is pretty much only limited by your imagination to apply it and it has a robust, very active community supporting, creating not only rules, but enhancing and bettering YARA as well.  Hopefully, I’ve whetted your appetite to learn more.  If so, the YARA github repository or the program documentation are the place go next.  Dig in, build some rules and share them and your use cases for them.

Share this with others

Featured resources



2024 Futures Report

Get price Free trial