“Machine learning” is the new “it” buzzword in security. As a result, it’s being thrown around fairly loosely on vendor websites and in marketing materials. Not only is that unfortunate for anyone looking to get a straight answer on how machine learning can help their company stay more secure, it is also fostering a general sense of confusion around what the term actually means.
To help clear things up, let’s take a closer look at six of the most common misconceptions around the use of machine learning in security and separate the facts from the fiction.
Myth #1: Machine learning is a form of protection
Perhaps the biggest misunderstanding around machine learning being perpetuated by vendors with vague marketing claims is that it’s some kind of new product or feature vendors can offer that can keep companies safe. The fact is machine learning doesn’t provide protection, it helps inform how existing protection operates. The way it does that is by enabling more in-depth and accurate analysis.
Myth #2: Machine learning is only being used by next generation antivirus solutions
Currently, the most common application of machine learning in endpoint security is analyzing file attributes to predict whether a file on disk is malicious before it has the chance to execute (in other words, the same job antivirus has been doing for years).
But machine learning isn’t just being limited to building a better AV or next-gen AV mouse trap. New solutions are also utilizing machine learning to move endpoint security forward in a different direction. Rather than simply analyzing static file attributes and making a prediction for what a program will do before it’s executed, for example, Barkly analyzes program behaviors during runtime, in an effort to identify and block executing malware in the act.
Myth #3: Machine learning is only being applied to analyzing files
While solutions that rely on file scanning (ex: next generation antivirus) have obvious trouble detecting fileless attacks — with no file, there’s nothing to scan — other solutions (such as Barkly) are using machine learning to help them analyze system activity and predict whether any particular combination of system calls and commands are indicative of an attempted attack in progress.
Myth #4: Machine learning models don’t need to be re-evaluated for months
The fact is that machine learning is not “set it and forget it.” Models are only as good as the data they analyze. Improvements to protection depend on frequent, rigorous re-training of the model by providing data with high fidelity to the real world. The more limited the data — in terms of quantity, quality, and frequency — the lower the model’s ceiling for providing accurate results.
Myth #5: Machine learning-based protection generates a lot of false positives
In addition to training machine learning models to recognize malware, they can also be trained to recognize goodware samples (even custom software unique to specific organizations). This is the approach we take at Barkly, and it allows our models to adapt to reduce false positives.
Myth #6: Machine learning is a black box
A potential downside to machine learning that’s been raised is the fear that once a model is trained and learning on its own, there’s little insight into why it’s making the determinations it’s making. The truth is that while some machine learning models are a black box, others are able to expose their logic, providing more constructive value for researchers.
There is no doubt machine learning has and will continue to revolutionize endpoint security. It’s important for security and IT professionals to understand exactly how this technology actually works, so they can make the best and most meaningful purchasing decisions possible.