In this PICNIC podcast, Kevin Conklin is talking about security analytics. What exactly are they? Can they help make us safer? Why aren’t they already helping?
Our guest Mike Paquette, director of product and security markets at Elastic, has over 15 years’ experience developing IT security products. At Elastic, he is involved with the machine learning analytics products that help keep us safe.
“Security analytics is the application of a variety of techniques to the data that’s available within the IT infrastructure to help people understand what’s going on,” Paquette said. “If there are hackers or other adversaries trying to steal information, you’ll be able to find it earlier and quickly.”
Good news: Security analytics is a good thing.
Bad news: Some people aren’t using it.
“The threat landscape has changed pretty dramatically,” Paquette said.
Security never gets easier, either, because the threat landscape changes all the time. “So you’re always having to keep up and try to stay ahead,” Paquette said.
In an IT security environment, security analytics is fairly common.
“Most organizations with mature IT organization within them are using some form of analytics,” Paquette confirmed. Like looking for malware signatures, for a basic example.
But as to the more advanced technology that a modern interpretation of analytics requires in the constantly shifting web of attack behaviors? “Unfortunately, most organizations haven’t taken that step yet,” Paquette said.
What Can Analytics Do?
Anti-virus is a signature-based technology to try to identify known bad e-mails, programs, or executables.
An example of the more automated security analytics would be something like FireEye, which scrutinizes the signatures of incoming traffic to see if they’re bad. “These programs are not only based on signatures but they’re actually able to deduce from watching a program run whether or not it might be malware,” Paquette said.
It’s less common, but the IT environment is taking steps toward these more advanced analytics.
“The technology is maturing rapidly. This will be technology that we might talk about as machine learning or even artificial intelligence,” Paquette said.
“It’s making its way into the tool set of IT security teams,” he said. But the majority of organizations don’t have advanced analytics technology in place—yet.
In particular, Elastic is the company behind a number of open source software products.
The primary one is Elasticsearch, a data storage search engine that can be used in a wide variety of applications.
“Security analytics is in very many ways a search problem,” Paquette said.
If you can adjust all these logs into a data store—and then search through them at very high speeds—well, now you’ve got the basis of security analytics. “Because you’ve gained that visibility, you can see attack behaviors before it’s too late,” Paquette said.
This requires an IT organization to understand that they want to put all this data someplace and choose a mechanism to store lots of data, structured or unstructured.
And companies are just starting to wake up to this.
For years, there’s been a security technology called Security Information and Event Management, or SIEM. “When SIEM was first designed, the scale of data being produced by the IT infrastructure was pretty modest by today’s standards—maybe gigabytes of data per day,” Paquette said.
Compare that to large organizations today that produce tens of terabytes of data per day. “The ability for traditionally designed SIEMs to be able to ingest, index, and search that kind of data just isn’t there.”
“So they need a new kind of search engine data store to be able to gain visibility to that data through the process of searching,” Paquette said.
What Does Machine Learning Have To Do With It?
Once you have all the data organized, then you get to the good stuff. Analytics could mean visualizing your data through dashboards and graphics to machine learning technology.
“Machine learning can be used for lots and lots of different aspects of security,” Paquette said.
Elastic has a machine learning technology that models the “normal” state of IT infrastructure data and notifies you when it’s acting unusual. This form of advanced analytics for anomaly detection can really help speed the discovery process of attacks and progress.
“Humans are really good at finding anomalies—if you’ve got a well-visualized pattern of a small number of data samples,” Paquette said. “But if you’ve got really complex data and you’ve got many fields that might be unusual, then humans just run out of steam.”
Machine learning just becomes automated anomaly detection. Nobody has to bother looking at a screen to know when to investigate unusual data because the machine finds it.
Basically, you can delegate log scanning to an entity that’s way better at it than a human is.
Security analysts typically read log files into databases and sort through them for known signatures or thresholds of certain values.
“The first form of automation that we’re talking about in security analytics is really taking that to the next level where the analyst still in control,” Paquette said.
If you’re an analyst nervous about your organization’s IT infrastructure because you think that if data were to be snuck out over the DNS protocol you might not be able to see it. So you’d use machine learning to say, Keep an eye on my DNS activity, model it and let me know if anything is unusual.
“That’s a really powerful step,” Paquette said. Being able to tell a machine to let you know when things are unusual.
Will Analytics Always Keep Evolving?
But all this makes you wonder: Is security just a never-ending goal, or is it ever really achievable?
When the good guys and the bad guys are often equally knowledgeable about SIEM rules and anomaly detection—the war between them can feel eternal.
“It is a dynamic, escalating battle,” Paquette admitted. “If you’ve got anything that can be monetized, including information, that has a target on it. The adversaries will try to find ways to leak that, to steal that, to monetize it.”
If you’re a good guy, it can feel like the adversary has the advantage, because they only need one weakness to get in, whereas the defenders have to protect all the vulnerability. This struggle is not going away anytime soon, either.
But if you have an anomaly detection tool that can know when something is different, any attack is going to have to dedicate a lot of time to studying the normalcy of the environment in order to move through it without immediately setting off alarms.
“With a little bit of guidance up front by the analyst, the technology is indeed sensitive enough to detect those anomalies,” Paquette said.
Of course, if the data were completely normal, there wouldn’t be any exfiltration. “If you’ve got the capability to model the normal activity, you should be able to find any type of significant exfiltration,” he said.
When Will Analytics Win?
Clever adversaries know how to work around anomaly detection that looks at just the volume of network traffic. They just exfiltrate data in small packets over a period of time.
“Automated anomaly detection with machine learning looks at other metrics in addition to just the traffic volume and it’s able to detect all kind of unusual patterns that may be indicative of that exfiltration.”
So there’s hope.
A significant factor is developing analytics that can be used in real time—so people can see breaches as they’re happening and protect their data right away. Instead of finding out six months later from the FBI that their data has been lost.
Security professionals refer to the time period between when a breach happened and when it was noticed as the dwell time. A few years ago, that could be nine months or so. A shockingly long time.
“Reports from the end of the 2016 year show that the gap has closed considerably,” Paquette said. “Now the median time is about 100 days. It went from absolutely terrible to just really bad.”
Major exfiltration can’t happen overnight. If you could discover a breach within a day or two, you could knock out 90% of the problem.
“This type of technology that we’re talking about can certainly help accelerate that reduction in dwell time,” Paquette said.
How long until dwell time is down to one day?
“Five years. That technology is going to advance rapidly. The bad guys will still have the advantage, but it will be harder and harder for them,” Paquette predicted.
To learn more about anomaly detection from Elastic, navigate to elastic.co, Products, and X-Pack. Machine learning is one of the key capabilities of the X-Pack, which is part of their product Elastic Stack.
And as always, listen to the PICNIC podcast to hear more about data, security, machine learning, and the problem in the chair not in the computer.