Event Recording

John Tolbert - The Information Protection Lifecycle

Name: John Tolbert - The Information Protection Lifecycle
Uploaded: 2019-12-12T12:00:00+01:00
Duration: 20 min 56 s

John Tolbert

Lead Analyst

KuppingerCole

Posted on Dec 12, 2019

Too often those of us in the cybersecurity space get wrapped up in comparing, deploying, and managing point solutions. While this is a necessary consequence of both the fragmented nature of the market and the highly specialized nature of our work, sometimes we need to step back and look at the big picture. What kind of information am I charged with protecting? How can I discover and keep track of it all? What kinds of controls can I apply? How can data be protected in different environments, on different platforms, etc? We'll look at the various stages in the life and death of information and how to best manage and protect it.

Video Description

Show Transcript

Well, good morning. Welcome back to day two of our cybersecurity leadership summit. Hope you enjoyed yesterday and learned a lot. I certainly had fun thought it was a lot of good material. So I'm John Tolbert, lead Analyst here at coping or Cole. And today I wanted to start off by talking about a, a new concept. We're introducing called the information protection life cycle and or framework. I think it's a little bit of both because I talk about the beginning and end of data.

But some of my colleagues pointed out that it's a little bit like a framework too, because I try to address different kinds of tools that are involved in each of these six phases. So just at a, at a high level, here we start with asset discovery and classification and then access control, encryption masking and tokenization detection, be detection of threats, deception, which is kind of a new up and coming area. And then disposition, you know, I think it is one of the more difficult areas.

You know, how to do a full information protection plan. It's very complex.

You know, there are a variety of different data formats that you have to consider structured versus unstructured data data associated with applications, distributed storage. You know, it can be anywhere on desktops, mobiles, your network, the cloud, and then just a lack of consistency and interoperability between the different kinds of tools that are out there for protecting information. I showed a couple of these slides in one of my sessions yesterday, but I think it's important to, to look at some of the recent trends around motivations for cyber crime, cyber espionage.

You see, as of September, the numbers were, you know, 84% cyber crime. So that's a lot of fraud perpetrated against individuals as well as businesses, about 13% with cyber espionage. And that would be, you know, corporations spying on one another. And in some cases, state actors trying to steal information from companies as well. And then looking at some of the attack techniques, you know, we are often focused on malware, the preventing thereof, the detection thereof.

And I think rightly so, because you can see 48% of attacks involved, malware of some kind followed by about 13 and a half percent account hijacking and then targeted attacks are almost 11% and targeted attacks are definitely the, the a P T advanced persistent threat kinds of attacks. If you look at the targets, you can see there's a, a slice of the pie for everyone.

You know, depending on what industry you're in, individuals are still being hit pretty hard, but then it, it goes all across the, the range of public administration, defense, social security, healthcare, finance manufacturing. So everybody's experiencing some form of cyber attack or another. So about this life cycle, we have to start with the discovery and classification, cuz you can't really protect what you don't know you have.

So in many cases, tools will allow you to do things like metadata tagging or placing metadata associated with a file that might contain its classification or categorization so that you can then use that in access control decisions. If it's data inside a database, then it gets more complex because you have to either create other tables or ways of representing the sensitivity of data inside the database itself, access control here. I mean really granular, granular authorization, and then encryption, tokenization and masking.

We all know we should be encrypting things as much as possible in transit, in use and at rest. And then there's also the, the possibility of redacting data or sodomizing data detection.

You know, historically we've been able to do a pretty good job of looking at things on the network, thanks to things like CASB, which we'll mention more in a minute, you can do that in the cloud. EDR now allows us to have more visibility endpoint, deception. This is interesting because there are now quite a few, I think about a dozen different vendors that do what I'd call distributed deception platforms. The idea there is to put up fake resources, to lure bad guys, to look at the, the decoys rather than your important data and sort of trap them there and figure out what they're doing.

And then lastly, dispose of it archive or delete the data when you no longer need it. So discovery and classification, I think, you know, this is becoming a little bit easier now actually, thanks to things like GDPR, where we're required to do data privacy, impact assessments, things like that. So organizations are taking a look at the data that they have on hand and categorizing it.

And again, for metadata tagging, that's using tools to hopefully automatically sort through data files and figure out, you know, how they should be classified, which then can be paired with a, an access control system and appropriate decisions made about that. And yeah, with databases, you know, so much, it's not just SQL anymore. As we all know, there's a lot of no SQL types of databases too, that get, makes it even more complex for being able to adequately represent the sensitivity of the data inside the database. So access control, you might be familiar with this model.

This is the, the good old, exact mold reference model where we have policy information points, many cases that's like an L D repository policy administration points where business users write the policies, the govern who can get access to what policy enforcement points, which mediate the decisions.

So a user will let's say have a request to get access to some document on a web server, the P P like a plug-in code intercepts that sends it off to the PDP and asks, should this user be able to get access to this, you know, and that relies on looking at attributes about the user attributes about the resources themselves and then various environmental factors. What makes it more complicated is that there's quite a few different access control protocols and standards that are already out there. We see a lot of usage of JSUN for this jot Java web tokens. OAuth is very pervasive.

You know, people think of SAML as mostly for federated authentication, but I know a lot of organizations will, you know, draw additional attributes and put 'em in a Sam assertion so that you can do authorization and access control based off of SAML assertion too. And then there's Uma or user managed access, which was designed to enable users to have a greater say in what happens, you know, more specifically to their PII and then last but not least our good old friend zible that I was, I was proud to be involved with for a while, but, you know, I think it was so complex.

It really didn't catch on as well as it should have. So in thinking about doing access control at the desktop level, thinking about using DLP programs, you know, you need to install an agent users would let's say try to download data files to thumb drives or upload 'em to different websites. You can sort of have more control over that.

If you have like a DLP policy administration point and then put agents on the machines, you can stop those things by policy DLP is another kind of technology that I think had a lot of promise, but got bogged down because of complexity on the mobile side, you can sort of do the same thing with unified endpoint management or mobile device management solutions. But again, that requires some policy infrastructure in order to be able to stop those kinds of things. CASB stands for cloud access security brokers.

And I think this is a, an important component here because as we've heard from many, many times before, from many people, there are lots of companies out there that have shadow it, organizations that, you know, maybe they're associated with different lines of business and they just don't feel like they're getting what they need outta corporate it. So they run out and, you know, use infrastructure as a service or software as a service. And it doesn't even know what's going on. So you can use CASB for the discovery phase in, in access control here as well.

But I think the real power of CASB is in allowing you to design access control policies that can kind of mirror what you do on your on-premise systems too. So you can actually use CASBY solutions to do the access control and service that enforcement point for all the cloud services that you may be, your organization may be using. And then to me, this makes it a pretty critical I P C or information protection lifecycle component. You've gotta be able to know where your data is, control it and, and keep those policies consistent with what you're doing inside your organization.

So encryption tokenization, masking, encryption, and transit, you know, more and more, we see this being used effectively, SSLs outdated. We should all be using TLS 1.2 or better, but also think about the, the risks of implementation errors, remembering heart bleed, especially there are still widespread use of unencrypted protocols inside networks. It's better to replace that with things like SSH or SFTP application level secure encryption is great, too.

Things like SMI for email, and then encrypting your network tunnels with either SSH or IP sec, and then just enforce encryption again by policy, wherever possible to block UN encrypted protocols on your firewalls or your, your borders inside your network, segments, encryption at rest. You know, there's several different ways to looking at that there's disc encryption, you know, that that's kind of designed to keep discs safe if they get removed from machines, but then you've got file level encryption.

Again, it's better to do this by policy, maybe leveraging those metadata attributes that were applied during the discovery and classification phase and making sure that you encrypt the things that are important to you, database encryption. Again, there are different kinds of technologies. Sometimes they may use proprietary infrastructure, but the, the encryption needs to be standards based and then figuring out the right mix of encryption, depending on the kinds of things that you use in your organization is something to be considered. So we encrypted at rest. We encrypt it in transit.

What about when we use it? There's some kind of academic usages of what's called homomorphic encryptions, and that's like working on encrypted data without decrypting. It doesn't really scale that well yet some organizations I know, use what we call secure enclaves. So that's like a very secure work environment you've gotta VPN into in all the data is kept inside that environment that can be costly and difficult to maintain.

It's not something that you can really do in a, a cloud or a SAS environment that well either masking, pseudonymization and tokenization at the heart, these are all forms of data substitution. You've got the irreversible kinds pseudonymization. This is especially around, you know, making sure that you can't associate data with a particular user static masking.

You know, I was involved in a project years ago where we were doing this for, you know, in the us to get rid of social security numbers. So you effectively destroy the social security number except for the last four digits on the reversible side there's format, preserving encryption types in the, the payment card industry.

They've, they've used vaulted tokenization for many years, but we see a trend now more toward the, the faultless tokenization kind, this kind of alleviates the risk from certain parties in the, the payment transaction ecosystem. So there's a lot of value in being able to do that detection.

So, you know, probably for 20 years, we've had things like intrusion detection systems, some are network based, some are host based and, you know, I think people have gotten value out of those, but at the same time, they're really, really difficult to fine tune. It takes a lot of security Analyst sitting around trying to figure out what's good and what's bad. And then being able to not be overwhelmed with false positives.

So IDs systems are, are kind of tried and true, but they're also kind of tired, you know, so there are newer solutions that have been around for a while that tend to be called now network threat detection and response. The main difference there is instead of being based on a lot of static rules, they use machine learning algorithms to crunch through tons and tons of data and develop a baseline and then look for anomalies.

So there's definitely some advantages in the newer approaches with network threat detection and response on the server side, you've got SIM, you know, dumping all the server logs into a huge data lake where you can then automate processes to look for bad things that may have happened. EDR, like I was mentioning earlier, I think is a nice tool. It's not really for everybody because it's really complex to run, but it also gives you visibility to the endpoint that you would not have had otherwise.

I mean, you can capture lots of great stuff going across the network, but EDR allows you to instrument endpoints and collect information like around process launches, process injections, registry changes, and then be able to see the big picture across your entire enterprise. So I think there's value in each of these newer technologies, but AI ML, not withstanding, they still require very knowledgeable human Analyst to get any value out of. And I thought I'd mentioned the minor attack framework, cuz everybody heard of minor attack.

It's, it's kind of an evolution passed, you know, the Lockheed Martin cyber killed chain, which was very upfront loaded on prevention here. You see with the Mir attack framework, there's just a little bit about prevention, mostly about detecting and respond.

So it, I recommend checking out the Mir attack page because there's lots of information about, you know, extrapolated, all the different kinds of attacks that can happen. So deception deception think about our old honey pots.

You know, somebody might stick a server outside the network and just see what happens to it. That idea has evolved considerably. And now we have, like I said about a dozen vendors that, that have full blown, distributed deception platforms that are designed to draw attackers into these environments so that you can watch them and keep them away from, you know, your really good stuff that you don't want attackers messing around with. It also gives you an opportunity to learn about their TTPs and it can actually be used to compliment all these other monitoring technologies.

It's not a replacement for that. So how it works is, you know, prior to using deception technologies, the hackers always wear hoods. I guess I don't really understand that, but whenever I see a person with a hood on the street, I assume they gotta be a hacker.

Now, you know, they'd, they'd compromise an endpoint, get on your network and then they'd kind of have free reign around the inside of your network. Then let's think about using deception technology. You put up decoys, maybe something that looks like an admin workstation, something that looks like a really valuable server and then things that look like IOT or SCD devices so that you draw them into that, that, and you keep these invisible on your network so that you know, your regular users, aren't gonna find these by accident.

So in the moment you get activity in this decoy network, you know, you've got somebody poking around that. Shouldn't be, then you can watch and figure out what's going on and hopefully keep them away from your real assets motivations for this. There are some areas that, you know, you really can't put like EDR agents on, on a lot of devices like OT devices, SCD devices, they don't support a lot of the security tools that we are accustomed to using.

This would be great in the power grid, many medical devices I've heard from some vendors that they're getting brand new, like MRI machines from various places around the world that come pre-installed with malware. And there's no way you can put an agent on an x-ray machine. So having this duplicate network can help you find, figure out if there are bad guys in your, your hospital, doing things that they shouldn't be doing. And it's gotten enough recognition that even N is now saying consider using this. So I'll try to speed up here at the end strengths and weaknesses.

You know, there are a number of things. It, it helps you cut the meantime to respond. A lot of surveys, show that, you know, the bad guys will be on your network for more than six months before they're discovered. So the idea here is you, you find them and get rid of them faster. It still takes very knowledgeable people to run that that's the threat. The weaknesses is it can be difficult to implement and administer. And at this stage it can be difficult to convince executives, to spend money on a deception kind of project, but the opportunities are great.

You know, it, the market shows it's a really growing kind of technology. So lastly, disposition, you probably already know where I'm going here.

You know, data retention laws govern how long we have to keep certain kinds of data. And then what we should do with it quickly, I would just say, you know, archive the things that you don't actively need every day and then use the principles of data minimization, get rid of what you don't need. Just simply get rid of it and reduce your risk, reduce your risk from a regulatory exposure as well. So I put this together. It's incomplete.

I haven't, I haven't finished it, but trying to think of the tools that map to each one of these components feel free to hit me up afterward. I'll be around all day or, or email me later. This is a work in progress. I think there's a lot of value in trying to put together something that's focused on information protection because we've got cybersecurity frameworks of different kinds and ISO, but I just want something that's really focused on. What do we do? How do we protect our most critical information? And with that, I will bring up Clara Jordan.

Like this?

Don't like this?

Why don't you like this?

John Tolbert - The Information Protection Lifecycle