The content of this post is solely the responsibility of the author. AT&T does not adopt or endorse any of the views, positions, or information provided by the author in this article.
The cloud has revolutionized the way we do business. It has made it possible for us to store and access data from anywhere in the world, and it has also made it possible for us to scale our businesses up or down as needed.
However, the cloud also brings with it new challenges. One of the biggest challenges is just keeping track of all of the data that is stored in the cloud. This can make it difficult to identify and respond to security incidents.
Another challenge is that the cloud is a complex environment. There are many different services and components that can be used in the cloud, and each of these services and components has different types of data stored in different ways. This can make it difficult to identify and respond to security incidents.
Finally, since cloud systems scale up and down much more dynamically than anything we’ve seen in the past, then the data we need to understand the root cause and scope of an incident can disappear in the blink of an eye.
In this blog post, we will discuss the challenges of cloud forensics and incident response, and we will also provide some tips on how to address these challenges.
How to investigate a compromise of a cloud environment
When you are investigating a compromise of a cloud environment, there are a few key steps that you should follow:
- Identify the scope of the incident: The first step is to identify the scope of the incident. This means determining which resources were affected and how the data was accessed.
- Collect evidence: The next step is to collect evidence. This includes collecting log files, network traffic, metadata, and configuration files.
- Analyze the evidence: The next step is to analyze the evidence. This means looking for signs of malicious activity and determining how the data was compromised.
- Respond to the incident and contain it: The next step is to respond to the incident. This means taking steps to mitigate the damage and prevent future incidents. For example with a compromise of an EC2 system in AWS, that may include turning off the system or updating the firewall to block all network traffic, as well as isolating any associated IAM roles by adding a DenyAll policy. Once the incident is contained, that will give you more time to investigate safely in detail.
- Document the incident: The final step is to document the incident. This includes creating a report that describes the incident, the steps that were taken to respond to the incident, and the lessons that were learned.
What data can you get access to in the cloud?
Getting access to the data required to perform an investigation to find the root cause is often harder in the cloud than it is on-prem. That’s as you often find yourself at the mercy of the data the cloud providers have decided to let you access. That said, there are a number of different resources that can be used for cloud forensics, including:
- AWS EC2: Data you can get includes snapshots of the volumes and memory dumps of the live systems. You can also get cloudtrail logs associated with the instance.
- AWS EKS: Data you can get includes audit logs and control plane logs in S3. You can also get the docker file system, which is normally a versioned filesystem called overlay2. You can also get the docker logs from containers that have been started and stopped.
- AWS ECS: You can use ecs execute or kubectl exec to grab files from the filesystem and memory.
- AWS Lambda: You can get cloud trail logs and previous versions of lambda.
- Azure Virtual Machines: You can download snapshots of the disks in VHD format.
- Azure Kubernetes Service: You can use “command invoke” to get live data from the system.
- Azure Functions: A number of different logs such as "FunctionAppLogs”.
- Google Compute Engine: You can access snapshots of the disks, downloading them in VMDK format.
- Google Kubernetes Engine: You can use kubectl exec to get data from the system.
- Google Cloud Run: A number of different logs such as the application logs.
Figure 1: The various data sources in AWS
Tips for cloud forensics and incident response
Here are a few tips for cloud forensics and incident response:
- Have a plan: The first step is to have an explicit cloud incident response plan. This means having a process in place for identifying and responding to security incidents in each cloud provider, understanding how your team will get access to the data and take the actions they need.
- Automate ruthlessly: The speed and scale of the cloud means that you don’t have the time to perform steps manually, since the data you need could easily disappear by the time you get round to responding. Use the automation capabilities of the cloud to set up rules ahead of time to execute as many as possible of the steps of your plan without human intervention.
- Train your staff: The next step is to train your staff on how to identify and respond to security incidents, especially around those issues that are highly cloud centric, like understanding how accesses and logging work.
- Use cloud-specific tools: The next step is to use the tools that are purpose built to help you to identify, collect, and analyze evidence produced by cloud providers. Simply repurposing what you use in an on-prem world is likely to fail.