Defense in Depth in the Cloud

Oct 9 / Irina Zarzu
We once ran into a situation where a security group was left wide open: it allowed all traffic, even to our Amazon ECS cluster. At first, this seemed acceptable since only the internal team members had IAM permissions to interact with the cluster. But we quickly realized the problem: if credentials were ever compromised, the SG was already exposing everything at the network level.

Instead of relying only on SGs, we tightened our defenses through network segmentation. A Network ACL was configured to explicitly allow only trusted subnets and IP ranges to reach the ECS port, while denying all other traffic.

Furthermore, we added a IAM IP based policy to deny requests which were made from another IP than the trusted ones. IAM policies are evaluated before traffic even reaches the resource. The IAM layer would still block unauthorized users, but the NACL provided an additional safeguard, a way to stop malicious traffic before it could ever reach the security group.

This layered approach gave us confidence: even if one layer failed, another stood ready to enforce the boundary.



Overview: Why Private Subnets Are Not Enough


In the previous article we tackled subnet segmentation, why and when to use private and public subnets. In this article we explore the next steps to be taken to enforce security: security groups, NACLs, IAM IP based access control policy.

By default, AWS does not classify subnets as "public" or "private". It is up to the architect to define these based on configurations like routing tables, and internet access. Security Groups can provide strong instance level protection, if they are configured properly. However, configured alone, they do not offer layered security by themselves. A single misconfiguration can expose internal resources directly to the internet.

To ensure data protection in your VPC, you need to implement several security layers, starting with network segmentation. If the bad actor passes through a layer, they will be stopped on the next security layer. In security, we can never say that we have enforced enough layers and that resources are fully protected.

What we can do instead is make life harder for anyone who tries to breach our environment, and ensure his attempts are unsuccessful.

Moreover, if it happens and we are asked why the attacker managed to exfiltrate data.. under no circumstances would the answer be: "The resources were deployed in public subnets with ingress rules set to all." I would propose to save our honor by starting with implementing the basics: subnet segmentation, proper configured security groups and NACLs, and filter traffic based on IAM IP address access-based policy.

1. Security Groups

A security group acts as a virtual firewall which controls inbound and outbound traffic at the instance level. A security group has two types of rules:
Inbound / ingress rules - allow or deny the incoming traffic.
Outbound / egress rules - allow or deny the traffic which leave the instances.

Security groups are stateful, meaning that if a user or another service is authorized to send traffic to an instance, and the instance receives the packets, the reply is guaranteed, regardless of the outbound rules configuration. This simplifies many use cases like allowing a server to receive HTTP requests and respond without needing a matching egress rule.
It’s fundamental to note that Security Groups only allow "Allow" rules, there is no "Deny" rule possibility like in NACLs. So, instead of blocking traffic explicitly, you enforce security by not allowing it in the first place. Regarding the evaluation logic, AWS checks all rules together and allows the traffic if any one rule matches.

From a cloud security perspective, this is one of the foundational layers of defense. For me, this means like applying the least privilege. Think of security groups as enforcers of least privilege at the network level of your workloads. You should apply the principle of least privilege not only to IAM roles, but also to network traffic. 

Best practices to apply when you create security groups:
  • Do not allow inbound from ::/0 or 0.0.0.0/0 to remote server administration ports (SSH - 22 or RDP - 3389), even better, opt for Session Manager.
  • Update the default security group from every VPC to deny all inbound and outbound traffic.
  • Only allow the minimum required ports (e.g., 443 instead of 0–65535)
  • Scoping access to specific IP ranges or VPC peers - cidr_blocks, source_security_group_id, or my favorite, the AWS prefix_list_ids to restrict to AWS services.
  • Using different security groups per tier (e.g., frontend, backend, DB) to isolate behavior.
  • Ensure all changes to security groups are monitored: VPC Flow Logs + CloudWatch Logs Insights/Kinesis/your SIEM; GuardDuty findings for exposed ports/port scans; CloudTrail for SG changes (and alert on public-open).
In well-designed environments, security groups complement subnet design: public subnets expose only the minimum interfaces (like a public ALB), while everything behind it is protected with tightly scoped, well-structured security groups.

Below are some scenarios where security groups are commonly applied:

  1. A RDS instance hosts sensitive data. Its security group allows traffic only from backend servers' security group. Even if someone knows the RDS endpoint, they can’t connect to the database without being part of the trusted SG.
  2. A bastion host is used for SSH into private EC2 instances. Its security group allows port 22 from the corporate VPN CIDR only. The bad actor scans the public IP, but they won't be able to connect to it since they are not inside the VPN. 

2. Network Access Control Lists (NACLs)

After implementing subnet-level segmentation and applying Security Groups at the resource level, the next network defense layer to apply is the Network Access Control List (NACL), a stateless firewall applied at the subnet level.

Unlike Security Groups, which are attached to resources like EC2 or RDS, NACLs are associated with entire subnets, filtering traffic entering or leaving all resources within that subnet. They act as a first line of control before the traffic reaches the instance’s security group.
Each NACL contains a set of rules for both:
  • Inbound traffic - allow only the exact ports and peers that must reach resources in the subnet (e.g., 443 from web/ALB CIDRs); everything else is denied by default.
  • Outbound traffic - permit only required destinations and ports (e.g., 443 to VPC endpoints or specific services) plus the necessary ephemeral ports to those peers.
  • Each rule specifies:
  • Protocol - you should specify the exact protocol (tcp/udp/icmp) instead of -1 (all), so only the intended traffic type is permitted.
  • Port range - use the service ports for forward traffic and the minimal ephemeral range (typically 1024–65535) for return paths.
  • Source/destination CIDR block - a best practice is to choose the smallest possible CIDRs (peer subnets or known ranges) and avoid 0.0.0.0/0.
  • Action: Allow or Deny - I prefer explicit allow rules for required flows and rely on the implicit deny.
  • A rule number -determines the evaluation order (first match wins), starting from the lowest numbered rule. For example, if there are two rules: one numbered 100 and the second 200, the 100 rule will be applied. 
What I would like to highlight is that unlike Security Groups, NACLs support both Allow and Deny rules. This makes them useful when you want to explicitly block certain traffic, for example, blacklisting a malicious IP range.

I found it useful to mention that in addition to the above security best practices regarding Network ACLs, the CIS AWS benchmark recommends:
  • Do not allow inbound from 0.0.0.0/0 to remote server administration ports (SSH - 22 or RDP - 3389). This means that in case you not not use SSM and you have the 22 and 3389 ports open, restrict the inbound traffic only from the trusted IPs.
  • Make sure you monitor the changes made on the NACLs: real-time monitoring can be achieved by directing the CloudTrail Logs to CloudWatch Logs or another SIEM. It is important to set alarms, because if those are triggered, you will know that something malicious might happen.

Pay attention that NACLs are stateless, meaning the return traffic must be explicitly allowed. If an inbound rule allows traffic on port 443, you must create a corresponding outbound rule to grant the response. As you can imagine, this can be powerful, but it also adds complexity if it is not documented or tested correctly.

For a better understanding, here are some examples of how NACLs can strengthen your network defenses:

  1. An infected EC2 tries to exfiltrate data over a non-standard port (TCP 9000). The Network ACLs only allow outbound traffic on 443 port and ephemeral ports. The connection will be dropped because NACLs apply to both inbound and outbound., and TCP port 9000 it is not allowed.
  2. An incident response team suspects malicious behavior in a subnet, where multiple EC2 instances are running. To quarantine the whole subnets quickly, without hunting down and updating each Security Group, they updated the NACL to deny all inbound and outbound traffic.

3. IAM Access Control Based on IP Address 

In addition to subnet-based controls, security groups and NACLs, you can enforce security at the identity level using AWS IAM identity-based policies that restrict access to resources based on the user’s IP address.

In my opinion, this layer is often overlooked, and I was glad to include it in my research. Please keep it in mind, it’s more important than you might think, and it can also appear in AWS exams.

Applying this type of policy will deny all the requests that do not come from the specified IP addresses / CIDR range. IAM conditions such as aws:SourceIp and aws:ViaAWSService (used for service-to-service calls) can be applied only for public IP address ranges and allow you to:
  • Permit access only from trusted corporate networks, VPNs, or specific office IPs.
  • Deny access attempts from unauthorized IP ranges.
  • Permit access to AWS services which can be out of VPC (regional services which are available only via public endpoints, such as: S3, CloudFront, Secrets Manager) and you must control the access via API / IAM.

The above policy adds another critical layer of defense: even if a subnet or security group is misconfigured, unauthorized access can still be blocked at the identity level. IAM policies are evaluated after network-level controls. They enforce identity-based access, ensuring that even if a request is allowed at the network layer (VPC subnets, security groups, Network ACLs), it will be blocked at the identity level, based on the SourceIP condition. The condition will compare the requestor’s IP with the authorized Ips listed in the policy. For example, if a malicious actor manages to get credentials and tries to use them from an outside IP, they will not be able to call AWS services or access data, because they do not belong to the IP addresses / CIDR block mentioned in the above policy (“203.0.113.0/24”, “198.51.100.0/24”). 

In case that the bad actor manages to bypass the NACLs and security group, they will be stopped at the IAM IP based policy, because their IP it is not part of the IP range the policy enforces.

I believe that we all understand better with examples, therefore here are some examples of how IP-based policies add an extra layer of protection:

  1. A developer accidentally commits IAM access keys to GitHub. An attacker finds them within minutes and tries to run AWS CLI commands from their laptop in another country. The IAM policy checks the attacker’s IP and sees it’s not from the company VPN range, so access is denied.
  2. An S3 bucket’s access policy was left wide open by mistake. Normally, this would expose data to the internet. But because the IAM policy denies access unless the request comes from the office IP block, no external attacker can exploit the misconfiguration.

Conclusion

While subnet segmentation is the foundation of network isolation in AWS, it is not enough on its own. Security groups, NACLs, and IAM policies play a unique role in enforcing access control: from instance-level to subnet-level to identity-level protection. What makes a cloud environment secure is not a perfect configuration, but a defense-in-depth strategy where each layer compensates for the possible failure of the other.
Created with