As my knowledge and experience of Cloud networking grew from designing network architectures over time and also more of lately from reviewing client network architectures, I've come to realise and appreciate the need to designing a proper network architecture that includes the long-term considerations, as early as possible - especially before a projects begins and definately before any resources are deployed into any VPCs.
In the past, I didn't have much of an interest into how a network was configured in an office, or how routing to a publicly accessible on-premise hosted application was set up when I first started in IT, this is mainly due to understanding very little about networking and also just because the networking was looked after by somebody else. It is only when I started using Cloud services where it allowed me to learn networking, much easier, perhaps because I was able to design, build and play around with my own dedicated isolated network in minutes without worrying about breaking things.
In this blog we will illustrate what a Monolithic Subnet looks like, and the problems that comes along with them; and illustrate one way to break down a Monolithic Subnet into multiple smaller Subnets - how solutions can benefit from designing workloads to leverage dedicated individual Subnets where each Subnet is for a set of common resources type or grouping. VPCs is also susceptible to becoming monolithics so I will write a separate blog about it in the future. Workloads should always be deployed across multiple AZs architecture for high-availability but to make this blog more digestable we will talk about a one AZ architecture.
Monolithic Subnet example
We often see systems evolve over time whether they are applications or databases, that get to a point where they are too big to run, maintain or work with: these systems are commonly known as Monolithic Applications or Monolithic Databases.
Networks and the constructs of Networking can also be susceptible to becoming a monolith, early signs and symptoms could be: 1) CIDR block based rules in Security Groups or NACLs encompasses IP addresses of resources that should not be opened up to: a cause of this may be due to the number of different groups of resources within a Subnet where the IP addresses of each resource is non-deterministic – it may be difficult to design a minimum set of CIDR block values for rules to satisfy the least privilege principle. 2) conversely, CIDR block rules in Security Groups or NACLs with too many granular rules may also be a sign of a Monolithic Subnet – the common symptom are quotas of rules being reached too often.
Let’s take the example of 4 groups of compute resources, each group has a different network traffic usage behaviour than the other groups – group #1 communicates with resources in VPC X and VPC Y, while group #3 communicates with resources in VPC Y and VPC Z.
This is an example of how the groups of resources could be represented in a Subnet ordered by their Private IP address:
Often CIDR block rules that are too broad are used, which opens access to resources that should not be included. The following rules also allows in resources Groups #2 and #4 to communicate with resources in VPC X, Y and Z, when they are not expected to interaction with resources in any of those VPCs.
Conversely, implementing granular rules to follow best practice of least privilege may lead to quotas of Security Groups and NACLs to be reached; in any case, least privilege should be followed.
Solution
The solution is to break the groups of resources down into a Subnet for each Group. There is no hard rule that states a VPC must contain X number of tiers of Subnets - Subnets are used to group similar resources with similar network traffic patterns, if there are many groupings of resources then it is perfectly fine to create as many number of tiers of Subnets – one Subnet for each Grouping.
As a result, rules are more specific, targeted and makes it straight forward to implement the least privilege principle.
When groups of resources are broken down from a monolith Subnet into multiple Subnets, there are other benefits created as a by-product:
- With each resource group deployed in a separate dedicated Subnet early on it will likely reduce or eliminate (a good solution is to not have a problem to begin with) future re-work that combats increased architecture complexity, which may often require re-deploying resources into new Subnets - to me this is unnecessary effort if we can avoid it, especially for resources that requires a lot of manual effort to deploy
- NACLs rules are broken down and grouped into its respective resources and Subnet, which leads to fewer number of rules in a NACL – reduce possibility of reaching the quotas
- When all resources are deployed within one Subnet only Security Groups could be leveraged to implement firewall rules, but when resources are broken down into multiple Subnets then NACLs can be leveraged as well
- Security Posture is improved because certain traffic does not enter the Subnet from adjacent Subnets if NACL rules are implemented appropriately
- Depending on how granular you break down your monolithic Subnets, if it is a very fine break down then you are setting up your network architecture to be in a position to implement tighter controls gearing towards a micro-segmentation network architecture
This solution compliments the use of networking solutions in other blogs I have written:
- Maintain a Prefix List of EC2 Private IP Addresses using EventBridge
- Work-around for cross-account Transit Gateway Security Group Reference
- AWS Prefix List
- Swiss Cheese Network Security: Factorising Security Group Rules into NACLs and Security Group Rules