Content Delivery Networks (CDN) are the vital backbone of the internet, responsible for content delivery. These days, every content type is tangled with CDN, new websites, articles, shopping online, watching videos, or social media feed.
CDN will resolve the famous “latency” issue, which is, by definition, the annoying delay that occurs when a user requests to load a web page to the moment its content actually appears on the screen.
There are various reasons for this delay, but the most repetitive one is impacted by the physical distance between the end-user and the website’s origin server. A CDN’s goal is to virtually shorten this physical distance and improve site speeds and performances. CDNs are to improve user experience and provide it with a more efficient network resource utilization. To know more, read the article; What Is CDN?
Now, to maintain a clear view of CDN architecture, let’s have a closer look at CDN infrastructure architecture.
What Is CDN Architecture?
A CDN provider will program a website to be hosted by specified PoPs (Point-of-Presence). The PoPs will cache all the necessary data needed by visitors (users) within that area. As a result, the visitors will be directed to the closest PoP, and data will be delivered as requested from the origin server. This is all available by high-tech CDN architecture.
CDN architecture provides features to optimize content in different ways.
Pull zones are specific areas of the CDN architecture that will automatically pull the static content from the origin server and create a copy distributed and served to users once requested. CDN providers will identify from which servers the CDN will get the cache’s data and the correct path data pulling.
Push zones are the areas of the CDN architecture from which data is to be distributed. They will upload content on various storage servers and push content to PoP based on users’ requests.
Edge servers are the points-of-presence that will cache and send data to the nearest visitors (users). A CDN provider will choose the best edge server for clients’ CDN architecture needs, and based on the requested locations, only those areas of the CDN architecture will be activated.
CDN Architecture Layers
DNS Layer: The DNS layer will find the CDN server closest to visitors’ browsers.
A Reverse Proxy Layer: The reverse proxy layer imitates the website server, tricks the browser into thinking it is the website’s origin server and installs additional layers of performances and securities like caching or firewall protection.
CDN Architecture Key Components
Most CDN architecture is designed using the following key components:
Delivery Nodes: These nodes are designed in CDN infrastructure architecture to deliver the content to the end-user. Delivery nodes are servers containing caches that are running one/more content delivery applications. Generally, they are placed in the closest locations to the end-users. The content can be stored in the delivery nodes manually (Push CDN). Using Push CDN makes the contents available instantly, but the content provider must push them every time they are updated. The delivery nodes can also demand content from origin nodes on cache expiration rules (Pull CDN). The contents will be demanded automatically from the content provider by a Pull CDN, but the initial content delivery speed is low. Once the contents are cached in the delivery nodes for the first time, the access will be instant.
Storage Nodes: Storage nodes’ mission is to store copies of original data, which will be distributed to Delivery Nodes. For tiered caching, deploy Storage Nodes in a hierarchical model.
Origin Nodes: Origin Nodes are the primary sources of the original content and enable content distribution across the network of content’s owner infrastructure.
Control Nodes: Control Nodes will host the management, route, and monitor other CDN infrastructure architecture components.
Usually, the CDN Nodes are deployed in multiple locations. The number of nodes and servers will depend on the CDN architecture. Some may reach hundreds of nodes with hundreds of servers on the various point-of-presence (PoPs), and some may build a universal network and maintain a small number of geographical PoPs.
What Are Points of Presence?
CDN PoPs are strategically-located datacenters, containing numerous caching servers that use different techniques (like SSD storage, GZip compression) to optimize file storage and content delivery. Their primary mission is to reduce the round-trip time by bringing the content closer to the website’s visitor.
To be more specific about the CDN PoP architecture, it is good to mention that PoPs store and cache static contents easily. The user’s request will not be sent to the origin datacenter, but instead, it will be requested from the local PoP with a shorter distance. This will decrease the RTTs (Round Trip Times), resulting in smoother performance.
If the requested content is not available in local PoP, it will connect to the origin datacenter enabling the resume of an existing TCP connection, which will lower the RTT, and a maximum congestion window. Once the TCP maximum congestion window is at its peak, the data starts flowing between two endpoints.
CDN PoP Architecture
Generally, the PoP location can have different designs. Modern CDNs try to contain as few stats as possible. States are responsible for memory and registers. A stateless device needs no registers and processes independently. But security devices have many states based on stateful functions and Deep Packet Inspection (DPI). Increasing numbers of states will destroy the performance and create complexity.
CDN will perform in a ruled and clean state design. There exist some states but not meddling in between processes.
Numerous CDN PoP architectures run equal-cost multi-path routing (ECMP) down the server host at the local PoP. It will enable next-hop packet processing to a single hop over multiple paths. ECMP deployments will provide load balancing, scaling datacenters, and remove the need for running expensive routers. All the things on the host will be terminated like Secure Sockets Layer (SSL) connections. This is possible by deploying cheaper Layer 3 switches and standards-based BGP.
The designs contain a BGP route reflector (RR), which is deployed in each CDN PoP, runs iBGP to the localhost, and eBGP out to the WAN. BGP RR is considered an alternative to a full mesh of iBGP speakers, acting as a centerpiece for iBGP sessions. It will increase scalability and performance. These days load balancing is based on pure IP in hardware without specialist load balancing devices and Layer 3 routers.
How Users Will Be Connected to PoPs?
Previously, the physical layout and logical designs of PoP with ECMP/BGP were discussed. Now the question is how the users’ requests will be directed to the correct PoP?
To do so, there are two primary methods known as below:
DNS Based Load Balancing
Traditionally, CDN designs worked with PoP advertising a different IP address to the WAN and informing users of the location. Now, this ability is combined with a geolocation DNS-based load balancing process, sending users requests to one of the datacenters configured with a specific IP address. DNS-based services will make it possible to control users’ placements and mapping to individual datacenters. It means if a datacenter is overloaded, the users won’t be sent there.
Anycast will use the internet’s natural flow to select PoPs. By Anycast, each PoP is assigned the same IP address. The single IP address is advertised from multiple PoP locations, and users will follow the flow for PoP selection, which is based on hop count. The DNS infrastructure is built based on Anycast attributes. Anycast is an excellent tool for mitigating and absorbing DDoS attacks. It also doesn’t have correlation issues. It can use high DNS TTL.
To sum up, a strong CDN architecture must improve performance, scalability, reliability, and responsiveness.
Performance: Minimizing latency is considered to be the main purpose of a CDN. A perfect CDN architecture should be built for optimal connectivity, and PoPs should be located at a central networking hub that data travels. A trusted CDN provider will provide CDNs to reduce round-trip times and improve bandwidth utilization.
Scalability: CDNs are built for high-speed and high-volume routing. As a result, CDNs can manage any amount of traffic. CDN architecture must offer ample networking, processing, catching, and computing resources available at any server. CDN’s DDoS protection solutions result from this scalability by deploying specified servers built for attack mitigation.
Reliability: CDN infrastructure architecture makes statistical improbability, ensuring record resilience, high-availability, and enabling CDN providers to commit to Service Level Agreements (SLAs). A reputed CDN like ArvanCloud CDN service will manage internal failover, disaster recovery, and additional hardware/software redundancy.
Responsiveness: CDN architecture, with its global-sized network, ensures responsiveness with its quick configuration propagation. All the configuration changes must be communicated across all PoPs. The larger and more geographically spread locations will enhance the process.