Kevin Bogusch | Senior Competitive Intelligence Analyst | January 24, 2024
The costs associated with data egress are an unwelcome surprise many organizations face early in their cloud migration journeys. While cloud services such as virtual machines and storage have fixed pricing, the cost to move data out of the cloud is variable and often unpredictable.
When organizations learn this, they frequently consider re-architecting their applications to reduce the amount of data traveling between network segments, either using techniques such as compression and deduplication or by adding a caching layer to reduce the bits traveling across the wire. Ideally, companies moving to the cloud will design their applications with these considerations in mind to minimize data transfer fees and improve performance.
Cloud data egress refers to any information that leaves a cloud provider’s network for another location. That other location could be an on-premises data center, another cloud data center in a different region, an “availability zone” (data center) in the same cloud service, or even another virtual network within the same data center.
Cloud providers vary, sometimes significantly, in how, and how much, they charge for data egress, so it’s important to understand the components that factor into their data egress calculations. Regions, for example, are locations typically consisting of multiple data centers within a tight geographic boundary. Cloud services normally provide ample network connectivity between regions to support highly available, cross-regional applications. Data moving from one region to another is metered and typically charged based on the number of gigabytes of data sent.
Availability zones are the other primary cloud service availability concept. While each cloud provider has a slightly different name for this notion, including “availability domains” or simply “zones,” the concept encompasses multiple data centers in the same geographic region that have diversity in their networking and power providers and, therefore, are unlikely to fail simultaneously. Cloud providers encourage running services across multiple availability zones for high availability, but data moving from one zone to another is metered and may be charged per gigabyte.
Organizations gain several benefits when moving IT systems into a cloud computing model. Among them are unlocking the business value of modern cloud economics, faster IT improvements, higher levels of availability, and earlier access to new technologies, such as machine learning and artificial intelligence.
But a potentially “hidden” aspect of cloud economics is data egress, which can come in multiple forms. Cloud service providers track and meter network traffic moving from one location in their clouds to another—from region to region, zone to zone, or even one virtual network to another in the same region.
Inbound traffic—that is, data coming into the cloud, aka data ingress—is almost always free of charge, no matter the volume, at every cloud provider. Outbound traffic—data egress—is treated differently. Whether data is headed to the internet, on-premises data centers, offices, or even another cloud region, charges are assessed at varying rates depending on the cloud provider, the specific service, and the network path. Typically, data egress comes at a fixed price per gigabyte up to a certain threshold, with discounts applied for data transmission beyond that level.
Cloud data egress traffic can be standard network traffic from cloud virtual machines to other cloud virtual machines, such as data moving from a database to a business user’s spreadsheet. Another example, and a more significant concern in terms of cost, is traffic related to cloud-hosted storage assets used on websites or in mobile applications. A customary practice for cloud object stores is to house website assets separately from the website application; this improves performance and developer efficiency. In this architecture, a website would load images directly from a cloud object store, such as Amazon Web Services (AWS) S3, Azure Blob Storage, Google Cloud Storage, or Oracle Cloud Infrastructure (OCI) Object Storage, where those files exist as HTTP endpoints, as opposed to storing those files in a web server. When images are loaded directly from cloud storage, the application will incur charges for both read operations on the storage and the network costs associated with transferring that data from the cloud to the internet.
Busy web applications may have instances in multiple cloud regions, and storing their images in a cloud account in a single remote region can incur hefty data egress charges. While a major website or application would likely use a content delivery network (CDN) to reduce costs and improve performance, new or rapidly growing sites may not be architected to use CDNs and so may incur expensive egress charges.
These are just some examples of data egress in cloud applications. Traffic between virtual machines and database instances, virtual network peering, private links to platform-as-a-service offerings, and other architectures may all incur significant egress fees.
Cloud providers charge for data egress to recoup the infrastructure costs associated with building the large networks going into their data centers and the bandwidth between sites, which can cost millions or billions of dollars. Investing in undersea networking, as Microsoft did in 2017 when it built a 4,104-mile undersea cable between the US East Coast and Spain, is one example of cloud providers’ investments in large-scale network infrastructure to connect their data centers.
Cloud providers also continually invest to ensure the reliability and performance of these networks, and egress charges support those investments. Customers benefit from not having to plan and work with multiple providers to develop a global network. While cloud data egress charges can be frustrating, they’re part of the price organizations must pay to take advantage of the worldwide public cloud infrastructure.
A significant difference between the public cloud and on-premises deployments is that in the public cloud, all charges are on a single invoice. On the other hand, on-premises deployments typically involve separate bills for storage, servers, networking components, and services that were likely procured from multiple vendors and at different times. Organizations also pay separately for their data centers’ power and cooling or rent for a colocation facility, as well as for various software and support costs. Seeing the equivalent cloud service charges for all that on a single invoice can be a shock to unprepared IT managers.
Additionally, most of the expenses associated with on-premises deployments are fixed, while public cloud providers offer many computing solutions at variable costs. Variable costs can be beneficial because they rise and fall as computing requirements go up and down, so organizations can avoid overpaying for capacity they don’t need. While this variable cost model is often helpful for cloud services with changing demand patterns, misconfigurations or poor design can lead to outsize cloud bills.
Data egress charges are calculated in several ways, depending on the type of service and the cloud provider. However, the basic process is the same.
To get a sense of what this could mean in the real world, consider that the median size of all pages on the web was 2,315 kilobytes in June 2022, according to HTTP Archive’s Web Almanac. Suppose a cloud-hosted website comprises median-size pages and receives 10,000 monthly visitors, with each viewing two pages. That website would generate approximately 44.2 GB of cloud data egress every month:
10,000 x 2 = 20,000
20,000 x 2,315 KB = 46.3 billion KB
46.3 billion KB / 1 GB (1,048,576 KB) = 44.155 GB
Next, let’s look at some real-world examples of cloud data egress costs. First is a very simple example based on Azure pricing as of June 2023 for traffic from a web server outbound to the internet.
Now let’s examine a slightly more complex scenario for an AWS customer. Developers uploaded 1 TB of data to an S3 storage bucket and then transferred 4 TB to an AWS region outside the United States and 2 TB to the internet. The costs as of June 2023 would be as follows:
In this scenario, storage costs represent less than 10% of the total monthly data egress charges. The critical point to understand is how much traffic an application or service will generate and what that will cost for the types of services being used. A high-volume website could reduce its spending in the above example by using a content delivery network (CDN) to reduce the amount of traffic flowing directly to the internet. CDNs cache data files closer to the predicted users and in the CDN providers’ own networks, so the net cost of their services is commonly far less than the data egress charges for serving the same files from the cloud every time.
It’s worth noting that Oracle Cloud Infrastructure gives customers up to 10 TB of data egress each month at no charge. So, looking at the AWS example, an OCI customer would pay only $25.24 in data storage fees for the same amount of data egress. Similarly, the $882.18 of network egress fees incurred in the Azure example would be included in the cost of the VM for OCI customers.
As seen in the above examples, data egress charges can escalate quickly. This unconstrained cost model means potential misconfigurations, such as connecting to the wrong endpoint of a geo-replicated storage account, can lead to out-of-control cloud bills. So how does an IT team manage this reality?
It’s critical to determine cloud data egress spending sooner rather than later to minimize costs. The following five tips can help:
Monitoring and managing egress fees is one challenge. Lowering them is another. To reduce egress costs, enterprise IT teams need to go a level deeper into the technical stack.
A CDN reads an organization’s website assets once and then caches them at an edge location much closer to the customer. Using a CDN means that when a customer requests an image or a file from your website, it’s served from the CDN rather than directly from your cloud-hosted web server. The costs of a CDN are much lower than data egress charges, and CDNs allow customers to experience better load times. While using a CDN may require minor changes to application code, the cost and performance benefits are almost always worth it.
It’s not always possible to compress network traffic, but infrastructure traffic—think VM to VM over a virtual network—can commonly be compressed for a tiny trade-off in increased CPU cycles. The incremental cost of that CPU use is orders of magnitude less than the network egress fees.
Many cloud services, even within the same cloud provider, provide multiple ways to get nearly identical functionalities. For example, there are almost 20 ways to run a container in AWS. As part of the planning process, organizations should price each approach, including network charges, which can vary significantly.
A dedicated network connection between the cloud provider and your site can be expensive, but a private link does give you unlimited use of the connection once purchased. For organizations that can’t avoid a high volume of cloud data egress, this investment can actually reduce the total cost of ownership.
In a hybrid cloud model, with a mix of on-premises and hosted systems, moving to an “all-cloud” model can reduce the amount of data flowing out of the cloud by locating everything in one place.
These changes can be difficult to implement, especially for production applications. In some cases, they may require application engineers to make significant changes. But the cost savings frequently make those efforts worthwhile. While using a dedicated network connection, such as ExpressRoute and Direct Connect, can require coordination with both the organization’s cloud and network provider, both Microsoft and Amazon price data egress over these direct connections at a discount compared with the rate of normal outbound egress.
Not surprisingly, different cloud providers have different pricing structures. Azure, for example, charges for traffic from one virtual network to other connected virtual networks, even within the same cloud region. Thus, it’s well worth carefully evaluating what your data egress costs might be, especially when first migrating to the public cloud.
Oracle is well aware of the high data egress costs charged by other cloud providers and has worked to make reduced data costs a major benefit of its cloud services. Oracle Cloud Infrastructure (OCI) offers low networking prices that enable enterprises to move, at low cost, significant volumes of data, including services that frequently consume the highest amounts of bandwidth, such as live video streaming, videoconferencing, and gaming. As with other clouds, inbound data transfer is free; however, OCI offers 10 TB of data egress each month at no cost—considerably more than AWS’s 100 GB per month—allowing organizations to save significantly on data egress. Additionally, OCI’s outbound bandwidth costs are up to 25% less than AWS’s, which is a big benefit when you use services that require large amounts of bandwidth.
One of the major cloud economics considerations faced by cloud customers is the bandwidth for data egress from cloud services and networks. The associated costs can be a large surprise on the cloud bill, and the charges can be highly variable. One of the more important aspects of cloud architecture is minimizing egress costs by taking advantage of technologies, such as compression, content delivery networks, and caching layers, that can boost the performance of a site or application while also lowering costs.
Discover game-changing tactics, from driving transformation with AI to fine-tuning your M&A strategy and embracing collaborative leadership, to help you navigate today’s challenges—and tomorrow’s—and thrive.
Is an egress fee included in cloud storage?
No. A couple factors contribute to the cost of cloud storage. The first and most apparent is the volume of data stored in the storage account or bucket. The second is the number of reads and writes of the data during the billing period. Both factors are affected by the tier (hot, cool, cold, or archive) of the data—the hotter the tier, the lower the cost associated with those read and write transactions but the higher the storage cost per volume of data. These transactional charges are separate from data egress fees over the network.
What are data ingress and egress cloud charges?
Ingress and egress charges are costs associated with data transfer across cloud regions and zones, to and from the internet, and to and from on-premises networks. Data ingress (inbound to the cloud) generally is free, but all providers charge for data egress (outbound from the cloud provider). Egress charges are based on the volume of data and its location, such as the internet, availability zones, and other cloud regions, and are referred to as unbound or unbounded charges, meaning they’re variable and can be an expensive surprise to cloud customers.
Why does AWS charge for egress?
Like other cloud providers, AWS charges for data egress across most of its services to cover the costs of building and maintaining its network. AWS doesn’t include the cost of data egress in the pricing of services. That allows Amazon to charge lower prices on services such as virtual machines and storage buckets. Customers whose applications perform high levels of data egress end up paying the bill for that network transmission infrastructure, while customers whose applications transfer less data pay a smaller proportion of the bill.