Alan Zeichick | Content Strategist | May 31, 2023
Putting business data to good use requires a return-on-investment calculation—just as surely as investing in a factory, an office expansion, or R&D effort does. Our organizations can’t operate without data—data about customers, products, transactions, employees, finances, the economy, and competitors. We need that data to grow and thrive. Yet high-quality data comes with a price tag to acquire, store, manage, secure, and analyze it. The more data companies have, the better they can serve customers and collaborate with partners, and yet also the more time, effort, and resources they must invest in the entire data ecosystem. Businesses benefit from consistently treating data with this kind of ROI mindset.
This article will explore primarily the cost side of the data ROI equation and focus on ways to control and minimize the costs to acquire, store, secure, and use that data.
Data costs are the expenses associated with acquiring, maintaining, securing, and using business data. Many of those data costs are clear-cut. The data itself has to live somewhere, whether it’s on-premises on a hard drive or storage array, for example, or in cloud-based storage (which itself consists of physical hard drives). There’s software to organize that data, such as a content management system, relational database, data warehouse or data lake, or other structure; that software has commercial license costs or subscription/support contracts when using open source solutions. The data must be backed up, requiring additional storage and software to manage those backups and prepare for a possible limited recovery if some data is lost or full restoration if there’s a physical disaster.
There also may be license fees or other costs to acquire data from a third-party provider. There needs to be security and access controls, perhaps to conform to industry or government regulations and to address privacy concerns. There are costs associated with validating the data as well as ensuring or improving the quality of the data, such as by correcting out-of-date information.
There also may be costs to make full use of the data, which requires software for user interfaces, analytics, and reports and even deep learning or artificial intelligence software to discover insights.
Finally, there are costs tied to performance and scalability. When data grows from megabytes to terabytes or even petabytes, it requires sophisticated software, careful planning, and potentially automation tools to maintain and use that data, plus the hardware to store and access it at large scale. And for each one of the data costs noted above, companies must hire skilled people to manage and operate their data management tools.
Key Takeaways
Minimizing data costs starts with understanding what kind of data an organization has. Some of that is relational—that is, the data can be thought of as living in rows and columns. Other data is unstructured, and it might consist of documents, images, and videos, and binary files. Once an organization understands the data assets it has, the next step is determining the best format for storing them—a relational database, NoSQL database, document repository, etc.—and considering database consolidation opportunities. It’s also essential to know where the data comes from, where it resides, and where and how it will be used.
Once an organization understands its data and where best to store it, the next step is to adopt a flexible data architecture that’s capable of accounting for all of those data sources and uses and lets the organization optimize its acquisition, management, storage, and analysis. A key element of this architecture will be finding the right data governance model to determine how the data will be used. Another is choosing the right on-premises or cloud data management systems to minimize costs while maximizing performance, flexibility, security, and usefulness. All these steps will give an organization the ability to assess the value and use of any tranche of data and take the right steps to minimize the costs of delivering that value.
No matter how much data a business holds today, there’s more coming in every day, perhaps every second. Much of that data is needed to drive business operations, conduct transactions, serve customers and partners, empower management, drive financial reporting, and ensure compliance. Some of it, though, might be of very little value. Below are 11 ways to minimize the costs of acquiring, transforming, storing, securing, and using all that data. In some cases, these steps might lead to indirect savings, rather than direct budget reductions, due to increased business agility, staff productivity, or other efficiencies.
Determine the most appropriate data management systems based on your anticipated use cases and data volumes, considering, for example, transactional databases, data warehouses, data lakes, and machine learning tools. Consolidating data and workloads into fewer databases can reduce costs for software licensing and data management; choosing the best type of data storage and management technology can lower costs by simplifying the amount of work needed to create and maintain the integrations.
Cloud-based data management systems may offer scalability and manageability beyond that of an on-premises system, at a lower total cost, with the advantages of better resiliency, connectivity, security, and management services. The cloud also likely lowers staffing costs for infrastructure management.
Manual processes for data management are hard to scale and prone to human errors or inconsistently applied policies. Automated processes, such as those found in an autonomous database, offer predictability and strong security along with labor cost savings.
Data governance policies describe how your organization optimizes and secures its data, as well as how it can leverage that data to support business operations. Strong data governance policies can eliminate data redundancies, among other advantages, meaning that less data needs to be stored, backed up, and analyzed.
Using a leading open source database system can deliver many advantages, including a large, diverse developer community; reliability; a wide ecosystem of tools and software; the ability to customize the software; and, of course, lower software licensing costs. Whether open source lowers your total costs requires careful financial analysis. Managed cloud services based on open source software offer another option for tapping into these advantages.
Data is what you need to run day-to-day transactions and operations. That’s a vital start, but the real competitive edge comes from analytics. Analysis turns your data into insights that can help you spot trends, lower operating costs, increase revenues, and better serve your customers. This could include big data initiatives that use AI to cull insights from large and diverse data stores. A word of caution: Data analysis should increase the “return” of your ROI equation, but it won’t likely lower your total data management costs, since you’re adding the costs of analytical tools.
Data cleansing involves correcting errors and inconsistencies in your data’s rows and columns, according to both industry-standard and customized rules. While raw, uncorrected data may be fine for transactions, data analysis is more accurate—and more useful—when the data is clean. Not only that, but when data is clean, it can take less effort (and expense) to analyze. However, be wary about overselling the cost saving advantages of data hygiene. The amount of data removed likely isn’t huge, and there’s a cost to cleanse data, so the benefit here might mostly come from better analysis rather than lower costs.
Whether data operations are on-premises or in the cloud, network traffic analysis shows where things are working efficiently—and where there are unnecessary bottlenecks. Monitoring usage and network activity helps you identify areas where configuration changes can boost performance and user productivity. Network monitoring might spot cases where data access is consuming excess compute and storage resources, where there’s an opportunity for more effective architecture that lowers costs.
Where is your data coming from? Where do you get the data you rely upon the most? Analyzing and then visualizing the lineage of your key data can help you optimize data governance to leverage this data most efficiently, whether it’s generated internally or comes from outside sources—especially with big data. Again, this probably isn’t a huge money saver, but it may spot unneeded or underused third-party data you’re paying for.
You can manage your data architecture, servers, resources, and applications yourself—or you can let a specialist handle those technical needs for you. This lets you focus on your business, rather than on the intricacies of data management, at greater efficiency and at lower risk. Plus, the specialist staff and tools used by service providers may be able to do the job at a lower cost. It’s worth running the numbers.
Some parts of your business are very dependent on data—but which data is crucial? How is the data being used? Where and when is it being used? Who’s using it? Use these insights to guide the best use of your technology resources and data management budget.
The goal of a data cost reduction program is to help you do more with lower costs: Gain greater business insights and operating responsiveness from your data while spending less to manage that data.
Many organizations—large and small—are reducing the cost of data by leveraging the cloud and modern data architectures.
Data helps your business function, supporting everything from billing to translation logs, from documents to parts catalogs, from price lists to inventory. Using that operating data more effectively unlocks new opportunities. But every day, that data is growing—and with it the cost. Fortunately, you can take steps to minimize your cost of data while still driving business growth and improving efficiency.
HeatWave lets you use automated and integrated generative AI and machine learning in one cloud service for transactions and lakehouse scale analytics. Companies can eliminate the cost and complexity of separate analytics and vector databases, machine learning services, and ETL processes—while avoiding the latency and security risks of data movement between data stores. With built-in, machine learning–powered automation, developers and DBAs can save significant time, further increase performance, and reduce costs. HeatWave is available on Oracle Cloud Infrastructure (OCI), Amazon Web Services (AWS), Microsoft Azure, and in customers’ data centers with OCI Dedicated Region.
The query performance and Price-performance of HeatWave Lakehouse is significantly better. Many fast-growing organizations use HeatWave to simplify their data infrastructure and reduce their data management costs while improving performance, scalability, security, and productivity.
What is the first step in exiting a data center?
When you are planning to exit a data center, conduct a thorough survey of applications, data, services, users, and security requirements. Everything on that survey will require a migration plan, whether it’s to “lift and shift” the existing applications and data into the cloud, choose new applications, or build new applications from scratch.
What is the lifespan of equipment in a data center?
Major parts of data center infrastructure, such as HVAC (heating, ventilation, and air conditioning) systems, power distribution, and physical security systems, could last a decade or longer with regular maintenance. The computation equipment, such as servers, routers, switches, and storage, are good for three to five years, as a rule of thumb, before becoming obsolete.
Who is responsible for security in the cloud?
Physical security of the cloud infrastructure—the servers, network infrastructure, and so on—is managed by the cloud providers. Responsibility for securing the software and services is shared between the cloud provider and the enterprise.
How long does it take to exit a data center?
Plan on a full data center exit to take months. A larger IT infrastructure could take years. It all depends on the size of the data center, its complexity, and the amount of data. Much of that time will be consumed by taking a thorough inventory, developing plans, creating and testing new software (if required), and training. Like with moving offices, the actual migration and exit itself is a relatively short phase, once all the planning is complete.
Learn how to take advantage of generative AI, build machine learning models, query data in object storage, or explore other HeatWave topics of interest.