Oracle Horizon data warehouse enables self-service with Autonomous Database

Oracle’s self-service Horizon data warehouse uses Autonomous Database to enable developers to spend more time building new features.

Autonomous Database is a key component of our self-service lakehouse, now at 2.2 petabytes and growing. We use Autonomous Database because we can integrate it tightly with our data lake, because the savings in administration have enabled us to deliver more new functionality to our users, and because it has greatly simplified our self-service strategy.

Ron SielinskiVice President Engineering, Data Warehouse, Oracle

Business challenges

Water cooler conversations can be interesting, but when Oracle Cloud Infrastructure (OCI) teams want to know what’s really going on, they go to the Horizon data warehouse. Horizon is the repository of record for over 150 different teams, including engineering, support, finance, capacity planning, and more. Those teams can generate reports or perform analysis on all available data, not just their own. It contains telemetry, log files, application state data, revenue, consumption, and much more.

A key design philosophy of Horizon was to “get out of the way” of users. So almost all aspects of Horizon are self-service, including report generation, onboarding new users, and data governance. Because each partner has different data needs, even loading and transforming new data is self-service. Teams just go to the Horizon portal, identify the data source, format, and transformations needed, while the ingest and extract, transfer, and load (ETL) is done for them, no coding required. All this self-service helps to cut down on support work and improve partner satisfaction. In fact, this creates a virtuous cycle where improved satisfaction and results mean more teams want to join, bringing new data and capabilities for everybody.

Why Oracle chose Autonomous Database

Horizon was built from the ground up alongside OCI, so the first version was built on-premises. As OCI gained more capabilities, the system was moved to the cloud. That on-premises data warehouse needed 1.5 to 2 full-time people to maintain and operate it. At the time of the migration, this was a large fraction of the available resources in the Horizon data warehouse team. Oracle wanted to allocate those resources to developing new functionality, not just keeping the lights on. So Oracle Autonomous Database was selected for analytics and data warehousing because it automated the tasks that were then being performed manually. It also greatly reduced the work needed to make the data warehouse support the self-service strategy of Horizon.

Results

Today, the Horizon data warehouse is provisioned with 120 OCPUs and has over 200 TB of data under management, with a further 2 petabytes in object storage. Those self-service capabilities mean there are more than 1,800 data pipelines and 1,000 analytics dashboards meeting the data and analytics needs of 150-plus partners. Adoption grew over 50% in the last year, with more than 2,000 users every month currently.

Horizon is built using a lakehouse architecture. The data lake is used for raw data, transformations, and staging. Data is moved as needed into the data warehouse, and older data, typically more than 6 months old, is archived back to the data lake. Archived data is still accessible (via external tables) so partners can continue to use their existing tools to run analytics on older data.

Building a lakehouse requires a whole platform, not just a data lake and a data warehouse. Horizon uses over 10 OCI tools, including OCI Data Flow, OCI Data Catalog, and many other infrastructure services. The Horizon self-service model already supports Oracle Analytics Cloud for analysts. The company is investigating adding machine learning services to simplify access for data scientists.

Horizon is a business-critical system. If it’s not working, then the business sees an immediate impact. So Horizon is protected in two ways. First, Autonomous Data Guard provides failover. Like Horizon itself, Data Guard is self-service—one click and you’re done. Being up and running is necessary but not sufficient. To ensure suitable levels of performance, autoscaling is enabled (also a single click) so additional OCPUs are added autonomously when needed to meet transient demand.

Administering the non-autonomous database no longer requires up to two full-time employees. The value that brings to the Horizon data warehouse team is much more than a cost savings. Those people are now part of a larger group extending and modernizing the system, or adding new functionality. The practical effect is that the team can now accept most feature requests, while partners are happier with the service provided, which is reflected in significantly increased adoption. Perhaps best of all, there are now no more late-night calls to address database issues, so everybody sleeps better.

Autonomous Database is a core component of an enterprise-scale lakehouse, with well over 2 petabytes of data serving more than 150 teams. As it does with every customer, it saves on administration costs and downtime. But the big story is what that savings enables. Autonomous operations have been key to the self-service model that has been so successful at increasing adoption and user satisfaction. And because the Horizon data warehouse team has been freed up to do higher-value work supporting the business, rather than performing database administration, it has also helped to increase team morale.

Published:April 29, 2022