A data lakehouse is a modern, open architecture that enables you to store, understand, and analyze all your data. It combines the power and richness of data warehouses with the breadth and flexibility of the most popular open source data technologies you use today. A data lakehouse can be built from the ground up on Oracle Cloud Infrastructure (OCI) to work with the latest AI frameworks and prebuilt AI services like Oracle’s language service.
Easily bring together, analyze, and find new insights from all your data like invoices, forms, text, audio, and video.
Learn how a data lakehouse on OCI provides an efficient, integrated, automated platform that integrates all your data—whether in a data warehouse, data lake, or application output—and adds analytics capabilities and machine learning to help you get the most value out of your data.
Learn about patterns, best practices, and architectures for deploying a lakehouse on Oracle Cloud Infrastructure.
The most successful customers engage with cloud specialists from the start. Our cloud engineers provide guidance on planning, architecting, prototyping, and managing cloud migrations, so you can move faster and with more confidence.
Oracle makes it easy to transform your organization and analytics team into a lakehouse solutions team using the skills and leveraging the investments you’ve already made. Easily extend your data warehouse and data lakes into a data lakehouse, move and modernize the data lakes you’ve built on premises, or start from your Oracle SaaS data.
Our customers can easily migrate existing or build new open source data lakes with our fully managed services like Oracle Big Data Service and Oracle Data Flow. Spark, HIVE, Hbase, and many more services can be easily deployed and scaled on OCI.
Data Flow is a serverless Spark service that enables our customers to focus on their Spark workloads with zero infrastructure concepts.
Oracle Autonomous Data Warehouse enables fast and scalable queries directly over any data in the object store. A single query can join data in Autonomous Data Warehouse and a data lake.
For our current data warehouse customers, this is the fastest and easiest path to transform your warehouse into a data lakehouse, enabling you to store and analyze all data, while using the applications, tools, and skills you already have.
Oracle customers want to build advanced, machine learning-based analytics over their Oracle SaaS data, or any SaaS data. Our easy- to-use data integration connectors for Oracle SaaS, make creating a lakehouse to analyze all data with your SaaS data easy and reduces time to solution.
All of our data lakehouse services are built on high-scale, low-cost OCI Object Stores, use OCI Data Catalog for unified data definition, easily integrate with powerful AI, and use Oracle Data Integration for scalable data ingestion and movement within the lakehouse.
Oracle Big Data Service is a Hadoop-based data lake to store and analyze large amounts of raw customer data. A managed service, Oracle Big Data Service comes with a fully integrated stack that includes both open source and Oracle value-added tools that simplify your IT operations. Oracle Big Data Service makes it easier for enterprises to manage, structure, and extract value from organization-wide data.
Oracle Cloud Infrastructure Data Flow is a fully managed Apache Spark service with no infrastructure for customer IT teams to deploy or manage. Data Flow lets developers deliver applications faster because they can focus on application development without getting distracted by operations.
Autonomous Data Warehouse is a cloud data warehouse service that eliminates the complexities of operating a data warehouse, data warehouse center, securing data, and developing data-driven applications. Oracle uses machine learning to completely automate all routine data warehousing tasks—ensuring higher performance, reliability, security, and operational efficiency.
MySQL HeatWave, is the only service that enables database admins and app developers to run OLTP and OLAP workloads directly from their MySQL database. This eliminates the need for complex, time-consuming, and expensive data movement and integration with a separate analytics database.
Oracle Cloud Infrastructure Data Catalog helps data professionals across the organization search, explore, and govern data using an inventory of enterprise-wide data assets. It automatically harvests metadata across an organization’s data stores and provides a common metastore for data lakes. Data Catalog simplifies the definition of business glossaries and curated information about data assets located in Oracle Cloud Infrastructure and other locations so data consumers can easily find needed data.
Simplify your complex data extract, transform, and load processes (ETL/E-LT) into data lakes and warehouses for data science and analytics with Oracle’s modern, no-code data flow designer.
Advanced data migration for extract, transform, and load. Oracle Data Integrator is optimized for Oracle databases, such as Oracle Autonomous Database and Oracle Database Exadata Cloud Service as well as on-premises databases. Includes best-in-class support for heterogeneous sources and targets.
Oracle GoldenGate enables high-availability, real-time data integration, change data capture, data replication, transformations, and verification between operational and analytical enterprise systems.
Streaming service is a real-time, serverless, Apache Kafka-compatible event streaming platform for developers and data scientists. Streaming is tightly integrated with Oracle Cloud Infrastructure, Oracle Database, Oracle GoldenGate, and Oracle Integration and Migration. The service also provides out-of-the-box integrations for hundreds of third-party products across categories such as DevOps, databases, big data, and SaaS applications.
Object Storage enables customers to store any type of data in its native format. This is ideal for building modern applications that require scale and flexibility, as it can be used to consolidate multiple data sources for analytics, backup, or archive purposes.
Experian improved performance by 40% and reduced costs by 60% when it moved critical data workloads from other clouds to a data lakehouse on OCI, speeding data processing and product innovation while expanding credit opportunities worldwide.
Oracle partner solutions leverage and augment data lakehouses on OCI.
You can create big data clusters with options for node shapes and storage sizes. In this workshop, you create a non-HA cluster and assign small shapes to the nodes. This cluster is perfect for testing applications.
Learn how Spark developers and data scientists can create, edit, and run Spark jobs at any scale without the need for clusters, an operations team, or highly specialized Spark knowledge.
Learn how to create users, access, and policies to build a new catalog, and then to harvest data from object storage, databases, and on-premises data sources.
Learn how to set up Data Integration, connect to data sources , ingest and transform data, and load data to object store and/or Oracle Databases.