Oracle Cloud Infrastructure AI Blueprints is a suite of prepackaged and verified blueprints for OCI that provide consistent and repeatable deployments of GenAI workloads in minutes with built-in observability.
Oracle Cloud Infrastructure AI Blueprints helps you deploy, scale, and monitor AI workloads in production in minutes. AI Blueprints are OCI-verified, no-code deployment blueprints for popular GenAI workloads. They include clear hardware recommendations with NVIDIA GPUs; opinionated software stack components, such as NVIDIA NIM; and prepackaged observability tools. This enables you to deploy AI workloads without having to make software stack decisions or manually provision the infrastructure. You can also leverage AI Blueprints' advanced infrastructure features, such as multi-instance GPUs or autoscaling based on inference latency, with a few simple configuration changes. With these capabilities, we reduce GPU onboarding for scaled, mission-critical deployments from weeks to minutes.
OCI AI Blueprints are available to any OCI user for free.
OCI AI Blueprints can be found on GitHub. Within the GitHub page, you will follow directions about how to:
To test an OCI AI Blueprint, create a separate compartment and OCI Kubernetes Engine cluster. Deploying OCI AI Blueprints in the newly created compartment isolates any potential impact to your tenancy.
Following are the containers and resources deployed in the tenancy:
All available blueprints are listed here.
To run an inference benchmarking blueprint, simply deploy a vLLM blueprint, then use a tool such as LLMPerf to run benchmarking against your inference endpoint.
Use kubectl to inspect pod logs in your Oracle Kubernetes Engine cluster. You can also inspect logs from the AI Blueprints portal.
Yes, OCI AI Blueprints leverage KEDA for application-driven autoscaling. See the documentation for more information.
Any NVIDIA GPUs available in your OCI region, such as A10, A100, or H100.
Yes, you can deploy OCI AI Blueprints to an existing cluster by following the instructions here.
To run multiple blueprints on the same node, we recommend that you enable shared node pools. Read more here.
Oracle Cloud Infrastructure Data Science and Oracle Cloud Infrastructure Data Science Quick Actions are PaaS offerings intended to help you build and deploy AI applications on managed compute instances. AI Blueprints is an IaaS booster. OCI AI Blueprints is ideal for customers deploying GenAI workloads to reserved instances in their tenancy. In the initial stages of the customer journey, AI Blueprints helps with pre-sales POCs, LLM benchmarking, and quick prototyping of end-to-end AI applications, such as retrieval-augmented generation (RAG ). In the later stages, customers can use AI Blueprints for production workloads on Kubernetes clusters with advanced configurations like autoscaling and distributed inference.
OCI Generative AI service is a PaaS offering. AI Blueprints is an IaaS booster. OCI AI Blueprints is ideal for customers deploying GenAI workloads to reserved instances in their tenancy. In the initial stages of the customer journey, AI Blueprints helps with pre-sales POCs, LLM benchmarking, and quick prototyping of end-to-end AI applications, such as RAG. In the later stages, customers can use AI Blueprints for production workloads on Kubernetes clusters with advanced configurations like autoscaling and distributed inference.
You can deploy custom LLMs or most models available on Hugging Face with our vLLM blueprint as long as the models are compatible with vLLM.
Yes.
Yes. You can use other solutions such as Ollama, TensorRT, and NIM.
Yes. We have a blueprint specifically for CPU inference that runs Ollama on CPUs.
Yes.
AI Blueprints currently provides an API (CLI is in development). You can also leverage the Kueue CLI for job orchestration and scheduling with AI Blueprints.
With OCI AI Blueprints, you can benefit in the following ways: