AI Blueprints FAQ

Overview and availability

What is OCI AI Blueprints?

Oracle Cloud Infrastructure AI Blueprints is a suite of prepackaged and verified blueprints for OCI that provide consistent and repeatable deployments of GenAI workloads in minutes with built-in observability.

What does OCI AI Blueprints do for customers?

Oracle Cloud Infrastructure AI Blueprints helps you deploy, scale, and monitor AI workloads in production in minutes. AI Blueprints are OCI-verified, no-code deployment blueprints for popular GenAI workloads. They include clear hardware recommendations with NVIDIA GPUs; opinionated software stack components, such as NVIDIA NIM; and prepackaged observability tools. This enables you to deploy AI workloads without having to make software stack decisions or manually provision the infrastructure. You can also leverage AI Blueprints' advanced infrastructure features, such as multi-instance GPUs or autoscaling based on inference latency, with a few simple configuration changes. With these capabilities, we reduce GPU onboarding for scaled, mission-critical deployments from weeks to minutes.

What is the cost to use OCI AI Blueprints?

OCI AI Blueprints are available to any OCI user for free.

Get started with OCI AI Blueprints

Where can I find OCI AI Blueprints?

OCI AI Blueprints can be found on GitHub. Within the GitHub page, you will follow directions about how to:

  1. Install the OCI AI Blueprints platform in your tenancy and access OCI AI Blueprints’ UI/API
  2. Deploy and monitor an AI Blueprint
  3. When complete, undeploy a blueprint

What is the safest way to test OCI AI Blueprints in my tenancy?

To test an OCI AI Blueprint, create a separate compartment and OCI Kubernetes Engine cluster. Deploying OCI AI Blueprints in the newly created compartment isolates any potential impact to your tenancy.

Which containers and resources get deployed in my tenancy?

Following are the containers and resources deployed in the tenancy:

  1. OCI AI Blueprints front- and back-end containers
  2. Grafana and Prometheus (monitoring)
  3. MLflow (experiment tracking)
  4. KEDA (application-based auto-scaling)
  5. Kueue
  6. KubeRay

Where can I see the full list of blueprints?

All available blueprints are listed here.

Operating with OCI AI Blueprints

How can I run LLM inference benchmarking?

To run an inference benchmarking blueprint, simply deploy a vLLM blueprint, then use a tool such as LLMPerf to run benchmarking against your inference endpoint.

How do I check logs for troubleshooting?

Use kubectl to inspect pod logs in your Oracle Kubernetes Engine cluster. You can also inspect logs from the AI Blueprints portal.

Does OCI AI Blueprints support autoscaling?

Yes, OCI AI Blueprints leverage KEDA for application-driven autoscaling. See the documentation for more information.

Which GPUs are compatible?

Any NVIDIA GPUs available in your OCI region, such as A10, A100, or H100.

Can I deploy to an existing Oracle Kubernetes Engine cluster?

Yes, you can deploy OCI AI Blueprints to an existing cluster by following the instructions here.

How do I run multiple blueprints on the same node?

To run multiple blueprints on the same node, we recommend that you enable shared node pools. Read more here.

What is the difference between OCI Data Science/Quick Actions and AI Blueprints?

Oracle Cloud Infrastructure Data Science and Oracle Cloud Infrastructure Data Science Quick Actions are PaaS offerings intended to help you build and deploy AI applications on managed compute instances. AI Blueprints is an IaaS booster. OCI AI Blueprints is ideal for customers deploying GenAI workloads to reserved instances in their tenancy. In the initial stages of the customer journey, AI Blueprints helps with pre-sales POCs, LLM benchmarking, and quick prototyping of end-to-end AI applications, such as retrieval-augmented generation (RAG ). In the later stages, customers can use AI Blueprints for production workloads on Kubernetes clusters with advanced configurations like autoscaling and distributed inference.

What is the difference between OCI Generative AI service and AI Blueprints?

OCI Generative AI service is a PaaS offering. AI Blueprints is an IaaS booster. OCI AI Blueprints is ideal for customers deploying GenAI workloads to reserved instances in their tenancy. In the initial stages of the customer journey, AI Blueprints helps with pre-sales POCs, LLM benchmarking, and quick prototyping of end-to-end AI applications, such as RAG. In the later stages, customers can use AI Blueprints for production workloads on Kubernetes clusters with advanced configurations like autoscaling and distributed inference.

Which models can I deploy with OCI AI Blueprints?

You can deploy custom LLMs or most models available on Hugging Face with our vLLM blueprint as long as the models are compatible with vLLM.

Can I deploy multimodal models?

Yes.

Can I serve LLMs with inference engines other than vLLM?

Yes. You can use other solutions such as Ollama, TensorRT, and NIM.

What if I don’t have GPUs yet. Can I deploy LLMs to CPUs with AI Blueprints?

Yes. We have a blueprint specifically for CPU inference that runs Ollama on CPUs.

Can I use AI Blueprints with NIM and NeMo?

Yes.

Does AI Blueprints have a command-line interface (CLI) and an API?

AI Blueprints currently provides an API (CLI is in development). You can also leverage the Kueue CLI for job orchestration and scheduling with AI Blueprints.

What value does AI Blueprints provide?

With OCI AI Blueprints, you can benefit in the following ways:

  • Deploy GenAI workloads in minutes via a simplified set-up flow with blueprints and clear guidance.
  • Gain a shorter time to production and faster realized value of OCI compute for GenAI by reducing the amount of time spent on initial setup and ongoing maintenance.
  • Use self-service onboarding to GPUs for GenAI with extensive documentation and a heavy emphasis on the end user experience with easy-to-consume portals and APIs.