Guide to Dynamo CLI — Dynamo
Title: Guide to Dynamo CLI — Dynamo
URL Source: https://docs.nvidia.com/dynamo/archive/0.3.0/guides/README.html?userAgent=PromptingBot%2F1.0.0
Published Time: Fri, 13 Jun 2025 20:22:49 GMT
Markdown Content: Skip to main content
Back to top- [x] - [x]
Ctrl+K
Search Ctrl+K
Search Ctrl+K
Table of Contents
Architecture & Features
Dynamo Command Line Interface
- CLI Overview
- Running Dynamo (dynamo run)
- Serving Inference Graphs (dynamo serve)
- Building Dynamo (dynamo build)
- Deploying Inference Graphs (dynamo deploy)
Usage Guides
- Writing Python Workers in Dynamo
- Disaggregation and Performance Tuning
- KV Cache Router Performance Tuning
- Planner Benchmark Example
- Working with Dynamo Kubernetes Operator
Deployment Guides
- Dynamo Cloud Kubernetes Platform
- Deploying Dynamo Inference Graphs to Kubernetes using the Dynamo Cloud Platform
- Manual Helm Deployment
- Minikube Setup Guide
- Model Caching with Fluid
API
Examples
-
Hello World Example: Aggregated and Disaggregated Deployment
-
Guide to Dynamo CLI
Guide to Dynamo CLI#
After installing Dynamo with the following command, Dynamo can be used primarily through its CLI.
apt-get update DEBIAN_FRONTEND=noninteractive apt-get install -yq python3-dev python3-pip python3-venv libucx0 python3 -m venv venv source venv/bin/activate
pip install "ai-dynamo[all]"
Dynamo workflow#
Dynamo CLI has the following 4 sub-commands.
-
:runner: dynamo run: quickly spin up a server to experiment with a specified model, input and output target.
-
:palm_up_hand: dynamo serve: compose a graph of workers locally and serve.
-
:hammer: (Experimental) dynamo build: containerize either the entire graph or parts of graph to multiple containers
-
:rocket: (Experimental) dynamo deploy: deploy to K8 with helm charts or custom operators
-
:cloud: (Experimental) dynamo cloud: interact with your dynamo cloud server
For more detailed examples on serving LLMs with disaggregated serving, KV aware routing, etc, please refer to LLM deployment examples
On this page
Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact
Copyright © 2025-2025, NVIDIA Corporation.
Links/Buttons:
- Skip to main content
- document.write(
<img src="../_static/nvidia-logo-horiz-rgb-wht-for-screen.svg" class="logo__image only-dark" alt="Dynamo - Home"/>); Dynamo - GitHub
- Support Matrix
- Getting Started
- High Level Architecture
- Distributed Runtime
- Disaggregated Serving
- KV Block Manager
- Motivation
- KVBM Architecture
- Understanding KVBM components
- KVBM Further Reading
- KV Cache Routing
- Planner
- CLI Overview
- Running Dynamo (dynamo run)
- Serving Inference Graphs (dynamo serve)
- Building Dynamo (dynamo build)
- Deploying Inference Graphs (dynamo deploy)
- Writing Python Workers in Dynamo
- Disaggregation and Performance Tuning
- KV Cache Router Performance Tuning
- Planner Benchmark Example
- Working with Dynamo Kubernetes Operator
- Dynamo Cloud Kubernetes Platform
- Deploying Dynamo Inference Graphs to Kubernetes using the Dynamo Cloud Platform
- Manual Helm Deployment
- Minikube Setup Guide
- Model Caching with Fluid
- Dynamo SDK
- Python API
- Hello World Example: Basic
- Hello World Example: Aggregated and Disaggregated Deployment
- LLM Deployment Examples
- Multinode Examples
- LLM Deployment Examples using TensorRT-LLM
- #
- LLM deployment examples
- Privacy Policy
- Manage My Privacy
- Do Not Sell or Share My Data
- Terms of Service
- Accessibility
- Corporate Policies
- Product Security
- Contact