About the Dynamo Command Line Interface — Dynamo
Title: About the Dynamo Command Line Interface#
URL Source: https://docs.nvidia.com/dynamo/archive/0.3.1/guides/cli_overview.html?userAgent=PromptingBot%2F1.0.0
Published Time: Wed, 02 Jul 2025 21:58:30 GMT
Markdown Content: The Dynamo CLI serves, containerizes, and deploys Dynamo applications efficiently. It provides intuitive commands to manage your Dynamo services.
CLI Capabilities#
With the Dynamo CLI, you can:
-
Chat with models quickly using
run -
Serve multiple services locally using
serve -
Package your services into archive (called
dynamo artifact) usingbuild -
Deploy pipelines to Dynamo Cloud using
deploy
Commands#
run#
Use run to start an interactive chat session with a model. This command executes the dynamo-run Rust binary under the hood. For more details, see Running Dynamo.
Example#
dynamo run deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
serve#
Use serve to run your defined inference graph locally. You’ll need to specify your file and intended class using the file:Class syntax. For more details, see Serving Inference Graphs.
Usage#
dynamo serve [SERVICE]
Arguments#
SERVICE: Specify the service to start using file:Class syntax
Flags#
-
--file/-f: Path to optional YAML configuration file. For configuration examples, see the SDK docs -
--dry-run: Print the dependency graph and values without starting services -
--service-name: Start only the specified service name -
--working-dir: Set the directory for finding the Service instance -
Additional flags following Class.key=value pattern are passed to the service constructor. For details, see the configuration section of the SDK docs
Example#
cd examples
Start the Frontend, Middle, and Backend components
dynamo serve hello_world:Frontend
Start only the Middle component in the graph that is discoverable from the Frontend service
dynamo serve --service-name Middle hello_world:Frontend
build#
Use build to package your inference graph and its dependencies into an archive. Combine this with the --containerize flag to create a single Docker container for your inference graph. As with serve, you point toward the first service in your dependency graph. For more details, see Serving Inference Graphs.
Usage#
dynamo build [SERVICE]
Arguments#
SERVICE: Specify the service to build using file:Class syntax
Flags#
-
--working-dir: Specify the directory for finding the Service instance -
--containerize: Choose whether to create a container from the dynamo artifact after building
Example#
cd examples/hello_world dynamo build hello_world:Frontend
deploy#
Use deploy to create a pipeline on Dynamo Cloud using either interactive prompts or a YAML configuration file. For more details, see Deploying Inference Graphs to Kubernetes.
Usage#
dynamo deploy [PIPELINE]
Arguments#
PIPELINE: The pipeline to deploy; defaults to None; required
Flags#
-
--name/-n: Set the deployment name. Defaults to None; required -
--config-file/-f: Specify the configuration file path. Defaults to None; required -
--wait/--no-wait: Choose whether to wait for deployment readiness. Defaults to wait -
--timeout: Set maximum deployment time in seconds. Defaults to 3600 -
--endpoint/-e: Specify the Dynamo Cloud deployment endpoint. Defaults to None; required -
--help/-h: Display command help
For a detailed deployment example, see Operator Deployment.
Links/Buttons:
- Skip to main content
- document.write(
<img src="../_static/nvidia-logo-horiz-rgb-wht-for-screen.svg" class="logo__image only-dark" alt="Dynamo - Home"/>); Dynamo - GitHub
- Support Matrix
- Getting Started
- High Level Architecture
- Distributed Runtime
- Disaggregated Serving
- KV Block Manager
- Motivation
- KVBM Architecture
- Understanding KVBM components
- KVBM Further Reading
- KV Cache Routing
- Planner
- CLI Overview
- Running Dynamo (dynamo run)
- Serving Inference Graphs (dynamo serve)
- Building Dynamo (dynamo build)
- Deploying Inference Graphs (dynamo deploy)
- Writing Python Workers in Dynamo
- Disaggregation and Performance Tuning
- KV Cache Router Performance Tuning
- Working with Dynamo Kubernetes Operator
- Dynamo Cloud Kubernetes Platform
- Deploying Dynamo Inference Graphs to Kubernetes using the Dynamo Cloud Platform
- Manual Helm Deployment
- Minikube Setup Guide
- Model Caching with Fluid
- Planner Benchmark Example
- SDK Reference
- Python API
- Hello World Example: Basic
- Hello World Example: Aggregated and Disaggregated Deployment
- LLM Deployment Examples
- Multinode Examples
- LLM Deployment Examples using TensorRT-LLM
- #
- Privacy Policy
- Manage My Privacy
- Do Not Sell or Share My Data
- Terms of Service
- Accessibility
- Corporate Policies
- Product Security
- Contact