About the Dynamo Command Line Interface — Dynamo
Title: About the Dynamo Command Line Interface#
URL Source: https://docs.nvidia.com/dynamo/archive/0.2.1/guides/cli_overview.html?userAgent=PromptingBot%2F1.0.0
Published Time: Thu, 05 Jun 2025 21:23:32 GMT
Markdown Content: The Dynamo CLI is a powerful tool for serving, containerizing, and deploying Dynamo applications. It leverages core pieces of the BentoML deployment stack and provides a range of commands to manage your Dynamo services.
The Dynamo CLI lets you:
-
run- quickly chat with a model -
serve- run a set of services locally (viadepends()or.link()) -
build- create an archive of your services (called abento) -
deploy- create a pipeline on Dynamo Cloud
Commands#
run#
The run command allows you to quickly chat with a model. Under the hood - it is running the dynamo-run Rust binary. For details, see Running Dynamo.
Example
dynamo run deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
serve#
The serve command lets you run a defined inference graph locally. You must point toward your file and intended class using file:Class syntax. For details, see Serving Inference Graphs.
Usage
dynamo serve [SERVICE]
Arguments
SERVICE- The service to start. You use file:Class syntax to specify the service.
Flags
-
--file/-f- Path to optional YAML configuration file. An example of the YAML file can be found in the configuration section of the SDK docs -
--dry-run- Print out the dependency graph and values without starting any services. -
--service-name- Only serve the specified service name. The rest of the discoverable components in the graph are not started. -
--working-dir- Specify the directory to find the Service instance -
Any additional flags that follow Class.key=value are passed to the service constructor for the target service and parsed. See the configuration section of the SDK docs for more details.
Example
cd examples
Spin up Frontend, Middle, and Backend components
dynamo serve hello_world:Frontend
Spin up only the Middle component in the graph that is discoverable from the Frontend service
dynamo serve --service-name Middle hello_world:Frontend
build#
The build commmand allows you to package up your inference graph and its dependancies and create an archive of it. This is commonly paired with the --containerize flag to create a single docker container that runs your inference graph. As with serve, you point toward the first service in your dependency graph. For details about dynamo build, see Serving Inference Graphs.
Usage
dynamo build [SERVICE]
Arguments
SERVICE- The service to build. You use file:Class syntax to specify the service.
Flags
-
--working-dir- Specify the directory to find the Service instance -
--containerize- Whether to containerize the Bento after building
Example
cd examples/hello_world dynamo build hello_world:Frontend
deploy#
The deploy commmand creates a pipeline on Dynamo Cloud using parameters at the prompt or using a YAML configuration file. For details, see Deploying Inference Graphs to Kubernetes.
Usage
dynamo deploy [PIPELINE]
Arguments
pipeline- The pipeline to deploy. Defaults to None; required.
Flags
-
--nameor-n- Deployment name. Defaults to None; required. -
--config-fileor-f- Configuration file path. Defaults to None; required. -
--wait- Whether or not to wait for deployment to be ready. Defaults to wait.--no-wait -
--timeout- The number of seconds that can elapse before deployment times out; measured in seconds. Defaults to 3600. -
--endpointor-e- The Dynamo Cloud endpoint where the pipeline should be deployed. Defaults to None; required. -
--helpor-h- Display in-line help fordynamo deploy.
Example
For a detailed example, see Operator Deployment.
Links/Buttons:
- Skip to main content
- document.write(
<img src="../_static/nvidia-logo-horiz-rgb-wht-for-screen.svg" class="logo__image only-dark" alt="Dynamo - Home"/>); Dynamo - GitHub
- Support Matrix
- Getting Started
- High Level Architecture
- Distributed Runtime
- Disaggregated Serving
- KV Block Manager
- Motivation
- KVBM Architecture
- Understanding KVBM components
- KVBM Further Reading
- KV Cache Routing
- Planner
- CLI Overview
- Running Dynamo (dynamo run)
- Serving Inference Graphs (dynamo serve)
- Building Dynamo (dynamo build)
- Deploying Inference Graphs (dynamo deploy)
- Writing Python Workers in Dynamo
- Disaggregation and Performance Tuning
- KV Cache Router Performance Tuning
- Planner Benchmark Example
- Dynamo Cloud Kubernetes Platform
- Deploying Dynamo Inference Graphs to Kubernetes using the Dynamo Cloud Platform
- Manual Helm Deployment
- Minikube Setup Guide
- Dynamo SDK
- Python API
- Hello World Example
- LLM Deployment Examples
- Multinode Examples
- LLM Deployment Examples using TensorRT-LLM
- #
- run
- serve
- build
- deploy
- Privacy Policy
- Manage My Privacy
- Do Not Sell or Share My Data
- Terms of Service
- Accessibility
- Corporate Policies
- Product Security
- Contact