KubeDL automatically tunes the best container-level configurations before an ML model is deployed as inference services. This auto-configuration workflow is developed as an independent project—Morphling. Github →

Morphling Stack

Morphling Overview

Morphling tunes the optimal configurations for your ML/DL model serving deployments. It searches the best container-level configurations (e.g., resource allocations and runtime parameters) by empirical trials, where a few configurations are sampled for performance evaluation.

Morphling Stack
Morphling Workflow.


Key benefits include:

  • Automated tuning workflows hidden behind simple APIs.
  • Out of the box ML model serving stress-test clients.
  • Cloud agnostic and tested on AWS, Alicloud, etc.
  • ML framework agnostic and generally support popular frameworks, including TensorFlow, PyTorch, etc.
  • Equipped with various and customizable hyper-parameter tuning algorithms.

Core APIs

Morphling requires users to specify the ProflingExperiment interface for configuration tuning, including:

  • ML model container (e.g., the Pod template)

  • performance objective function

  • tunable configuration parameters with types and search range

  • sampling algorithms

  • sampling budget

Getting started

Install using YAML files

Install Morphling using YAML files. Go →


Run model serving configuration examples. Go →


See Morphling Workflow to check how Morphling tunes ML serving configurations automatically in a Kubernetes-native way.

Developer Guide

To develop/debug Morphling controller manager locally, please check the Developer Guide and Debug Guide.


If you have any questions or want to contribute, GitHub issues or pull requests are warmly welcome.