KubeDL Model tracks a model’s version and lineage in Kubernetes CRD. It leverages container image to take versioning of a model. Each model version will generate a corresponding image that includes all the model artifacts.
In short, KubeDL training generates the KubeDL model and can then later by referenced by KubeDL Serving to serve the model directly.
KubeDL provides two CRDs
ModelVersion to accomplish this:
ModelVersion describes a model’s version, its location, who creates this model, the image repo for pushing the model and so on. The backend controller watches this CRD and uses Kaniko to generate a image that incorporates all the artifacts. Check the full CRD spec.
An example YAML looks like below:
apiVersion: model.kubedl.io/v1alpha1 kind: ModelVersion metadata: name: mv-3 namespace: default spec: # The model name for the model version modelName: model1 # The entity (user or training job) that creates the model createdBy: user1 # The image repo to push the generated model imageRepo: modelhub/resnet # The storage will be mounted at /kubedl-model inside the training container. # Therefore, the training code should export the model at /kubedl-model path. storage: # The local storage to store the model localStorage: # The local host path to export the model path: /foo # The node where the chief worker run to export the model nodeName: kind-control-plane
KubeDL mounts the storage such as local fs, nfs at
/kubedl-model inside each training container by default. Therefore,
the training code inside the container must export the model under
/kubedl-model, so that it is also present on the external storage.
The ModelVersion controller will then trigger a Kaniko container mounted with the same storage and generate an image that includes the
model artifacts at
KubeDL also automatically injects an ENV
KUBEDL_MODEL_PATH=/kubedl-model into the generated model image.
Model is associated with multiple ModelVersions. It just aggregates some information about the ModelVersions in its status, such as the latest ModelVersion like below. Once the Model is deleted, all its ModelVersions are cascadingly deleted. Check the full CRD spec.
status: latestVersion: imageName: modelhub/resnet:v1c072 modelVersion: mv-3