Experiment tracking dashboard enhancement
Team dashboard (resource usage tracking)
Managed Airflow / Airflow integration
Dataset versioning - dvc integration
Dataset, output directory NFS mount
Save/load files with savvihub SDK
If you want any features / supports, please contact us. ([email protected])
You can find the best hyperparameters with SavviHub's new tuning techniques. Currently, we support grid search, random search, and bayesian optimization. Just choose the hyperparameters you want, and let us to optimize it. (docs)
T4 GPU support
Attach local volume to workspace
Now you can start your development environment freely. Start your workspace and connect with JupyterLab / SSH.
Service is redesigned and changed by workspace.
Dataset versioning is available by dvc integration.
You can add local dataset. Local dataset will be mounted with NFS protocol in experiment.
You can download your logs and metrics in experiment detail page.
Workspace is renamed to Organization
Now, your source code does not have to be on GitHub to run an experiment on SavviHub. Create a CLI-driven project and run
savvihub experiment run on your local terminal without git push.
$ sv experiment run[?] Select project: cli-driven-exampleversion-control-example> cli-driven-example[?] Experiment message:[?] Start command: python main.py[?] Please choose a cluster:  aws-apne2-prod1 (SavviHub)>  aws-apne2-prod1 (SavviHub) on-premise-cluster (Custom)[?] Please choose a resource:  v1.v100-1.mem-52.spot (GPU(V100) x 1 / CPU 8 Cores / Memory 52GB) v1.cpu-2.mem-6.spot (CPU 2 Cores / Memory 6GB) v1.cpu-2.mem-6 (CPU 2 Cores / Memory 6GB) v1.cpu-4.mem-13 (CPU 4 Cores / Memory 13GB) v1.k80-1.mem-52 (GPU(K80) x 1 / CPU 4 Cores / Memory 52GB) v1.k80-8.mem-480 (GPU(K80) x 8 / CPU 32 Cores / Memory 480GB) v1.v100-1.mem-52 (GPU(V100) x 1 / CPU 8 Cores / Memory 52GB) v1.v100-4.mem-232 (GPU(V100) x 4 / CPU 32 Cores / Memory 232GB) v1.cpu-0.mem-1 (CPU shared / Memory 1GB) v1.k80-1.mem-52.spot (GPU(K80) x 1 / CPU 4 Cores / Memory 52GB) v1.cpu-4.mem-13.spot (CPU 4 Cores / Memory 13GB)>  v1.v100-1.mem-52.spot (GPU(V100) x 1 / CPU 8 Cores / Memory 52GB) v1.v100-8.mem-480 (GPU(V100) x 8 / CPU 96 Cores / Memory 480GB) v1.k80-16.mem-724 (GPU(K80) x 16 / CPU 64 Cores / Memory 724GB)[?] Please choose a kernel image:  savvihub/kernels:py37.full-cpu (Python 3.7 (All Packages)) savvihub/kernels:py36.full-cpu (Python 3.6 (All Packages))>  savvihub/kernels:py37.full-cpu (Python 3.7 (All Packages)) savvihub/kernels:py36.full-cpu.jupyter (Python 3.6 (JupyterLab)) savvihub/kernels:py37.full-cpu.jupyter (Python 3.7 (JupyterLab)) tensorflow/tensorflow:1.14.0-py3 (Tensorflow 1.14.0) tensorflow/tensorflow:1.15.5-py3 (Tensorflow 1.15.5) tensorflow/tensorflow:2.0.4-py3 (Tensorflow 2.0.4) tensorflow/tensorflow:2.2.1-py3 (Tensorflow 2.2.1) tensorflow/tensorflow:2.3.2 (Tensorflow 2.3.2) tensorflow/tensorflow:2.4.1 (Tensorflow 2.4.1) tensorflow/tensorflow:2.3.0 (TensorFlow 2.3.0 (Tensorboard))Upload the zipped local projectExperiment 1 is running. Check the experiment status at below linkhttps://savvihub.com/example-workspace/cli-driven-example/experiments/1
Edit have been added to service actions. A new service can easily be created using an existing service's configuration via
You can modify a stopped service's name, computing resource, start command, exposed ports, environment variables and ssh key. Name and ports can be updated even when running.
Edit experiment name and message on experiment details
Pass an experiment message with
-m, --message options of savvihub experiment run
Add a progress bar during savvihub dataset files upload
Display the experiment message on the output of savvihub experiment list
You can check your payment based on on-demand instance usage on settings/billing page. (docs)
If you use spot instance in experiment, it automatically continues after spot interruption. All you need to do is that make your experiment resist termination, such as save your checkpoint in every epoch, and start your experiment from saved checkpoint. You can check the details in here.
You can use A100 spot instances in us west2 (Oregon) region. Select default region as us-west2 in workspace create. You can only select one region per one workspace for now.
Save & load checkpoint in savvihub/examples.
Fix web terminal link broken in custom cluster.
You can use your own private docker registry on DockerHub & AWS ECR. Go
Workspace>Settings>Integrations, and register your own credentials. (Docs)
Now you can connect to your experiment / service with terminal on web, and native SSH connection. (Docs)
You might want to access the experiment container after it runs. Termination protection allows you to do that. If you checked the checkbox, then the experiment will go
idle status after finish it. (Docs)
(Fix) Log collect on various docker/kubernetes runtime configuration (e.g. RKE)
You can configure container log path with
For RKE, install helm with
--set kubernetes.logContainerPath=/var/log/containers. (Ref)
(Fix) Remove prometheus dependency
SavviHub does not install prometheus on savvihub agent installation.