Fix the cuda 12 base image (#46)

Signed-off-by: Hung-Han (Henry) Chen <chenhungh@gmail.com>

Fix the cuda 12 base image (#46)
f76e8981 · Henry Chen · GitHub · b62838ee · f76e8981 · f76e8981
Unverified Commit f76e8981 authored 1 year ago by Henry Chen Committed by GitHub 1 year ago
--- a/Dockerfile.cuda12
+++ b/Dockerfile.cuda12
 # syntax=docker/dockerfile:1

-FROM nvidia/cuda:12.2.0-base-ubuntu20.04
+FROM nvidia/cuda:12.2.0-base-ubuntu22.04
 RUN apt-get update && apt-get install -y -q python3 python3-pip
 WORKDIR /app
 COPY requirements.txt requirements.txt

--- a/README.md
+++ b/README.md
@@ -85,6 +85,12 @@ helm install llama2-7b-chat-cuda11 ialacol/ialacol -f examples/values/llama2-7b-

 Deploys llama2 7b model with 40 layers offloadind to GPU. The inference is accelerated by CUDA 11.

+### CUDA Driver Issues
+
+If you see `CUDA driver version is insufficient for CUDA runtime version` when making the request, you are likely using a Nvidia Driver that is not [compatible with the CUDA version](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html).
+
+Upgrade the driver manually on the node (See [here](https://github.com/awslabs/amazon-eks-ami/issues/1060) if you are using CUDA11 + AMI). Or try different version of CUDA.
+
 ## Tips

 ### Creative v.s. Conservative

--- a/charts/ialacol/Chart.yaml
+++ b/charts/ialacol/Chart.yaml
 apiVersion: v2
-appVersion: 0.7.0
+appVersion: 0.7.1
 description: A Helm chart for ialacol
 name: ialacol
 type: application
-version: 0.7.1
+version: 0.7.2