If you want to run Ollama on a specific GPU or multiple GPUs, this tutorial is for you. By default, Ollama utilizes all available GPUs, but sometimes you may want to dedicate a specific GPU or a subset of your GPUs for Ollama's use. The idea for this guide originated from the following issue: Run Ollama on dedicated GPU.
-
Create a script let's call it
ollama_gpu_selector.sh
:nano ollama_gpu_selector.sh
paste following code in it:
#!/bin/bash # Validate input validate_input() { if [[ ! $1 =~ ^[0-4](,[0-4])*$ ]]; then echo "Error: Invalid input. Please enter numbers between 0 and 4, separated by commas." exit 1 fi } # Update the service file with CUDA_VISIBLE_DEVICES values update_service() { # Check if CUDA_VISIBLE_DEVICES environment variable exists in the service file if grep -q '^Environment="CUDA_VISIBLE_DEVICES=' /etc/systemd/system/ollama.service; then # Update the existing CUDA_VISIBLE_DEVICES values sudo sed -i 's/^Environment="CUDA_VISIBLE_DEVICES=.*/Environment="CUDA_VISIBLE_DEVICES='"$1"'"/' /etc/systemd/system/ollama.service else # Add a new CUDA_VISIBLE_DEVICES environment variable sudo sed -i '/\[Service\]/a Environment="CUDA_VISIBLE_DEVICES='"$1"'"' /etc/systemd/system/ollama.service fi # Reload and restart the systemd service sudo systemctl daemon-reload sudo systemctl restart ollama.service echo "Service updated and restarted with CUDA_VISIBLE_DEVICES=$1" } # Check if arguments are passed if [ "$#" -eq 0 ]; then # Prompt user for CUDA_VISIBLE_DEVICES values if no arguments are passed read -p "Enter CUDA_VISIBLE_DEVICES values (0-4, comma-separated): " cuda_values validate_input "$cuda_values" update_service "$cuda_values" else # Use arguments as CUDA_VISIBLE_DEVICES values cuda_values="$1" validate_input "$cuda_values" update_service "$cuda_values" fi
-
Make the script executable and run it with administrative privileges:
chmod +x ollama_gpu_selector.sh sudo ./ollama_gpu_selector.sh
It will prompt you for the GPU number (main is always 0); you can give it comma-separated values to select more than one.
-
Use the command
nvidia-smi -L
to get the id of your GPU(s).
You can also add aliases for easier switching. Here's how you can do it. If you have 2 Nvidia GPUs (different models) like me:
nano ~/.bashrc
If you are using Zsh, then use the following command:
nano ~/.zshrc
Go to the end of the file and set your aliases. For example:
# Alias definitions for easier switching
alias 3090_ollama="sudo $SCRIPT_LOCATION/ollama_gpu_selector.sh 1"
alias 4090_ollama="sudo $SCRIPT_LOCATION/ollama_gpu_selector.sh 0"
Finally, update the current terminal session:
source ~/.bashrc
And for Zsh
source ~/.zshrc
Now you can run 3090_ollama
to force Ollama to use the second GPU as default. GPU numbers start at 0.
With the provided script, you force ollama to use only one GPU. As far as I know, ollama supports multi-GPU out of the box.
If you've already used the script, you can manually reverse its effect by running the following command:
Then, remove the line:
Environment="CUDA_VISIBLE_DEVICES=X"
Save the changes and exit the editor to revert ollama back to its original multi-GPU behavior.
b.t.w.
don't forget to run :