Skip to content

Instantly share code, notes, and snippets.

@Madhusakth
Last active March 17, 2024 12:14
Show Gist options
  • Save Madhusakth/66fa3daaffee8b7e11b83df5e2eb1c4e to your computer and use it in GitHub Desktop.
Save Madhusakth/66fa3daaffee8b7e11b83df5e2eb1c4e to your computer and use it in GitHub Desktop.
Steps for NVIDIA driver, CUDA 10.0 and cuDNN 7.4 installation on Linux 18.04

Steps for NVIDIA driver, CUDA 10.0 and cuDNN 7.4 installation on Linux 18.04

After spending hours trying to identify the correct driver version and the way to install Nvidia drivers, CUDA and cuDNN I have curated the steps here.

You can find the required CUDA and cuDNN version for tensorflow-gpu here: https://www.tensorflow.org/install/source

If you'd like a clean install, please uninstall the previous nvidia drivers and cuda version:

For apt-get driver and cuda install:

sudo apt-get purge "*nvidia*" "*cublas*" "cuda*" "nsight*" "libcudnn*" "libnccl*" "*nvidia*"
sudo apt-get autoremove
sudo apt-get autoclean
sudo rm -rf /usr/local/cuda*

check for nvidia driver/cuda presence using:

dpkg -l | grep -i nvidia
dpkg -l | grep -i cuda

If you installed the nvidia driver using .run file:

sudo /usr/bin/nvidia-uninstall

After the above steps your system should be clean and ready for installation

If you are planning to use tensorflow-gpu version, make sure you know the exact cuda and cuDNN version that is required along with the gcc version.

In my case I wanted cuda-10.0 cuDNN-7.4 for tensorflow-gpu-1.13. I have multiple gcc version installed in my system. Please refer to this post for details: https://linuxconfig.org/how-to-switch-between-multiple-gcc-and-g-compiler-versions-on-ubuntu-20-04-lts-focal-fossa

I downloaded the nvidia driver file from their website. Note that you need to install the appropriate nvidia driver based on the cuda version you are installing. (It took me a lot of time to figure this out!)

If you go to their webiste based on your nvidia product you'd get the latest driver: https://www.nvidia.com/download/index.aspx?lang=en-us

However, for the appropriate driver version based on cuda please download from https://www.nvidia.com/Download/driverResults.aspx/148589/en-us

NVIDIA Driver installation

sudo sh sudo sh NVIDIA-Linux-x86_64-430.34.run

nvidia-smi should return

PATH$ nvidia-smi
Thu Sep  9 21:38:26 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.34       Driver Version: 430.34       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 207...  Off  | 00000000:01:00.0 Off |                  N/A |
| 21%   41C    P0     1W / 215W |      0MiB /  7982MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Finally, reboot after nvidia driver installation

sudo reboot

CUDA install

Download the .run file from https://developer.nvidia.com/cuda-10.0-download-archive

Install using the below command:

sudo sh cuda_10.0.130_410.48_linux.run

It will appear like this:

Do you accept the previously read EULA?
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?
(y)es/(n)o/(q)uit: n

Install the CUDA 10.0 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
 [ default is /usr/local/cuda-10.0 ]: 

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: 
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 10.0 Samples?
(y)es/(n)o/(q)uit: y

Enter CUDA Samples Location
 [ default is PATH ]: y

Samples location must be an absolute path
Enter CUDA Samples Location
 [ default is PATH ]: 

Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...
Installing the CUDA Samples in PATH ...
Copying samples to PATH/NVIDIA_CUDA-10.0_Samples now...
Finished copying samples.

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-10.0
Samples:  Installed in PATH

Please make sure that
 -   PATH includes /usr/local/cuda-10.0/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.

***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 10.0 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run -silent -driver

Logfile is /tmp/cuda_install_10720.log

Follow the command-line prompts and say 'no' for driver install and 'yes' for all the other cases. During the license agreement, hit space and that should move through the page faster.

Export path by pasting this code to your .bashrc

vim ~/.bashrc
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-10.0/bin:$PATH
source ~/.bashrc 

To test CUDA installation run the following

cd /usr/local/cuda/samples
sudo make -k
./bin/x86_64/linux/release/deviceQuery

That should give you something like

./bin/x86_64/linux/release/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce RTX 2070 SUPER"
  CUDA Driver Version / Runtime Version          10.1 / 10.0
  CUDA Capability Major/Minor version number:    7.5
  Total amount of global memory:                 7982 MBytes (8370061312 bytes)
  (40) Multiprocessors, ( 64) CUDA Cores/MP:     2560 CUDA Cores
  GPU Max Clock rate:                            1770 MHz (1.77 GHz)
  Memory Clock rate:                             7001 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 4194304 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1024
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 3 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime Version = 10.0, NumDevs = 1
Result = PASS

You have successfully installed CUDA!

cuDNN installation

Download the .deb files for the appropriate cuDNN version from https://developer.nvidia.com/cudnn. But, you'll need an account. So, go ahead and create one. Download runtime library, developer library and code samples and user guilde (deb) for your ubuntu version.

sudo dpkg -i libcudnn7_7.4.2.24-1+cuda10.0_amd64.deb
sudo dpkg -i libcudnn7-dev_7.4.2.24-1+cuda10.0_amd64.deb
sudo dpkg -i libcudnn7-doc_7.4.2.24-1+cuda10.0_amd64.deb

To test cuDNN installation, copy /usr/src/cudnn_samples_v7/ to any folder. cd into that folder

cd cudnn_samples_v7/mnistCUDNN/
make clean && make
./mnistCUDNN

This should return

PATH:~/Desktop/Cadence/bin_quant/driver-files/cudnn_samples_v7/mnistCUDNN$ ./mnistCUDNN 
cudnnGetVersion() : 7402 , CUDNN_VERSION from cudnn.h : 7402 (7.4.2)
Host compiler version : GCC 4.8.5
There are 1 CUDA capable devices on your machine :
device 0 : sms 40  Capabilities 7.5, SmClock 1770.0 Mhz, MemSize (Mb) 7982, MemClock 7001.0 Mhz, Ecc=0, boardGroupID=0
Using device 0

Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 0
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.015264 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.026432 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.034240 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.059200 time requiring 207360 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.059680 time requiring 2057744 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!

Testing half precision (math in single precision)
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 0
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.012288 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.020544 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.024768 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.044832 time requiring 207360 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.049696 time requiring 2057744 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!

Phew! We installed all three of them!

Note: The one details that took me hours was the nvidia driver version needs to match the version that cuda needs and although it should have worked, directly installing the nvidia driver from cuda install never worked for me. Also, the latest nvidia driver for my GPU didn't work for cuda-10.0 either.

Edit 1: To install 470.94 driver: https://www.nvidia.com/en-us/geforce/drivers/ Tue May 17 18:28:57 2022
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 470.94 Driver Version: 470.94 CUDA Version: 11.4 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A | | 21% 44C P0 24W / 215W | 0MiB / 7982MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+

UPDATE: If nvidia-smi fails after reboot, re-install the driver. sudo sh NVIDIA-Linux-x86_64-470.94.run Thu Aug 4 15:38:30 2022
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 470.94 Driver Version: 470.94 CUDA Version: 11.4 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A | | 21% 46C P0 29W / 215W | 0MiB / 7982MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment