0sama's Website!

GNU/Linux

Setting Up CUDA on Arch Linux for GTX 750 Ti

CUDA and GTX 750 Ti

FUCK THE COMPLECATED MESS THAT IS CUDA ESPECIALLY WHEN DEALING WITH OLDER GPU'S!!!!!!!! SO I WRITE THIS TO HELP MESELF NOT FORGET THESE THINGS!!!

Step 1: Installing the NVIDIA Proprietary Drivers

The first step is to install the NVIDIA proprietary driver, which is required for CUDA to function properly. For Maxwell-based GPUs like the GTX 750 Ti, use the nvidia package:

  • Go to the Arch Wiki to follow the instructions for installing the correct drivers.
  • Install the NVIDIA driver with: sudo pacman -S nvidia
  • If you're using the LTS kernel, use: sudo pacman -S nvidia-lts
  • For kernel compatibility, install: sudo pacman -S nvidia-dkms

Step 2: Install CUDA 10.1

Once the drivers are installed, we can proceed with installing CUDA 10.1. CUDA 10.1 is the latest version supported by the GTX 750 Ti which has Compute Capability of 5.0. We need to install cuda and cuDNN. You have two main methods of installation:

  1. Install CUDA from the NVIDIA website.
  2. Install an older version of CUDA using Arch Linux Archive.

Method 1: Install CUDA from NVIDIA's Website

- Note: These steps are takken from Here!

  1. Install CUDA:
    1. Go to the NVIDIA CUDA Downloads page and select:
      • Linux
      • x86_64
      • Ubuntu (choose the latest version) -> runfile (local).
    2. Copy the provided wget command and execute it in the terminal:
    3. wget https://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda_10.1.243_418.87.00_linux.run
    4. Run the installer:
    5. chmod +x cuda_10.1.243_418.87.00_linux.run
      sudo ./cuda_10.1.243_418.87.00_linux.run
    6. Skip the driver installation since it’s already done via pacman.
    7. Set the environment paths in `~/.bashrc`:
    8. export PATH=/usr/local/cuda-10.1/bin:$PATH
      export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64:$LD_LIBRARY_PATH
    9. Reload `~/.bashrc`:
    10. source ~/.bashrc
  2. Install cuDNN:
    1. Go to the NVIDIA cuDNN Downloads page and select:
      • Download cuDNN v8.0.5 (November 9th, 2020), for CUDA 10.1
      • cuDNN Library for Linux (x86)
    2. Follow the installation instructions on this page. When using Nvidia's way once you extract the file a folder called "cuda" will appear.
    3. Inside it there will be 2 folders called "lib64" and "include"
    4. The guide has this command
      "sudo cp -P cudnn-linux-x86_64–8.9.7.29_cuda12-archive/lib/libcudnn* /usr/local/cuda/lib64/"
    5. This command for us is the same as
      "sudo cp -P path/where/you/extracted/cudnn/cuda/lib64/libcudnn* /usr/local/cuda/lib64/"
    6. So its the same thing with different file names.

Method 2: Install CUDA via Arch Linux Archive

- These steps are taken from this Here.

  1. Download the required CUDA packages from the Arch Linux Archive:
    1. Select cuda-10.1.243-2-x86_64.pkg.tar.xz
    2. Select cudnn-8.0.5.39-1-x86_64.pkg.tar.zst
    sudo pacman -U https://archive.archlinux.org/packages/c/cuda/cuda-10.1.243-1-x86_64.pkg.tar.xz
    sudo pacman -U https://archive.archlinux.org/packages/c/cudnn/cudnn-8.0.5-1-x86_64.pkg.tar.xz

Step 3: Install Python 3.7 for TensorFlow

TensorFlow 2.1.0 is compatible with CUDA 10.1, but it requires Python 3.7. To install Python 3.7 on Arch Linux:

  • Install Python 3.7 from the AUR:
    yay -S python37
  • Create a virtual environment:
  • python3.7 -m venv tf_env
    source tf_env/bin/activate
  • Install TensorFlow GPU 2.1.0:
  • pip install tensorflow-gpu==2.1.0
  • Also, install the required version of protobuf:
  • pip uninstall protobuf
    pip install protobuf==3.20.*
  • Lastly if your importing a model like .pt file you need h5py version 2.10.0 otherwise it will give a core._numpy error or something like that.
  • pip install h5py==2.10.0

Step 4: Install TensorRT for Deep Learning Inference.

- These steps are taken from this blog guide.

- TensorRT 6 is required for deep learning inference on CUDA 10.1. To install it:

  1. Go to the TensorRT download page.
  2. Select "TensorRT 6" and fill out the stupid survey.
  3. Download the latest Ubuntu (NOT .deb the tar one) package from cuda 10.1.
  4. Then do the following:
    1. tar -xvf TensorRT-6.0.1.5.Ubuntu-18.04.x86_64-gnu.cuda-10.1.cudnn7.6.tar.gz
      cp -r TensorRT-6.0.1.5.Ubuntu-18.04.x86_64-gnu.cuda-10.1.cudnn7.6/TensorRT-6.0.1.5 ~
  5. Then go to your home dir and open "TensorRT-6.0.1.5/doc/pdf/TensorRT-Installation-Guide.pdf"
  6. Follow the tar installation also consult the blog guide as well.

Step 5: Install Python 3.8 for Pytorch

Pytorch 1.8.1 is compatible with CUDA 10.1. Go to this page and Ctrl+F search "cuda 10.1". The latest Python version for it is "Python 3.8". To install Python 3.8 on Arch Linux:

  • Install Python 3.8 from the AUR:
    yay -S python38
  • Create a virtual environment:
  • python3.8 -m venv pytorch_env
    source pytorch_env/bin/activate
  • Install Pytorch GPU:
  • pip install torch==1.8.1+cu101 torchvision==0.9.1+cu101 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html

Thats it!! Thank you Pytorch for simply showing what we need in a nice table form. Fuck TensorFlow!!!

Step 6: Install g++-8 for CUDA Compilation

To compile CUDA programs, you need g++-8, as newer versions may not be compatible with CUDA 10.1:

Step 7: Patch the "math_functions.h" File

Recently a bug in nvidia's "math_functions.h" made it impossible to compile cuda programs.

  • Navigate to "/usr/local/cuda/include/math_functions.h" and follow the patch instructions provided here: NVIDIA forum thread.

Step 8: Reboot and Verify Installation

Once everything is installed, reboot your system just because. Now using that virtual environment that you created use ChatGPT to write a python program to test GPU with tensorflow-gpu and pytorch. It SHOULD WORK!!!!!

Conclusion

These are the things we did:

  1. nvidia driver from arch package
  2. CUDA 10.1 and cuDNN v8.0.5
    1. For TensorFlow:
      • python 3.7
      • pip install tensorflow-gpu==2.1.0
      • pip install protobuf==3.20.*
      • TensorRT 6
    1. For Pytorch:
      • python 3.8
      • pip install torch==1.8.1+cu101 torchvision==0.9.1+cu101 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
  3. g++-8
  4. Patching /usr/local/cuda/include/math_functions.h

Additional Notes:

I made a bash script for compiling and running cuda and mpi programs. You can use it to test your own programs. Additionally you should get a folder in home dir called "NVIDIA_CUDA-10.1_Samples" After Cuda 10 all NVIDIA_CUDA_Samples are in their github.

  • My Bash Scripts
  • While testing "NVIDIA_CUDA-10.1_Samples" make sure in Makefile to include g++-8 and gcc-8 as they assume your system by default has it. Also check out this post.
  • If you didnt see "NVIDIA_CUDA-10.1_Samples" then follow this guide and see "Verification and Testing Procedures" part.
  • Recently Nvidia has added nvidia GSP firmware in the fallback image if using the kms hook. This filled the "/boot" partition for some people. To manage it please follow this guide.
  • Also this guide shows how to increase your "/boot" partition.
  • If you get the following error while compiling a program:
    /usr/include/linux/types.h:12:27: error: expected initializer before β€˜__s128’
                      typedef __signed__ __int128 __s128 __attribute__((aligned(16)));
                                                ^~~~~~

    Then follow this guide. Essentially all you need to do is the following:
    1. sudo vim /usr/include/linux/types.h
    2. At line "12" you will see the following line:
      typedef __signed__ __int128 __s128 __attribute__((aligned(16)));
    3. Just comment it out and compile.
    4. Remember! Be sure to uncomment the line after compilation as this is a system header file.
    5. This is seen as a bug from cuda. You can read about it here.