IOMMU on NVIDIA graphics

Simplified version of PCI passthrough via OVMF on archwiki with methods for turning on and off nvidia GPU added.

Host configuration

  1. Append kernel parameters intel_iommu=on iommu=pt and reboot (You're not a brainlet. Don't append intel_iommu=on when you're on an AMU machine).
  2. Run the following script to find out the IOMMU group that the dGPU belongs to and record the ids of the group members. Ids are printed out in the form of [1234:d231]
    #!/bin/bash
    shopt -s nullglob
    for g in /sys/kernel/iommu_groups/*; do
        echo "IOMMU Group ${g##*/}:"
        for d in $g/devices/*; do
            echo -e "\t$(lspci -nns ${d##*/})"
        done;
    done;
    
  3. Turn off the NVIDIA card (if it is turned on). This can be done with
    #!/bin/bash
    CONTROLLER_BUS_ID=$(echo "0000:\$(lspci | grep Controller | grep PCI | awk '{print \$1}')")
    DEVICE_BUS_ID=$(echo "0000:\$(lspci | grep NVIDIA | awk '{print \$1}')")
    MODULES_UNLOAD=(nvidia_drm nvidia_modeset nvidia_uvm nvidia)
    
    if [[ ! $(lsmod | grep nvidia) ]]; then
            echo "Error :  No Nvidia card loaded" 1>&2
            exit
    fi
    
    for module in "${MODULES_UNLOAD[@]}"
        do
            echo "Unloading module ${module}"
            sudo modprobe -r ${module}
        done
    
    if [[ $(lsmod | grep nvidia) ]]; then
    	echo "Error :  Module unload failed. Run nvidia-smi to check running process and kill them " 1>&2
    	exit
        else
            echo 'Removing Nvidia bus from the kernel'
            sudo tee /sys/bus/pci/devices/${DEVICE_BUS_ID}/remove <<<1
            echo 'Enabling powersave for the PCIe controller'
            sudo tee /sys/bus/pci/devices/${CONTROLLER_BUS_ID}/power/control <<<auto
    fi
    
  4. Now modprobe vfio_pci ids=[id1,id2...] (without the bracket [], comma being the field separator. I'm assuming that you're not a brainlet whose sole capacity is copy-pasting)
  5. Turn on the dGPU.
    #!/bin/bash
    CONTROLLER_BUS_ID=$(echo "0000:\$(lspci | grep Controller | grep PCI | awk '{print \$1}')")
    DEVICE_BUS_ID=$(echo "0000:\$(lspci | grep NVIDIA | awk '{print \$1}')")
    echo 'Turning the PCIe controller on to allow card rescan'
    sudo tee /sys/bus/pci/devices/${CONTROLLER_BUS_ID}/power/control <<<auto
    echo 'Waiting 1 second'
    sleep 1
    if [[ ! -d /sys/bus/pci/devices/${DEVICE_BUS_ID} ]]; then
            echo 'Rescanning PCI devices'
            sudo tee /sys/bus/pci/rescan <<<1
            echo "Waiting 3 seconds for rescan"
            sleep 3
    fi
    echo 'Turning the card on'
    sudo tee /sys/bus/pci/devices/${DEVICE_BUS_ID}/power/control <<<auto
    

To turn the dGPU off, first modprobe -r vfio_pci, then run the script presented (for turning off GPU of course) again.

Note that trying to write (via sudo tee) in order to power off the GPU with nvidia or vfio-pci modules loaded will cause the writing process to hang and not even possible to be terminated by kill -9.

Guest configuration

An example. The relevant line is -device vfio-pci,host=01:00.0.01:00.0 being the PCI address of the dGPU which can be found by running lspci | grep -i nvidia. I got 1:00.0 3D controller: NVIDIA Corporation TU117M [GeForce GTX 1650 Mobile / Max-Q] (rev a1), hence the host 01:00.0

#/usr/bin/env sh

PROCESSORS=2
MEMORY=2048

qemu-system-x86_64 -enable-kvm \
		-cpu host \
		-machine q35 \
		-nographic \
		-smp $PROCESSORS \
		-m $MEMORY \
		-device virtio-net,netdev=net0 -netdev user,id=net0,hostfwd=tcp::2222-:22 \
		-device virtio-balloon \
		-drive file=FreeBSD-13.0-CURRENT-amd64.qcow2,if=virtio,aio=native,cache.direct=on \
		-device vfio-pci,host=01:00.0