Skip to content

Instantly share code, notes, and snippets.

@theodric
Last active December 24, 2024 08:48
Show Gist options
  • Save theodric/a370b59ac41888fcda5381474941a1ab to your computer and use it in GitHub Desktop.
Save theodric/a370b59ac41888fcda5381474941a1ab to your computer and use it in GitHub Desktop.
Notes on device passthrough configuration for KVM hosts
2024-12-05: I have moved this information to a git repo which includes BIOS screenshots.
Please go here for all future updates:
https://github.com/theodric/kvm-vfio-notes
---------------------
Notes on getting KVM VFIO working on my hardware:
- CPU: Ryzen 7 5700G
- MB: Biostar B550T-SILVER
- Host GFX: Ryzen 7 5700G iGPU
- Passthrough GFX: PCIe Nvidia RTX 4000 SFF Ada Generation 20GB
- BIOS version: B55AK830 (AMD AM4 AGESA 1.2.0.Cc) (https://www.biostar.com.tw/app/en/mb/introduction.php?S_ID=1010&data-type=DOWNLOAD)
As of 2024-12-05, KVM VFIO passthrough of a current-gen Nvidia card while retaining the Ryzen 5700G's IGP as the host's framebuffer console is fully working.
I now provide a set of screenshots documenting every BIOS option, whether I changed it or not, as a means of providing a "known-good snapshot" of a working BIOS configuration. You will find them in the /screenshots directory of this repo.
December 2024 ADDENDUM TO BELOW NOTES:
1. I have found that the linux-amd-znver3 kernel reliaby works for passthrough, initializing devices in a sane order and respecting boot-time blacklist kernel args https://aur.archlinux.org/packages/linux-amd-znver3
2. I have since managed to get the Nvidia RTX A2000 12GB working in passthrough
3. The RTX 4000 SFF Ada Generation 20GB works as a drop-in replacement for the A2000.
4. Stability is much improved - I don't know what to blame, but it's infrequent enough (i.e. one crash every few months) that I haven't spent time on it.
5. I have recently transitioned away from Arch to openSUSE Tumbleweed, which works out-of-the-box as a passthrough host and does not constantly break.
6. I am separately maintaning a variant of the linux-amd-znver3 and linux-amd-znver2 configs which have been tweaked to work on openSUSE and are available at https://github.com/theodric/linux-amd-zen2-zen3/
2022 NOTES - FULLY WORKING PASSTHROUGH
In this configuration, I have two fully independent "computer stations" that can be used simultaneously:
1. the host Linux system, on which I have a KDE desktop
2. the guest VM, which can be Windows, Linux, or (theoretically) macOS (but that's illegal, so don't break the law, OK?)
Arch Linux
CPU: Ryzen 7 5700G
MB: Biostar B550T-SILVER
Host GFX: Ryzen 7 5700G iGPU
Passthrough GFX: PCIe MSI Radeon 560 ITX (fully working in Windows and Linux guests)
(new: Nvidia GeForce RTX A2000 12GB - so far not cooperating in any way - curse you, Nvidia!)
In UEFI setup:
In Advanced -> PCI Subsystem Settings:
Enable Above 4G decoding
Disable Resizable BAR
Enable SR-IOV support
Enable BME DMA Mitigation
In Advanced -> CSM Support
Option ROM Execution:
Network - UEFI
Storage - UEFI
Video - Legacy - forces PCIe GPU to legacy mode so it doesn't get grabbed by UEFI, which would stop it from being free for passthrough to the VM
Other PCI Device - UEFI
In Chipset:
In North Bridge Configuration
IOMMU -> Enabled
Primary video device -> IGD Video
append kernel args: iommu=pt and vfio-pci.ids=<PCI IDs of all cards to be passed through>
Find those IDs and check their PCIe group assignments with this script:
[root@cube ~]# cat vfio.sh
#!/bin/bash
shopt -s nullglob
for g in $(find /sys/kernel/iommu_groups/* -maxdepth 0 -type d | sort -V); do
echo "IOMMU Group ${g##*/}:"
for d in $g/devices/*; do
echo -e "\t$(lspci -nns ${d##*/})"
done;
done;
I am passing through 1 or 2 ports of the onboard USB 3.2 controller that is in its own PCIe device group, because the other controllers are in groups with other devices and don't break out cleanly even with the additional kernel args, resulting in instability.
The WLAN socket on this board functions perfectly as a standard PCIe 4x slot with an adapter cable, but again, it's part of a group of several other devices. However, adding a USB3 card here helps make up for the ports lost to passthrough.
-> TODO: see if my PCIe switch works on this
Works with all kernels tested so far:
[root@cube ~]# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-linux-zen root=UUID=cbde20e5-0804-4179-ae95-7c1752d86f04 rw loglevel=3 net.ifnames=0 biosdevname=0 noibrs noibpb nopti nospectre_v2 nospectre_v1 l1tf=off nospec_store_bypass_disable no_stf_barrier mds=off tsx=on tsx_async_abort=off mitigations=off iommu=pt vfio-pci.ids=1002:67ff,1002:aae0,1022:1639
After that, it's just a matter of assigning the PCIe devices and/or host USB devices to the VM in either the virt-manager GUI, or else directly in the config file if you're a masochist.
To get rid of the Spice display that is auto-populated, you may have to resort to editing XML, because the GUI will often refuse to let you delete it, citing some dependency.
After that, you can just toggle the virtual (non-GPU) display on and off as required by changing the Video -> Model setting in the VM properties between "None" and the other options.
######### OLD 2021 NOTES - PROXMOX ON MAC PRO 5,1 ################
The device to be passed through can't be held by any driver.
OSX-KVM instructions say to blacklist the graphics card driver and use only integrated graphics, but that doesn't work for me because my Socket 1366 Xeon does not have onboard graphics, and both of the video cards in the system use the amdgpu kernel module.
The solution is to use the sysfs interface to unbind the device I will pass through to macOS from the driver, like so:
root@kvm-mule:/sys/bus/pci/drivers/amdgpu# lspci -vv | grep -i amd
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Baffin [Radeon RX 550 640SP / RX 560/560X] (rev cf) (prog-if 00 [VGA controller])
Kernel driver in use: amdgpu
Kernel modules: amdgpu
01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X]
02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Tahiti XT [Radeon HD 7970/8970 OEM / R9 280X] (prog-if 00 [VGA controller])
Kernel modules: radeon, amdgpu
02:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Tahiti HDMI Audio [Radeon HD 7870 XT / 7950/7970]
root@kvm-mule:/sys/bus/pci/drivers/amdgpu# echo -n 0000:01:00.0 > unbind #detach - works live
root@kvm-mule:/sys/bus/pci/drivers/amdgpu# echo -n 0000:01:00.0 > bind #reattach - works live
@PapaBaer1968
Copy link

Hello theodric,
your vfio script was the last kick to get my system w. pci-passthrough running.
Same MB/CPU, GTX1660Super on Debian 12(sip).
Thank you very much.
best regards
PapaBaer68

@theodric
Copy link
Author

Awesome! Glad I could do something useful for someone 🙂

@PapaBaer1968
Copy link

Ah, you're on the night-shift too.
Do you have found a solution to get lm-sensors to work with the mb ?

@theodric
Copy link
Author

I ran 'sensors-detect' and said YES to everything, and what I got was:

[root@cube ~]# sensors
k10temp-pci-00c3
Adapter: PCI adapter
Tctl:         +56.8°C

amdgpu-pci-0600
Adapter: PCI adapter
vddgfx:        1.44 V
vddnb:       843.00 mV
edge:         +34.0°C
PPT:          38.00 W

nvme-pci-0500
Adapter: PCI adapter
Composite:    +48.9°C  (low  =  -5.2°C, high = +83.8°C)
                       (crit = +87.8°C)

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +56.0°C  (crit = +127.0°C)

I guess I didn't really care to mess with it further :)

@PapaBaer1968
Copy link

Yes, that's what I get.
I'll try to get more out if it.
Thank you

@lreeves
Copy link

lreeves commented Mar 5, 2024

I have no idea what "BME DMA Mitigation" does or means but finding this Gist fixed an issue I was having all day trying to get my kernel to initialize an AMD GPU properly. Thanks!

@theodric
Copy link
Author

theodric commented Mar 9, 2024

"Enable or disable (default) Bus Master Attribute that is disabled after PCI enumeration for PCI bridges after SMM is locked."

Why that would stop a card coming up is a mystery to me, but hey, if it works, it works :)

@Theadre
Copy link

Theadre commented Dec 18, 2024

Awesome! Glad I could do something useful for someone 🙂

Hello theodric,

I'm on Opensuse TumbleWeed

  • CPU: Ryzen 7 5750GE
  • MB: Asus B550I
  • Host GFX: Radeon RX 6800
  • Passthrough GFX: Nvidia Quadro RTX 4000

07:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU104GL [Quadro RTX 4000] [10de:1eb1] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12a0]
Kernel driver in use: vfio-pci
Kernel modules: nouveau
07:00.1 Audio device [0403]: NVIDIA Corporation TU104 HD Audio Controller [10de:10f8] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12a0]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
07:00.2 USB controller [0c03]: NVIDIA Corporation TU104 USB 3.1 Host Controller [10de:1ad8] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12a0]
Kernel driver in use: vfio-pci
Kernel modules: xhci_pci
07:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU104 USB Type-C UCSI Controller [10de:1ad9] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12a0]
Kernel driver in use: vfio-pci
Kernel modules: i2c_nvidia_gpu

[ 0.013261] [ T0] AMD-Vi: Unknown option - 'on'
[ 0.048938] [ T0] AMD-Vi: Using global IVHD EFR:0x206d73ef22254ade, EFR2:0x0
[ 0.344591] [ T1] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[ 0.345193] [ T1] AMD-Vi: Extended features (0x206d73ef22254ade, 0x0): PPR X2APIC NX GT IA GA PC GA_vAPIC
[ 0.345197] [ T1] AMD-Vi: Interrupt remapping enabled
[ 0.345197] [ T1] AMD-Vi: X2APIC enabled
[ 0.580940] [ T1] AMD-Vi: Virtual APIC enabled

Could you please help to pass my quadro rtx 4000 through VM? After I added the pci I get that error message while starting the VM guest :

error starting domain: internal error: QEMU unexpectedly closed the monitor (vm='Ename'): 2024-12-18T22:09:49.683873Z qemu-system-x86_64: -device {"driver":"vfio-pci","host":"0000:07:00.0","id":"hostdev0","bus":"pci.0","addr":"0xa"}: vfio 0000:07:00.0: group 15 is not viable
Please ensure all devices within the iommu_group are bound to their vfio bus driver.

Traceback (most recent call last):
File "/usr/share/virt-manager/virtManager/asyncjob.py", line 71, in cb_wrapper
callback(asyncjob, *args, **kwargs)
File "/usr/share/virt-manager/virtManager/asyncjob.py", line 107, in tmpcb
callback(*args, **kwargs)
File "/usr/share/virt-manager/virtManager/object/libvirtobject.py", line 57, in newfn
ret = fn(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/share/virt-manager/virtManager/object/domain.py", line 1405, in startup
self._backend.create()
File "/usr/lib64/python3.11/site-packages/libvirt.py", line 1379, in create
raise libvirtError('virDomainCreate() failed')
libvirt.libvirtError: internal error: QEMU unexpectedly closed the monitor (vm='Ename'): 2024-12-18T22:09:49.683873Z qemu-system-x86_64: -device {"driver":"vfio-pci","host":"0000:07:00.0","id":"hostdev0","bus":"pci.0","addr":"0xa"}: vfio 0000:07:00.0: group 15 is not viable
Please ensure all devices within the iommu_group are bound to their vfio bus driver.

@theodric
Copy link
Author

theodric commented Dec 23, 2024

Hi Theadre, you're running a different board than I am so I can't give you specific advice, but you could try:

Looks like your RTX 4000 exposes 4 devices with the following PCI IDs;
10de:1ad9, 10de:1ad8, 10de:10f8, 10de:1eb1

Specify those in your kernel cmdline (in /etc/default/grub) e.g. vfio-pci.ids=10de:1ad9,10de:1ad8,10de:10f8,10de:1eb1 and run sudo grub2-mkconfig -o /boot/grub2/grub.cfg to make the change take effect.
-- as I say in the gist, you should append at least these two kernel args: iommu=pt and vfio-pci.ids=<PCI IDs of all cards to be passed through>

There are other considerations like VFIO only working reliably when all devices to be passed through:

  1. are the only things in their IOMMU group (unless you specify a not-recommended kernel argument that enables passing through only part of an IOMMU group), and
  2. are ALL passed through at the same time.

To determine your IOMMU group allocations, download and run this https://github.com/theodric/kvm-vfio-notes/blob/main/vfio.sh to print what IOMMU groups your PCIe devices including the RTX 4000 are in. If there are other things than those 4 PCI IDs I mentioned at the beginning in the same IOMMU group, then either pull the RTX out and physically move it to another slot and try again, or (unsafe, will probably crash your system eventually) add the pcie_acs_override=downstream and type1.allow_unsafe_interrupts=1 flags to your cmdline. If you have don't have the slots to choose from, you're better off having the card that is not going to be passed through assigned to an IOMMU group that is shared with something that won't be passed through.

-If that is still not working, you may need to try detaching the driver (looks like nouveau) from the card before trying to pass it through, as I had to do when experimenting on a Mac Pro. This will blank the monitor, of course, so make sure you have another way to get in, which I guess would be your AMD GPU.
I have only tried detaching an AMD card/amdgpu driver, but in principle it should work with an Nvidia card and nouveau driver as well. The relevant commands extracted from the above document are:

#detach - works live
root@kvm-mule:/sys/bus/pci/drivers/amdgpu# echo -n 0000:01:00.0 > unbind

#reattach - works live
root@kvm-mule:/sys/bus/pci/drivers/amdgpu# echo -n 0000:01:00.0 > bind

You of course will need to substitute the actual IDs of your card within the correct driver there!

If that's STILL not working then you could blacklist nouveau entirely in your cmdline using modprobe.blacklist=nouveau. This should have the effect of ensuring that nothing has grabbed on to the card within the OS so it is free for KVM to do things to.

You can refer to the new repo for this information over at https://github.com/theodric/kvm-vfio-notes where I have screenshots of all my selected BIOS options. I have to enable a couple of possibly counter-intuitive settings to get the Nvidia card NOT to be activated by the BIOS or grabbed by the kernel, leaving it available for my VMs to use. I use the iGPU inside the CPU for the video console on the host, and the Nvidia card is ONLY used passed through to KVM VMs with its own monitor/input devices.

Do let me know if this works, or doesn't! Good luck.

@Theadre
Copy link

Theadre commented Dec 24, 2024

Hi theodric,

I gave up with Nvidia...
I replaced my GPU with an AMD one. It seems to be algood now.
Thank you for replying me though!

Nvidia better not exist for linux world, have a good one!
Cheers,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment