How to get FreeBSD + Bhyve + SR-IOV to work

Sadly, for now, just one VM works. In my case with a X552 10GBASE-T NIC and a 10-Gigabit X540-AT2. I used FreeBSD 12.1-RELEASE-p10 and ix(4) driver 3.3.14.

Preparations

Checks

Check the device you want to use, in thise case ix0, shows up in /dev/iov. If it does not, you have the wrong device driver, BIOS-setting, CPU, PCI-root bridge. The device needs SR-IOV capability and the PCIe-bridges need Access Control Service capability. SR-IOV needs to be enabled in the BIOS, in the PCIe settings. The output of pciconf -c -l shows this
# pciconf -c -l
...
pcib2@pci0:0:1:0:       class=0x060400 card=0x086d15d9 chip=0x6f028086 rev=0x02 hdr=0x01
    cap 0d[40] = PCI Bridge card=0x086d15d9
    cap 05[60] = MSI supports 2 messages, vector masks 
    cap 10[90] = PCI-Express 2 root port max data 128(256) ARI disabled
                 link x0(x4) speed 0.0(8.0) ASPM disabled(L1)
                 slot 1 power limit 0 mW
    cap 01[e0] = powerspec 3  supports D0 D3  current D0
    ecap 000b[100] = Vendor 1 ID 2
    ecap 000d[110] = ACS 1
    ecap 0001[148] = AER 1 0 fatal 0 non-fatal 0 corrected
    ecap 000b[1d0] = Vendor 1 ID 3
    ecap 0019[250] = PCIe Sec 1 lane errors 0
    ecap 000b[280] = Vendor 1 ID 5
    ecap 000b[300] = Vendor 1 ID 8

...
Note the ACS

Check that module vmm.ko loads. If it does not, you have incorrect BIOS-setting or a CPU that is too old or unsupported. VT-d or AMD-Vi needs to be enabled in the CPU settings of the BIOS to enable the IOMMU. The output of pciconf -c -l shows this after the iovctl command.

# pciconf -c -l
...
ix0@pci0:3:0:0: class=0x020000 card=0x15ad15d9 chip=0x15ad8086 rev=0x00 hdr=0x00
    cap 01[40] = powerspec 3  supports D0 D3  current D0
    cap 05[50] = MSI supports 1 message, 64 bit, vector masks 
    cap 11[70] = MSI-X supports 64 messages, enabled
                 Table in map 0x20[0x0], PBA in map 0x20[0x2000]
    cap 10[a0] = PCI-Express 2 endpoint max data 256(512) FLR
                 link x1(x1) speed 2.5(2.5) ASPM L1(L0s/L1)
    cap 03[e0] = VPD
    ecap 0001[100] = AER 2 0 fatal 0 non-fatal 1 corrected
    ecap 0003[140] = Serial 1 0000c9ffff000000
    ecap 000e[150] = ARI 1
    ecap 0010[160] = SR-IOV 1 IOV enabled, Memory Space enabled, ARI enabled
                     4 VFs configured out of 64 supported
                     First VF RID Offset 0x0080, VF RID Stride 0x0002
                     VF Device ID 0x15a8
                     Page Sizes: 4096 (enabled), 8192, 65536, 262144, 1048576, 4194304
    iov bar  [184] = type Memory, range 64, base 0xfb200000, size 16384, enabled
    iov bar  [190] = type Memory, range 64, base 0xfb210000, size 16384, enabled
    ecap 000d[1b0] = ACS 1
    ecap 0018[1c0] = LTR 1
...
Note the SR-IOV enabled and the maximum number of VFs.

Configuration

/boot/loader.conf

...
vmm_load="YES"
if_ix_updated_load=YES
...

/etc/rc.conf

...
iovctl_files="/etc/iovctl-ix0.conf"
...

/etc/iovctl-ix0.conf

PF {
	device: "ix0";
	num_vfs: 4;
}
DEFAULT {
	passthrough: true;
	allow-set-mac: true;
	allow-promisc: true;
}
This gives you four ppt devices that you passthrough to the VMs. Then do pciconf list and find the passthrough device IDs, for example
# pciconf -l | grep ^ppt
ppt0@pci0:3:0:128:	class=0x020000 card=0x15ad15d9 chip=0x15a88086 rev=0x00 hdr=0x00
ppt1@pci0:3:0:130:	class=0x020000 card=0x15ad15d9 chip=0x15a88086 rev=0x00 hdr=0x00
ppt2@pci0:3:0:132:	class=0x020000 card=0x15ad15d9 chip=0x15a88086 rev=0x00 hdr=0x00
ppt3@pci0:3:0:134:	class=0x020000 card=0x15ad15d9 chip=0x15a88086 rev=0x00 hdr=0x00
Here you see PCI device 3:0:128 for the first device. That number you put in the configuration of the first VM. In the second you put 3:0:130 etcetera.

/vm/friet1-ppt.sh

A FreeBSD guest.
...
sh /usr/share/examples/bhyve/vmrun.sh -m 2048 -d friet1.dsk -p 3/0/128 -c 2 friet1

ketamine-install.sh

A NetBSD guest.
echo "(cd0) ./NetBSD-9.0-amd64.iso" > install.map
echo "(hd1) ./nbsd1.dsk" >> install.map


echo "knetbsd -h -r cd0a (cd0)/netbsd ; boot" | grub-bhyve -r cd0 -M 1G -S -m  install.map ketamine
sleep 2

bhyve -A -H -P -S -s 0:0,hostbridge -s 1:0,lpc \
    -s 2:0,passthru,3/0/128 \
    -s 3:0,virtio-blk,./ketamine.dsk \
    -s 4:0,ahci-cd,./NetBSD-9.0-amd64.iso \
    -l com1,stdio -c 2 -m 1G ketamine

bhyvectl --vm=ketamine --destroy
Remove the (hd1) items to run the VM after installation.

Further work

The ugly part

It only works for one VM. As soon as a second VM does an ifconfig of its NIC, there is no more networking in any VM. Then, if you shutdown the second VM and do ifconfig ixv0 down and ifconfig ixv0 up on the first VM, it works again.

Also, even with the allow-promisc: true, promiscuous mode is not possible: BIOCPROMISC: Operation not supported.

References

As with a lot of things, once you know how to do them, they are easy.

Note

And now for something completely different: do not put untagged (ix0.7) interfaces in a bridge. Performance will be horrible. Ping works, but there is packet loss, resulting in lots of TCP retransmits and lots of duplicate ACKs. 8kB/s is optimistic. Just put the tagged interface (ix0) in the bridge and configure the VLAN in the guest. This has security implications if the guest is untrusted.
webmaster@itsx / f+b+s / created 2020-9-26 / last update 2020-10-4