Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save FlorianHeigl/de72c603b1579dd94dc9ecbeb290a36c to your computer and use it in GitHub Desktop.

Select an option

Save FlorianHeigl/de72c603b1579dd94dc9ecbeb290a36c to your computer and use it in GitHub Desktop.
cn7322 / cn2360 loading firmware just like its nothing
[ 765.142769] LiquidIO 0000:01:00.0: Initializing device 177d:9702.
[ 765.142946] LiquidIO 0000:01:00.0: trs:64 max_vfs:52 rings_per_vf:1 pf_srn:52 num_pf_rings:12
[ 776.352906] LiquidIO 0000:01:00.0: Firmware version: 1.7.2
[ 776.352911] LiquidIO 0000:01:00.0: octeon_download_firmware: Loading 1 images
[ 776.352915] LiquidIO 0000:01:00.0: Loading firmware 1285952 at 21000000
[ 776.375532] LiquidIO 0000:01:00.0: Writing boot command: bootoct 0x21000000 numcores=$(numcores) time_sec=1774939247 time_nsec=539216810
[ 776.403319] LiquidIO 0000:01:00.1: Initializing device 177d:9702.
[ 776.403448] LiquidIO 0000:01:00.1: trs:64 max_vfs:52 rings_per_vf:1 pf_srn:52 num_pf_rings:12
[ 779.282224] LiquidIO 0000:01:00.1: Running NIC (1500000000 Hz)
[ 779.282237] LiquidIO 0000:01:00.0: Running NIC (1500000000 Hz)
[ 779.926064] LiquidIO 0000:01:00.1: eth%d VLAN filter enabled
[ 779.938085] LiquidIO 0000:01:00.1 eth0: RX Checksum Offload Enabled
[ 779.949063] LiquidIO 0000:01:00.1 eth0: TX Checksum Offload Enabled
[ 779.996062] LiquidIO 0000:01:00.0: eth%d VLAN filter enabled
[ 780.008063] LiquidIO 0000:01:00.0 eth1: RX Checksum Offload Enabled
[ 780.020060] LiquidIO 0000:01:00.0 eth1: TX Checksum Offload Enabled
[ 807.766335] LiquidIO 0000:01:00.1 eth0: Link Down
[ 807.766342] LiquidIO 0000:01:00.1 eth0: Max MTU changed from 1500 to 16000
[ 807.768389] LiquidIO 0000:01:00.1: eth0 interface is opened
[ 810.244896] LiquidIO 0000:01:00.0 eth1: Link Down
[ 810.244902] LiquidIO 0000:01:00.0 eth1: Max MTU changed from 1500 to 16000
[ 810.246497] LiquidIO 0000:01:00.0: eth1 interface is opened
@FlorianHeigl
Copy link
Copy Markdown
Author

FlorianHeigl commented Mar 31, 2026

also it has no link, of course and ethtool-6.19-r0 still kinda can't detect link anyway.
maybe the VF's would work and the PF has no function.

or rather, maybe it's not announcing 10g to Linux and also doesn't read the DAC cables' link speed?
they got away with a lot of peculiarities for sure. (till they didn't)

Settings for eth0:
        Supported ports: [ FIBRE ]
        Supported link modes:   25000baseCR/Full
                                25000baseKR/Full
                                25000baseSR/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: No
        Supported FEC modes: Not reported
        Advertised link modes:  25000baseCR/Full
                                25000baseKR/Full
                                25000baseSR/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: Unknown!
        Duplex: Unknown! (255)
        Auto-negotiation: off
        Port: FIBRE
        PHYAD: 0
        Transceiver: external
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: no

@FlorianHeigl
Copy link
Copy Markdown
Author

FlorianHeigl commented Mar 31, 2026

loading vsw fw working

tschike:/lib/firmware/liquidio# ls -l
total 37180
-rw-r--r-- 1 root root 424336 Jan 22 13:21 lio_210nv_nic.bin.zst
-rw-r--r-- 1 root root 403609 Jan 22 13:21 lio_210sv_nic.bin.zst
-rw-r--r-- 1 root root 403835 Jan 22 13:21 lio_23xx_nic.bin.zst
-rw-r--r-- 1 root root 20434408 Mar 31 09:16 lio_23xx_vsw.bin
-rw-r--r-- 1 root root 15970313 Mar 31 09:16 lio_23xx_vsw.bin.zst
-rw-r--r-- 1 root root 424325 Jan 22 13:21 lio_410nv_nic.bin.zst

note filesize will roughly match

modprobe liquidio fw_type=vsw

[ 94.751588] LiquidIO 0000:01:00.0: Initializing device 177d:9702.
[ 94.751695] LiquidIO 0000:01:00.0: trs:64 max_vfs:52 rings_per_vf:1 pf_srn:52 num_pf_rings:12
[ 106.081088] LiquidIO 0000:01:00.0: Firmware version: 1.7.2
[ 106.081092] LiquidIO 0000:01:00.0: octeon_download_firmware: Loading 1 images
[ 106.081095] LiquidIO 0000:01:00.0: Loading firmware 20433096 at 21000000
[ 106.439666] LiquidIO 0000:01:00.0: Writing boot command: setexpr n10 $(numcores) - 6; setexpr lxn $(n10) / 4; setexpr acn $(n10) - $(lxn); bootoctlinux 0x21000000 numcores=0x$(lxn) mem=1392M time_sec=1774941774 time_nsec=52734709
[ 106.462455] LiquidIO 0000:01:00.1: Initializing device 177d:9702.
[ 106.462631] LiquidIO 0000:01:00.1: trs:64 max_vfs:52 rings_per_vf:1 pf_srn:52 num_pf_rings:12
[ 130.220406] LiquidIO 0000:01:00.1: Running NIC (1500000000 Hz)
[ 130.220424] LiquidIO 0000:01:00.0: Running NIC (1500000000 Hz)
[ 130.851837] LiquidIO 0000:01:00.0: eth%d VLAN filter enabled
[ 130.866174] LiquidIO 0000:01:00.0 eth0: RX Checksum Offload Enabled
[ 130.873838] LiquidIO 0000:01:00.0 eth0: TX Checksum Offload Enabled
[ 130.873844] LiquidIO 0000:01:00.1: eth%d VLAN filter enabled
[ 130.884865] LiquidIO 0000:01:00.1 eth1: RX Checksum Offload Enabled
[ 130.895841] LiquidIO 0000:01:00.1 eth1: TX Checksum Offload Enabled

timeout might be 10g vs 25g link issue
[ 253.172159] LiquidIO 0000:01:00.0: lio_process_ordered_list:
[ 253.172164] LiquidIO 0000:01:00.0: cmd 1/14/0/0 failed,
[ 253.172167] LiquidIO 0000:01:00.0: timeout (4294920802, 4294920800)
[ 314.610213] LiquidIO 0000:01:00.0: lio_process_ordered_list:
[ 314.610218] LiquidIO 0000:01:00.0: cmd 1/14/0/0 failed,
[ 314.610221] LiquidIO 0000:01:00.0: timeout (4294982241, 4294982240)
[ 314.610252] LiquidIO 0000:01:00.0: eth0 interface is opened
[ 333.988718] LiquidIO 0000:01:00.1: eth1 interface is opened

more hangups from this
tschike:~# ip link set $LIO_MACVLAN_PF0 master $LIO_BOND_MGMT
but sets promisc mode too
[ 412.058835] LiquidIO 0000:01:00.0 eth0: entered promiscuous mode

This ddint work, but pf0 did?
tschike:~# ip link set $LIO_MACVLAN_PF1 master $LIO_BOND_MGMT
Not enough information: "dev" argument is required.

not reachable
tschike:~# ping -q 169.254.1.1 -c 1
PING 169.254.1.1 (169.254.1.1): 56 data bytes

--- 169.254.1.1 ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss

maybe missing
tschike:~# echo 1 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
[ 642.441599] LiquidIO_VF 0000:01:00.3: OCTEON_CN23XX VF: 1 ioqs
[ 642.447181] LiquidIO 0000:01:00.0: driver for VF0 was loaded
[ 642.465171] LiquidIO_VF 0000:01:00.3 eth4: RX Checksum Offload Enabled
[ 642.476142] LiquidIO_VF 0000:01:00.3 eth4: TX Checksum Offload Enabled

worked well, ping still lolnope
fw is (hopefully) correctly loaded
tschike:~# cat /sys/module/liquidio/parameters/fw_type
vsw
gonna have to check serial console output on card pray mercy etc

35% chance stupid card wanting 25g link
20% chance fw wrong / different
45% nothing works because because

50% of 45% because dumb
tschike:~# set | grep MAC
LIO_MACVLAN_PF0='lio-mcvlan1'

@FlorianHeigl
Copy link
Copy Markdown
Author

plugged a 25g cable (the same cable) into both ports and rebooted

tried setting up 10 vfs per pci device node beforehand (need to load module first)
ran the script unmodified except eth0/eth1

tschike:~# bash -x liovswgogo

  • PF0=eth0
  • PF1=eth1
  • LIO_BOND_MGMT=lio-bond-mgmt
  • LIO_MACVLAN_PF0=lio-mcvlan0
  • LIO_MACVLAN_PF1=lio-mcvlan1
  • LIO_HOST_MGMT_IP4_ADDR=169.254.1.2
  • LIO_MGMT_IP4_ADDR=169.254.1.1
  • LIO_MGMT_IP4_MASK=16
  • ip link set eth0 up
  • ip link set eth1 up
  • ip link add lio-mcvlan0 link eth0 type macvlan
  • ip link add lio-mcvlan1 link eth1 type macvlan
  • ip link add lio-bond-mgmt type bond
  • echo balance-rr
  • ip link set lio-mcvlan0 master lio-bond-mgmt
  • ip link set lio-mcvlan1 master lio-bond-mgmt
  • ip addr add 169.254.1.2/16 dev lio-bond-mgmt
  • ip link set lio-bond-mgmt up

[ 149.062532] LiquidIO_VF 0000:01:09.3 eth22: RX Checksum Offload Enabled
[ 149.073528] LiquidIO_VF 0000:01:09.3 eth22: TX Checksum Offload Enabled
[ 149.073551] pci 0000:01:09.4: [177d:9712] type 00 class 0x020000 PCIe Endpoint
[ 149.073679] pci 0000:01:09.4: Adding to iommu group 43
[ 149.073819] LiquidIO_VF 0000:01:09.4: Initializing device 177d:9712.
[ 149.073824] LiquidIO_VF 0000:01:09.4: enabling device (0000 -> 0002)
[ 149.074939] LiquidIO_VF 0000:01:09.4: OCTEON_CN23XX VF: 1 ioqs
[ 149.080568] LiquidIO 0000:01:00.1: driver for VF9 was loaded
[ 149.097517] LiquidIO_VF 0000:01:09.4 eth23: RX Checksum Offload Enabled
[ 149.108522] LiquidIO_VF 0000:01:09.4 eth23: TX Checksum Offload Enabled
[ 149.108550] LiquidIO 0000:01:00.1: oct->pf_num:1 num_vfs:10
[ 149.108554] LiquidIO 0000:01:00.1: 10 VFs requested; only 0 enabled
[ 253.145309] LiquidIO 0000:01:00.0: lio_process_ordered_list:
[ 253.145316] LiquidIO 0000:01:00.0: cmd 1/14/0/0 failed,
[ 253.145320] LiquidIO 0000:01:00.0: timeout (4294920801, 4294920800)
[ 253.145360] LiquidIO 0000:01:00.0: eth0 interface is opened
[ 253.155227] LiquidIO 0000:01:00.1: eth1 interface is opened
[ 253.170706] LiquidIO 0000:01:00.0 eth0: entered promiscuous mode
[ 253.181076] lio-bond-mgmt: (slave lio-mcvlan0): Enslaving as an active interface with an up link
[ 253.181914] LiquidIO 0000:01:00.1 eth1: entered promiscuous mode
[ 253.183075] lio-bond-mgmt: (slave lio-mcvlan1): Enslaving as an active interface with an up link

ethtool hands when querying the interface eth0

after very long (30s-60s) timepout
[ 253.181076] lio-bond-mgmt: (slave lio-mcvlan0): Enslaving as an active interface with an up link
[ 253.181914] LiquidIO 0000:01:00.1 eth1: entered promiscuous mode
[ 253.183075] lio-bond-mgmt: (slave lio-mcvlan1): Enslaving as an active interface with an up link
[ 314.583526] LiquidIO 0000:01:00.0: lio_process_ordered_list:
[ 314.583531] LiquidIO 0000:01:00.0: cmd 1/14/0/0 failed,
[ 314.583534] LiquidIO 0000:01:00.0: timeout (4294982241, 4294982240)
[ 376.021890] LiquidIO 0000:01:00.0: lio_process_ordered_list:
[ 376.021897] LiquidIO 0000:01:00.0: cmd 1/14/0/0 failed,
[ 376.021900] LiquidIO 0000:01:00.0: timeout (4295043681, 4295043680)

same shit
tschike:~# ethtool eth0
Settings for eth0:
Supported ports: [ FIBRE ]
Supported link modes: 10000baseKR/Full
25000baseCR/Full
25000baseKR/Full
25000baseSR/Full
10000baseCR/Full
10000baseSR/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: 25000baseCR/Full
25000baseKR/Full
25000baseSR/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: Unknown!
Duplex: Unknown! (255)
Auto-negotiation: off
Port: FIBRE
PHYAD: 0
Transceiver: external
Current message level: 0x00000007 (7)
drv probe link
Link detected: no

and of course vsw doesn't answer
tschike:~# ping -q 169.254.1.1 -c 1
PING 169.254.1.1 (169.254.1.1): 56 data bytes

--- 169.254.1.1 ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss

i bet this is just the fucking link, which should never be the case

@FlorianHeigl
Copy link
Copy Markdown
Author

FlorianHeigl commented Mar 31, 2026

tried to be an ass about this and just run nmap against the /16 but no. I think we really don't have link. need to try other-err- MOAR - dac cables.
tschike:~# nmap -p 22 169.254.0.0/16
Starting Nmap 7.98 ( https://nmap.org ) at 2026-03-31 10:03 +0200
WARNING: eth_send of ARP packet returned -1 rather than expected 42 (errno=105: No buffer space available)
WARNING: eth_send of ARP packet returned -1 rather than expected 42 (errno=105: No buffer space available)

iirc you can't pass 10g speed while loading the module. and in uboot I have it set to 10g.

@FlorianHeigl
Copy link
Copy Markdown
Author

FlorianHeigl commented Mar 31, 2026

tschike:/sys/module/liquidio/parameters# ls
console_bitmask ddr_timeout debug fw_type

there might be some hope for the link speed?

update_link_status <----
liquidio_xmit
liquidio_set_mac
setup_nic_devices
liquidio_init_nic_module
nic_starter
octeon_device_init
liquidio_probe
octeon_unmap_pci_barx
octeon_destroy_resources

tschike:/tmp# strings liquidio-core.ko | head
b4,;N=;
}RF1
teZ`
lio_get_regs
wait_for_sc_completion_timeout
lio_23xx_reconfigure_queue_count
lio_get_link_ksettings
lio_set_link_ksettings <----
packets
bytes

yeah nothing there. console_bitmask will allow setup of serial/pci console I guess....

I think i'm getting to the point where I have spent enough time for it to be fair to ask AaronW once or twice.
just if he doesn't reply i'll be in limbo for a year or two...

@FlorianHeigl
Copy link
Copy Markdown
Author

FlorianHeigl commented Mar 31, 2026

looking at lio_set_link_ksettings was a winner: https://lists.openwall.net/netdev/2018/05/08/21
went upstream 2018. idk why it reports falsely/incomplete speed list in ethtool then.

tschike:/tmp# ethtool -s eth0 speed 10000
netlink error: link settings update failed
netlink error: Not supported

[ 1093.406719] LiquidIO 0000:01:00.0: lio_set_link_ksettings: Changing speed is not supported

this is odd. it should be fully supported, but I suppose that lio_get_link_ksettings has a problem, thus not querying, and set settings has a kill switch apparently.

@FlorianHeigl
Copy link
Copy Markdown
Author

FlorianHeigl commented Mar 31, 2026

wer'e at no_speed_setting for some reason that stays unclear to me.

static int lio_set_link_ksettings(struct net_device *netdev,
                                  const struct ethtool_link_ksettings *ecmd)
{
        const int speed = ecmd->base.speed;
        struct lio *lio = GET_LIO(netdev);
        struct oct_link_info *linfo;
        struct octeon_device *oct;

        oct = lio->oct_dev;

        linfo = &lio->linfo;

        if (!(oct->subsystem_id == OCTEON_CN2350_25GB_SUBSYS_ID ||
              oct->subsystem_id == OCTEON_CN2360_25GB_SUBSYS_ID))
                return -EOPNOTSUPP;

        if (oct->no_speed_setting) {
                dev_err(&oct->pci_dev->dev, "%s: Changing speed is not supported\n",
                        __func__);
                return -EOPNOTSUPP;
        }

octeon_device.h:#define OCTEON_CN2350_25GB_SUBSYS_ID 0X7177d
01:00.0 0200: 177d:9702 (rev 03)

IDK about that 7 but the rest seems fine?

what the hell ok anyway
I know the card can do 10g...

/** Driver identifies chips by these Ids, created by clubbing together
 *  DeviceId+RevisionId; Where Revision Id is not used to distinguish
 *  between chips, a value of 0 is used for revision id.
 */
#define  OCTEON_CN68XX                0x0091
#define  OCTEON_CN66XX                0x0092
#define  OCTEON_CN23XX_PF_VID         0x9702
#define  OCTEON_CN23XX_VF_VID         0x9712

/**RevisionId for the chips */
#define  OCTEON_CN23XX_REV_1_0        0x00
#define  OCTEON_CN23XX_REV_1_1        0x01
#define  OCTEON_CN23XX_REV_2_0        0x80 

/**SubsystemId for the chips */
#define  OCTEON_CN2350_10GB_SUBSYS_ID_1 0X3177d
#define  OCTEON_CN2350_10GB_SUBSYS_ID_2 0X4177d
#define  OCTEON_CN2360_10GB_SUBSYS_ID   0X5177d
#define  OCTEON_CN2350_25GB_SUBSYS_ID   0X7177d
#define  OCTEON_CN2360_25GB_SUBSYS_ID   0X6177d

So I have a rev03 and that is what? REV_2_0?
I have a 2360 so we should find, for the pf, which ethtool runs against

0x9702 , 0x80 , 0X6177d

lio_core.c is how we get no_speed_setting:

        if (retval == IQ_SEND_FAILED) { <--- we're not here
                dev_info(&oct->pci_dev->dev, "Failed to send soft command\n");
                octeon_free_soft_command(oct, sc);
                retval = -EIO;
        } else { <--- we're here
                retval = wait_for_sc_completion_timeout(oct, sc, 0); <--- we see many timeouts maybe we are here
                if (retval)
                        return retval; <---- but we don't return?

                retval = resp->status;  <--- idk
                if (retval) {  <--- we don't go here so seesms irrelevant
                        dev_err(&oct->pci_dev->dev,
                                "%s failed retval=%d\n", __func__, retval);
                        retval = -EIO;
                } else {  <---- we go here so we have not a retval?
                        u32 var;

                        var = be32_to_cpu((__force __be32)resp->speed); <--- no idea
                        oct->speed_setting = var; <--- 0xffff why
                        if (var == 0xffff) { <--- bad
                                /* unable to access boot variables <--- meanie!
                                 * get the default value based on the NIC type  <---- (see larger note)
                                 */
                                if (oct->subsystem_id ==
                                                OCTEON_CN2350_25GB_SUBSYS_ID ||
                                    oct->subsystem_id ==
                                                OCTEON_CN2360_25GB_SUBSYS_ID) { <--- match
                                        oct->no_speed_setting = 1; <--- set
                                        oct->speed_setting = 25; <--- set
                                } else {
                                        oct->speed_setting = 10;
                                }
                        }

                }
                WRITE_ONCE(sc->caller_is_done, true);
        }

        return retval;

note:
maybe we need to load a newer uboot first, once again, but then we won't be able to easily unload, but nowhere would you see that this wouldn't work just bc porti s down? or, wait? does it happen if the port is down at boot and no speed was set? and whatever? defaults to bad speed, no link, no speed_setting found, no way to set it?
that would support my impression that this is a bug of sorts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment