-
Notifications
You must be signed in to change notification settings - Fork 108
Open
Description
System Information
- OS: Arch Linux
- Kernel: linux-zen 6.13.4
# lspci -nn | grep Mellanox
0c:00.0 Ethernet controller [0200]: Mellanox Technologies MT27520 Family [ConnectX-3 Pro] [15b3:1007]
The build is configured with --disable-inband.
Problem Description
On v4.30.0-1 (or 0975751, which is the newest commit I can get to build from source):
# mstflint -d 0c:00.0 query
Segmentation fault (core dumped)
On v4.29.0-1:
# mstflint -d 0c:00.0 query
Image type: FS2
FW Version: 2.42.5044
FW Release Date: 21.10.2018
Product Version: 02.42.50.44
Rom Info: version_id=8025 type=CLP
type=UEFI version=14.11.48 cpu=AMD64
type=PXE version=3.4.754
Device ID: 4103
Description: Node Port1 Port2 Sys image
GUIDs: b88303ffff93f430 b88303ffff93f431 b88303ffff93f432 b88303ffff93f433
MACs: b8830393f431 b8830393f432
VSD:
PSID: HP_1370110017
INI revision: 0xc592a23f
git bisect points to commit 4f91c64:
$ git bisect bad
4f91c64509a0eeab2c53e0d59bf8b9039b1def42 is the first bad commit
commit 4f91c64509a0eeab2c53e0d59bf8b9039b1def42 (HEAD)
Author: Shelly Sela <[email protected]>
Date: Sun Sep 8 09:06:00 2024
[CX8, QTM3 and above] - add support for Late LF mode
Description:
Late LF support: the driver will detect that the device is in Late LF mode to determine the right access method.
LF in newer devices: in contrast to legacy LF, newer devices have VSC exposed. The driver will distinguish between a functional device and a LF device by the VSC type.
When running MFT tools on a device in late LF mode, the user will get the same behavior received from a device in LF mode.
HLD - https://confluence.nvidia.com/pages/viewpage.action?pageId=3113761891
Tested OS: linux
Tested devices: ConnectX8
Tested flows: see detailed test and outputs in the bottom of the HLD
Known gaps (with RM ticket): n/a
Issue: 3909481 4043079
include/mtcr_ul/mtcr_com_defs.h | 2 +-
include/mtcr_ul/mtcr_mf.h | 3 +-
kernel/mst.h | 2 +-
kernel/mst_kernel.h | 9 +++--
kernel/mst_main.c | 85 +++++++++++++++++++++++++++++++++-------------
mtcr_freebsd/mtcr_ul.c | 38 +++++++++++++++------
mtcr_ul/mtcr_ul.c | 2 +-
mtcr_ul/mtcr_ul_com.c | 88 ++++++++++++++++++++++++++++--------------------
mtcr_ul/mtcr_ul_com.h | 6 ++++
mtcr_ul/mtcr_ul_icmd_cif.c | 12 +++----
small_utils/mtserver.c | 2 +-
11 files changed, 164 insertions(+), 85 deletions(-)
Backtrace on 0975751:
>>> bt
#0 0x0000555555650ef8 in mtcr_pcicr_mread4 (mf=<optimized out>, offset=<optimized out>,
value=0x7fffffffced8) at mtcr_ul_com.c:585
#1 mtcr_pcicr_mread4 (mf=0x55555572df60, offset=<optimized out>, value=0x7fffffffced8)
at mtcr_ul_com.c:570
#2 0x0000555555651142 in mread4_ul (mf=mf@entry=0x55555572df60, offset=<optimized out>,
value=value@entry=0x7fffffffced8) at mtcr_ul_com.c:274
#3 0x000055555564b58d in mread4 (mf=mf@entry=0x55555572df60, offset=<optimized out>,
value=value@entry=0x7fffffffced8) at mtcr_ul.c:46
#4 0x0000555555657b69 in read_device_id (mf=mf@entry=0x55555572df60,
device_id=device_id@entry=0x7fffffffced8) at mtcr_ul_com.c:4073
#5 0x000055555564a5af in dm_get_device_id_inner (mf=0x55555572df60,
ptr_dm_dev_id=ptr_dm_dev_id@entry=0x7fffffffcfdc, ptr_hw_dev_id=0x7fffffffcfe0,
ptr_hw_rev=0x7fffffffcfe4) at tools_dev_types.c:696
#6 0x000055555564a825 in dm_get_device_id (mf=<optimized out>, ptr_dm_dev_id=0x7fffffffcfdc,
ptr_hw_dev_id=<optimized out>, ptr_hw_rev=<optimized out>) at tools_dev_types.c:715
#7 0x00005555555bcc7e in FwOperations::IsDeviceSupported (fwParams=...) at fw_ops.cpp:930
#8 0x00005555555bf7a8 in FwOperations::FwOperationsCreate (fwParams=...) at fw_ops.cpp:949
#9 0x0000555555591eea in SubCommand::openOps (this=this@entry=0x5555556fb3a0,
ignoreSecurityAttributes=ignoreSecurityAttributes@entry=false, ignoreDToc=ignoreDToc@entry=false)
at subcommands.cpp:633
#10 0x00005555555aa500 in SubCommand::preFwOps (this=0x5555556fb3a0,
ignoreSecurityAttributes=<optimized out>, ignoreDToc=<optimized out>) at subcommands.cpp:847
#11 0x00005555555ae94d in QuerySubCommand::executeCommand (this=0x5555556fb3a0) at subcommands.cpp:4713
#12 0x000055555558e659 in Flint::run (this=this@entry=0x5555556f8d50, argc=argc@entry=4,
argv=argv@entry=0x7fffffffe758) at flint.cpp:278
#13 0x0000555555577dc8 in main (argc=4, argv=0x7fffffffe758) at flint.cpp:287
On this line:
mstflint/mtcr_ul/mtcr_ul_com.c
Line 586 in 42be686
| u_int32_t tmp = ((u_int32_t*)mf->bar_virtual_addr)[offset / 4]; |
mf->bar_virtual_addr is 0xffffffffffffffff here, which doesn't seem right.
EcmaXp
Metadata
Metadata
Assignees
Labels
No labels