Skip to content

Conversation

@yizhanglinux
Copy link
Contributor

@yizhanglinux yizhanglinux force-pushed the add-nvme-format-test branch from 320045b to c91107c Compare June 13, 2025 07:05
nlbaf=$(nvme id-ns --output-format=json "$TEST_DEV" | jq '.nlbaf')

for lbaf in $(seq 0 "$nlbaf"); do
nvme format --lbaf="$lbaf" --force "$TEST_DEV" >> "${FULL}"
Copy link
Contributor

@igaw igaw Jun 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is going to work well. In the nvme-cli nightly CI runs there is such a test and it needed a bit of safety net to get it working right, e.g. the device needs time to finish the operation before you can do the next.

https://github.com/linux-nvme/nvme-cli/blob/438304a1267d44d4cee6fd5db4d8f7a07551405c/tests/nvme_format_test.py#L120

https://github.com/linux-nvme/nvme-cli/actions/runs/15625258463/job/44018152431

At least you need to test if the expected formatting operation was been done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @igaw for the review, I have five nvme disks on my local server, and the test works as expected.
It seems that nvme-cli doesn't have such options to wait for the formatting operation to be done.
Do you have any suggestions for checking whether the formatting is done?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, I've confused the formatting with the create namespace operation. But still you should add a check if the format has at least changed after the nvme format command.

You are right, nvme-cli doesn't have anything like, 'wait for operation to complete' option. Something to consider for sure. Thanks for the input.

Anyway, you might want to add some try check loop nevertheless, e.g.:

https://github.com/linux-nvme/nvme-cli/blob/438304a1267d44d4cee6fd5db4d8f7a07551405c/tests/nvme_test.py#L389

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, just updated.

@kawasaki
Copy link
Collaborator

@yizhanglinux Thanks for this PR. With a first look, this test case looks useful for me. I tried to run the test case for a QEMU NVME device on the kernel v6.16-rc3 applying the patch series titled "fix atomic limits check v2". I observed a lockdep WARN at the first run. More importantly, I observed the test case hang at the 2nd run. For recording, I upload dmesg observed at the WARN and the hang.
dmesg.txt

Yi, do you expect the WARN the hang?

@yizhanglinux yizhanglinux force-pushed the add-nvme-format-test branch from c91107c to 2181300 Compare June 27, 2025 03:55
@yizhanglinux
Copy link
Contributor Author

@yizhanglinux Thanks for this PR. With a first look, this test case looks useful for me. I tried to run the test case for a QEMU NVME device on the kernel v6.16-rc3 applying the patch series titled "fix atomic limits check v2". I observed a lockdep WARN at the first run. More importantly, I observed the test case hang at the 2nd run. For recording, I upload dmesg observed at the WARN and the hang. dmesg.txt

Yi, do you expect the WARN the hang?

Yeah, seems the test found new issue on kernel side, I didn't find such failure before, let me retest it on the latest linux-block/for-next now.

@yizhanglinux yizhanglinux changed the title nvme/065: add nvme format test with supported LBA format [draft] nvme/065: add nvme format test with supported LBA format Jun 27, 2025
@yizhanglinux yizhanglinux force-pushed the add-nvme-format-test branch 3 times, most recently from 41dd731 to fcd0020 Compare June 27, 2025 08:47
@yizhanglinux
Copy link
Contributor Author

I tried 6 NVMe disks with the latest linux-block/for-next and didn't reproduce the WARNING/hang issue, bug there is one intel NVMe SSD, it reports supported lbaf0 to lbaf6, but the format operation failed with most of the lbaf, maybe there is FW issue with it.

# nvme list
Node                  Generic               SN                   Model                                    Namespace  Usage                      Format           FW Rev
--------------------- --------------------- -------------------- ---------------------------------------- ---------- -------------------------- ---------------- --------
/dev/nvme0n1          /dev/ng0n1                  S39WNA0K201139 Dell Express Flash PM1725a 1.6TB AIC     0x1          1.60  TB /   1.60  TB      4 KiB +  8 B   1.2.1
/dev/nvme1n1          /dev/ng1n1            S795NC0X201793       SAMSUNG MZWLO1T9HCJR-00A07               0x1          0.00   B /   1.92  TB    512   B +  0 B   OPPA4B5Q
/dev/nvme2n1          /dev/ng2n1            2135312ADFD1         Micron_9300_MTFDHAL3T8TDP                0x1          3.84  TB /   3.84  TB    512   B +  0 B   11300DY0
/dev/nvme3n1          /dev/ng3n1            S64FNE0R802879       SAMSUNG MZQL2960HCJR-00A07               0x1        960.20  GB / 960.20  GB      4 KiB +  0 B   GDC5302Q
/dev/nvme4n1          /dev/ng4n1            CVFT6011001V1P6DGN   INTEL SSDPEDMD016T4                      0x1          1.60  TB /   1.60  TB    512   B +  0 B   8DV10171
/dev/nvme5n1          /dev/ng5n1            3F50A00H0LR3         KIOXIA KCMYDRUG1T92                      0x1          0.00   B /   1.92  TB    512   B +  0 B   1UET7104

# ./check nvme/065
nvme/065 => nvme0n1 (Test nvme format NVMe disk with supported LBA format) [passed]
    runtime  3.004s  ...  3.009s
nvme/065 => nvme2n1 (Test nvme format NVMe disk with supported LBA format) [passed]
    runtime  11.128s  ...  11.164s
nvme/065 => nvme3n1 (Test nvme format NVMe disk with supported LBA format) [passed]
    runtime  2.114s  ...  2.121s
nvme/065 => nvme4n1 (Test nvme format NVMe disk with supported LBA format) [failed]
    runtime  40.918s  ...  41.772s
    --- tests/nvme/065.out	2025-06-27 03:05:27.324156330 -0400
    +++ /root/blktests/results/nvme4n1/nvme/065.out.bad	2025-06-27 04:40:34.653108677 -0400
    @@ -1,2 +1,12 @@
     Running nvme/065
    +NVMe status: Invalid Format: The LBA Format specified is not supported(0x10a)
    +/dev/nvme4n1 formatted to lbaf:0, expected:1
    +NVMe status: Invalid Format: The LBA Format specified is not supported(0x10a)
    +/dev/nvme4n1 formatted to lbaf:0, expected:2
    +NVMe status: Invalid Format: The LBA Format specified is not supported(0x10a)
    +/dev/nvme4n1 formatted to lbaf:3, expected:4
    ...
    (Run 'diff -u tests/nvme/065.out /root/blktests/results/nvme4n1/nvme/065.out.bad' to see the entire diff)
nvme/065 => nvme5n1 (Test nvme format NVMe disk with supported LBA format) [passed]
    runtime  2.775s  ...  6.236s

# nvme id-ns /dev/nvme4n1 | tail -7
lbaf  0 : ms:0   lbads:9  rp:0x2 (in use)
lbaf  1 : ms:8   lbads:9  rp:0x2
lbaf  2 : ms:16  lbads:9  rp:0x2
lbaf  3 : ms:0   lbads:12 rp:0
lbaf  4 : ms:8   lbads:12 rp:0
lbaf  5 : ms:64  lbads:12 rp:0
lbaf  6 : ms:128 lbads:12 rp:0

# cat results/nvme4n1/nvme/065.full
Success formatting namespace:ffffffff
Success formatting namespace:ffffffff
Success formatting namespace:ffffffff

# cat results/nvme4n1/nvme/065.out.bad
Running nvme/065
NVMe status: Invalid Format: The LBA Format specified is not supported(0x10a)
/dev/nvme4n1 formatted to lbaf:0, expected:1
NVMe status: Invalid Format: The LBA Format specified is not supported(0x10a)
/dev/nvme4n1 formatted to lbaf:0, expected:2
NVMe status: Invalid Format: The LBA Format specified is not supported(0x10a)
/dev/nvme4n1 formatted to lbaf:3, expected:4
NVMe status: Invalid Format: The LBA Format specified is not supported(0x10a)
/dev/nvme4n1 formatted to lbaf:3, expected:5
NVMe status: Invalid Format: The LBA Format specified is not supported(0x10a)
/dev/nvme4n1 formatted to lbaf:3, expected:6
Test complete

@yizhanglinux yizhanglinux changed the title [draft] nvme/065: add nvme format test with supported LBA format nvme/065: add nvme format test with supported LBA format Jun 27, 2025
@kawasaki
Copy link
Collaborator

kawasaki commented Oct 3, 2025

Today, I tried to run the test with v6.17-rc5 kernel. When repeated the test case for 4kb block size QEMU NVME device, the kernel hanged:
dmesg_Oct_3_2025.txt

This symptom is similar as the one I observed in June, so I guess the same cause still exists. udev-worker calls munmap() system call, then a Oops happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants