Skip to content

Use Native Ram Init (NRI) on haswell boards (not chromebox's preppy borrowed MRC blob) #1923

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

gaspar-ilom
Copy link
Contributor

@gaspar-ilom gaspar-ilom commented Mar 9, 2025

T440p/w541 only : Untested: only for external flashers. -> 10s spent in romstage, lot of debug info: truncates cbmem up to cbmem being 1mb big still

Quickly hacked together. Probably I still missed something, but I will let you have a look.
Takes upstream NRI patch train from https://review.coreboot.org/c/coreboot/+/64186/9 and changes Heads coreboot configs for t440p/w541 to test results on top of coreboot 24.12


  • @gaspar-ilom cherry-pick tlaurion@e42b913
  • edit board configs docs
  • understand what happens in the 10s spent under romstage, where user left in the dark without bootplash as for other boards
  • Decide if 19s of boot time prior of being under Heads is good enough (vs 50s with preppy's MRC blob under master)
  • Document
  • merge

@tlaurion See also #1825 (comment) and #1711 (comment)

TODOs:

Before merge:

So t440p/w541 board owners, it would be a time to compare before/after this PR roms:

  • sudo systemd-analyze blame
  • suspend/resume
  • cbmem -t or master vs this PR
  • sluggishness/unresponsiveness felt/measured

Board owners:

AFTER MERGE:

  • remove references in board owners haswell docs telling MRC blobs are needed
  • others?

EDIT: 4cb6985 should probably be merged into #1908 as a cleanup -> done

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 9, 2025

EDIT: 4cb6985 should probably be merged into #1908 as a cleanup

@gaspar-ilom done

tlaurion added a commit to tlaurion/heads that referenced this pull request Mar 9, 2025
…oreboot.modify_and_save_oldconfig_in_place

Input for linuxboot#1923

Signed-off-by: Thierry Laurion <[email protected]>
@tlaurion
Copy link
Collaborator

tlaurion commented Mar 9, 2025

@gaspar-ilom cherry-pick tlaurion@e42b913

Would be nice if you gave repro instructions under 1b3cd51 to dump patches in the right place for audit/repro/future patchsets needing to be cherry-picked for future work to use this as ref

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 9, 2025

Wow @gaspar-ilom ! That was fast!

Edited OP for testing!

@MattClifton76
Copy link

Does this need tested? I have everything still out and setup if it doesn't work.

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 9, 2025

Does this need tested? I have everything still out and setup if it doesn't work.

@MattClifton76 :
Testing needed Indeed! Only for t440p and w541. Changes suggested for coreboot configs are anesthetics on my side, the builds of Circleci succeeded: this will tell us of NRI state (no more memory blobs for memory init)

See OP for comparisons needs. Suspend/resume needs to work, and if no regression on performance are notable, this will pass the tests here.

Other changes needed are docs related basically.

Edit: @MattClifton76 you do not seem to be board owner for neither of the boards though.

gaspar-ilom pushed a commit to gaspar-ilom/heads that referenced this pull request Mar 9, 2025
…oreboot.modify_and_save_oldconfig_in_place

Input for linuxboot#1923

Signed-off-by: Thierry Laurion <[email protected]>
@gaspar-ilom
Copy link
Contributor Author

* [ ]  edit board configs docs

Is that what you had in mind? I could not find any references in the files under boards/. Do you think we should mention NRI there?

* [ ]  remove references in board owners haswell docs telling MRC blobs are needed

Is that referring to heads-wiki?

@gaspar-ilom
Copy link
Contributor Author

Would be nice if you gave repro instructions under 1b3cd51 to dump patches in the right place for audit/repro/future patchsets needing to be cherry-picked for future work to use this as ref

Hmm, do you think I should change the commit message or where exactly do you mean?

The steps I did:

  1. In coreboot: git checkout 5d291de6011a56bfd767c4bcdfdc3aa6ee87a2dd && git format-patch 7b36319fd9..HEAD --start-number=10. 7b36319fd9 is the last commit before the haswell-NRI patch train. Then move the resulting patch files under respective patches/coreboot-24.12/ directory in heads
  2. Remove make dependencies on the mrc.bin blobs and the scripts to create these blobs from the board config and in the blob directory.
  3. Run make menuconfig for the two haswell boards. (I have done that in coreboot. Hence the diff in tlaurion@e42b913 Load the old config file. Under chipset: select "[NOT COMPLETE] Use native raminit". Save in place.

I think that's it. Add to the commit message?

@MattClifton76
Copy link

MattClifton76 commented Mar 9, 2025

image
Fresh Boot
Screenshot_2025-03-09_19-46-07
From suspend
Screenshot_2025-03-09_19-58-07

Flashed, re-ownership, LUKs, Qubes, suspend to ram appears to work on Qubes when selected from the menu.

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 10, 2025

image Fresh Boot Screenshot_2025-03-09_19-46-07 From suspend Screenshot_2025-03-09_19-58-07

Flashed, re-ownership, LUKs, Qubes, suspend to ram appears to work on Qubes when selected from the menu.

Awesome! so this works with tlaurion@e42b913, really good news! Thanks @MattClifton76

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 10, 2025

* [ ]  edit board configs docs

Is that what you had in mind? I could not find any references in the files under boards/. Do you think we should mention NRI there?

* [ ]  remove references in board owners haswell docs telling MRC blobs are needed

Is that referring to heads-wiki?

@gaspar-ilom : Quick searches (neglect heads-wiki, can be done later, there is no t440p/w541 disassemble guide nor anything else under Wiki: were promised by original port guys who moved to other things after the merge. One day, those will be contributed back, or not, from board owners wanting to contribute back :) )


user@localhost:~/heads-wiki$ find ./ -name "*.md" | xargs grep -Rni mrc
./About/Heads-threat-model.md:169: (Intel's MRC and ME firmware, for instance), but the bulk of the vendor
./About/FAQ.md:104:Maybe. x230 has very few (MRC) since it has native vga init.
user@localhost:~/heads-wiki$ cd ~/heads
user@localhost:~/heads$ grep -Rni mrc boards/
user@localhost:~/heads$ grep -Rni mrc blobs/
blobs/t440p/README.md:10:- `mrc.bin` - Consists of Intel’s Memory Reference Code (MRC) and [is used to initialize the DRAM](https://doc.coreboot.org/northbridge/intel/haswell/mrc.bin.html).
blobs/t440p/README.md:17:When building any T440p board variant with `make`, the build system will download a copy of the MRC and Intel ME. We extract `mrc.bin` from a Chromebook firmware image and `me.bin` from a Lenovo firmware update.
blobs/w541/README.md:10:- `mrc.bin` - Consists of Intel’s Memory Reference Code (MRC) and [is used to initialize the DRAM](https://doc.coreboot.org/northbridge/intel/haswell/mrc.bin.html).
blobs/w541/README.md:17:When building any W541 board variant with `make`, the build system will download a copy of the MRC and Intel ME. We extract `mrc.bin` from a Chromebook firmware image and `me.bin` from a Lenovo firmware update.
user@localhost:~/heads$ 

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 10, 2025

Would be nice if you gave repro instructions under 1b3cd51 to dump patches in the right place for audit/repro/future patchsets needing to be cherry-picked for future work to use this as ref

Hmm, do you think I should change the commit message or where exactly do you mean?

The steps I did:

1. In coreboot: `git checkout 5d291de6011a56bfd767c4bcdfdc3aa6ee87a2dd && git format-patch 7b36319fd9..HEAD --start-number=10`. 7b36319fd9 is the last commit before the haswell-NRI patch train. Then move the resulting patch files under respective `patches/coreboot-24.12/` directory in heads

2. Remove make dependencies on the mrc.bin blobs and the scripts to create these blobs from the board config and in the blob directory.

3. Run make menuconfig for the two haswell boards. (I have done that in coreboot. Hence the diff in [tlaurion@e42b913](https://github.com/tlaurion/heads/commit/e42b913a122626471a438c3c4eec72f340aa5940) Load the old config file. Under chipset: select "[NOT COMPLETE] Use native raminit". Save in place.

I think that's it. Add to the commit message?

Yes, that's what I try to do with everything I do so that commit messages always contains a "repro" section where relevant, so others can arrive to the same result. Here, one would have to replicate exactly your steps to make sure that patches you took from coreboot, and the coreboot patches you put in the patch directory, matches. That is the job of the person that merges the patchwork to reproduce, otherwise we trust blindly, which is not recommended for security projects. Patches should be bit by bit the same, and patched with another patch if we need to change something from where we got it. That also shows upstream what to modify if they want to replicate what was done here, and for CircleCI to arrive at the final hash which even coreboot devs would be able to replicate. It was once suggested that Heads became a base for testing patches for boards used by real users. This is kind of what we are doing here. Awesome and quick work @gaspar-ilom :) those board owners should give you a tip if you tell where to do so! I love just guiding here, seems like you get the gist of Heads here! Thank you for your collaborations!

Note on your point 3, and my commit message for a206e15
./docker_repro.sh make BOARD=UNTESTED_XYZ-maximized coreboot.modify_and_save_oldconfig_in_place

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 10, 2025

@gaspar-ilom since @MattClifton76 confirmed things work on t440p, you could as well do
./docker_repro.sh make BOARD=UNTESTED_XYZ board.move_untested_to_tested

Which from Makefile does

board.move_untested_to_tested:
        @echo "Moving $(BOARD) from UNTESTED to tested status"
        @NEW_BOARD=$$(echo $(BOARD) | sed 's/^UNTESTED_//'); \
        INCLUDE_BOARD=$$(grep "include \$$(pwd)/boards/" boards/$(BOARD)/$(BOARD).config | sed 's/.*boards\/\(.*\)\/.*/\1/'); \
        NEW_INCLUDE_BOARD=$$(echo $$INCLUDE_BOARD | sed 's/^UNTESTED_//'); \
        echo "Updating config file: boards/$(BOARD)/$(BOARD).config"; \
        sed -i 's/$(BOARD)/'$${NEW_BOARD}'/g' boards/$(BOARD)/$(BOARD).config; \
        sed -i 's/'$$INCLUDE_BOARD'/'$$NEW_INCLUDE_BOARD'/g' boards/$(BOARD)/$(BOARD).config; \
        echo "Renaming config file to $${NEW_BOARD}.config"; \
        mv boards/$(BOARD)/$(BOARD).config boards/$(BOARD)/$${NEW_BOARD}.config; \
        echo "Renaming board directory to $${NEW_BOARD}"; \
        mv boards/$(BOARD) boards/$${NEW_BOARD}; \
        echo "Updating .circleci/config.yml"; \
        sed -i "s/$(BOARD)/$${NEW_BOARD}/g" .circleci/config.yml; \
        echo "Operation completed for $(BOARD) -> $${NEW_BOARD}"

Each time I have to do something that I feel I will have to redo in the future, I add helpers. Either in global Makefile or in modules/* makefiles

The current helpers are

user@localhost:~/heads$ make 
Display all 108 possibilities? (y or n)
all                                                          hidapi                                                       mbedtls.clean
bash                                                         hidapi.clean                                                 modules.clean
bash.clean                                                   initrd                                                       msrtools
board.move_tested_to_unmaintained                            initrd.clean                                                 msrtools.clean
board.move_tested_to_untested                                inject_gpg                                                   musl-cross-make
board.move_unmaintained_to_tested                            json-c                                                       musl-cross-make.clean
board.move_untested_to_tested                                json-c.clean                                                 ncurses
board.move_untested_to_unmaintained                          kexec                                                        ncurses.clean
busybox                                                      kexec.clean                                                  npth
busybox.clean                                                libaio                                                       npth.clean
cairo                                                        libaio.clean                                                 packages
cairo.clean                                                  libassuan                                                    payload
coreboot-24.12                                               libassuan.clean                                              pciutils
coreboot-24.12.clean                                         libgcrypt                                                    pciutils.clean
coreboot.modify_and_save_oldconfig_in_place                  libgcrypt.clean                                              pinentry
coreboot.modify_defconfig_in_place                           libgpg-error                                                 pinentry.clean
coreboot.save_in_defconfig_format_in_place                   libgpg-error.clean                                           pixman
coreboot.save_in_oldconfig_format_in_place                   libksba                                                      pixman.clean
cryptsetup2                                                  libksba.clean                                                popt
cryptsetup2.clean                                            libpng                                                       popt.clean
dropbear                                                     libpng.clean                                                 qrencode
dropbear.clean                                               libusb                                                       qrencode.clean
e2fsprogs                                                    libusb.clean                                                 real.clean
e2fsprogs.clean                                              linux                                                        real.gitclean
echo_modules                                                 linuxboot.run                                                real.gitclean_keep_packages
exfatprogs                                                   linux.clean                                                  real.gitclean_keep_packages_and_build
exfatprogs.clean                                             linux.modify_and_save_defconfig_in_place                     real.remove_canary_files-extract_patch_rebuild_what_changed
fbwhiptail                                                   linux.modify_and_save_oldconfig_in_place                     run
fbwhiptail.clean                                             linux.prompt_for_new_config_options_for_kernel_version_bump  tpmtotp
flashprog                                                    linux.save_in_defconfig_format_in_place                      tpmtotp.clean
flashprog.clean                                              linux.save_in_olddefconfig_format_in_place                   util-linux
flashtools                                                   linux.save_in_versioned_defconfig_format                     util-linux.clean
flashtools.clean                                             linux.save_in_versioned_oldconfig                            zlib
FORCE                                                        lvm2                                                         zlib.clean
gpg2                                                         lvm2.clean                                                   zstd
gpg2.clean                                                   mbedtls                                                      zstd.clean

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 10, 2025

@gaspar-ilom since @MattClifton76 confirmed things work on t440p, you could as well do ./docker_repro.sh make BOARD=UNTESTED_XYZ board.move_untested_to_tested

Which from Makefile does

board.move_untested_to_tested:
        @echo "Moving $(BOARD) from UNTESTED to tested status"
        @NEW_BOARD=$$(echo $(BOARD) | sed 's/^UNTESTED_//'); \
        INCLUDE_BOARD=$$(grep "include \$$(pwd)/boards/" boards/$(BOARD)/$(BOARD).config | sed 's/.*boards\/\(.*\)\/.*/\1/'); \
        NEW_INCLUDE_BOARD=$$(echo $$INCLUDE_BOARD | sed 's/^UNTESTED_//'); \
        echo "Updating config file: boards/$(BOARD)/$(BOARD).config"; \
        sed -i 's/$(BOARD)/'$${NEW_BOARD}'/g' boards/$(BOARD)/$(BOARD).config; \
        sed -i 's/'$$INCLUDE_BOARD'/'$$NEW_INCLUDE_BOARD'/g' boards/$(BOARD)/$(BOARD).config; \
        echo "Renaming config file to $${NEW_BOARD}.config"; \
        mv boards/$(BOARD)/$(BOARD).config boards/$(BOARD)/$${NEW_BOARD}.config; \
        echo "Renaming board directory to $${NEW_BOARD}"; \
        mv boards/$(BOARD) boards/$${NEW_BOARD}; \
        echo "Updating .circleci/config.yml"; \
        sed -i "s/$(BOARD)/$${NEW_BOARD}/g" .circleci/config.yml; \
        echo "Operation completed for $(BOARD) -> $${NEW_BOARD}"

Each time I have to do something that I feel I will have to redo in the future, I add helpers. Either in global Makefile or in modules/* makefiles

The current helpers are

user@localhost:~/heads$ make 
Display all 108 possibilities? (y or n)
all                                                          hidapi                                                       mbedtls.clean
bash                                                         hidapi.clean                                                 modules.clean
bash.clean                                                   initrd                                                       msrtools
board.move_tested_to_unmaintained                            initrd.clean                                                 msrtools.clean
board.move_tested_to_untested                                inject_gpg                                                   musl-cross-make
board.move_unmaintained_to_tested                            json-c                                                       musl-cross-make.clean
board.move_untested_to_tested                                json-c.clean                                                 ncurses
board.move_untested_to_unmaintained                          kexec                                                        ncurses.clean
busybox                                                      kexec.clean                                                  npth
busybox.clean                                                libaio                                                       npth.clean
cairo                                                        libaio.clean                                                 packages
cairo.clean                                                  libassuan                                                    payload
coreboot-24.12                                               libassuan.clean                                              pciutils
coreboot-24.12.clean                                         libgcrypt                                                    pciutils.clean
coreboot.modify_and_save_oldconfig_in_place                  libgcrypt.clean                                              pinentry
coreboot.modify_defconfig_in_place                           libgpg-error                                                 pinentry.clean
coreboot.save_in_defconfig_format_in_place                   libgpg-error.clean                                           pixman
coreboot.save_in_oldconfig_format_in_place                   libksba                                                      pixman.clean
cryptsetup2                                                  libksba.clean                                                popt
cryptsetup2.clean                                            libpng                                                       popt.clean
dropbear                                                     libpng.clean                                                 qrencode
dropbear.clean                                               libusb                                                       qrencode.clean
e2fsprogs                                                    libusb.clean                                                 real.clean
e2fsprogs.clean                                              linux                                                        real.gitclean
echo_modules                                                 linuxboot.run                                                real.gitclean_keep_packages
exfatprogs                                                   linux.clean                                                  real.gitclean_keep_packages_and_build
exfatprogs.clean                                             linux.modify_and_save_defconfig_in_place                     real.remove_canary_files-extract_patch_rebuild_what_changed
fbwhiptail                                                   linux.modify_and_save_oldconfig_in_place                     run
fbwhiptail.clean                                             linux.prompt_for_new_config_options_for_kernel_version_bump  tpmtotp
flashprog                                                    linux.save_in_defconfig_format_in_place                      tpmtotp.clean
flashprog.clean                                              linux.save_in_olddefconfig_format_in_place                   util-linux
flashtools                                                   linux.save_in_versioned_defconfig_format                     util-linux.clean
flashtools.clean                                             linux.save_in_versioned_oldconfig                            zlib
FORCE                                                        lvm2                                                         zlib.clean
gpg2                                                         lvm2.clean                                                   zstd
gpg2.clean                                                   mbedtls                                                      zstd.clean

RE-EDIT: @gaspar-ilom : please cherry-pick tlaurion@af84b6a (note order of repro notes: hotp includes non-hotp board variants) :)

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 10, 2025

EDITED @gaspar-ilom please cherry-pick tlaurion@1e23270

* [ ]  add @MattClifton76 to t440p owners under BOARD_TESTERS.md :)

@tlaurion
Copy link
Collaborator

@gaspar-ilom : do you still have w541 and can report everything kosher?

gaspar-ilom pushed a commit to gaspar-ilom/heads that referenced this pull request Mar 10, 2025
…oreboot.modify_and_save_oldconfig_in_place

Input for linuxboot#1923

Signed-off-by: Thierry Laurion <[email protected]>
@gaspar-ilom
Copy link
Contributor Author

gaspar-ilom commented Mar 10, 2025

Yes, that's what I try to do with everything I do so that commit messages always contains a "repro" section where relevant, so others can arrive to the same result. Here, one would have to replicate exactly your steps to make sure that patches you took from coreboot, and the coreboot patches you put in the patch directory, matches. That is the job of the person that merges the patchwork to reproduce, otherwise we trust blindly, which is not recommended for security projects. Patches should be bit by bit the same, and patched with another patch if we need to change something from where we got it. That also shows upstream what to modify if they want to replicate what was done here, and for CircleCI to arrive at the final hash which even coreboot devs would be able to replicate. It was once suggested that Heads became a base for testing patches for boards used by real users. This is kind of what we are doing here.

Maybe it would be worth writing a short section on commit (message) guidelines about that in the heads-wiki. Mention this and maybe a few other things such as signing and signing-off commits. It is probably also one of those recurring tasks to get contributors to do this :-)

EDIT:

Awesome and quick work @gaspar-ilom :) those board owners should give you a tip if you tell where to do so! I love just guiding here, seems like you get the gist of Heads here! Thank you for your collaborations!

If anyone wants to give a tip it should be to you or the project in general. I see this as a small contribution I can make. I am using it on my boards too after all. And tbh. this one was such a small effort compared to the porting of the T480, but you know that @tlaurion Anyway, I would not want to take a tip as I might just disappear from the project at some point. The plus side of all work for the T480 under your guidance is that I am now much more familiar with the code base and the build system. So a contribution like this one takes a lot less time.

@gaspar-ilom
Copy link
Contributor Author

@gaspar-ilom : do you still have w541 and can report everything kosher?

Still have it. Gonna test later.

@gaspar-ilom
Copy link
Contributor Author

@gaspar-ilom cherry-pick tlaurion@e42b913

Would be nice if you gave repro instructions under 1b3cd51 to dump patches in the right place for audit/repro/future patchsets needing to be cherry-picked for future work to use this as ref

done

@gaspar-ilom
Copy link
Contributor Author

gaspar-ilom commented Mar 10, 2025

@gaspar-ilom : do you still have w541 and can report everything kosher?

I have just tested with f3eb374

  • W541 works
    • but the boot is not as fast as I would have hoped. It took 16s from pressing the power button to showing the bootsplash and then a few more before the gui came up. This is way better than with the mrc.bin but does not compare to the T430 for instance where I get the boot splash almost immediately after pressing the power button. For me this is good enough.
  • T430 works.
    • Just tested this board to verify no regressions and anyway still wanted to flash after updating coreboot to 24.12.

@MattClifton76 have you measured boot time for T440p?

@tlaurion What is missing so that this can be merged? Can you do the review? If not who should we poke?

@MattClifton76
Copy link

@gaspar-ilom i just timed it, 11-12 seconds to get to splash screen. Much longer to get into qubes because of LUks and logging in. It's definitely much slower than my T480. Is there still some code refinement that needs to happen up stream? Will it get faster as coreboot matures?

gaspar-ilom pushed a commit to gaspar-ilom/heads that referenced this pull request Mar 10, 2025
…oreboot.modify_and_save_oldconfig_in_place

Input for linuxboot#1923

Signed-off-by: Thierry Laurion <[email protected]>
@tlaurion
Copy link
Collaborator

tlaurion commented Mar 10, 2025

@MattClifton76 @gaspar-ilom

The proper tool to get coreboot boot time measurements is cbmem -t.

Console logs are cbmem -1 (this is a one) output as log so I can have eyes here. A diff between t430 and t440p should tell us a lot here.

So

  • Heads recovery shell
  • mount-usb --mode rw
  • cbmem -t > /media/cbmem_stages_time.txt
  • cbmem -1 > /media/cbmem_console.txt
  • umount /media

Upload cbmem_stages_time.txt and cbmem_console.txt.

Also, coreboot config shows debug statements here and there. Nevermind.

We need a baseline prior of tuning things, understand where time is spent at least, and why.


@gaspar-ilom once 24.12 PR is merged, thus or should be rebased --signoff so that comment here are only relevant to NRI port effort. Nobody will look at all the changes coming from T480 to 24.12 bump to only check NRI related config changes.

I'm out of time today but this is where the fun starts. One would have to explore the changes and kconfig dependencies and what can be done here.


12s at each boot means memory is most probably retrained at each boot. High level analysis of upstream patchwork stipulates that it should not be the case.

This is where collaboration upstream starts. Either here or subsequent PR.

But first, we need to understand config options related to NRI, but at first glance there is none outside of tweaking numerical values... But also Kconfig options says s3 suspend resume is not working..... While it is. I thing it got just merged to stop bitorotting and conflicting with other code base. Future of NRI start here (while be aware that memory training code is the most complicated part, and really hard to reverse. So amazing work here already.)

My main concern here is to understand what happens when there is no bootsplash. Bootsplash shown on romstage. So hypothesis is that mrc cache is not reused. But that needs to be proven with logs.

@MattClifton76
Copy link

@MattClifton76 @gaspar-ilom

The proper tool to get coreboot boot measurements is cbmem -t

So

* Heads recovery shell

* mount-usb --mode rw

* cbmem -t > /media/cbmem_stages_time.txt

* umount /media

Upload cbmem_stages_time.txt content here

Also, coreboot config shows ebug statements here and there.

We need a baseline prior of tuning things

@gaspar-ilom once 24.12 PR is merged, thus or should be rebased --signoff so that comment here are only relevant to NRI port effort. Nobody will look at all the changes coming from T480 to 24.12 bump to only check NRI related config changes.

I'm out of time today but this is where the fun starts. One would have to explore the changes and kconfig dependencies and what can be done here.

12s at each boot means memory is most probably retrained at each boot. High level analysis of upstream patchwork stipulates that it should not be the case.

This is where collaboration upstream starts. Either here or subsequent PR. But first, we need to understand config options related to NRI, but at first glance there is none outside of tweaking numerical values...

Please also share cbmem -1 (this is a one) output as log so I can have eyes here. A diff between t430 and t440p should tell us a lot here.

@tlaurion here is the requested txt file.
cbmem_stages_time.txt

@tlaurion
Copy link
Collaborator

@MattClifton76 @gaspar-ilom

The proper tool to get coreboot boot measurements is cbmem -t

So

* Heads recovery shell

* mount-usb --mode rw

* cbmem -t > /media/cbmem_stages_time.txt

* umount /media

Upload cbmem_stages_time.txt content here

Also, coreboot config shows ebug statements here and there.

We need a baseline prior of tuning things

@gaspar-ilom once 24.12 PR is merged, thus or should be rebased --signoff so that comment here are only relevant to NRI port effort. Nobody will look at all the changes coming from T480 to 24.12 bump to only check NRI related config changes.

I'm out of time today but this is where the fun starts. One would have to explore the changes and kconfig dependencies and what can be done here.

12s at each boot means memory is most probably retrained at each boot. High level analysis of upstream patchwork stipulates that it should not be the case.

This is where collaboration upstream starts. Either here or subsequent PR. But first, we need to understand config options related to NRI, but at first glance there is none outside of tweaking numerical values...

Please also share cbmem -1 (this is a one) output as log so I can have eyes here. A diff between t430 and t440p should tell us a lot here.

@tlaurion here is the requested txt file.
cbmem_stages_time.txt

You were a bit too quick, I added cbmem -1 but not in my steps... Logs would help, and config gives timestamps there as well, figurenwill be complete and I will be able to compare with x230 posting same logs.

Sorry for the edit while you were doing that. See #1923 (comment) again.

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 23, 2025

Original question:

What are readily available options to obtain coreboot console output on machines not having serial console?

What would be recommended in 2025 that can test old and new platforms not having EHCI debug ports anymore and no serial console? Thanks!

From coreboot matrix channel, nic3-14159 answered

For EHCI debug nowadays I'd generally recommend something like an FT232H or CH347 over using USB gadget on a SBC like the BeagleBone Black. Of course, EHCI debug isn't an option on newer platforms.

The SPI flash console has been broken for a long time and is is generally said to only be for debugging early boot issues (generally bootblock/romstage).

The SMBus console is probably one of the more universal options as it can be accessed through things like DRAM slots or PCIe slots and should be present on basically all systems, though it might not be the most convenient and may need special hardware to break put the SMBus lines on those slots.

<@insurgo:matrix.org> Ch347 can be used for that? Source? Wiring examples upstream?

I added support for it in coreboot (I also added support it in flashrom and flashprog). It basically just acts as a USB to UART bridge so you just need to plug the USB cable into the EHCI debug port on the target device and then connect the TX and GND pins to anything that can receive 3.3V UART.

This seems to be the most common CH347 board: https://www.tindie.com/products/johnnywu/ch347-development-board/

tlaurion aka Insurgo [UTC-4]: By the way, I don't really have a good answer for your original question. My earlier messages here were more just me thinking out loud. The best console output to use will probably end up being device specific, as many generic interfaces either aren't easilly accessible (if at all) depending on the board or require too much of a software stack to support in coreboot (like standard USB not using things like EHCI debug). I suppose the flash console would be the most generic, easily accessible console for a wide range of platforms, though that would require the bugs in it to be fixed.

Then Max Shanly said:

I messaged you about this the other day. I've bought two of the boards nic3-14159 has linked to off AliExpress and then one of each of these. https://www.aliexpress.com/item/1005006328049717.html

From the AliExpress description of the ones I just linked to: "Multi-Protocol Support: Seamlessly integrate with UART, SPI, IIC, and JTAG using this versatile CH347 development board."

A flashing multi-tool for under £11. Sounds great.

@Th3Fanbus
Copy link

@tlaurion For EHCI debug, I would use a FT232H or FT2232H (must be H suffix), but I think someone added support to use the CH347 for coreboot logs via EHCI debug.

I don't know if anyone successfully used a SBC like the RPi for EHCI debug in the last few years, and I'm pretty sure it must be a RPi without a USB hub (you need a USB port from the SoC).

@gaspar-ilom
Copy link
Contributor Author

gaspar-ilom commented Mar 26, 2025

@tlaurion For EHCI debug, I would use a FT232H or FT2232H (must be H suffix), but I think someone added support to use the CH347 for coreboot logs via EHCI debug.

I don't know if anyone successfully used a SBC like the RPi for EHCI debug in the last few years, and I'm pretty sure it must be a RPi without a USB hub (you need a USB port from the SoC).

Thanks @Th3Fanbus and @tlaurion I have ordered an FT232H dongle now. My plan is to plug that into the corebooted W541 and then connect the jumper wires to the UART pins of the Raspberry Pi and read the log messages from there. Will let you know once the board arrives.

EDIT: @Th3Fanbus and just out of curiosity/ to get a better understanding, why isn't it possible to use something like a PL2303 plug the USB into the corebooted machine (W541) and connect the other end to the UART of a Raspberry Pi?

@gaspar-ilom
Copy link
Contributor Author

@MattClifton76
Copy link

Please accept my apology for being MIA, I was busy at work.

@gaspar-ilom which FT232H dongle did you buy?

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 27, 2025

@tlaurion seems like circleci flakiness is hitting again: https://app.circleci.com/pipelines/circleci/MHXxJnuGL1oVD9jaDwEAGo/Kguuk4Rjpkd7hBpYBWhVuN/83/workflows/d0fde9d6-20cb-4fa3-910f-470cdfbf673d

@gaspar-ilom its your PR, slap it doing "rerun workflow from failed".

Screenshot_20250326-220453.png

@gaspar-ilom
Copy link
Contributor Author

* [x]  @gaspar-ilom cherry-pick [tlaurion@e42b913](https://github.com/tlaurion/heads/commit/e42b913a122626471a438c3c4eec72f340aa5940)

* [x]  edit board configs docs

* [ ]  understand what happens in the 10s spent under romstage, where user left in the dark without bootplash as for other boards

* [ ]  Decide if 19s of boot time prior of being under Heads is good enough (vs 50s with preppy's MRC blob under master)

* [ ]  Document

* [ ]  merge

@tlaurion which documentation would you like to see added? Is this a general reminder or referring to something specific that is missing?

@gaspar-ilom
Copy link
Contributor Author

gaspar-ilom commented Mar 27, 2025

Please accept my apology for being MIA, I was busy at work.

Don't worry. It's not like much has happened in this PR anyway.

@gaspar-ilom which FT232H dongle did you buy?

@MattClifton76 I have bought something like that: https://www.aliexpress.us/item/2251832631165237.html
There's different vendors/sellers and I found one that is not shipped from China to my country, but a little more expensive. Hopefully will not take month to arrive, though...

If you want to buy the quality product with good documentation, you should go for the Adafruit FT232H Breakout: https://learn.adafruit.com/adafruit-ft232h-breakout/serial-uart It is the same chip, so it should not make a difference, but who knows.

Please, note that on all these PCBs you will have to solder the header pins yourself. It is not hard but you will need the equipment: https://www.youtube.com/watch?v=Z0joOKaQ43A

And you will need a different device to plug the UART pins into. As mentioned I want to use a Pi for that. If you do not own a suitable UART device, you could go with a setup like here (using a FT232H instead of the FT2232H): https://ch1p.io/coreboot-ft2232h/

@gaspar-ilom
Copy link
Contributor Author

@tlaurion seems like circleci flakiness is hitting again: https://app.circleci.com/pipelines/circleci/MHXxJnuGL1oVD9jaDwEAGo/Kguuk4Rjpkd7hBpYBWhVuN/83/workflows/d0fde9d6-20cb-4fa3-910f-470cdfbf673d

@gaspar-ilom its your PR, slap it doing "rerun workflow from failed".

@tlaurion I would do that if I could. The buttons are all greyed out and not clickable, even though I am logged in to circleci. So my workaround is to amend to the last commit (changing the date) and force-push, but that sucks. I already did that once yesterday but it failed again. Now I have done it another time.

@tlaurion
Copy link
Collaborator

@tlaurion seems like circleci flakiness is hitting again: https://app.circleci.com/pipelines/circleci/MHXxJnuGL1oVD9jaDwEAGo/Kguuk4Rjpkd7hBpYBWhVuN/83/workflows/d0fde9d6-20cb-4fa3-910f-470cdfbf673d

@gaspar-ilom its your PR, slap it doing "rerun workflow from failed".

@tlaurion I would do that if I could. The buttons are all greyed out and not clickable, even though I am logged in to circleci. So my workaround is to amend to the last commit (changing the date) and force-push, but that sucks. I already did that once yesterday but it failed again. Now I have done it another time.

Just realized that CircleCI was building from linuxboot/heads and not from your fork. You need to follow your fork under CircleCI.


ChatGPT (seems valid):

Guide: Setting Up CircleCI for GitHub Forks

This guide explains how users who fork a GitHub repository can configure CircleCI so that builds are triggered from their own CircleCI accounts when they push commits.

1. Fork the GitHub Repository

  • Navigate to the Repository: Open the original GitHub repository in your browser.
  • Fork the Repository: Click the Fork button in the upper right corner.
  • Select Your Account: Choose your GitHub account or organization where you want the forked repository to reside.

2. Enable CircleCI for the Forked Repository

  • Log in to CircleCI: Go to CircleCI and log in using your GitHub credentials.
  • Locate Your Fork:
    • In the CircleCI Projects dashboard, find your forked repository.
    • If the repository is not visible, click Add Projects and authorize GitHub access if required.
  • Set Up the Project:
    • Click Set Up Project next to your forked repository.
    • If a .circleci/config.yml file already exists in your fork, select Use an existing config.
    • Click Set Up Project to enable builds for your fork.

3. Configure Your Workflow

  • Automatic Builds: With a valid .circleci/config.yml file in your repository, CircleCI will automatically trigger builds whenever you push changes.
  • Manage Secrets:
    • If the original repository uses restricted contexts (such as for managing secrets), those secrets are not transferred to your fork.
    • Configure your own environment variables by navigating to Project Settings > Environment Variables in CircleCI.

4. Verify Webhook and GitHub Permissions

  • Automatic Webhook Setup: CircleCI typically adds a webhook to your fork that points to https://circleci.com/hooks/github.
  • Check Webhooks in GitHub:
    • Go to your forked repository's Settings > Webhooks to ensure the CircleCI webhook is present.
    • If the webhook is missing, re-enable CircleCI for your fork under Project Settings > Advanced in CircleCI and click Rerun Setup.

5. Optional: Set Up Status Checks for Pull Requests

  • Enable Status Checks:
    • Go to your forked repository's Settings > Branches on GitHub.
    • Under Branch Protection Rules, enable Require status checks to pass before merging.
  • Add CircleCI Check:
    • Add the CircleCI build status (commonly labeled as ci/circleci: build) as a required check.

Additional Notes for Repository Maintainers

  • Build Forked Pull Requests:
    • If you want to enable builds on forked pull requests in the main repository, enable the Build forked pull requests option in CircleCI > Project Settings > Advanced.
  • Private Repositories:
    • For private repositories, forked builds will not have access to the original CircleCI contexts unless explicitly allowed.

@gaspar-ilom
Copy link
Contributor Author

@Th3Fanbus @tlaurion

I captured a log over USB/EHCI from the W541. Just took a brief look. It seems that the MRC cache is not used:

w541_usb_g3c0dc40.log

Should we increase the log level again to get more information? Which parameters should we tweak?

tlaurion added a commit to tlaurion/heads that referenced this pull request Apr 1, 2025
…level to debug, enable ram init debug info to console

Input for linuxboot#1923 (comment)

Signed-off-by: Thierry Laurion <[email protected]>
@tlaurion
Copy link
Collaborator

tlaurion commented Apr 1, 2025

sudo rm -rf build/x86/coreboot-24.12/{w541,t440p}-maximized/ build/x86/{w541,t440p}-maximized # remove local builds artifacts
echo "bogus" | sudo tee build/x86/coreboot-24.12/.canary > /dev/null # reset git so sync + patch applied from coreboot commit
sudo rm -rf build/x86/coreboot-24.12/src/mainboard/lenovo/sklkbl_thinkpad/variants/t480* # remove files created by t480 patch
./docker_repro.sh make BOARD=w541-maximized #reuse coreboot 24.12 crossgcc not flushed by .canary flushed out
./docker_repro.sh make BOARD=w541-maximized coreboot.modify_and_save_oldconfig_in_place #modify coreboot config for w541 as below pictures

2025-04-01-103646
2025-04-01-103707
2025-04-01-103754

sudo rm -rf build/x86/coreboot-24.12/{w541,t440p}-maximized/ build/x86/{w541,t440p}-maximized
echo "bogus" | sudo tee build/x86/coreboot-24.12/.canary > /dev/null
sudo rm -rf build/x86/coreboot-24.12/src/mainboard/lenovo/sklkbl_thinkpad/variants/t480*
./docker_repro.sh make BOARD=w541-maximized 
user@heads-nri:~/heads$ git diff > diff
user@heads-nri:~/heads$ patch config/coreboot-t440p.config diff
patching file config/coreboot-t440p.config
Hunk #2 succeeded at 617 (offset 1 line).
Hunk #3 succeeded at 701 (offset 1 line).
./docker_repro.sh make BOARD=t440p-maximized coreboot.modify_and_save_oldconfig_in_place
git status | grep modified | awk -F ":" {'print $2'} | xargs git add
git branch -D gaspar-ilom_haswell-nri
git commit --signoff 
git push tlaurion-github --force

Result under tlaurion@e96f426 @gaspar-ilom
console debug level could be put to SPEW if something missing.

As @Th3Fanbus said earlier, with console output alone, human eye should see where time spent. With timestamps on console output, should be clean in logs. With console in debug (was info) we should see more of what happens. With raminit additional debug enabled, we should have introspection on what happens on raminit. Let the logs be the answers hopefully!

Pictures of your debugging setup with closeups would be awesome for posterity as well.

…level to debug, enable ram init debug info to console

Input for linuxboot#1923 (comment)

Signed-off-by: Thierry Laurion <[email protected]>
@gaspar-ilom
Copy link
Contributor Author

Result under tlaurion@e96f426 @gaspar-ilom

@tlaurion @Th3Fanbus here are the logs with increased debug level and ram init debugging enabled:

w541_usb_ge96f426.log

Btw, it takes way longer to boot than with the previous commit with a lower debug level.

@gaspar-ilom
Copy link
Contributor Author

gaspar-ilom commented Apr 2, 2025

Pictures of your debugging setup with closeups would be awesome for posterity as well.

@tlaurion here are some images.

Overview:
W541 USB -> FT232H -> GND+RX+TX -> Raspberry Pi 3B -> Ethernet/SSH
Note that you can ignore the SOIC clip which is still connected to the pins on the Pi because I was lazy...

3c34fa0b-0ad3-492e-9c5f-047a52b80367

FT232H
RX, TX and GND go out. The documentation is available here.
GND is connected to GND on the Pi. RX to TX on the Pi. And TX to RX on the Pi. Yes, UART means transmitting on one device is receiving on the other and vice versa.

7686cf4e-aec7-46b1-a9c5-37f2134cf665

The Pi

8c916ce3-2793-4155-9a39-8a42b4712e41

Here you can see how the three pins are connected:

2e293df5-f4ba-4ae3-a8b5-acccd083e82c

GPIO Pins
On the Pi the relevant pins are GPIO 14 and 15 and some GND pin of your choice. In my case, I picked pin numbers 6,8 and 10 as they are next to each other. The pin layout of the Pi can be checked here.

f812f438-2b5c-41fa-9c17-845981f9b492

Sidenote, I don't think it is necessary to use a Pi. Anything that handles UART properly should work.

EDIT: Another sidenote, I had to solder the header pins to the FT232H myself. So be aware that this is another step to make this work...

EDIT2:

I should probably add some information on how to set the Pi up:

# sudo apt update && sudo apt upgrade && sudo apt install minicom
sudo raspi-config
# Select 3 Interface Options
# Select I6 Serial Port
# Disable login shell when prompted
# Enable serial port hardware when prompted
# Maybe reboot
sudo minicom -D /dev/serial0 -C my_log_file.log # /dev/serial0 that should work on all Pis but point to different files depending on which Pi is used
# Alternatively you can use screen:
sudo apt install screen
screen /dev/serial0 115200

With minicom you will both see the logs in the shell and they will be saved in my_log_file.log.

@MattClifton76
Copy link

Pictures of your debugging setup with closeups would be awesome for posterity as well.

@tlaurion here are some images.

Overview:

W541 USB -> FT232H -> GND+RX+TX -> Raspberry Pi 3B -> Ethernet/SSH

Note that you can ignore the SOIC clip which is still connected to the pins on the Pi because I was lazy...

3c34fa0b-0ad3-492e-9c5f-047a52b80367

FT232H

RX, TX and GND go out. The documentation is available here.

GND is connected to GND on the Pi. RX to TX on the Pi. And TX to RX on the Pi. Yes, UART means transmitting on one device is receiving on the other and vice versa.

7686cf4e-aec7-46b1-a9c5-37f2134cf665

The Pi

8c916ce3-2793-4155-9a39-8a42b4712e41

Here you can see how the three pins are connected:

2e293df5-f4ba-4ae3-a8b5-acccd083e82c

GPIO Pins

On the Pi the relevant pins are GPIO 14 and 15 and some GND pin of your choice. In my case, I picked pin numbers 6,8 and 10 as they are next to each other. The pin layout of the Pi can be checked here.

f812f438-2b5c-41fa-9c17-845981f9b492

Sidenote, I don't think it is necessary to use a Pi. Anything that handles UART properly should work.

EDIT: Another sidenote, I had to solder the header pins to the FT232H myself. So be aware that this is another step to make this work...

Thank you, I'm still waiting on mine to show up, good ol USPS.

@tlaurion
Copy link
Collaborator

tlaurion commented Apr 3, 2025

Result under tlaurion@e96f426 @gaspar-ilom

@tlaurion @Th3Fanbus here are the logs with increased debug level and ram init debugging enabled:

w541_usb_ge96f426.log

Btw, it takes way longer to boot than with the previous commit with a lower debug level.

@gaspar-ilom @Th3Fanbus

From log

[DEBUG]  MRC: Checking cached data update for 'RW_MRC_CACHE'.
[DEBUG]  flash size 0x2800000 bytes
[INFO ]  SF: Detected 00 0000 with sector size 0x1000, total 0x2800000
[ERROR]  SF size 0x2800000 does not correspond to CONFIG_ROM_SIZE 0xc00000!!
[NOTE ]  MRC: no data in 'RW_MRC_CACHE'
[DEBUG]  MRC: cache data 'RW_MRC_CACHE' needs update.
[DEBUG]  SF: Successfully written 2 bytes @ 0x20000
[DEBUG]  SF: Successfully written 2 bytes @ 0x20002
[DEBUG]  SF: Successfully written 20 bytes @ 0x20020
[DEBUG]  SF: Successfully written 4092 bytes @ 0x20034
[DEBUG]  MRC: updated 'RW_MRC_CACHE'.

That's what cought my attention.
@Th3Fanbus?

@Th3Fanbus
Copy link

So, w541_usb_ge96f426.log has a raminit log where I don't see anything unusual (it's just four dual-rank, reference card F DDR3 SO-DIMMs getting trained). As there aren't any timestamps during raminit, I can't tell from the log file where the long boot times may come from. Even with logging, I would expect boot times to be less than 5 seconds.

If you watch the EHCI debug console in real time, are the romstage messages getting printed unusually slowly? Messages should be going pretty fast (compare against ramstage logs), unless there's big pauses somewhere. If so, it would be great to know at which point in the log those big pauses occur (note down the messages right before each pause).

@tlaurion
Copy link
Collaborator

Result under tlaurion@e96f426 @gaspar-ilom

@tlaurion @Th3Fanbus here are the logs with increased debug level and ram init debugging enabled:
w541_usb_ge96f426.log
Btw, it takes way longer to boot than with the previous commit with a lower debug level.

@gaspar-ilom @Th3Fanbus

From log

[DEBUG]  MRC: Checking cached data update for 'RW_MRC_CACHE'.
[DEBUG]  flash size 0x2800000 bytes
[INFO ]  SF: Detected 00 0000 with sector size 0x1000, total 0x2800000
[ERROR]  SF size 0x2800000 does not correspond to CONFIG_ROM_SIZE 0xc00000!!
[NOTE ]  MRC: no data in 'RW_MRC_CACHE'
[DEBUG]  MRC: cache data 'RW_MRC_CACHE' needs update.
[DEBUG]  SF: Successfully written 2 bytes @ 0x20000
[DEBUG]  SF: Successfully written 2 bytes @ 0x20002
[DEBUG]  SF: Successfully written 20 bytes @ 0x20020
[DEBUG]  SF: Successfully written 4092 bytes @ 0x20034
[DEBUG]  MRC: updated 'RW_MRC_CACHE'.

That's what cought my attention. @Th3Fanbus?

ping @Th3Fanbus I do not think this is normal and seems to prevent MRC cache to be saved and reused no?

@tlaurion
Copy link
Collaborator

tlaurion commented Apr 11, 2025

So, w541_usb_ge96f426.log has a raminit log where I don't see anything unusual (it's just four dual-rank, reference card F DDR3 SO-DIMMs getting trained). As there aren't any timestamps during raminit, I can't tell from the log file where the long boot times may come from. Even with logging, I would expect boot times to be less than 5 seconds.

If you watch the EHCI debug console in real time, are the romstage messages getting printed unusually slowly? Messages should be going pretty fast (compare against ramstage logs), unless there's big pauses somewhere. If so, it would be great to know at which point in the log those big pauses occur (note down the messages right before each pause).

@Th3Fanbus @gaspar-ilom :
Can't ts be used to prepend every serial console line received by app ?

I find ts really useful in situations like this, where program output doesn't include timestamp where I need them. The result looks like ([...] redacted):

sudo wyng-util-qubes [...] | ts

[...]
Apr 11 10:38:56 Last updated 2025-04-11 09:57:46.894347 (-04:00)
Apr 11 10:41:50 
Apr 11 10:41:50 Preparing snapshots in '/var/lib/qubes/'...
Apr 11 10:41:50   Queuing full scan of import 'boot'
Apr 11 10:43:22 Acquiring deltas.
Apr 11 10:43:23 
Apr 11 10:43:23 Sending backup session 20250411-103854:
[...]

Which show clearly where time was spent between each line printed on console.
minicom | ts > serial_nri_coreboot_output.txt ?

@Th3Fanbus
Copy link

Result under tlaurion@e96f426 @gaspar-ilom

@tlaurion @Th3Fanbus here are the logs with increased debug level and ram init debugging enabled:
w541_usb_ge96f426.log
Btw, it takes way longer to boot than with the previous commit with a lower debug level.

@gaspar-ilom @Th3Fanbus
From log

[DEBUG]  MRC: Checking cached data update for 'RW_MRC_CACHE'.
[DEBUG]  flash size 0x2800000 bytes
[INFO ]  SF: Detected 00 0000 with sector size 0x1000, total 0x2800000
[ERROR]  SF size 0x2800000 does not correspond to CONFIG_ROM_SIZE 0xc00000!!
[NOTE ]  MRC: no data in 'RW_MRC_CACHE'
[DEBUG]  MRC: cache data 'RW_MRC_CACHE' needs update.
[DEBUG]  SF: Successfully written 2 bytes @ 0x20000
[DEBUG]  SF: Successfully written 2 bytes @ 0x20002
[DEBUG]  SF: Successfully written 20 bytes @ 0x20020
[DEBUG]  SF: Successfully written 4092 bytes @ 0x20034
[DEBUG]  MRC: updated 'RW_MRC_CACHE'.

That's what cought my attention. @Th3Fanbus?

ping @Th3Fanbus I do not think this is normal and seems to prevent MRC cache to be saved and reused no?

What exactly doesn't seem normal? I already said that [ERROR] SF size 0x2800000 does not correspond to CONFIG_ROM_SIZE 0xc00000!! seems wrong: it detects as if the flash chip size is much larger than CONFIG_ROM_SIZE, were any flash chips replaced?

The MRC cache itself seems to be populated properly, remember this is a first-time boot so the cache was empty.

@tlaurion
Copy link
Collaborator

@Th3Fanbus

What exactly doesn't seem normal? I already said that [ERROR] SF size 0x2800000 does not correspond to CONFIG_ROM_SIZE 0xc00000!! seems wrong: it detects as if the flash chip size is much larger than CONFIG_ROM_SIZE, were any flash chips replaced?

No. That's exactly the question here.

If you look at the full oldconfig file in PR, romsize fits bios region. Unfortunately I cannot help more here than repeating that this seems to be the cause of issue, and that MRC training de esnt seem to stick and constantly be retrained, but post first boot logs are missing.

@gaspar-ilom ping on piping the logs through | ts on host receiving the logs so we have timestamps for each line of cbmem output over serial.

Nothing much else I can say here.

@Th3Fanbus
Copy link

@Th3Fanbus

What exactly doesn't seem normal? I already said that [ERROR] SF size 0x2800000 does not correspond to CONFIG_ROM_SIZE 0xc00000!! seems wrong: it detects as if the flash chip size is much larger than CONFIG_ROM_SIZE, were any flash chips replaced?

No. That's exactly the question here.

If you look at the full oldconfig file in PR, romsize fits bios region. Unfortunately I cannot help more here than repeating that this seems to be the cause of issue, and that MRC training de esnt seem to stick and constantly be retrained, but post first boot logs are missing.

@gaspar-ilom ping on piping the logs through | ts on host receiving the logs so we have timestamps for each line of cbmem output over serial.

Nothing much else I can say here.

@gaspar-ilom did you happen to replace the flash chips on this computer with bigger ones? I need to understand why the logs contain the error I quoted earlier.

@tlaurion
Copy link
Collaborator

tlaurion commented May 2, 2025

@Th3Fanbus

What exactly doesn't seem normal? I already said that [ERROR] SF size 0x2800000 does not correspond to CONFIG_ROM_SIZE 0xc00000!! seems wrong: it detects as if the flash chip size is much larger than CONFIG_ROM_SIZE, were any flash chips replaced?

No. That's exactly the question here.
If you look at the full oldconfig file in PR, romsize fits bios region. Unfortunately I cannot help more here than repeating that this seems to be the cause of issue, and that MRC training de esnt seem to stick and constantly be retrained, but post first boot logs are missing.
@gaspar-ilom ping on piping the logs through | ts on host receiving the logs so we have timestamps for each line of cbmem output over serial.
Nothing much else I can say here.

@gaspar-ilom did you happen to replace the flash chips on this computer with bigger ones? I need to understand why the logs contain the error I quoted earlier.

Will try to speed this up a little. Long delays and long context switches for things I'm not knowledgeable into are not easy for me here.

So Last commit from which logs were extracted by @gaspar-ilom were 6dfe541 in follow up comment #1923 (comment) (even if commit id mismatches @gaspar-ilom )

Result under tlaurion@e96f426 @gaspar-ilom

@tlaurion @Th3Fanbus here are the logs with increased debug level and ram init debugging enabled:

w541_usb_ge96f426.log

Btw, it takes way longer to boot than with the previous commit with a lower debug level.

Where my comment at #1923 (comment) pinpointed

[ERROR] SF size 0x2800000 does not correspond to CONFIG_ROM_SIZE 0xc00000!!

I checked all the comments of this PR @Th3Fanbus and recall your comment on ME being at fault #1923 (comment) but cannot find your comment on the size difference error. Sorry if i'm still missing it.

@gaspar-ilom uses w541 and roms built from CircleCI to test and report to this PR to make things reproducible for all board owners.

Therefore, from that commit 6dfe541's coreboot's oldconfig file

My question still stands; where "SF size 0x2800000" is obtained from. @gaspar-ilom correct me if i'm wrong, but you use roms from this PR right and stock SPI

@Th3Fanbus
Copy link

@tlaurion Sorry, the above does not tell me anything I didn't know already.

@tlaurion
Copy link
Collaborator

tlaurion commented May 2, 2025

@tlaurion Sorry, the above does not tell me anything I didn't know already.

I just wanted to speed up between @gaspar-ilom answers, putting the hypothesis else where than

@gaspar-ilom did you happen to replace the flash chips on this computer with bigger ones? I need to understand why the logs contain the error I quoted earlier.

and restated I do not understand either that error happening twice in logs


[DEBUG]  Executing raminit task RAMINITEND
[DEBUG]  Waiting for mc_init_done acknowledgement... DONE!

[DEBUG]  ME: Requested 0MB UMA
[DEBUG]  ME: FW Partition Table      : OK
[DEBUG]  ME: Bringup Loader Failure  : NO
[DEBUG]  ME: Firmware Init Complete  : NO
[DEBUG]  ME: Manufacturing Mode      : YES
[DEBUG]  ME: Boot Options Present    : NO
[DEBUG]  ME: Update In Progress      : NO
[DEBUG]  ME: Current Working State   : Initializing
[DEBUG]  ME: Current Operation State : Bring up
[DEBUG]  ME: Current Operation Mode  : Debug
[DEBUG]  ME: Error Code              : No Error
[DEBUG]  ME: Progress Phase          : BUP Phase
[DEBUG]  ME: Power Management Event  : Pseudo-global reset
[DEBUG]  ME: Progress Phase State    : 0x4d
[DEBUG]  CBMEM:
[DEBUG]  IMD: root @ 0x7f7ff000 254 entries.
[DEBUG]  IMD: root @ 0x7f7fec00 62 entries.
[DEBUG]  FMAP: area COREBOOT found @ 30200 (12385792 bytes)
[DEBUG]  External stage cache:
[DEBUG]  IMD: root @ 0x7fbff000 254 entries.
[DEBUG]  IMD: root @ 0x7fbfec00 62 entries.
[DEBUG]  FMAP: area RW_MRC_CACHE found @ 20000 (65536 bytes)
[DEBUG]  MRC: Checking cached data update for 'RW_MRC_CACHE'.
[DEBUG]  flash size 0x2800000 bytes
[INFO ]  SF: Detected 00 0000 with sector size 0x1000, total 0x2800000
[ERROR]  SF size 0x2800000 does not correspond to CONFIG_ROM_SIZE 0xc00000!!
[NOTE ]  MRC: no data in 'RW_MRC_CACHE'
[DEBUG]  MRC: cache data 'RW_MRC_CACHE' needs update.

[INFO ]  Found TPM 1.2 ST33ZP24 (0x0000) by ST Microelectronics (0x104a)
[DEBUG]  PNP: 0c31.0 enabled
[DEBUG]  scan_bus: bus PCI: 00:00:1f.0 finished in 39 msecs
[DEBUG]  PCI: 00:00:1f.3 scanning...
[DEBUG]  scan_bus: bus PCI: 00:00:1f.3 finished in 0 msecs
[DEBUG]  scan_bus: bus DOMAIN: 00000000 finished in 364 msecs
[DEBUG]  scan_bus: bus Root Device finished in 382 msecs
[INFO ]  done
[DEBUG]  BS: BS_DEV_ENUMERATE run times (exec / console): 4 / 393 ms
[DEBUG]  BM-LOCKDOWN: Enabling boot media protection scheme 'readonly' using CTRL...
[DEBUG]  flash size 0x2800000 bytes
[INFO ]  SF: Detected 00 0000 with sector size 0x1000, total 0x2800000
[ERROR]  SF size 0x2800000 does not correspond to CONFIG_ROM_SIZE 0xc00000!!
[INFO ]  spi_flash_protect: FPR 0 is enabled for range 0x00000000-0x00bfffff
[INFO ]  BM-LOCKDOWN: Enabled bootmedia protection
[DEBUG]  BS: BS_DEV_RESOURCES entry times (exec / console): 0 / 39 ms
[INFO ]  Timestamp - device configuration: 163000581575
[DEBUG]  found VGA at PCI: 00:00:02.0
[DEBUG]  Setting up VGA for PCI: 00:00:02.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Switch Haswell boards to NRI (Native Ram Initialization)
4 participants