-
-
Notifications
You must be signed in to change notification settings - Fork 195
Use Native Ram Init (NRI) on haswell boards (not chromebox's preppy borrowed MRC blob) #1923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@gaspar-ilom done |
…oreboot.modify_and_save_oldconfig_in_place Input for linuxboot#1923 Signed-off-by: Thierry Laurion <[email protected]>
@gaspar-ilom cherry-pick tlaurion@e42b913 Would be nice if you gave repro instructions under 1b3cd51 to dump patches in the right place for audit/repro/future patchsets needing to be cherry-picked for future work to use this as ref |
Wow @gaspar-ilom ! That was fast! Edited OP for testing! |
Does this need tested? I have everything still out and setup if it doesn't work. |
@MattClifton76 : See OP for comparisons needs. Suspend/resume needs to work, and if no regression on performance are notable, this will pass the tests here. Other changes needed are docs related basically. Edit: @MattClifton76 you do not seem to be board owner for neither of the boards though. |
…oreboot.modify_and_save_oldconfig_in_place Input for linuxboot#1923 Signed-off-by: Thierry Laurion <[email protected]>
Is that what you had in mind? I could not find any references in the files under
Is that referring to heads-wiki? |
Hmm, do you think I should change the commit message or where exactly do you mean? The steps I did:
I think that's it. Add to the commit message? |
Awesome! so this works with tlaurion@e42b913, really good news! Thanks @MattClifton76
|
@gaspar-ilom : Quick searches (neglect heads-wiki, can be done later, there is no t440p/w541 disassemble guide nor anything else under Wiki: were promised by original port guys who moved to other things after the merge. One day, those will be contributed back, or not, from board owners wanting to contribute back :) )
|
Yes, that's what I try to do with everything I do so that commit messages always contains a "repro" section where relevant, so others can arrive to the same result. Here, one would have to replicate exactly your steps to make sure that patches you took from coreboot, and the coreboot patches you put in the patch directory, matches. That is the job of the person that merges the patchwork to reproduce, otherwise we trust blindly, which is not recommended for security projects. Patches should be bit by bit the same, and patched with another patch if we need to change something from where we got it. That also shows upstream what to modify if they want to replicate what was done here, and for CircleCI to arrive at the final hash which even coreboot devs would be able to replicate. It was once suggested that Heads became a base for testing patches for boards used by real users. This is kind of what we are doing here. Awesome and quick work @gaspar-ilom :) those board owners should give you a tip if you tell where to do so! I love just guiding here, seems like you get the gist of Heads here! Thank you for your collaborations! Note on your point 3, and my commit message for a206e15 |
@gaspar-ilom since @MattClifton76 confirmed things work on t440p, you could as well do Which from Makefile does
Each time I have to do something that I feel I will have to redo in the future, I add helpers. Either in global Makefile or in modules/* makefiles The current helpers are
|
RE-EDIT: @gaspar-ilom : please cherry-pick tlaurion@af84b6a (note order of repro notes: hotp includes non-hotp board variants) :) |
EDITED @gaspar-ilom please cherry-pick tlaurion@1e23270
|
@gaspar-ilom : do you still have w541 and can report everything kosher? |
dd7979f
to
f3eb374
Compare
…oreboot.modify_and_save_oldconfig_in_place Input for linuxboot#1923 Signed-off-by: Thierry Laurion <[email protected]>
Maybe it would be worth writing a short section on commit (message) guidelines about that in the heads-wiki. Mention this and maybe a few other things such as signing and signing-off commits. It is probably also one of those recurring tasks to get contributors to do this :-) EDIT:
If anyone wants to give a tip it should be to you or the project in general. I see this as a small contribution I can make. I am using it on my boards too after all. And tbh. this one was such a small effort compared to the porting of the T480, but you know that @tlaurion Anyway, I would not want to take a tip as I might just disappear from the project at some point. The plus side of all work for the T480 under your guidance is that I am now much more familiar with the code base and the build system. So a contribution like this one takes a lot less time. |
Still have it. Gonna test later. |
done |
I have just tested with f3eb374
@MattClifton76 have you measured boot time for T440p? @tlaurion What is missing so that this can be merged? Can you do the review? If not who should we poke? |
@gaspar-ilom i just timed it, 11-12 seconds to get to splash screen. Much longer to get into qubes because of LUks and logging in. It's definitely much slower than my T480. Is there still some code refinement that needs to happen up stream? Will it get faster as coreboot matures? |
f3eb374
to
7038b25
Compare
…oreboot.modify_and_save_oldconfig_in_place Input for linuxboot#1923 Signed-off-by: Thierry Laurion <[email protected]>
The proper tool to get coreboot boot time measurements is cbmem -t. Console logs are cbmem -1 (this is a one) output as log so I can have eyes here. A diff between t430 and t440p should tell us a lot here. So
Upload cbmem_stages_time.txt and cbmem_console.txt.
We need a baseline prior of tuning things, understand where time is spent at least, and why. @gaspar-ilom once 24.12 PR is merged, thus or should be rebased --signoff so that comment here are only relevant to NRI port effort. Nobody will look at all the changes coming from T480 to 24.12 bump to only check NRI related config changes. I'm out of time today but this is where the fun starts. One would have to explore the changes and kconfig dependencies and what can be done here. 12s at each boot means memory is most probably retrained at each boot. High level analysis of upstream patchwork stipulates that it should not be the case. This is where collaboration upstream starts. Either here or subsequent PR. But first, we need to understand config options related to NRI, but at first glance there is none outside of tweaking numerical values... But also Kconfig options says s3 suspend resume is not working..... While it is. I thing it got just merged to stop bitorotting and conflicting with other code base. Future of NRI start here (while be aware that memory training code is the most complicated part, and really hard to reverse. So amazing work here already.) My main concern here is to understand what happens when there is no bootsplash. Bootsplash shown on romstage. So hypothesis is that mrc cache is not reused. But that needs to be proven with logs. |
@tlaurion here is the requested txt file. |
You were a bit too quick, I added cbmem -1 but not in my steps... Logs would help, and config gives timestamps there as well, figurenwill be complete and I will be able to compare with x230 posting same logs. Sorry for the edit while you were doing that. See #1923 (comment) again. |
Original question:
From coreboot matrix channel, nic3-14159 answered
Then Max Shanly said:
|
@tlaurion For EHCI debug, I would use a FT232H or FT2232H (must be H suffix), but I think someone added support to use the CH347 for coreboot logs via EHCI debug. I don't know if anyone successfully used a SBC like the RPi for EHCI debug in the last few years, and I'm pretty sure it must be a RPi without a USB hub (you need a USB port from the SoC). |
Thanks @Th3Fanbus and @tlaurion I have ordered an FT232H dongle now. My plan is to plug that into the corebooted W541 and then connect the jumper wires to the UART pins of the Raspberry Pi and read the log messages from there. Will let you know once the board arrives. EDIT: @Th3Fanbus and just out of curiosity/ to get a better understanding, why isn't it possible to use something like a PL2303 plug the USB into the corebooted machine (W541) and connect the other end to the UART of a Raspberry Pi? |
@tlaurion seems like circleci flakiness is hitting again: https://app.circleci.com/pipelines/circleci/MHXxJnuGL1oVD9jaDwEAGo/Kguuk4Rjpkd7hBpYBWhVuN/83/workflows/d0fde9d6-20cb-4fa3-910f-470cdfbf673d |
1045b83
to
fb3cf50
Compare
Please accept my apology for being MIA, I was busy at work. @gaspar-ilom which FT232H dongle did you buy? |
@gaspar-ilom its your PR, slap it doing "rerun workflow from failed". |
@tlaurion which documentation would you like to see added? Is this a general reminder or referring to something specific that is missing? |
Don't worry. It's not like much has happened in this PR anyway.
@MattClifton76 I have bought something like that: https://www.aliexpress.us/item/2251832631165237.html If you want to buy the quality product with good documentation, you should go for the Adafruit FT232H Breakout: https://learn.adafruit.com/adafruit-ft232h-breakout/serial-uart It is the same chip, so it should not make a difference, but who knows. Please, note that on all these PCBs you will have to solder the header pins yourself. It is not hard but you will need the equipment: https://www.youtube.com/watch?v=Z0joOKaQ43A And you will need a different device to plug the UART pins into. As mentioned I want to use a Pi for that. If you do not own a suitable UART device, you could go with a setup like here (using a FT232H instead of the FT2232H): https://ch1p.io/coreboot-ft2232h/ |
Signed-off-by: gaspar-ilom <[email protected]>
fb3cf50
to
3c0dc40
Compare
@tlaurion I would do that if I could. The buttons are all greyed out and not clickable, even though I am logged in to circleci. So my workaround is to amend to the last commit (changing the date) and force-push, but that sucks. I already did that once yesterday but it failed again. Now I have done it another time. |
Just realized that CircleCI was building from linuxboot/heads and not from your fork. You need to follow your fork under CircleCI. ChatGPT (seems valid): Guide: Setting Up CircleCI for GitHub ForksThis guide explains how users who fork a GitHub repository can configure CircleCI so that builds are triggered from their own CircleCI accounts when they push commits. 1. Fork the GitHub Repository
2. Enable CircleCI for the Forked Repository
3. Configure Your Workflow
4. Verify Webhook and GitHub Permissions
5. Optional: Set Up Status Checks for Pull Requests
Additional Notes for Repository Maintainers
|
I captured a log over USB/EHCI from the W541. Just took a brief look. It seems that the MRC cache is not used: Should we increase the log level again to get more information? Which parameters should we tweak? |
…level to debug, enable ram init debug info to console Input for linuxboot#1923 (comment) Signed-off-by: Thierry Laurion <[email protected]>
Result under tlaurion@e96f426 @gaspar-ilom As @Th3Fanbus said earlier, with console output alone, human eye should see where time spent. With timestamps on console output, should be clean in logs. With console in debug (was info) we should see more of what happens. With raminit additional debug enabled, we should have introspection on what happens on raminit. Let the logs be the answers hopefully! Pictures of your debugging setup with closeups would be awesome for posterity as well. |
…level to debug, enable ram init debug info to console Input for linuxboot#1923 (comment) Signed-off-by: Thierry Laurion <[email protected]>
@tlaurion @Th3Fanbus here are the logs with increased debug level and ram init debugging enabled: Btw, it takes way longer to boot than with the previous commit with a lower debug level. |
@tlaurion here are some images. Overview: FT232H The Pi Here you can see how the three pins are connected: GPIO Pins Sidenote, I don't think it is necessary to use a Pi. Anything that handles UART properly should work. EDIT: Another sidenote, I had to solder the header pins to the FT232H myself. So be aware that this is another step to make this work... EDIT2: I should probably add some information on how to set the Pi up:
With minicom you will both see the logs in the shell and they will be saved in |
Thank you, I'm still waiting on mine to show up, good ol USPS. |
From log
That's what cought my attention. |
So, w541_usb_ge96f426.log has a raminit log where I don't see anything unusual (it's just four dual-rank, reference card F DDR3 SO-DIMMs getting trained). As there aren't any timestamps during raminit, I can't tell from the log file where the long boot times may come from. Even with logging, I would expect boot times to be less than 5 seconds. If you watch the EHCI debug console in real time, are the romstage messages getting printed unusually slowly? Messages should be going pretty fast (compare against ramstage logs), unless there's big pauses somewhere. If so, it would be great to know at which point in the log those big pauses occur (note down the messages right before each pause). |
ping @Th3Fanbus I do not think this is normal and seems to prevent MRC cache to be saved and reused no? |
@Th3Fanbus @gaspar-ilom : I find
Which show clearly where time was spent between each line printed on console. |
What exactly doesn't seem normal? I already said that The MRC cache itself seems to be populated properly, remember this is a first-time boot so the cache was empty. |
No. That's exactly the question here. If you look at the full oldconfig file in PR, romsize fits bios region. Unfortunately I cannot help more here than repeating that this seems to be the cause of issue, and that MRC training de esnt seem to stick and constantly be retrained, but post first boot logs are missing. @gaspar-ilom ping on piping the logs through | ts on host receiving the logs so we have timestamps for each line of cbmem output over serial. Nothing much else I can say here. |
@gaspar-ilom did you happen to replace the flash chips on this computer with bigger ones? I need to understand why the logs contain the error I quoted earlier. |
Will try to speed this up a little. Long delays and long context switches for things I'm not knowledgeable into are not easy for me here. So Last commit from which logs were extracted by @gaspar-ilom were 6dfe541 in follow up comment #1923 (comment) (even if commit id mismatches @gaspar-ilom )
Where my comment at #1923 (comment) pinpointed
I checked all the comments of this PR @Th3Fanbus and recall your comment on ME being at fault #1923 (comment) but cannot find your comment on the size difference error. Sorry if i'm still missing it. @gaspar-ilom uses w541 and roms built from CircleCI to test and report to this PR to make things reproducible for all board owners. Therefore, from that commit 6dfe541's coreboot's oldconfig file
My question still stands; where "SF size 0x2800000" is obtained from. @gaspar-ilom correct me if i'm wrong, but you use roms from this PR right and stock SPI |
@tlaurion Sorry, the above does not tell me anything I didn't know already. |
I just wanted to speed up between @gaspar-ilom answers, putting the hypothesis else where than
and restated I do not understand either that error happening twice in logs
|
T440p/w541 only :
Untested: only for external flashers.-> 10s spent in romstage, lot of debug info: truncates cbmem up to cbmem being 1mb big stillQuickly hacked together. Probably I still missed something, but I will let you have a look.
Takes upstream NRI patch train from https://review.coreboot.org/c/coreboot/+/64186/9 and changes Heads coreboot configs for t440p/w541 to test results on top of coreboot 24.12
@tlaurion See also #1825 (comment) and #1711 (comment)
TODOs:
Before merge:
So t440p/w541 board owners, it would be a time to compare before/after this PR roms:
Board owners:
AFTER MERGE:
EDIT: 4cb6985 should probably be merged into #1908 as a cleanup-> done