You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a little over 25,000 plots spread out over four harvesters. Each harvester is a dual Xeon server with 256GB RAM running only the harvester software.
I have moved all harvesters, plotters, and my main node over to 1.2.1 and started to plot pool plots.
Once I made the move to 1.2.1, I started seeing the following errors on all of the harvesters generally several times per day with a resulting failure of the chia_harvester process:
[Wed Jul 14 21:57:15 2021] chia_harvester[370192]: segfault at 7fc21c1c0d16 ip 00007fc237678862 sp 00007fc203ff1808 error 4 in chiapos.cpython-38-x86_64-linux-gnu.so[7fc237607000+bf000]
Rolling everything back to 1.1.7 (which is a total PITA) resolves this issue completely. Here is what I have done to troubleshoot the problem:
Go from 1.1.7 on all systems (zero segfaults) to 1.2.0 - Segfaults start
Go from 1.2.0 to 1.2.1 on all systems, segfaults continue
Rollback to 1.1.7 - segfaults stop (Ran for three days without a single segfault)
Build an entirely new server, new memory, new cables, new LSI controller card, new SAS cables, new sas expander cabinet, new fresh install of Ubuntu 20.04 LTS, new Chia install:
1.1.7 no segfaults at all
1.2.0 Segfaults start
1.2.1 Segfaults continue
I am currently trapping both dmesg and /var/log/syslog along with DEBUG on my chia logs. There is nothing prior to or after the segfaults in any of those logs to indicate where the problem might possibly be coming from.
I have tried everything that I can think of to locate the cause of this issue and while 1.1.7 seems to work fine, I really want the ability to plot for pooling which necessitates 1.2.x.
Any help/thoughts/suggestions/ideas would be greatly appreciated.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
I have a little over 25,000 plots spread out over four harvesters. Each harvester is a dual Xeon server with 256GB RAM running only the harvester software.
I have moved all harvesters, plotters, and my main node over to 1.2.1 and started to plot pool plots.
Once I made the move to 1.2.1, I started seeing the following errors on all of the harvesters generally several times per day with a resulting failure of the
chia_harvester
process:[Wed Jul 14 21:57:15 2021] chia_harvester[370192]: segfault at 7fc21c1c0d16 ip 00007fc237678862 sp 00007fc203ff1808 error 4 in chiapos.cpython-38-x86_64-linux-gnu.so[7fc237607000+bf000]
Rolling everything back to 1.1.7 (which is a total PITA) resolves this issue completely. Here is what I have done to troubleshoot the problem:
I am currently trapping both
dmesg
and/var/log/syslog
along withDEBUG
on my chia logs. There is nothing prior to or after the segfaults in any of those logs to indicate where the problem might possibly be coming from.I have tried everything that I can think of to locate the cause of this issue and while 1.1.7 seems to work fine, I really want the ability to plot for pooling which necessitates 1.2.x.
Any help/thoughts/suggestions/ideas would be greatly appreciated.
Beta Was this translation helpful? Give feedback.
All reactions