-
-
Notifications
You must be signed in to change notification settings - Fork 471
Description
After an instance crash (power-out, ...), fresh start takes very long time
because of HeapReader reading RWI database and rebuilding indexes (~half a
day in case of 300GB RWI database) resulting in prolonged instance
down-time.
I'm not sure about how RWI database works, prove me wrong please, but from
what I observe:
-
all the segments indices are kept in memory during run
(which can result in memory exhaustion as described in RWIs fill out the whole memory space for YaCy #731) -
during correct shutdown, indices are written to (.idx and .gap?) files and
re-start is relatively fast -
if the instance crashes, no .idx & .gap files are written
Possible solution could be:
-
keep the .idx & .gap files until the segment is changed
-
maybe time-to-time, write .idx & .gap file, when the segment wasn't
changed for a long time -
if running into low-memory condition, unmount (?) the oldest segments.
check this also on start-up, because with a RWI database bigger than
available RAM, the instance wouldn't start, resulting in memory error -
speed-up the RWI transfer to other instances, as suggested in RWI: indexDistribution.minChunkSize does nothing #724.
RWI database on a host only grows even by moderate crawling (~200ppm,
DHT-IN switched off) and never actually shrinks, resulting in many
instances jam.