-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Even if --listBlocks wasn't specified, it makes sense to keep track of when
zero blocks are read or written so that they don't have to be read or written
repeatedly. The attached patch accomplishes this as follows:
* Change the non-zero block map into a zero block map, i.e., a bit in the map
is set if the corresponding block is zero, rather than being set if it's
non-zero. This change is not, strictly speaking, entirely necessary, since I
could have just left it as a non-zero map and then checked for the opposite bit
value, but I think it logically makes more sense for it to be zero map, and
hence the code is clearer this way, because what we're really interested in
knowing is the fact that a block is zero so we don't need to read or write it.
* Create an empty zero map when initializing http_io if --listBlocks wasn't
specified.
* Add a bit to the zero map if we try to read a block and get ENOENT.
* Add a bit to the zero map if we write a zero block that wasn't previously
zero.
This is actually the first patch of five I intend to submit in this area, if
it's OK with you. They are:
1. This patch (track zero instead of non-zero blocks, and track even when
--listBlocks wasn't specified).
2. Make --listBlocks happen in the background in a separate thread after the
filesystem is mounted (this should be relatively easy to do now that I've done
patch 1).
3. When a block that we expect to exist in S3 isn't there when we try to read
it, restore it from the cache if possible.
4. When a block that we expect to exist in S3 isn't there when we do
--listBlocks, restore it from the cache if possible.
5. Add an option to rerun --listBlocks periodically in the background while
s3backer is running.
Patches 3-5 deserve some explanation. My concern is that, to a very small
extent with regular S3 storage and to a much larger and even likely over time
extent with reduced redundancy storage (RRS), blocks could simply disappear
from S3 without any intervention on our part. I'm using s3backer to store my
backups with rsync, so I'm using RRS, since all the data I'm saving exists on
my desktop as well. However, the doc for RRS says that it should only be used
for data that can be restored easily, and indeed it can in this case, since for
performance reasons, my s3backer cache is big enough to hold my entire backup
filesystem. Ergo, it makes a great deal of sense to teach s3backer how to
automatically restore dropped blocks.
Please let me know your thoughts about this patch and my plans for the rest of
them. Especially since I think I may need some guidance from you when
implementing patches 3-5 :-).
Thanks,
jik
Original issue reported on code.google.com by jikam...@gmail.com on 24 Oct 2010 at 7:45
Attachments:
Reactions are currently unavailable