Replies: 4 comments 3 replies
-
The data is compressed separately in each block, so you cannot just
concatenate and gunzip to get the content. We may miss some headers in the
compressed stream as well, e.g. in zstd.
…On Wed, Apr 13, 2022 at 9:29 AM Stefan de Konink ***@***.***> wrote:
At this moment it is possible to access snapshots via the .zfs/snapshot
folder on each filesystem. Filesystem that have enabled compression have an
interesting potential; they are obviously stored smaller on disk then when
accessed in practise, but also: they are *already* compressed.
Considering a fileserver that sends out content that is gzipped. The
content would be already stored gzipped in the zfs filesystem, it would be
rather nifty to get the compressed blocks directly. For example by opening
.zfs/compressed/path/to/file.
As a practical example.
zdb -ddddd storage/compressed 387
Dataset storage/compressed [ZPL], ID 1248, cr_txg 8542160, 6.21M, 52 objects, rootbp DVA[0]=<0:d146d4000:2000> DVA[1]=<0:3c99c32000:2000> [L0 DMU objset] fletcher4 uncompressed unencrypted LE contiguous unique double size=1000L/1000P birth=8827030L/8827030P fill=52 cksum=10b7cc5d2b:2ca9214fb020:3f71118840efdd:3fbc1f4cfa9a6dee
Object lvl iblk dblk dsize dnsize lsize %full type
387 2 128K 128K 88K 512 1M 100.00 ZFS plain file
176 bonus System attributes
dnode flags: USED_BYTES USERUSED_ACCOUNTED USEROBJUSED_ACCOUNTED
dnode maxblkid: 7
path /NX-PI_01_at_oov_LINE_oov_15-596-j21_20220202.xml
uid 0
gid 0
atime Tue Mar 15 16:59:39 2022
mtime Wed Feb 2 16:21:22 2022
ctime Tue Mar 15 16:40:22 2022
crtime Tue Mar 15 16:40:22 2022
gen 8542172
mode 100644
size 959657
parent 34
links 1
pflags 840800000004
Indirect blocks:
0 L1 0:2d8b1d6000:2000 20000L/2000P F=8 B=8542172/8542172 cksum=9eee11decc:4a2e06f88402e:1155e5a4119062fb:460da27029f20a40
0 L0 0:2d84b6e000:4000 20000L/4000P F=1 B=8542172/8542172 cksum=47344b8bc64:330a790d3af518:32cee111be6853c8:bd4f07cc0b45e606
20000 L0 0:2d8b02c000:2000 20000L/2000P F=1 B=8542172/8542172 cksum=3b5fe2f2b8d:f48066b20f8a8:2870e00100ce1d6d:94ccf8fd0871d390
40000 L0 0:2d8b08a000:2000 20000L/2000P F=1 B=8542172/8542172 cksum=3afe0cbe872:f297c66f35acb:27ee32610c80ece4:28c4de9a81fdc32c
60000 L0 0:2d8b08c000:2000 20000L/2000P F=1 B=8542172/8542172 cksum=3db55e587f7:102562068b06b0:2b30d1b93873ad13:b97a1fba40c09b1c
80000 L0 0:2d8b08e000:2000 20000L/2000P F=1 B=8542172/8542172 cksum=3b48ea60028:fd5016dfc39e9:2a5f41d7a460006d:d31585d54e1f9d98
a0000 L0 0:2d8b090000:2000 20000L/2000P F=1 B=8542172/8542172 cksum=368def0b928:fcc56fd4756ac:2b45753eddbea7a6:5f98d35d35f6301c
c0000 L0 0:2d8b092000:2000 20000L/2000P F=1 B=8542172/8542172 cksum=13504a5361c:8224ad8cc331e:1bc48077a0c9e541:fde1a4ec5be32352
e0000 L0 0:2d8b094000:2000 20000L/2000P F=1 B=8542172/8542172 cksum=89c2fa8900:3fbbe5134f093:ec89da01fbad85c:adaaac8da5fd2a04
segment [0000000000000000, 0000000000100000) size 1M
#4984 <#4984> gives the hint that
something like below could do the trick (if all DVA parts) would be
compiled. That part is not working from me yet.
zdb -R storage/compressed 0:d146d4000:2000:r >/tmp/test.bin
zstdcat /tmp/test.bin
But in general, accessing the raw data, is that something that could be
supported natively inside the filesystem? Like for example:
compressed path /NX-PI_01_at_oov_LINE_oov_15-596-j21_20220202.xml.zstd
—
Reply to this email directly, view it on GitHub
<#13326>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABXQ6HIR5UXRB7X225IDVSTVEYBMPANCNFSM5TI25Z3A>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
This seems like it might be better implemented as generic functionality in whichever FS network sharing layer rather than using ZFS's own copies - you'd need to implement a bunch of reading and re-wrapping of the data to be readable with standard versions of those de/compressors, we'd need to decompress, divide up and recompress anything written anyway, and you'd get far better compression over larger regions of "whatever the FS sharing protocol is requesting" than over per-record like we use. |
Beta Was this translation helpful? Give feedback.
-
You're not the first to think about this. zfs send/receive are fairly good starting points. There are a couple places where it could be added depending on the intended use. Bolting something like ipfs or even bittorrent has many, many use cases. In ipfs it could be exposed as a vdev like a cache or available through channel programs.. etc. |
Beta Was this translation helpful? Give feedback.
-
It feels like externalizing and synchronizing the internal state of
zfs-send stream generator.
…On Sat, Jun 11, 2022 at 10:48 PM Stefan de Konink ***@***.***> wrote:
A few days a go when I started to introduce yet another backup node in our
network triggered my about the 'bittorent' and 'ipfs' spirit. For example
what about allowing zfs send to be coming from multiple hosts that all
share the same immutable snapshot? Those blocks should be able to be shared
across the network to reconstruct a filesystem. A rather ambitious
project...
—
Reply to this email directly, view it on GitHub
<#13326 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABXQ6HN4EPZ6DM5WZELUPIDVOSDJXANCNFSM5TI25Z3A>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
At this moment it is possible to access snapshots via the
.zfs/snapshot
folder on each filesystem. Filesystem that have enabled compression have an interesting potential; they are obviously stored smaller on disk then when accessed in practise, but also: they are already compressed.Considering a fileserver that sends out content that is gzipped. The content would be already stored gzipped in the zfs filesystem, it would be rather nifty to get the compressed blocks directly. For example by opening
.zfs/compressed/path/to/file
.As a practical example.
#4984 gives the hint that something like below could do the trick (if all DVA parts) would be compiled. That part is not working from me yet.
But in general, accessing the raw data, is that something that could be supported natively inside the filesystem? Like for example:
Beta Was this translation helpful? Give feedback.
All reactions