Help me understand some code paths plz

I've been trying to understand some of the implementation details of this, but I think there is some contextual knowledge that I like. I'll try to ask those questions here in a hope to find answers and contribute some improvements or at least code comments for future readers of this code.


1. I am confused by why `ConsumableBuffer` starts reading from the last byte as opposed to first byte

   https://github.com/ipfs/js-hamt-sharding/blob/44b6e6b20872200a16342a14592ed79b9660d996/src/consumable-buffer.ts#L28-L31

   What is even more confusing that while it reads bytes from the right side it reads bits from left side & the way it keeps tracks of the `_currentBitPos` would make you think bits are also read from the right side, but as it turns out it's deceiving because it actually reverses start position just before reading bits

     https://github.com/ipfs/js-hamt-sharding/blob/44b6e6b20872200a16342a14592ed79b9660d996/src/consumable-buffer.ts#L47-L49
   
     Perhaps most confusing about all this is the fact that js-ipfs-unixfs reverses hash bits, which makes me wonder if that would have been unnecessary if `ConsumableBuffer` just read bytes from left to right instead. 

    https://github.com/ipfs/js-ipfs-unixfs/blob/a5b9cad0cddd391e7dd93904bc4d3366f8e79202/packages/ipfs-unixfs-importer/src/options.js#L8-L16
   ```js
   async function hamtHashFn (buf) {
     return (await murmur3128.encode(buf))
       // Murmur3 outputs 128 bit but, accidentally, IPFS Go's
       // implementation only uses the first 64, so we must do the same
       // for parity..
       .slice(0, 8)
       // Invert buffer because that's how Go impl does it
       .reverse()
   } 
   ```

2. `InfiniteHash` seems to represent hash of some seed bytes, in a following pattern:

    ```js
     [...hash(seed), ...hash([...seed, 1]), ...hash([...seed, 2]), ....]
    ```

     if I understand it's code correctly (default) hash will return 8 bytes, after which new hash is generated from seed bytes & depth producing another 8 bytes, etc...

     https://github.com/ipfs/js-hamt-sharding/blob/44b6e6b20872200a16342a14592ed79b9660d996/src/consumable-hash.ts#L80-L89

    I suspect after 255*8 bytes things will repeat (because Uint8Array.from([256]) would be [1]), not sure if that is intentional or a bug, but it is a scenario that will not occur in practice so probably does not matter.

    Is this overall hashing behavior described somewhere ? I can't seem to find anything similar in go code base and in fact I got an impression that [go will error with out of bound error](https://github.com/ipfs/go-unixfs/blob/fede2ed6837d42eb20c2e125025ee591b7eda04f/hamt/util.go#L35-L37).

   Then again could it be that non of our tests run into conflicting keys and go and js might not align when that occurs ?


3. I'm under impression that `_findPlace` basically traverses trie by deriving index from 8 bits of the key hash, if bucket is encountered next 8 bits of the key hash is used to descend one level deeper etc... And since our hashes are deterministically derived from key + depth we will eventually encounter index without bucket in it
 
    https://github.com/ipfs/js-hamt-sharding/blob/44b6e6b20872200a16342a14592ed79b9660d996/src/bucket.ts#L154-L170 

    Is this accurate ? And if so, I feel like the whole hash cycling thing probably needs to be part of unixfs spec.


	constructor (value: Uint8Array) {
	this._value = value
	this._currentBytePos = value.length - 1
	this._currentBitPos = 7

	async _produceMoreBits () {
	this._depth++

	const value = this._depth > 0 ? uint8ArrayConcat([this._value, Uint8Array.from([this._depth])]) : this._value
	const hashValue = await this._hashFn(value)
	const buffer = new ConsumableBuffer(hashValue)

	this._buffers.push(buffer)
	this._availableBits += buffer.availableBits()
	}

	async _findPlace (key: string \| InfiniteHash): Promise<BucketPosition<T>> {
	const hashValue = this._options.hash(typeof key === 'string' ? uint8ArrayFromString(key) : key)
	const index = await hashValue.take(this._options.bits)

	const child = this._children.get(index)

	if (child instanceof Bucket) {
	return await child._findPlace(hashValue)
	}

	return {
	bucket: this,
	pos: index,
	hash: hashValue,
	existingChild: child
	}
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help me understand some code paths plz #38

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	const availableBits = this._currentBitPos + 1
	const taking = Math.min(availableBits, pendingBits)
	const value = byteBitsToInt(byte, availableBits - taking, taking)

Help me understand some code paths plz #38

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions