Skip to content

Issue84/aksh cross chunk attention#88

Open
akshsabherwal wants to merge 19 commits into
developfrom
issue84/aksh-cross-chunk-attention
Open

Issue84/aksh cross chunk attention#88
akshsabherwal wants to merge 19 commits into
developfrom
issue84/aksh-cross-chunk-attention

Conversation

@akshsabherwal
Copy link
Copy Markdown
Collaborator

Relevant issues

Description

  • Create src/python/chatlse/cross_chunk_attention.py where the ShiftedCrossChunkAttention (SCCA) class is defined
  • Edit src/python/chatlse/crawler.py to ensure that embed_function()incorporates the additional text processing

The basic mechanism of SCCA is as follows:

  1. Calculate embeddings of each chunk for each document (using existing utilities from embeddings.py)
  2. Shape embeddings into a tensor of required dimension
  3. Shift embeddings across chunks (e.g. if chunk 1 has embedding A, chunk 2 has embedding B, and chunk 3 has embedding C, then the shift results in chunk 1 having embedding C, chunk 2 having embedding A, and chunk 3 having embedding B)
  4. Break up each embedding into 8 different embeddings (called 'heads') and perform attention calculations on each of these heads
  5. Reshape the tensor back to its original dimension, resulting in each chunk now having an attended embedding

How to test

  1. Run the crawler
  2. Run the app
  3. Ask it any question you asked it in v0.1 that you had issues with, and see if there is any improvement at all.

@akshsabherwal akshsabherwal force-pushed the issue84/aksh-cross-chunk-attention branch from 52d3bfc to f61f351 Compare July 23, 2024 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ChatLSE] Explore Cross-Chunk attention mechanisms

2 participants