Releases: BBC-Esq/VectorDB-Plugin
v1.4.3 - models galore!
ADDED a bunch of new embedding models to choose from!
Renamed scripts and functions to prepare for expansion.
Removed a bug that prevented it from working on Macs and AMD Gpus (related to displaying metrics).
If anyone has any bugs please let me know. I will implement a custom feature you request (within reason) if you report a bug to me that make the program work better!
v1.4.2 - more metrics!
Added cpu and ram usage and percentage metrics.
Refactored code to prepare for expansion.
Roadmap:
--Introduce quantized embedding models for even faster increase and lower resource requirement.
--Add options when creating the database and interacting with the LLM; for example, controlling the chunk size or the number of results or the length of the results...to make sure it fits within the LLM's context window.
--Add a calculator that displays the total tokens of a user's prompt + the context returned to a user can see if it fits within the LLM's context window.
--Add "poor man's vector database" search based on my other repo. Make this an option within this program.
--Add some color and other stuff to improve ease on eyes and appearance of the GUI.
--Remove the table on the left side that shows embedding models. Add a "help" or user's manual with various tables this information (only when requested), among other information like primers on how a vector database works, how to get the most out of it, nuances of the various models, and other helpful stuff.
v1.4.1 - cuda/vram/multiprocessing/threading
Properly implemented multithreading/processing to make sure the CUDA/VRAM usage (and the GUI in general) doesn't freeze when creating the vector database nor when querying the database.
Updated pro tip to reflect reliable comments on Discord regarding larger LLMs being helpful for especially technical jargon.
v1.4 - BREAKING changes
Significantly revised the code, created new scripts, started using a configuration yaml file, reducing the "global variables," etc.
After much struggling, added a GPU and VRAM usage metrics at the bottom of the GUI so you can see when running! However, the GUI still hangs periodically, which prevents it from updating when you need it most. This will be fixed in a patch in the next day or so.
Comments are welcome. Collaboration is appreciated.
1.3 - added file-type support!
Added support for the following file types!
pdf, docx, txt, json, enex, eml, msg, csv, xls, xlsx
Removed placeholder text on gui.
1.2.1 - IMPORTANT
IMPORTANT
I addressed the "pandas.core.arrays.arrow.dtype" error, which ChromaDB caused while trying to use/not use the "pandas" library - I'm not 100%. Regardless, it prevented all releases of my program from working - I APOLOGIZE to people who were struggling.
Therefore, I have:
-
Added "pandas==2.0.3" to the requirements.txt file. This will install pandas 2.0.3 over any other version of pandas that any other library listed in the requirements.txt tries to install.
-
Confirmed the latest versions of all libraries in requirements.txt work with my program.
-
Put version numbers after each library in the requirements.txt file that I know work; therefore, only version of libraries that work with my script will be installed.
INSTALLATION instructions remain the same, but if you tried installing in the last 24 hours you'll need to reinstall everything from scratch.
Version 1.2
Added Metal/MPS and AMD GPU acceleration.
Revised readme and provided better installation instructions.
Added "check_gpu.py" to allow people to check if they're installed gpu-acceleration correctly.
Version 1.1
Significant changes:
-
Option to select and automatically download multiple embedding models.
-
Automatically select HuggingFaceInstructEmbeddings or HuggingFaceEmbeddings depending on the embedding models being used.
-
More useful GUI.