The Semantic Data Navigator is a Jupyter Notebook designed to process and analyze textual data using advanced natural language processing (NLP) techniques. It leverages OpenAI's embeddings and LangChain to load, split, and semantically search documents while providing visualization capabilities with t-SNE and Plotly.
- Load and preprocess textual data from a directory.
- Generate embeddings using OpenAI's
text-embedding-ada-002
model. - Store and retrieve document embeddings using ChromaDB.
- Implement interactive visualizations with Plotly and Matplotlib.
- Facilitate semantic search and similarity-based document retrieval.
- Utilize Gradio for interactive exploration of search results.
- Load the Notebook: Open semantic-data-navigator.ipynb in Jupyter or VS Code.
- Set Up API Keys: Ensure your OpenAI API key is properly configured.
- Run the Notebook: Execute each cell sequentially to process and analyze textual data.
- Explore Visualizations: Use the provided interactive t-SNE and semantic search functionalities.
- Deploy with Gradio: Run the Gradio UI for an interactive document retrieval experience.

- If you'd like to contribute to the development of this project, feel free to submit a pull request or raise an issue.
This project is open-source and distributed under the MIT License.