Merge pull request #57 from amosproj/feature-branch

lyndanajjar · web-flow · commit 8522acc676f2 · 2023-12-11T23:11:27.000+01:00
UI Refinement, Efficient Testing, and Documentation
diff --git a/Queries Classification b/Queries Classification
@@ -0,0 +1,68 @@
+Testing Queries (Based on the Bot answers):
+
+We have worked on catalog queries, found in the file   pairs.py  and classified them into four levels:
+
+Level 1: Easy - Approximately the exact same code.
+Level 2: Medium - More than 50% of lines are the same.
+Level 3: Hard - Less than 50% similarity but with the same functionality.
+Level 4: Undetermined - Sometimes the bot provides code (not necessarly correct), sometimes not. 
+
+
+
+Easy level queries :
+
+    Q1: I would like to use RTDIP components to read from an eventhub using 'connection string' as the connection string, and 'consumer group' as the consumer group, transform using binary to string, and edge x transformer then write to delta. 
+
+    Q18: Read customer purchase history from a Parquet file, perform a customer segmentation analysis, and save the segments to a Delta Lake.
+
+
+Medium level queries :
+
+    Q2: I need to read data from Kafka using a specific bootstrap server and topic, then apply a JSON parser, and finally write the results to a Hive table.
+
+    Q9:  Load sales data from an FTP (file transfer protocol) server, perform currency conversion, and append the results to an existing Parquet file. 
+
+    Q14:  Access weather data stored in an HDFS cluster, normalize temperature readings, and store the results in an Elasticsearch index. 
+
+    Q19:  Aggregate financial transaction data from a SQL database, calculate the monthly average transaction amount, and store the results in a Delta Lake. 
+
+    Q20 : Fetch log data from an Elasticsearch index, filter logs with error severity, and archive them in a Delta Lake. ,
+
+
+
+Hard level queries :
+
+    Q3: Fetch sensor data from an Azure Blob Storage in CSV format, aggregate the data on sensor ID, and save it to a SQL database.
+
+    Q4: Stream data from a MQTT broker, filter out readings below a threshold value, and store the data in Elasticsearch
+
+    Q6: Retrieve temperature data from a REST API, normalize the data, and write it into a MongoDB collection.
+
+    Q7: Connect to a Google Cloud Storage, download logs in JSON format, conduct sentiment analysis, and then store the results in a Google BigQuery table.
+
+    Q8: Stream Twitter data using API credentials, extract hashtags from tweets, and save the data into a Cassandra database.
+
+    Q10: Connect to an IoT device using MQTT protocol, apply a low-pass filter to sensor readings, and upload the filtered data to an InfluxDB instance.
+
+    Q12: Aggregate temperature and humidity data from a CSV file stored in an Azure Data Lake, calculate average values per day, and upload to a Snowflake database.
+
+    Q13: Extract stock market data from a REST API, calculate moving averages, and save the data in an Amazon Redshift cluster.
+
+    Q16: Stream social media data from a JSON file, deduplicate the entries based on user ID, and store the results in a Delta Lake.
+
+    Q21: Read weather data from a RESTful API, convert temperature from Celsius to Fahrenheit, and store the results in a JSON file.
+
+    Q22: Connect to a MySQL database, retrieve order data, group by product category, and insert the grouped data into a new table.
+
+    Q23: Extract text data from a series of PDF files stored in an SFTP server, perform named entity recognition, and index the entities in an Apache Solr collection.
+
+
+Undetermined level queries :
+
+    Q5:  Import financial data from an S3 bucket in Parquet format, apply a standard scaler transformation, and then upload it to a Redshift database. 
+
+    Q11: Read customer feedback from a Google Sheets document, apply sentiment analysis, and store the results in a PostgreSQL database for further analysis. 
+
+    Q15: Load sales records from a MongoDB collection, filter out records with sales below $500, and export the data to a CSV file. 
+
+    Q17:  Load IoT sensor data from a CSV file, apply a smoothing filter to the readings, and write to a Delta Lake for time-series analysis.
diff --git a/README.md b/README.md
@@ -13,7 +13,7 @@ blocks to perform the data processing. Chat history and generated output should
 ## Meet the Team 
 Scrum Master: [@SaraElBrak](https://github.com/SaraElBrak)  
 Product Owners: [@AviKatziuk](https://github.com/AviKatziuk), [@ceciliabetb](https://github.com/ceciliabetb)  
-Software Developerss: [@lyndanajjar](https://github.com/lyndanajjar), [@bergzain](https://github.com/bergzain), [@Obismadi99](https://github.com/Obismadi99), [@Nahrain1](https://github.com/Nahrain1)
+Software Developers: [@lyndanajjar](https://github.com/lyndanajjar), [@bergzain](https://github.com/bergzain), [@Obismadi99](https://github.com/Obismadi99), [@Nahrain1](https://github.com/Nahrain1)
 
 The planning document of the team is found [here](https://docs.google.com/spreadsheets/d/1m1z2m_p6k0ATw0RVNXJMbDp-RrOOPxpu0c3PPCtrwBI/edit#gid=6) 
 
@@ -22,17 +22,51 @@ Ensure that you have installed:
 
 * Python version 3.11 or higher
 * Docker Desktop 
+* OpenAI API Key   
 
-To install the required dependencies for this projecdct, please create a virtual environment and run: 
+## Build Process 
+To get started with this project, follow these steps: 
+
+* Clone the github repository 
+```
+git clone https://github.com/amosproj/amos2023ws05-pipeline-config-chat-ai.git
+``` 
+* Install Dependencies 
+To set up the required dependencies, create a virtual environment and run the following command:
 ```
 pip install -r requirements.txt
 ```
+* Run the Application with Docker  
+Navigate to the `src` folder by using: 
 
+``` 
+cd src 
+```
+Ensure that the Docker daemon is running before proceeding. Follow these steps to run the application:
+Step 1: Build the Docker Image
+Execute the following command to build the Docker image. Replace `<your-image-name>` with your chosen image name:
 
+```
+docker build -t <your-image-name> .
+```
+Step 2: Run the Docker Container
+Launch the container using the following command, specifying port 8501 for Streamlit:
 
+```
+docker run -dp 8501:8501 <your-image-name>
+```
+
+Once the container is successfully running, access the application by clicking the link displayed in the `Ports` column within the Docker Desktop interface. This link corresponds to the port mapping configured during the container's launch and serves as the entry point to interact with the application.
+
+_Note: Remember to replace <your-image-name> with the actual name you assigned to your Docker image._
 
+* Accessing the Chatbot Application  
 
+Open your web browser and navigate to the presented link. The Chatbot application will be displayed, prompting you to input your OpenAI API Key. Engage in conversations by posing RTDIP-oriented questions and explore the capabilities of the application.
 
+<div style="text-align:center">
+  <img src="https://github.com/amosproj/amos2023ws05-pipeline-config-chat-ai/raw/feature-branch/UI.png" alt="UI" width="500"/>
+</div>
 
 
 
diff --git a/UI.png b/UI.png
diff --git a/src/ChatUI_streamlit/app.py b/src/ChatUI_streamlit/app.py
@@ -5,45 +5,44 @@
 import os
 import time
 
+
+
 # App title
 if 'page_config_set' not in st.session_state:
-    st.set_page_config(page_title="RTDIP PipeLine Chatbot")
+    st.set_page_config(page_title="RTDIP Pipeline Chatbot")
     st.session_state['page_config_set'] = True
 
+# Use HTML/CSS to position the title and GitHub link on the same line
+st.markdown(
+    '''
+    <div style="display: flex; justify-content: space-between; align-items: center;">
+        <div style="margin-top: -70px; margin-left: -180px;"><h2>RTDIP Pipeline Chatbot</h2></div>
+        <div style="margin-top: -70px; "><a href="https://github.com/rtdip/core/tree/develop"><img src="https://img.shields.io/badge/GitHub-Repo-blue?logo=github"></a></div>
+    </div>
+    ''', unsafe_allow_html=True)
+
+
 # Replicate Credentials
-with st.sidebar:
-    st.title('RTDIP Pipeline Generation Chatbot')
-    openai_api_key = st.text_input('Enter OpenAI API Key:', type='password')
+api_key_container = st.empty()
+openai_api_key = api_key_container.text_input('Enter OpenAI API Key:', type='password')
 
 # Check if OpenAI API Key is entered
 if openai_api_key:
-    # Store the API key in the session state or environment variable
-    st.session_state['OPENAI_API_KEY'] = openai_api_key
-    os.environ['OPENAI_API_KEY'] = openai_api_key
-    st.success('API Key stored!')
+        # Store the API key in the session state 
+        st.session_state['OPENAI_API_KEY'] = openai_api_key
+        os.environ['OPENAI_API_KEY'] = openai_api_key
+        success_message = st.success('API Key stored!')
+        # Hide success message, input field, and chat messages after 3 seconds
+        time.sleep(0)
+        success_message.empty()
+        api_key_container.empty()
 else:
-    st.warning('Please enter your OpenAI API Key to proceed.')
-
-
+        st.warning('Invalid OpenAI API Key. Please enter a valid key.')
+        
 # Store LLM generated responses
 if "conversations" not in st.session_state.keys():
     st.session_state.conversations = [{"title": "Default Conversation", "messages": [{"role": "assistant", "content": "How may I assist you today?"}]}]
 
-# Chat history on the left
-st.sidebar.subheader('Chat History')
-
-# Button to load previous conversations
-if st.sidebar.button('Load Previous Conversations'):
-    st.sidebar.text('Select a conversation to open:')
-    selected_conversation = st.sidebar.selectbox('', range(len(st.session_state.conversations)), format_func=lambda x: st.session_state.conversations[x]["title"])
-
-    # Display the selected conversation
-    conversation = st.session_state.conversations[selected_conversation]
-    for message in conversation["messages"]:
-        with st.expander(conversation["title"]):
-            with st.chat_message(message["role"]):
-                st.write(message["content"])
-
 # Display or clear chat messages
 for conversation in st.session_state.conversations:
     for message in conversation["messages"]:
@@ -52,7 +51,7 @@
 
 def clear_chat_history():
     st.session_state.conversations = [{"title": "Default Conversation", "messages": [{"role": "assistant", "content": "How may I assist you today?"}]}]
-st.sidebar.button('Clear Chat History', on_click=clear_chat_history)
+#st.sidebar.button('Clear Chat History', on_click=clear_chat_history)
 
 
 # User-provided prompt
@@ -78,7 +77,6 @@ def clear_chat_history():
             with st.spinner("Generating..."):
                 response = RAG.run(prompt)
                 end_time = time.time()  # to calculate the time taken to generate the response
-
                 placeholder = st.empty()
                 full_response = ''
                 for item in response: