Skip to content

Commit 8522acc

Browse files
authored
Merge pull request #57 from amosproj/feature-branch
UI Refinement, Efficient Testing, and Documentation
2 parents 6ce553e + 0b17d52 commit 8522acc

4 files changed

Lines changed: 130 additions & 30 deletions

File tree

Queries Classification

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
Testing Queries (Based on the Bot answers):
2+
3+
We have worked on catalog queries, found in the file pairs.py and classified them into four levels:
4+
5+
Level 1: Easy - Approximately the exact same code.
6+
Level 2: Medium - More than 50% of lines are the same.
7+
Level 3: Hard - Less than 50% similarity but with the same functionality.
8+
Level 4: Undetermined - Sometimes the bot provides code (not necessarly correct), sometimes not.
9+
10+
11+
12+
Easy level queries :
13+
14+
Q1: I would like to use RTDIP components to read from an eventhub using 'connection string' as the connection string, and 'consumer group' as the consumer group, transform using binary to string, and edge x transformer then write to delta.
15+
16+
Q18: Read customer purchase history from a Parquet file, perform a customer segmentation analysis, and save the segments to a Delta Lake.
17+
18+
19+
Medium level queries :
20+
21+
Q2: I need to read data from Kafka using a specific bootstrap server and topic, then apply a JSON parser, and finally write the results to a Hive table.
22+
23+
Q9: Load sales data from an FTP (file transfer protocol) server, perform currency conversion, and append the results to an existing Parquet file.
24+
25+
Q14: Access weather data stored in an HDFS cluster, normalize temperature readings, and store the results in an Elasticsearch index.
26+
27+
Q19: Aggregate financial transaction data from a SQL database, calculate the monthly average transaction amount, and store the results in a Delta Lake.
28+
29+
Q20 : Fetch log data from an Elasticsearch index, filter logs with error severity, and archive them in a Delta Lake. ,
30+
31+
32+
33+
Hard level queries :
34+
35+
Q3: Fetch sensor data from an Azure Blob Storage in CSV format, aggregate the data on sensor ID, and save it to a SQL database.
36+
37+
Q4: Stream data from a MQTT broker, filter out readings below a threshold value, and store the data in Elasticsearch
38+
39+
Q6: Retrieve temperature data from a REST API, normalize the data, and write it into a MongoDB collection.
40+
41+
Q7: Connect to a Google Cloud Storage, download logs in JSON format, conduct sentiment analysis, and then store the results in a Google BigQuery table.
42+
43+
Q8: Stream Twitter data using API credentials, extract hashtags from tweets, and save the data into a Cassandra database.
44+
45+
Q10: Connect to an IoT device using MQTT protocol, apply a low-pass filter to sensor readings, and upload the filtered data to an InfluxDB instance.
46+
47+
Q12: Aggregate temperature and humidity data from a CSV file stored in an Azure Data Lake, calculate average values per day, and upload to a Snowflake database.
48+
49+
Q13: Extract stock market data from a REST API, calculate moving averages, and save the data in an Amazon Redshift cluster.
50+
51+
Q16: Stream social media data from a JSON file, deduplicate the entries based on user ID, and store the results in a Delta Lake.
52+
53+
Q21: Read weather data from a RESTful API, convert temperature from Celsius to Fahrenheit, and store the results in a JSON file.
54+
55+
Q22: Connect to a MySQL database, retrieve order data, group by product category, and insert the grouped data into a new table.
56+
57+
Q23: Extract text data from a series of PDF files stored in an SFTP server, perform named entity recognition, and index the entities in an Apache Solr collection.
58+
59+
60+
Undetermined level queries :
61+
62+
Q5: Import financial data from an S3 bucket in Parquet format, apply a standard scaler transformation, and then upload it to a Redshift database.
63+
64+
Q11: Read customer feedback from a Google Sheets document, apply sentiment analysis, and store the results in a PostgreSQL database for further analysis.
65+
66+
Q15: Load sales records from a MongoDB collection, filter out records with sales below $500, and export the data to a CSV file.
67+
68+
Q17: Load IoT sensor data from a CSV file, apply a smoothing filter to the readings, and write to a Delta Lake for time-series analysis.

README.md

Lines changed: 36 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ blocks to perform the data processing. Chat history and generated output should
1313
## Meet the Team
1414
Scrum Master: [@SaraElBrak](https://github.com/SaraElBrak)
1515
Product Owners: [@AviKatziuk](https://github.com/AviKatziuk), [@ceciliabetb](https://github.com/ceciliabetb)
16-
Software Developerss: [@lyndanajjar](https://github.com/lyndanajjar), [@bergzain](https://github.com/bergzain), [@Obismadi99](https://github.com/Obismadi99), [@Nahrain1](https://github.com/Nahrain1)
16+
Software Developers: [@lyndanajjar](https://github.com/lyndanajjar), [@bergzain](https://github.com/bergzain), [@Obismadi99](https://github.com/Obismadi99), [@Nahrain1](https://github.com/Nahrain1)
1717

1818
The planning document of the team is found [here](https://docs.google.com/spreadsheets/d/1m1z2m_p6k0ATw0RVNXJMbDp-RrOOPxpu0c3PPCtrwBI/edit#gid=6)
1919

@@ -22,17 +22,51 @@ Ensure that you have installed:
2222

2323
* Python version 3.11 or higher
2424
* Docker Desktop
25+
* OpenAI API Key
2526

26-
To install the required dependencies for this projecdct, please create a virtual environment and run:
27+
## Build Process
28+
To get started with this project, follow these steps:
29+
30+
* Clone the github repository
31+
```
32+
git clone https://github.com/amosproj/amos2023ws05-pipeline-config-chat-ai.git
33+
```
34+
* Install Dependencies
35+
To set up the required dependencies, create a virtual environment and run the following command:
2736
```
2837
pip install -r requirements.txt
2938
```
39+
* Run the Application with Docker
40+
Navigate to the `src` folder by using:
3041

42+
```
43+
cd src
44+
```
45+
Ensure that the Docker daemon is running before proceeding. Follow these steps to run the application:
46+
Step 1: Build the Docker Image
47+
Execute the following command to build the Docker image. Replace `<your-image-name>` with your chosen image name:
3148

49+
```
50+
docker build -t <your-image-name> .
51+
```
52+
Step 2: Run the Docker Container
53+
Launch the container using the following command, specifying port 8501 for Streamlit:
3254

55+
```
56+
docker run -dp 8501:8501 <your-image-name>
57+
```
58+
59+
Once the container is successfully running, access the application by clicking the link displayed in the `Ports` column within the Docker Desktop interface. This link corresponds to the port mapping configured during the container's launch and serves as the entry point to interact with the application.
60+
61+
_Note: Remember to replace <your-image-name> with the actual name you assigned to your Docker image._
3362

63+
* Accessing the Chatbot Application
3464

65+
Open your web browser and navigate to the presented link. The Chatbot application will be displayed, prompting you to input your OpenAI API Key. Engage in conversations by posing RTDIP-oriented questions and explore the capabilities of the application.
3566

67+
<div style="text-align:center">
68+
<img src="https://github.com/amosproj/amos2023ws05-pipeline-config-chat-ai/raw/feature-branch/UI.png" alt="UI" width="500"/>
69+
</div>
3670

3771

3872

UI.png

37.8 KB
Loading

src/ChatUI_streamlit/app.py

Lines changed: 26 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -5,45 +5,44 @@
55
import os
66
import time
77

8+
9+
810
# App title
911
if 'page_config_set' not in st.session_state:
10-
st.set_page_config(page_title="RTDIP PipeLine Chatbot")
12+
st.set_page_config(page_title="RTDIP Pipeline Chatbot")
1113
st.session_state['page_config_set'] = True
1214

15+
# Use HTML/CSS to position the title and GitHub link on the same line
16+
st.markdown(
17+
'''
18+
<div style="display: flex; justify-content: space-between; align-items: center;">
19+
<div style="margin-top: -70px; margin-left: -180px;"><h2>RTDIP Pipeline Chatbot</h2></div>
20+
<div style="margin-top: -70px; "><a href="https://github.com/rtdip/core/tree/develop"><img src="https://img.shields.io/badge/GitHub-Repo-blue?logo=github"></a></div>
21+
</div>
22+
''', unsafe_allow_html=True)
23+
24+
1325
# Replicate Credentials
14-
with st.sidebar:
15-
st.title('RTDIP Pipeline Generation Chatbot')
16-
openai_api_key = st.text_input('Enter OpenAI API Key:', type='password')
26+
api_key_container = st.empty()
27+
openai_api_key = api_key_container.text_input('Enter OpenAI API Key:', type='password')
1728

1829
# Check if OpenAI API Key is entered
1930
if openai_api_key:
20-
# Store the API key in the session state or environment variable
21-
st.session_state['OPENAI_API_KEY'] = openai_api_key
22-
os.environ['OPENAI_API_KEY'] = openai_api_key
23-
st.success('API Key stored!')
31+
# Store the API key in the session state
32+
st.session_state['OPENAI_API_KEY'] = openai_api_key
33+
os.environ['OPENAI_API_KEY'] = openai_api_key
34+
success_message = st.success('API Key stored!')
35+
# Hide success message, input field, and chat messages after 3 seconds
36+
time.sleep(0)
37+
success_message.empty()
38+
api_key_container.empty()
2439
else:
25-
st.warning('Please enter your OpenAI API Key to proceed.')
26-
27-
40+
st.warning('Invalid OpenAI API Key. Please enter a valid key.')
41+
2842
# Store LLM generated responses
2943
if "conversations" not in st.session_state.keys():
3044
st.session_state.conversations = [{"title": "Default Conversation", "messages": [{"role": "assistant", "content": "How may I assist you today?"}]}]
3145

32-
# Chat history on the left
33-
st.sidebar.subheader('Chat History')
34-
35-
# Button to load previous conversations
36-
if st.sidebar.button('Load Previous Conversations'):
37-
st.sidebar.text('Select a conversation to open:')
38-
selected_conversation = st.sidebar.selectbox('', range(len(st.session_state.conversations)), format_func=lambda x: st.session_state.conversations[x]["title"])
39-
40-
# Display the selected conversation
41-
conversation = st.session_state.conversations[selected_conversation]
42-
for message in conversation["messages"]:
43-
with st.expander(conversation["title"]):
44-
with st.chat_message(message["role"]):
45-
st.write(message["content"])
46-
4746
# Display or clear chat messages
4847
for conversation in st.session_state.conversations:
4948
for message in conversation["messages"]:
@@ -52,7 +51,7 @@
5251

5352
def clear_chat_history():
5453
st.session_state.conversations = [{"title": "Default Conversation", "messages": [{"role": "assistant", "content": "How may I assist you today?"}]}]
55-
st.sidebar.button('Clear Chat History', on_click=clear_chat_history)
54+
#st.sidebar.button('Clear Chat History', on_click=clear_chat_history)
5655

5756

5857
# User-provided prompt
@@ -78,7 +77,6 @@ def clear_chat_history():
7877
with st.spinner("Generating..."):
7978
response = RAG.run(prompt)
8079
end_time = time.time() # to calculate the time taken to generate the response
81-
8280
placeholder = st.empty()
8381
full_response = ''
8482
for item in response:

0 commit comments

Comments
 (0)