Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: chat bot foundations with flow retrieval/analysis #550

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

huntergregory
Copy link
Contributor

@huntergregory huntergregory commented Jul 18, 2024

Description

Building blocks for an LLM-powered app that retrieves and analyzes network flow logs (and eventually other data sources).

Steps

  1. Retrieve Hubble flow logs from a cluster with Retina:
    a. Port forward hubble-relay service.
    b. Get flows via gRPC connection.
  2. Parse/format the data.
  3. Prompt an LLM to analyze.

Architecture

All files are in the ai/ folder.

  • main.go acts as the current entry point.
  • chat/ package will run a chat loop:
    • Get user input.
    • Choose the scenario to run via LLM call.
    • Run the scenario with parameters provided by LLM.
  • scenarios/ package holds scenarios such as "drops" and "dns". Each scenario has:
    • A definition to help decide when to choose this scenario.
    • Logic to run the scenario.
  • retrieval/ package holds logic for gathering data.
  • lm/ package provides an abstraction for a language model.

Example Output

~/retina/ai$ go run main.go
INFO[0000] starting app...
INFO[0000] retrieved kubeconfig and clientset
INFO[0000] initialized Azure OpenAI model
INFO[0000] handling drops scenario...                    component=chat scenario=drops
INFO[0000] initialized grpc client                       component=flow-retriever
INFO[0018] stopped port-forward                          component=flow-retriever
INFO[0018] saving flows to JSON                          component=flow-retriever
INFO[0019] observed flows                                component=chat scenario=drops
INFO[0028] analyzed flows                                component=chat scenario=drops
Based on the provided "summary of network flow logs," the primary issue appears to be a significant number of dropped connections involving Pods with the prefix "kapinger-bad." Here are the key observations:

1. **Dropped Connections**:
   - Multiple connections from "kapinger-bad" Pods to "kapinger-good" Pods are being dropped. Examples include:
     - `kapinger-bad-7778f55bf8-xhkr5 -> kapinger-good-8468b88556-ccjf5`
     - `kapinger-bad-7778f55bf8-rwqlw -> kapinger-good-8468b88556-8gml7`
     - `kapinger-bad-7778f55bf8-4z6bv -> kapinger-good-8468b88556-ccjf5`
     - `kapinger-bad-7778f55bf8-qdx22 -> kapinger-good-8468b88556-88l77`
     - And many more similar connections.

2. **Successful Connections**:
   - There are successful connections between "kapinger-bad" and "kapinger-good" Pods, indicating that not all traffic is being dropped. However, the presence of dropped connections suggests intermittent issues or specific network policies affecting certain flows.

5. **Network Policies**:
   - The dropped connections might be due to network policies that are selectively allowing or denying traffic between certain Pods. It would be beneficial to review the network policies applied to the "kapinger-bad" and "kapinger-good" namespaces or Pods.

6. **Pod-to-Pod Communication**:
   - Ensure that the network policies, if any, are correctly configured to allow the necessary traffic between the "kapinger-bad" and "kapinger-good" Pods.

7. **Firewall Rules**:
   - Check for any firewall rules or security groups that might be affecting the traffic between these Pods.

8. **Resource Constraints**:
   - Verify if there are any resource constraints or issues on the nodes hosting the "kapinger-bad" Pods that might be causing network drops.

In summary, the primary issue is the dropped connections between "kapinger-bad" and "kapinger-good" Pods. Reviewing and adjusting network policies, firewall rules, and resource allocations should help resolve these connectivity issues.

Related Issue

#439

Checklist

  • I have read the contributing documentation.
  • I signed and signed-off the commits (git commit -S -s ...). See this documentation on signing commits.
  • I have correctly attributed the author(s) of the code.
  • I have tested the changes locally.
  • I have followed the project's style guidelines.
  • I have updated the documentation, if necessary.
  • I have added tests, if applicable.

@huntergregory huntergregory force-pushed the huntergregory/ai branch 6 times, most recently from 5c90807 to 1aff759 Compare July 19, 2024 17:49
@huntergregory huntergregory changed the title feat: automated data retrieval and analysis feat: chat bot foundations with flow retrieval/analysis Aug 5, 2024
@huntergregory huntergregory marked this pull request as ready for review August 5, 2024 21:37
@huntergregory huntergregory requested a review from a team as a code owner August 5, 2024 21:37
Copy link

github-actions bot commented Sep 6, 2024

This PR will be closed in 7 days due to inactivity.

@github-actions github-actions bot added the meta/waiting-for-author Blocked and waiting on the author label Sep 6, 2024
@huntergregory huntergregory removed the meta/waiting-for-author Blocked and waiting on the author label Sep 6, 2024
Copy link

github-actions bot commented Oct 6, 2024

This PR will be closed in 7 days due to inactivity.

@github-actions github-actions bot added the meta/waiting-for-author Blocked and waiting on the author label Oct 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/data-ingestion-and-visualization meta/waiting-for-author Blocked and waiting on the author type/enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant