-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathIntro
105 lines (87 loc) · 6.04 KB
/
Intro
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
I believe this project showcases commitment to technological innovation and dedication to making a positive impact on society, particularly for young minds.
Project Overview:
The project revolves around harnessing the power of artificial intelligence to create a safe and engaging bedtime story experience for children. The primary goals of this project are as follows:
- Adversarial AI Control: Developing sophisticated algorithms and machine learning models to detect and mitigate adversarial attacks on AI systems. This ensures that the content delivered to children is safe and appropriate.
- Interactive Storytelling: Creating an interactive and immersive storytelling platform that adapts to the child's preferences, age, and interests, offering a personalized and magical bedtime experience.
- Child Safety: Implementing robust safety measures to protect children from harmful or inappropriate content while maintaining a fun and enjoyable experience.
Technological Innovation:
To achieve these objectives, we are leveraging cutting-edge technologies such as natural language processing (NLP), computer vision, deep learning, and reinforcement learning. It also involves a multi-step methodology that includes data collection, adversarial AI detection, and the development of a storytelling model.
Methodology:
Data Collection and Preprocessing-
- Utilize web scraping and curated databases to collect a wide variety of children's stories.
- Clean and preprocess the data, removing any inappropriate or adversarial content.
- Anonymize and tokenize the text for model training. (Built-in tokenizer)
Adversarial AI Detection-
- Train machine learning models to detect anomalous patterns in AI-generated content integrating Isolation Forest, One-Class SVM, or autoencoders.
- Implement real-time monitoring and alert systems to respond to adversarial attempts with implementation of feedback loop to investigate flagged content.
- Develop countermeasures to neutralize adversarial AI effects.
Storytelling Model Development-
- Implement a neural network-based architecture such as RNN or transformers for generating bedtime stories.
- Fine-tune the model using the preprocessed dataset on filters and constraints with regards to ensure quality and safety.
- Continuously evaluate and improve the model's storytelling capabilities with loss functions and optimization techniques.
Diffusion processing : forward and reverse processing
This project demonstrates technical prowess and addressing complex challenges in the field of artificial intelligence which is ethics and the practice of using AI.
Features of security:
- RLHF
- prediction of outcomes (vurneability detection)
- display restrictions(data abstractions)
- find trends on dataset
- track back of a neural network, find the agent and kill it. ( variable o/p possibility ; set of entire outcomes is completetely erased when killing a neuron in the network )
- ensemble model : --
security on AI (research documentation):
- Compartmentalizing AI Process: seperating and restricitng AI workspaces
- zero trust model: verfiying system user continously, authontication and authorization
- Behaviour analysis / sentimental analysis
- machine learning for anomaly detection
Detailed explanation:
Importing clean dataset, training with carefully curated dataset
Data augumentation - small perturbation for model resilience
training on both clean and unclean language
checking input for anomalies or malicios content -
limiting the input
RLHF - reniforcement learning with human feedback
Revised explination:
AI model trained on adversarial inputs and perplexity filtering, scans excisting business AI mdoels and checks for vulnerabilities and fixes them.
all calls of inputs for the business AI goes through the excisting APi for the base model
providing security infrastructure all through out
ARCHITECTURE:
{ First layer of neural network : using FAT ( Friendly adversarial training ) architecture with filtering inputs and outliers removal.
Security Layer:
Model behaviors testing and Sentimental analysis. - mid model
Differential Privacy for adding controlled noise, encrypting input data,
Last layer : Content filtering and regenerating , restrictions based on age.}
Regulations and Standards to be covered:
- ISO / IEC 27001 : Information securoty management system
- ISO / IEC FDIS 5338, ISO / IEC / IEEE 12207, ISO / IEC/ IEEE 15288 : Software lifecycle process ( broadly can be applicable for AI with few modifications )
- SAMM : Software assurance maturity model
-
extensive thoughts:
AI rely almost on data, a rely factor on any aspect is always a vurnability. Why not AI create its own data for training.
just give out the real world parameters and situations with which it can generate all possible outcomes and set it up as its own data then the mother AI can train its own sub-models.
Controductive thought: If security is implemented to aviod bad decsions all together, doesn't some good things happen with bad decisions.
(refer note 1)
substintianal conclusion: bad desicions come with bad data
Product Furture overview:
Controlling the predictions outcomes of a AI model, similar to the security features implemented in a gen llm model.
talk with pediatrician on development of children’s mental capabilities
ARCH :
'''mermaid
graph TD;
A-->B;
B-->C;
'''
reference:
https://llm-attacks.org/zou2023universal.pdf
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3499858/
https://www.hindawi.com/journals/scn/2023/8691095/
https://towardsdatascience.com/adversarial-attacks-in-machine-learning-and-how-to-defend-against-them-a2beed95f49c
https://llm-attacks.org/
https://arxiv.org/pdf/2303.04381.pdf
https://en.wikipedia.org/wiki/Ensemble_learning
https://www.geeksforgeeks.org/top-10-python-libraries-for-cybersecurity/
https://www.bsi.bund.de/SharedDocs/Downloads/EN/BSI/KI/Security-of-AI-systems_fundamentals.pdf?__blob=publicationFile&v=4
https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/
https://arxiv.org/pdf/2203.02155
https://arxiv.org/pdf/2309.00614
https://arxiv.org/abs/2303.12712
https://github.com/niconi19/LLM-Conversation-Safety/tree/main