secure-ai-tooling/risk-map/tables/components-full.md at main · cosai-oasis/secure-ai-tooling

id	title	description	category	subcategory	edges
componentAgentInputHandling	Input Handling	An agent’s interaction with the world begins at the User or Application, which serves as the interface for collecting both explicit user instructions and passively collected contextual data from its environment. This blend of inputs creates a primary security challenge of reliably distinguishing trusted commands from the controlling user versus potentially untrusted information from other sources. An agent application processes explicit user instructions, which can be given directly (synchronously) like a typed command, or be configured to execute automatically when a specific event occurs (asynchronously). It also gathers implicit contextual inputs—data that isn’t a direct command but is passively collected from the environment, such as sensor readings, application state, tool call responses or the content of recently opened documents. Input Handling is is responsible for processing and understanding these inputs before they are sent to the agent’s reasoning core. This handoff is a critical security juncture, as the perception layer must reliably distinguish trusted user commands from untrusted data to prevent manipulation of the agent’s core logic.	componentsApplication	componentsAgent	To: componentReasoningCore From: componentAgentUserQuery componentAgentSystemInstruction componentTheModel componentApplication
componentAgentOutputHandling	Output Handling	The final step in an agent’s workflow is response rendering, the process or formatting of an AI agent’s generated output for display and interaction within a user application. This stage is a critical security boundary because it involves taking dynamic content from the agent and displaying it within the trusted context of a user’s application, such as a web browser or mobile application. Flaws in this process can allow malicious content generated by a compromised agent to be executed by the application, leading to significant security breaches. Agents often produce content in a universal format like Markdown, which is then interpreted and rendered by the specific client application. If this output isn’t properly sanitized according to the content type, it can create severe vulnerabilities. For example, unsanitized output can enable data exfiltration when content-embedded resources like images are automatically loaded by the application, implicitly passing sensitive information to an attacker's server via the resource's URL. Similarly, improper sanitization can lead to cross-site scripting (XSS) attacks.	componentsApplication	componentsAgent	To: componentApplication componentTheModel From: componentReasoningCore
componentAgentSystemInstruction	Agent System Instructions	These define an agent’s capabilities, permissions, and limitations, such as the actions it can take and the tools it is allowed to use. For security, it’s critical to unambiguously separate these instructions from user data and other inputs, often using special control tokens as a defense against prompt injection attacks.	componentsApplication	componentsAgent	To: componentAgentInputHandling
componentAgentUserQuery	Agent User Query	These contain the specific details of a user’s request after being processed. The query is then combined with system instructions and other contextual data, like agent memory or external information, to create a single, structured prompt for the reasoning core to process.	componentsApplication	componentsAgent	To: componentAgentInputHandling
componentApplication	Application	The application, product, or feature that uses an AI model for functionality. These applications might be directly user-facing, as in the case of a customer service chatbot, or the “user” might be a service within an organization, querying the model to power an upstream process. If an application has the ability to execute tools on behalf of its user, it is sometimes referred to as an Agent.	componentsApplication	componentsApplicationCore	To: componentApplicationOutputHandling componentAgentInputHandling From: componentApplicationInputHandling componentAgentOutputHandling
componentApplicationInputHandling	Input Handling	Input handling components filter, sanitize, and protect against potentially malicious inputs, whether from a user or more generally from anything outside the trusted system. Input handling acts as a control against numerous risks and is an area ripe for more research and development.	componentsApplication	componentsApplicationCore	To: componentApplication From: componentTheModel
componentApplicationOutputHandling	Output Handling	Similar to input handling, output handling components filter, sanitize, and protect against unwanted, unexpected or dangerous outputs from a model. Output handling is a major line of defense against various risks and an area primed for more development.	componentsApplication	componentsApplicationCore	To: componentTheModel From: componentApplication
componentDataFilteringAndProcessing	Data Filtering and Processing	The processes of cleaning, transforming, and preparing raw data from various sources to make it suitable for training. This may include labeling data, removing duplicates or errors, and even generating new synthetic data to enhance the model's learning.	componentsInfrastructure	componentsData	To: componentTrainingData From: componentDataSources
componentDataSources	Data Sources	The original sources or repositories from which data is gathered for potential use in training an AI model. These can include databases, APIs, web scraping, or even sensor data. The quality and diversity of data sources significantly impact the model's capabilities.	componentsInfrastructure	componentsData	To: componentDataFilteringAndProcessing
componentDataStorage	Data Storage Infrastructure	Storage for training data. Training data is stored from ingestion through filtering and usage during training.	componentsInfrastructure	componentsData	To: componentModelTrainingTuning From: componentTrainingData
componentMemory	Model Memory	Memory allows a model or agent to retain context and learn facts across interactions. Memory implementations may result in additional security and data risk exposures requiring mitigation. Examples include: Persistent attacks: Malicious data stored in memory could enable ongoing attacks against a user. Data loss through improper isolation: Sensitive data may be disclosed due to inadequate memory isolation between different users and contexts. Unintended data transfer: Sensitive information may be inappropriately transferred to third parties through undesired or unexpected tool calls.	componentsModel	componentsOrchestration	To: componentOrchestrationOutputHandling From: componentOrchestrationInputHandling
componentModelEvaluation	Model Evaluation	The process of testing the model against new data to see how well it performs (evaluation). Evaluation happens in two stages: during the training process, when each checkpointed update to the model is evaluated, and after the model is trained, to assess how well it performs at its intended purpose.	componentsModel	componentsModelTraining	To: componentModelTrainingTuning From: componentTheModel
componentModelFrameworksAndCode	Model Frameworks and Code	The code and frameworks necessary to train and use a model. Model code defines the model architecture and number and types of layers in the model. Framework code implements the steps for each layer to train and evaluate the model. The framework code is generally necessary not just for training a model, but also required to run inferences (i.e., make predictions) when the model is in use. Usually framework code is shipped separately from the model itself and needs to be installed to use the model.	componentsModel	componentsModelTraining	To: componentModelTrainingTuning
componentModelServing	Model Serving Infrastructure	The systems and process to deploy a model in production, making them available for services and applications. Note: Many model consumers use remote models served via API. Those that serve their own models, though, should consider the same Model Serving risks that apply to model creators.	componentsInfrastructure	componentsModelDeployment	To: componentTheModel
componentModelStorage	Model Storage	Storage for the model. Model storage refers to multiple stages in the development process: Local storage during training, in which each checkpoint is stored until overwritten. Published storage, after training is completed and the model is uploaded to a model hub (a centralized model repository). Note: Many model consumers use remote models served by API. Those model consumers that store models themselves, though, should consider the same Model Storage risks that apply to model creators.	componentsInfrastructure	componentsModelDeployment	To: componentTheModel
componentModelTrainingTuning	Training and Tuning	The process of teaching a model to extract the correct patterns and inferences from data by adjusting the probability of a given outcome (training) and adjusting a smaller set of probabilities to tune a model to a specific task (tuning). Given the enormous cost of training, many model creators take a preexisting model and tune it to their needs, by focusing only on the training related to a specific type of task.	componentsModel	componentsModelTraining	To: componentTheModel From: componentModelEvaluation componentModelFrameworksAndCode componentDataStorage
componentOrchestrationInputHandling	Input Handling	Orchestration input handling is responsible for validating, sanitizing, and normalizing all data entering the system from external sources before it reaches core orchestration logic.	componentsModel	componentsOrchestration	To: componentTools componentMemory componentRAGContent From: componentTheModel componentReasoningCore
componentOrchestrationOutputHandling	Output Handling	Orchestration output is responsible for validating, sanitizing, and safely formatting data as it exits the system to external or downstream components. This control ensures outbound data meets defined schemas, strips sensitive information that shouldn't be exposed, prevents injection attacks by properly encoding outputs for their destination context (such as HTML encoding for web responses or parameterization for database queries), and enforces data classification policies.	componentsModel	componentsOrchestration	To: componentReasoningCore componentTheModel From: componentTools componentMemory componentRAGContent
componentRAGContent	Retrieval Augmented Generation & Content	Content for Retrieval-Augmented Generation (RAG) provides the agent with curated knowledge to ground its responses and improve accuracy. The main security risk is data poisoning, where an attacker corrupts this knowledge source to manipulate the agent's output.	componentsModel	componentsOrchestration	To: componentOrchestrationOutputHandling From: componentOrchestrationInputHandling
componentReasoningCore	Agent Reasoning Core	The core of an agent’s functionality is its ability to reason about a user’s goal and create a plan to achieve it. The reasoning core processes system instructions, user queries, and contextual information to generate a sequence of actions. The actions, or tool calls, allow the agent to affect the real world—interacting with external systems, retrieving new information, or making changes to data and resources. The reasoning core typically consists of one or more models—possibly separate models for the reasoning and then planning steps, or potentially one large model able to do both. The process of planning is often iterative, taking place in a “reasoning loop” where the plan is refined based on new information or the results of previous actions. This iterative nature, combined with the ingestion of external data, creates a vulnerability to indirect prompt injection, where adversarially crafted information can manipulate the agent's planning process. The complexity of plans determines the agent’s level of autonomy, which can range from selecting a predefined workflow to dynamically orchestrating multi-step actions. This level of autonomy directly governs the potential severity of a security failure—the more an agent can do on its own, the greater the risk from manipulation or misalignment. This risk can be, at least partially, mitigated through guardrails that constrain actions taken by an agent; for example, by making certain actions subject to mandatory user confirmation. However, this can in turn result in limitations on the agent's autonomy, thus requiring a careful tradeoff between autonomy and security.	componentsApplication	componentsAgent	To: componentAgentOutputHandling componentOrchestrationInputHandling From: componentAgentInputHandling componentOrchestrationOutputHandling
componentTheModel	The Model	A pairing of code and weights, created with data during a training process. In the CoSAI Risk Map, the model is represented as the result of the output of the Data Components being trained, stored, and served using the Infrastructure Components. A model is ultimately useful when deployed in applications, using Application Components.	componentsModel	componentsModelCore	To: componentModelEvaluation componentAgentInputHandling componentApplicationInputHandling componentOrchestrationInputHandling From: componentModelTrainingTuning componentModelServing componentModelStorage componentAgentOutputHandling componentApplicationOutputHandling componentOrchestrationOutputHandling
componentTools	External Tools and Services	Tools are the external APIs and services an agent uses to take action in the world, which must be secured with least-privilege permissions. A key risk comes from deceptive descriptions on third-party tools, which can trick the agent into performing unintended, harmful functions.	componentsModel	componentsOrchestration	To: componentOrchestrationOutputHandling From: componentOrchestrationInputHandling
componentTrainingData	Training Data	The final, curated subset of data that is fed into the AI model during the training process. This data is used to adjust the model's internal parameters, enabling it to learn patterns and make predictions or inferences.	componentsInfrastructure	componentsData	To: componentDataStorage From: componentDataFilteringAndProcessing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

components-full.md

Latest commit

History

components-full.md

File metadata and controls