Technical documentation for managing the Job Finding Assistant data architecture.
The system uses two JSON files with distinct responsibilities:
| File | Purpose | Modified By | Audience |
|---|---|---|---|
job_search_knowledge_base.json |
User-specific career data | Users & AI Assistants | Job seekers |
ai_assistants_system_config.json |
System behavior configuration | System administrators | AI engineers |
File: job_search_knowledge_base.json
Purpose: Stores personal career information that personalizes AI responses
{
"metadata": {
"name": "Job Finding Assistant Knowledge Base",
"version": "1.2",
"last_updated": "ISO-8601 timestamp"
},
"user_profile": {
"basic_info": {}, // Stage 1: Career Coach writes
"social_media_links": {}
},
"career_objectives": {}, // Stage 1: Career Coach writes
"personal_brand": {}, // Stage 2: Personal Brand writes
"user_personality": {}, // Stage 2: Personal Brand writes
"go_to_market_strategy": {}, // Stage 3: Market Positioning writes
"website_configuration": {} // Stage 4A: Website Generator writes
}File: ai_assistants_system_config.json
Purpose: Defines assistant behavior, workflows, and standards
{
"metadata": {},
"workflow_architecture": {
"stages": [] // Defines 5-stage workflow
},
"knowledge_base_permissions": {
// Read/write matrix per assistant
},
"communication_standards": {
// Templates and guidelines
},
"platform_constraints": {
// Platform-specific limits
}
}When: Section doesn't exist
Who: Authorized assistant per permission matrix
How:
def create_section(kb_data, section_name, content):
if section_name not in kb_data:
kb_data[section_name] = content
kb_data['metadata']['last_updated'] = datetime.now().isoformat()
return kb_dataWhen: Every assistant initialization
Who: All assistants (read permissions vary)
How:
def read_knowledge_base(file_path):
try:
with open(file_path, 'r') as f:
return json.load(f)
except FileNotFoundError:
return None # Trigger conversational modeWhen: Assistant completes data gathering
Who: Only authorized assistants
How:
def update_section(kb_data, section_name, updates):
if has_write_permission(current_assistant, section_name):
kb_data[section_name].update(updates)
kb_data['metadata']['last_updated'] = datetime.now().isoformat()
return kb_dataPolicy: No deletion, only updates
Reason: Preserve audit trail and user data
| Assistant | Stage | Read Permissions | Write Permissions |
|---|---|---|---|
| Career Coach | 1 | user_profile, career_objectives |
user_profile.basic_info, career_objectives |
| Personal Brand Development | 2 | All sections | personal_brand, user_personality |
| Market Positioning | 3 | All sections | go_to_market_strategy |
| Website Generator | 4A | All sections | website_configuration |
| Job Application & Interview | 4B | All sections | None (read-only) |
| Professional Networking | 4C | All sections | None (read-only) |
def validate_knowledge_base(kb_data):
required_fields = ['metadata', 'user_profile']
# Check required fields
for field in required_fields:
if field not in kb_data:
raise ValidationError(f"Missing required field: {field}")
# Validate metadata
if 'version' not in kb_data['metadata']:
kb_data['metadata']['version'] = '1.1'
# Validate date format
try:
datetime.fromisoformat(kb_data['metadata'].get('last_updated', ''))
except:
kb_data['metadata']['last_updated'] = datetime.now().isoformat()
return kb_dataSCHEMA = {
'user_profile': {
'basic_info': {
'name': str,
'email': str,
'primary_location': str
}
},
'career_objectives': {
'objectives_by_category': dict,
'timeline_constraints': dict
},
'website_configuration': {
'last_updated': (str, type(None)),
'target_platform': (str, type(None)),
'design_preferences': dict,
'content_sections': dict,
'customizations': dict,
'generated_websites': list
}
}For platforms with file access (e.g., OpenAI GPTs):
# Direct file operations
kb = load_json('job_search_knowledge_base.json')
config = load_json('ai_assistants_system_config.json')For platforms without file access:
# Conversational state management
kb_state = request_from_user("Please paste your knowledge base")
config = load_default_config()- Lock-free reads: Multiple assistants can read simultaneously
- Sequential writes: Only one assistant writes per session
- Version tracking: Use
last_updatedfor conflict detection - Merge strategy: Latest write wins with user confirmation
Purpose: Stores website design preferences and platform selections for portfolio website generation.
Structure:
{
"website_configuration": {
"description": "Website design preferences and platform selections",
"last_updated": "2025-10-01T12:00:00Z",
"target_platform": "Notion|Eleventy|Jekyll|Astro",
"design_preferences": {
"color_scheme": "professional|modern|creative",
"layout_style": "minimalist|detailed|storytelling",
"content_focus": "technical|business|balanced"
},
"content_sections": {
"hero": true,
"mission_vision": true,
"value_proposition": true,
"skills": true,
"projects": true,
"contact": true
},
"customizations": {
"featured_projects": ["Project 1", "Project 2"],
"highlighted_skills": ["Skill 1", "Skill 2"],
"industry_focus": "Healthcare|FinTech|AI"
},
"generated_websites": [
{
"platform": "Notion",
"generated_date": "2025-10-01T12:00:00Z",
"url": "https://notion.site/...",
"version": "1.0"
}
]
}
}Access Control:
- Read: All assistants (especially Stage 4B/4C for including website links)
- Write: Only Website Generator (Stage 4A)
- Scope: Limited to
website_configurationsection only - NEVER modifiesgo_to_market_strategy,personal_brand, or other sections
Safe Operations:
def update_website_config(kb_data, config_updates):
"""Safely update website configuration"""
# Validate assistant has permission
if current_assistant != 'website_generator':
raise PermissionError("Only Website Generator can modify website_configuration")
# Update only website_configuration section
kb_data['website_configuration'].update(config_updates)
kb_data['website_configuration']['last_updated'] = datetime.now().isoformat()
# Preserve all other sections unchanged
return kb_dataValidation Rules:
target_platformmust be one of: "Notion", "Eleventy", "Jekyll", "Astro", or nulldesign_preferencesvalues must match predefined optionscontent_sectionsvalues must be booleangenerated_websitesmust be a list of objects with required fieldslast_updatedmust be ISO-8601 format or null
SENSITIVE_FIELDS = [
'user_profile.basic_info.email',
'user_profile.basic_info.phone',
'career_objectives.financial'
]
def sanitize_for_sharing(kb_data):
"""Remove sensitive data before sharing"""
sanitized = deepcopy(kb_data)
for field_path in SENSITIVE_FIELDS:
remove_nested_field(sanitized, field_path)
return sanitizeddef check_permissions(assistant_id, operation, field):
config = load_system_config()
permissions = config['knowledge_base_permissions'][assistant_id]
if operation == 'read':
return field in permissions['read']
elif operation == 'write':
return field in permissions['write']
return FalseError handling protocols are defined in the system configuration file (ai_assistants_system_config.json) under the knowledge_base_operations section. All AI assistants reference these protocols directly from the system configuration.
Key error scenarios handled:
- File not found
- Invalid JSON format
- Permission errors
- Generic errors
All knowledge base modifications require explicit user approval as defined in the system configuration.
User Machine
├── job_search_knowledge_base.json (git-ignored)
├── ai_assistants_system_config.json (version controlled)
└── AI Platform Sessions (ephemeral)
Cloud Storage (User-Specific)
├── users/
│ ├── user_001/kb.json
│ ├── user_002/kb.json
│ └── ...
└── shared/
└── system_config.json (cached globally)
class KnowledgeBaseService:
def __init__(self, storage_backend):
self.storage = storage_backend # S3, Azure, GCS
self.cache = Redis()
async def get_user_kb(self, user_id):
# Check cache first
if cached := self.cache.get(f"kb:{user_id}"):
return json.loads(cached)
# Load from storage
kb = await self.storage.get(f"users/{user_id}/kb.json")
self.cache.set(f"kb:{user_id}", json.dumps(kb), ex=3600)
return kbdef log_kb_operation(user_id, assistant_id, operation, field):
log_entry = {
'timestamp': datetime.now().isoformat(),
'user_id': user_id,
'assistant_id': assistant_id,
'operation': operation,
'field': field
}
append_to_audit_log(log_entry)| Issue | Cause | Solution |
|---|---|---|
| Missing fields | Incomplete workflow | Run missing assistant stages |
| Permission denied | Wrong assistant | Check permission matrix |
| Validation errors | Schema mismatch | Update to latest version |
| Sync conflicts | Concurrent edits | Use timestamp-based merge |
- Version Control: Keep system config in git
- Backup Strategy: Regular snapshots of user KBs
- Migration Planning: Version upgrade paths
- Monitoring: Track usage and errors
- Atomic Updates: Write complete sections
- Validation First: Check before writing
- Graceful Degradation: Handle missing KB
- Clear Errors: User-friendly messages
- Schema Evolution: Backward compatibility
- Data Pipeline: ETL for analytics
- Privacy Compliance: GDPR/CCPA considerations
- Performance: Optimize for read-heavy workload
load_knowledge_base(path: str) -> dict
save_knowledge_base(path: str, data: dict) -> bool
validate_schema(data: dict) -> bool
check_permissions(assistant: str, op: str, field: str) -> bool
merge_updates(base: dict, updates: dict) -> dictKB001: File not foundKB002: Invalid JSONKB003: Schema validation failedKB004: Permission deniedKB005: Merge conflict
For implementation examples, see the system prompts in AI_assistants/ directory.