This report contains only factual observations about the system's actual behavior, structure, and test results. No opinions or recommendations are included.
The backend is located at backend/app/ and contains the following modules:
| Module | Files | Purpose |
|---|---|---|
api/routes/ |
ask.py, graph.py, quiz.py, learning_path.py, subjects.py |
REST API endpoints |
api/ |
deps.py |
FastAPI dependency injection providers |
core/ |
settings.py, auth.py, middleware.py, rate_limit.py, exceptions.py, logging.py, subjects.py |
Configuration, auth, middleware |
kg/ |
neo4j_adapter.py, builder.py, schema.py, cypher_qa.py |
Knowledge graph operations |
nlp/ |
llm_client.py, embeddings.py, concept_extractor.py |
LLM and NLP |
rag/ |
retriever.py, kg_expansion.py, chunker.py, window_retriever.py, unified_retriever.py |
RAG pipeline |
student/ |
student_service.py, quiz_generator.py, recommendation_service.py, models.py |
Adaptive learning |
ui_payloads/ |
quiz.py, recommendations.py |
Response models |
Located at frontend/. Pages: home, chat, assessment, graph, comparison, about. Components: KnowledgeGraph (Cytoscape.js), Quiz, LearningPath, MasteryIndicator, PostQuizRecommendations, SubjectPicker, Skeleton, ErrorBoundary, Providers.
- Docker Compose at
infra/compose/compose.yamldefines: Neo4j 5.16-community, OpenSearch (latest), API (CPU/GPU profiles) - Scripts at
scripts/: pipeline scripts (fetch, parse, normalize, build-kg, index), migration scripts, seed scripts
- 4 subjects configured in
config/subjects.yaml: us_history (default), biology, economics, world_history - Settings in
backend/app/core/settings.py: 70+ configuration parameters via Pydantic BaseSettings
Docker services checked at time of observation:
| Service | Container | Status | Ports |
|---|---|---|---|
| Neo4j | adaptive-kg-neo4j | Up 8 days (healthy) | 7474, 7687 |
| OpenSearch | opensearch | Up 8 days | 9200, 9600 |
Both services were running and healthy.
Command: poetry run pytest backend/tests/ -v --tb=short
Result: 346 tests passed in 8.28 seconds. Zero failures.
| Test File | Tests | Status |
|---|---|---|
test_api_ask.py |
12 | All pass |
test_api_graph.py |
19 | All pass |
test_api_quiz.py |
11 | All pass |
test_docker.py |
5 | All pass |
test_kg_expansion.py |
16 | All pass |
test_llm_client.py |
30 | All pass |
test_logging.py |
3 | All pass |
test_main.py |
9 | All pass |
test_makefile.py |
8 | All pass |
test_neo4j_adapter.py |
34 | All pass |
test_poetry.py |
8 | All pass |
test_quiz_generator.py |
14 | All pass |
test_recommendation_service.py |
14 | All pass |
test_retriever.py |
8 | All pass |
test_settings.py |
3 | All pass |
test_streaming.py |
8 | All pass |
test_student_service.py |
25 | All pass |
Total coverage: 63% (1821 of 2893 statements covered)
| Module | Coverage | Notable Gaps |
|---|---|---|
core/exceptions.py |
100% | - |
core/settings.py |
100% | - |
core/middleware.py |
100% | - |
ui_payloads/quiz.py |
100% | - |
ui_payloads/recommendations.py |
100% | - |
kg/neo4j_adapter.py |
99% | 1 line |
student/recommendation_service.py |
97% | Factory function |
nlp/llm_client.py |
95% | Streaming hybrid edge case |
student/student_service.py |
95% | Factory function, one storage line |
student/models.py |
94% | accuracy property |
core/logging.py |
93% | One unreachable line |
student/quiz_generator.py |
88% | Factory functions |
api/routes/graph.py |
87% | Cache miss paths, error handlers |
api/routes/learning_path.py |
85% | Error handlers |
core/subjects.py |
85% | Some getter functions |
api/routes/ask.py |
79% | Window retrieval path, error handlers |
kg/schema.py |
75% | Schema validation methods |
rag/kg_expansion.py |
64% | Factory functions, get_all_concepts |
core/rate_limit.py |
60% | X-Forwarded-For path, decorators |
rag/retriever.py |
54% | Connect, create_collection, index_chunks, factory |
api/routes/quiz.py |
46% | Adaptive quiz, student profile, recommendations endpoints |
api/routes/subjects.py |
44% | Most subject endpoints |
main.py |
51% | Health ready checks, lifespan |
nlp/embeddings.py |
24% | Most embedding operations |
kg/cypher_qa.py |
24% | LangChain integration |
nlp/concept_extractor.py |
17% | Most extraction logic |
kg/builder.py |
14% | Knowledge graph building |
rag/chunker.py |
10% | Chunking operations |
core/auth.py |
0% | All auth functions |
api/deps.py |
0% | All dependency injection providers |
- All tests use mocked services (Neo4j, OpenSearch, LLM) via
conftest.py - Rate limiting is disabled in tests via
limiter.enabled = False - Graph cache is cleared between tests
autouse=Truefixturesetup_test_envsets environment variables
All endpoints are under prefix /api/v1 (configurable via settings.api_prefix).
- Returns:
{"name": str, "version": str, "status": "running", "llm_mode": str, "privacy_local_only": bool}
- Returns:
{"status": "healthy", "attribution": str} - This endpoint always returns "healthy" if the API process is running.
- Returns:
ReadinessResponsewithstatus("healthy"/"degraded"/"unhealthy"),servicesdict (neo4j, opensearch, ollama), andattribution - Each service has
status(ok/degraded/error), optionalmessage, optionallatency_ms - Neo4j and OpenSearch are critical: if either is ERROR, overall status is "unhealthy"
- Ollama ERROR alone results in "degraded" (not "unhealthy")
- Returns:
{"status": "alive"}
- Rate limit: 10/minute
- Accepts (JSON body):
question(str, required, min_length=3)subject(str|null, optional, defaults to us_history internally)use_kg_expansion(bool, default True)use_window_retrieval(bool, default True)window_size(int, default 1, ge=0, le=3)top_k(int, default 5, ge=1, le=20)
- Returns
QuestionResponse:question,answer,sources(list of dicts with text/module_title/section/score),expanded_concepts(list|null),retrieved_count,window_expanded_count(int|null),model,attribution
- Source text truncation: Source texts are truncated at 200 chars + "..."
- Error codes: 404 (no content), 503 (LLM/service error), 500 (internal error)
- Behavior: If KG expansion fails, it continues without expansion (logs warning). If no chunks retrieved, returns 404.
- Rate limit: 10/minute
- Accepts: Same body as
/ask - Returns: SSE (Server-Sent Events)
text/event-stream- First event:
{"type": "metadata", "sources": [...], "expanded_concepts": [...], "retrieved_count": int, "window_expanded_count": int|null, "model": str, "attribution": str} - Token events:
{"type": "token", "content": str} - Error events:
{"type": "error", "content": str} - Final event:
[DONE]
- First event:
- Headers:
Cache-Control: no-cache,Connection: keep-alive,X-Accel-Buffering: no - Error codes: Same as
/askfor pre-streaming errors. Errors during streaming are sent as SSE error events.
- Rate limit: 30/minute
- Accepts:
subject(query param, optional) - Returns:
{"concept_count": int, "module_count": int, "relationship_count": int} - Caching: Results cached for 5 minutes (TTL cache in
_graph_cachedict) - Error codes: 503 (Neo4j connection), 500 (query/other error)
- No rate limit
- Accepts:
limit(query param, default 20),subject(query param, optional) - Returns: list of dicts with
name,score,is_key_term,frequency - Runs direct Cypher:
MATCH (c:{concept_label}) RETURN ... ORDER BY c.importance_score DESC LIMIT $limit
- Rate limit: 30/minute
- Accepts:
limit(query param, default 100),subject(query param, optional) - Returns:
{"nodes": [...], "edges": [...]}- Nodes:
{"data": {"id": str, "label": str, "importance": float, "chapter": str|null}} - Edges:
{"data": {"id": str, "source": str, "target": str, "type": str, "label": str}}
- Nodes:
- Caching: Results cached for 5 minutes
- Formatted for Cytoscape.js visualization
- No rate limit
- Accepts (JSON body):
question(str, required)preview_only(bool, default False)
- Returns:
{"question": str, "cypher": str|null, "result": list|str|null, "answer": str|null, "error": str|null} - If
preview_only=True: generates Cypher but does not execute it - Uses LangChain's GraphCypherQAChain
- No rate limit
- Accepts (JSON body):
query(str, required)limit(int, default 10, ge=1, le=50)
- Accepts (query param):
subject(optional) - Returns: list of
{"name": str, "importance_score": float|null, "key_term": bool|null, "score": float} - Uses Neo4j fulltext index
- No rate limit
- Returns:
{"schema": str} - Uses LangChain's Neo4jGraph.schema
- Rate limit: 5/minute
- Accepts (query params):
topic(str, required)num_questions(int, default 3)subject(str, optional)
- Returns
Quiz:id,title,questions(list ofQuizQuestion),average_difficulty- Each
QuizQuestion:id,text,options(list of{"id": str, "text": str}),correct_option_id,explanation,source_chunk_id,related_concept,difficulty,difficulty_score
- Error codes: 404 (no content/value error), 503 (quiz generation failed), 500 (internal)
- Retrieves top 2 chunks for the topic, sends to LLM with structured JSON prompt
- Rate limit: 5/minute
- Accepts (query params):
topic(str, required)num_questions(int, default 3)student_id(str, default "default")subject(str, optional)
- Returns
AdaptiveQuiz(extends Quiz):- All Quiz fields plus:
student_mastery,target_difficulty,adapted(always True)
- All Quiz fields plus:
- Difficulty targeting: mastery < 0.4 -> easy, 0.4-0.7 -> medium, > 0.7 -> hard
- No rate limit
- Accepts:
student_id(query param, default "default") - Returns:
{"student_id": str, "overall_ability": float, "mastery_levels": dict[str, float], "updated_at": datetime} - Creates a new profile if student_id not found (initial mastery 0.3)
- No rate limit
- Accepts (JSON body):
concept(str, required)correct(bool, required)
- Accepts (query param):
student_id(default "default") - Returns:
{"concept": str, "previous_mastery": float, "new_mastery": float, "target_difficulty": str, "total_attempts": int} - Algorithm: correct -> +0.15 (cap 1.0), incorrect -> -0.10 (floor 0.1)
- Persists to
data/processed/student_profiles.jsonon every update
- No rate limit
- Accepts:
concept(query param, required),student_id(query param, default "default") - Returns:
{"concept": str, "mastery_level": float, "target_difficulty": str}
- No rate limit
- Accepts:
student_id(query param, default "default") - Returns: Same as
GET /student/profile - Resets mastery_map to empty, overall_ability to initial (0.3)
- No rate limit
- Accepts (JSON body):
topic(str),question_results(list of{"question_id": str, "related_concept": str, "correct": bool}),student_id(str, default "default"),subject(str|null)
- Returns
RecommendationResponse:path_type("remediation"/"advancement"/"mixed"),score_pct,remediation(list),advancement(list),summary
- Path type: score < 50% -> remediation, > 80% -> advancement, else -> mixed
- No rate limit
- Accepts:
student_id(query param, default "default") - Returns:
dict[str, "easy"|"medium"|"hard"] - Only returns difficulties for concepts already in the student's mastery map
- No rate limit
- Accepts:
concept_name(path param),max_depth(query param, default 3),subject(optional) - Returns:
{"target_concept": str, "prerequisites": list[{"id": str, "name": str, "importance": float, "chapter": str|null, "depth": int}], "total_concepts": int} - Traverses PREREQUISITE relationships in Neo4j with variable-length paths
- No rate limit
- Accepts:
concept_name(path param),depth(query param, default 2),subject(optional) - Returns:
{"concept": str, "prerequisites": list[dict], "depth": int}
- No rate limit
- Accepts:
concept_name(path param),depth(query param, default 2),subject(optional) - Returns:
{"concept": str, "dependents": list[dict], "depth": int}
- No rate limit
- Returns:
{"subjects": list[{"id": str, "name": str, "description": str, "is_default": bool}], "default_subject": str}
- Returns:
list[str]
- Returns:
{"id": str, "name": str, "description": str, "attribution": str, "opensearch_index": str, "book_count": int, "is_default": bool} - Error: 404 if subject_id not found
- Returns:
{"subject_id": str, "primary_color": str, "secondary_color": str, "accent_color": str, "chapter_colors": dict} - Error: 404 if subject_id not found
- Returns: list of book configuration dicts
- Error: 404 if subject_id not found
- The
auth.pymodule definesverify_api_keyandget_optional_api_key - When
settings.api_keyis empty string (default), all requests are allowed (development mode) - The
deps.pyfile definesRequireApiKeyandrequire_auth()dependency - No endpoint in
routes/currently uses the auth dependency. All endpoints are publicly accessible. - The auth module uses
secrets.compare_digestfor timing-safe comparison - The
X-API-Keyheader scheme is used
- Implemented via
slowapilibrary - Rate limit key is derived from
X-Forwarded-Forheader (first IP) or remote address - The
X-Forwarded-Forheader is trusted without validation - Configured limits:
/ask10/minute,/quiz/generate5/minute,/graph/statsand/graph/data30/minute - Several endpoints have NO rate limiting:
/concepts/top,/graph/query,/concepts/search,/graph/schema,/student/profile,/student/mastery,/student/target-difficulty,/student/reset,/quiz/recommendations,/student/all-difficulties, all/subjects/*endpoints, all/learning-path/*endpoints - Rate limiting is globally disabled in tests via
limiter.enabled = False
- Origins set from
CORS_ORIGINSenv var, defaults tohttp://localhost:3000,http://localhost:3001 allow_credentials=True,allow_methods=["*"],allow_headers=["*"]
questionfield:min_length=3via Pydantic Fieldtop_k:ge=1, le=20window_size:ge=0, le=3concept_search.limit:ge=1, le=50MasteryUpdate:concept(str, required),correct(bool, required) -- no min/max length onconceptstudent_idparameters: no validation (any string accepted)- The
questiontext is directly included in LLM prompts via f-string interpolation - Concept names from user input are passed directly to Cypher queries via parameterized queries ($name)
- OpenRouter:
openrouter_verify_ssldefaults toFalse - OpenSearch:
opensearch_verify_certsdefaults toFalse - OpenSearch health check uses
httpx.AsyncClient(verify=False)
| File | Size |
|---|---|
books_economics.jsonl |
3.2 MB |
books_us_history.jsonl |
2.8 MB |
books.jsonl |
2.8 MB |
chunks_economics.json |
8.4 MB |
chunks_us_history.json |
7.5 MB |
knowledge_graph_economics.json |
3.4 MB |
knowledge_graph_us_history.json |
2.9 MB |
knowledge_graph.json |
2.9 MB |
student_profiles.json |
4.7 KB |
The student_profiles.json file contains 2 profiles:
"default": 19 concepts tracked (mix of US History and Economics), mastery levels ranging from 0.2 to 0.92, overall_ability 0.521"test_student": empty mastery_map, overall_ability 0.3
- Data stored in JSON file at
data/processed/student_profiles.json - Profiles loaded into memory (
dict[str, StudentProfile]) at service initialization - Saved to disk on every mastery update, reset, or new profile creation
- No file locking mechanism
- No concurrent access protection
- On corrupt JSON file, the service logs a warning and starts with empty profiles
- Correct answer: mastery += 0.15 (capped at 1.0)
- Incorrect answer: mastery -= 0.10 (floored at 0.1)
- Initial mastery for new concepts: 0.3 (from
settings.student_initial_mastery) - Overall ability: simple arithmetic mean of all concept mastery levels
Starting from initial mastery 0.3:
- After 1 wrong: 0.2
- After 2 wrong: 0.1 (floor reached)
- All subsequent wrong answers: stays at 0.1 (never reaches 0.0)
- mastery < 0.4 -> "easy"
- mastery 0.4-0.7 -> "medium" (note: boundary 0.4 is "medium", not "easy")
- mastery > 0.7 -> "hard" (note: boundary 0.7 is "medium", not "hard")
- There is no quiz submission endpoint. The API generates quizzes and returns them with correct answers included in the response (
correct_option_idfield) - There is no server-side quiz answer validation
- Mastery is updated via the separate
POST /student/masteryendpoint - A student (or client) can submit mastery updates for any concept at any time, independent of quiz generation
- The same mastery update can be submitted multiple times -- each call increments
attemptsand adjusts mastery - There is no deduplication or idempotency for mastery updates
POST /student/masteryhas NO rate limitPOST /student/resethas NO rate limitGET /student/profilehas NO rate limit
The following services use a global singleton or per-subject registry pattern:
LLMClient: global singleton viaget_llm_client()StudentService: global singleton viaget_student_service()OpenSearchRetriever: per-subject registry viaget_retriever(subject_id)QuizGenerator: per-subject registry viaget_quiz_generator(subject_id)KGExpander: per-subject registry viaget_kg_expander(subject_id)Neo4jAdapter: per-subject registry viaget_neo4j_adapter(subject_id)RecommendationService: per-subject registry viaget_recommendation_service(subject)WindowRetriever: global singleton viaget_window_retriever()CypherQAService: global singleton viaget_cypher_qa_service()
deps.pydefines FastAPIDepends()providers for all major services- However, none of these DI providers are used by the route handlers
- Route handlers import factory functions directly:
from backend.app.rag.retriever import get_retriever - The
deps.pydependency injection providers have 0% test coverage
- Implemented via Neo4j label prefixes (e.g.,
us_history_Concept,biology_Concept) - All subjects share the same Neo4j database (
neo4j) - Each subject has its own OpenSearch index (e.g.,
textbook_chunks_us_history) - Subject configuration loaded from
config/subjects.yamland cached via@lru_cache(maxsize=1)
- Graph stats and data endpoints: in-memory dict with 5-minute TTL
- Subjects config: LRU cache (maxsize=1) -- effectively cached forever unless
clear_subjects_cache()called - Service singletons: cached in module-level globals
- Custom exception hierarchy rooted at
AdaptiveKGException - KG expansion failures are caught and silently continued (warning logged)
- Window retrieval failures are caught and silently continued (warning logged)
- LLM errors return 503
- Content-not-found returns 404
- All other unhandled errors return 500 with error message in detail
local: Ollama (default)remote: OpenRouterhybrid: tries local first, falls back to remote onLLMConnectionErrororLLMGenerationError
- Non-streaming calls: up to 3 retries with exponential backoff (1s to 10s)
- Only
LLMConnectionErrortriggers retry (notLLMGenerationError) - Streaming calls: no retry mechanism
- Answer prompts include numbered context passages
[1] text,[2] text - Quiz prompts request JSON output with structured schema
- Quiz JSON cleaned of markdown code blocks before parsing
knn: Vector-only search using OpenSearch kNNhybrid(default): BM25 text search + kNN vector search with reciprocal rank fusion (RRF, k=60)
- BM25 fields:
text^3,module_title^2,section,key_terms^2 - Fuzziness: AUTO
- RRF formula:
score(doc) = sum(1/(k + rank))where k=60
- Requires
vector_backendto be "neo4j" or "hybrid" (NOT "opensearch" which is default) - The default
vector_backendis "opensearch", meaning window retrieval is effectively disabled by default - Uses NEXT relationships in Neo4j to get surrounding chunks
- Uses
axioswith 30-second timeout - Streaming uses native
fetch()API (not axios) generateQuizsends parameters as query params (not body)askQuestionsendssubjectin request body- Error interceptor logs but re-throws all errors
healthCheckaccepts both "healthy" and "ok" as healthy statusesgetTopConceptssilently returns empty array on failure
- 4 subjects configured:
us_history,biology,economics,world_history - Data files exist for:
us_history(chunks + KG),economics(chunks + KG) - No data files exist for:
biology,world_history world_historyusessource_type: "openstax_web"(different from github_raw used by others)
privacy_local_only: True (default)llm_mode: "local" (default)embedding_device: "cuda" (default) -- this would fail on CPU-only machinesreranker_device: "cuda" (default) -- same issueretrieval_mode: "hybrid" (default)vector_backend: "opensearch" (default)student_initial_mastery: 0.3
RequestIDMiddleware: attaches UUID to every request viaX-Request-IDheader, accepts client-provided IDs
- The
OpenSearchRetriever.connect()is called immediately inget_retriever()factory - If OpenSearch is unreachable, the retriever creation fails and the error propagates
- The
get_neo4j_adapter()dependency is a generator (usesyield) for proper cleanup - Other deps are simple functions that call global singletons
- The
require_auth()function returnsDepends(verify_api_key)as a value, which is structurally unusual -- it returns a FastAPIDependsobject rather than using it as a decorator or parameter default
Untracked files:
AGENTIC_TEAM_PROMPT.mddata/processed/chunks_economics.jsondata/processed/chunks_us_history.json
Report generated by The Witness. All statements are factual observations of actual system behavior.