Releases: Codesteward/codesteward
Releases · Codesteward/codesteward
v0.5.2
Fixed — codesteward-graph
dependencyquery returned empty despite edges being written.
PyProjectParserandPackageJsonParseremitteddepends_onedges whose
source_idpointed at apyproject.toml/package.jsonfile node — but
that node was never actually written to the graph. On GraphQLite the edge
write silently dropped (source MATCH returned nothing); on Neo4j the source
had to be merged separately. Both parsers now return a tuple
(source_file_nodes, edges)andGraphBuilderpersists the file nodes
alongside the edges, giving thedependencyquery a proper source to
match against.
Improved — codesteward-graph
referentialquery now returns both outgoing and incoming edges for a
filter. Previously the filter matched only the edge source name, so
asking forreferential filter=fooreturned whatfoocalls but not
what callsfoo— the most common "who depends on this?" question.
Templates now also matchr.target_name(andtgt.nameon Neo4j), so a
single filter surfaces both directions. Direction is always visible in
the result rows viafrom_namevsto_name.
Container image
docker pull ghcr.io/codesteward/codesteward:0.5.2Image is signed with cosign keyless via GitHub OIDC.
Verify with:
cosign verify ghcr.io/codesteward/codesteward:0.5.2 \
--certificate-identity-regexp 'https://github.com/Codesteward/codesteward/.*' \
--certificate-oidc-issuer https://token.actions.githubusercontent.comv0.5.1
Fixed — codesteward-mcp
- stdio transport: server crashed on first tool call —
structlog.configure()was
called without an explicitlogger_factory, so structlog defaulted to
PrintLoggerFactory(file=sys.stdout). Anylog.warning/log.erroron a tool path
(e.g._make_backendwhen a backend was missing) wrote structlog output to stdout,
which is the JSON-RPC channel in stdio mode. The MCP client received non-JSON and
dropped the connection withJSON Parse error: Unable to parse JSON string. Fix:
route structlog tosys.stderrviaPrintLoggerFactory(file=sys.stderr).
Container image
docker pull ghcr.io/codesteward/codesteward:0.5.1Image is signed with cosign keyless via GitHub OIDC.
Verify with:
cosign verify ghcr.io/codesteward/codesteward:0.5.1 \
--certificate-identity-regexp 'https://github.com/Codesteward/codesteward/.*' \
--certificate-oidc-issuer https://token.actions.githubusercontent.comv0.5.0
Added — codesteward-graph
PyProjectParser— extractsdepends_onedges frompyproject.tomlfiles, supporting
PEP 621[project.dependencies],[project.optional-dependencies], and Poetry
[tool.poetry.dependencies]. Scans allpyproject.tomlfiles in the repo tree (handles
uv workspaces and monorepos). Wired intoGraphBuilder.build_graph()alongside the
existingPackageJsonParser. Thedependencyquery type now returns results for Python
projects.
Fixed — codesteward-graph
- GraphQLite:
dependencyquery returned null package names — the query template read
pkg.namefrom the target node via a traversal, but GraphQLite does not resolve target
node properties through traversal patterns (the same limitation already worked around in
thereferentialquery). Rewrote the template to readtarget_namefrom edge properties.
The same fix was applied to thesemanticquery'ssink_name/sink_filefields. - GraphQLite:
delete_file_nodesdid not filter by$paramin MATCH patterns —
$paraminterpolation into MATCH property patterns is unreliable in GraphQLite. The
method was inconsistent withcount_nodes/delete_repo_datawhich already use literal
values via_cypher_escape. Rewrotedelete_file_nodesto match. This was silently
breaking incremental rebuilds. - GraphQLite: named query templates used
$paramin MATCH patterns — tenant/repo/filter
isolation was unreliable inlexical,referential,semantic, anddependency
queries. Rewrote each template as a builder function that constructs Cypher with escaped
literal values and moves filters to WHERE clauses. - GraphQLite:
write_augment_edgecreated duplicate edges on re-invocation — it used
CREATEinstead ofMERGEand did not dedup byedge_id. Added a delete-before-create
pattern so re-writing an augment edge with the sameedge_idis idempotent (matches the
upsert behavior of Neo4j and JanusGraph). - GraphQLite:
semanticquery usedNOT r.sanitized— SQLite stores booleans as
integers andNOT <int>semantics were not reliable through GraphQLite's Cypher
translation. Changed to explicitr.sanitized = 0. - GraphQLite: full rebuild duplicated all edges —
write_edgesusesCREATE(notMERGE)
for relationship creation, so consecutive full rebuilds doubled the edge count with each run.
Addeddelete_repo_data(tenant_id, repo_id)to theGraphBackendABC and all three
implementations (Neo4j, JanusGraph, GraphQLite).GraphBuilder.build_graph()now clears
existing repo data before every full (non-incremental) rebuild. - Cross-file CALLS resolution 0% on codebases with shared method names —
_resolve_call_targets()marked all ambiguous names (e.g.parse, defined in 13+ parsers)
as unresolvable. Added same-file disambiguation: when a callee name is globally ambiguous,
resolve to the definition in the caller's own file. This recovers intra-file method calls
that were previously left as unresolved external nodes. - GUARDED_BY edges emitted for non-auth Python decorators —
@property,
@staticmethod,@abstractmethod,@dataclass,@pytest.fixture, and ~30 other standard
Python decorators were incorrectly producingguarded_byedges. Added_NON_AUTH_DECORATORS
blocklist to the Python parser; only actual auth decorators (@login_required,
Depends(...), etc.) now emit guard edges. build_graphsummary reportedlanguage: typescriptfor all codebases — thelanguage
parameter defaulted to"typescript"and was echoed into the summary unchanged. Added
_detect_dominant_language()which counts file-node languages and returns the most common
one. The summary now reflects the actual codebase language.
Security / CI
- CI/CD security hardening — introduced a comprehensive CI pipeline based on the
OpenSSF / SLSA guidance:- Every job now runs behind
step-security/harden-runner(audit mode) and uses
persist-credentials: falsewith scoped permissions. - New checks: Semgrep (p/python, p/security-audit, p/owasp-top-ten, p/docker),
Hadolint, zizmor (GitHub Actions static analysis, with ref-pin policy in
.github/zizmor.yml), CodeQL security-extended (push-to-main),pip-audit
againstuv export --frozen, Trivy container scan that gates the release,
dependency-review on PRs, conventional commits, license headers
(skywalking-eyes, check-only), markdown-lint, OpenSSF Scorecard, and a
weekly scheduled scan workflow (CodeQL, pip-audit, Trivy image, gitleaks). - Release workflow now builds
linux/amd64locally, scans with Trivy (HIGH/CRITICAL
gate), pushes multi-arch with SLSA provenance (provenance: mode=max) and SBOM,
signs with cosign keyless via GitHub OIDC, and attachestrivy-report.json+
sbom.cdx.jsonto the GitHub Release. - Added
.github/CODEOWNERS,.github/dco.yml(probot/dco),SECURITY.md,
.licenserc.yaml,renovate.json(withpinDigests: true), and
docs/ci-security-hardening.md.
- Every job now runs behind
- Container base patched against Debian openssl CVE-2026-28390 — final stage of
Dockerfile.mcpnow runsapt-get upgrade -yon top ofpython:3.12-slimto pick up
Debian security updates that lag the upstream image rebuild cadence. - Narrowly-scoped
.trivyignorefor bundledcodesteward-taintbinary — the upstream
v0.1.0binary was built with Go 1.22.12 and inherits nine Go stdlib CVEs that cannot be
fixed from this repo. Documented each suppression with a TODO to drop once upstream ships
a Go ≥ 1.26.2 rebuild. Operators who do not need taint analysis can
--build-arg TAINT_VERSION=noneto remove the binary and skip the suppressions. - Python runtime CVEs resolved — bumped locked
cryptography46.0.5 → 46.0.7
(CVE-2026-34073, CVE-2026-39892),pygments2.19.2 → 2.20.0 (CVE-2026-4539), and
pytest9.0.2 → 9.0.3 (CVE-2025-71176) viauv lock --upgrade-package.
Container image
docker pull ghcr.io/codesteward/codesteward:0.5.0Image is signed with cosign keyless via GitHub OIDC.
Verify with:
cosign verify ghcr.io/codesteward/codesteward:0.5.0 \
--certificate-identity-regexp 'https://github.com/Codesteward/codesteward/.*' \
--certificate-oidc-issuer https://token.actions.githubusercontent.comv0.4.2
Fixed — codesteward-graph
.venvand Python tool directories not excluded from graph builds —_IGNORED_DIRS
now includes.venv,venv,.env,env,.tox,.nox,.mypy_cache,
.ruff_cache,.pytest_cache,site-packages, and.eggs. Previously these
directories were parsed, causing recursion errors on large vendored files and polluting
the graph with thousands of library symbols.- Cross-file CALLS targets unresolved — added
_resolve_call_targets()post-parse pass
toGraphBuilder.build_graph(). After all files are parsed, afn_name → node_idmap
is built and CALLS edgetarget_idvalues are rewritten from bare callee names to proper
node IDs. Ambiguous names (multiple definitions) are left unresolved. Typically resolves
~30% of all CALLS edges in a codebase. - GraphQLite: referential query returned NULL target properties — GraphQLite's
relationship traversal(src)-[r]->(tgt)does not resolve target node properties.
Worked around by storingtarget_nameandtarget_idas edge properties during
write_edges, and reading them from the edge in the referential query template. - GraphQLite: UNWIND ON CREATE SET did not persist target node properties — replaced
batch UNWIND target-node creation with per-node literal MERGE, consistent with the
existing per-edge literal approach. - GraphQLite: dependency query SQL error —
MATCHwith mixed$paramand literal
values in property patterns triggers a GraphQLite SQL translation bug
(no such column: _prop__gql_default_alias_0.value). Movednode_type = 'file'
from the MATCH pattern to the WHERE clause.
Fixed — codesteward-mcp
codesteward-mcp setupwrote Claude Code MCP config to wrong file — was writing to
~/.claude/settings.jsonbut Claude Code reads MCP servers from~/.claude.json.
Also added the required"type": "stdio"field to the server config for Claude Code.
Changed — Known issues — GraphQLite backend
- The referential query NULL target issue from 0.4.1 is now resolved via edge-stored
target metadata. - The dependency query SQL error from 0.4.1 is now resolved via WHERE clause workaround.
Docker image
docker pull ghcr.io/bitkaio/codesteward:v0.4.2Full setup guide: AGENT_SETUP.md
v0.4.1
Fixed — codesteward-graph
- GraphQLite backend: SQLite threading error —
graphqlite.connect()now passes
check_same_thread=Falseso thatasyncio.to_thread()can execute queries in worker
threads without raisingsqlite3.ProgrammingError - GraphQLite backend: node properties not persisted —
write_nodesreplaced
SET node += n(map-merge syntax unsupported by GraphQLite) with explicit
ON CREATE SET/ON MATCH SETfor each field - GraphQLite backend: edges not persisted —
write_edgesrewritten to work around
two GraphQLite Cypher-to-SQL translation bugs: (1) UNWIND variable references in MATCH
property patterns match all nodes instead of filtering, and (2)$paramreferences in
relationship properties are silently discarded. Edges are now written individually with
Cypher literal values; target nodes are batch-MERGEd in a separate step - GraphQLite backend:
count_nodesalways returned 0 — replacedMATCHinline
property filter with aWHEREclause using literal values, avoiding the parameter
binding issue - GraphQLite backend:
write_augment_edgenot persisting edge properties — same
workaround aswrite_edges: target node MERGE separated from edge CREATE, relationship
properties written as literals
Known issues — GraphQLite backend
These are upstream bugs in the graphqlite package (≤ 0.4.3) that remain unresolved:
- Target node properties (
to_name,to_file,to_node_type) return NULL in
referential query results — the(src)-[r]->(tgt)pattern match finds edges but
tgt.*properties are inaccessible dependencynamed query fails withSQL prepare failed: no such column— the
RETURN DISTINCT+DEPENDS_ONedge type triggers a Cypher-to-SQL translation error- Node count mismatch between
graph_rebuildreported total andcount_nodesresult —
external reference nodes created during edge writing inflate the DB count beyond the
parser-reported total
Docker image
docker pull ghcr.io/bitkaio/codesteward:v0.4.1Full setup guide: AGENT_SETUP.md
v0.4.0
Added — codesteward-graph
- Graph backend abstraction layer (
engine/backends/): newGraphBackendABC with a
unified async interface for node/edge writes, named queries, and raw query passthrough.
All tool functions are now backend-agnostic. - JanusGraph backend (
backends/janusgraph.py): Apache 2.0 licensed alternative to Neo4j.
Connects via Gremlin (Apache TinkerPopgremlinpython>=3.7). Named query templates
(lexical,referential,semantic,dependency) reimplemented in Gremlin. Raw query
passthrough uses Gremlin instead of Cypher. - GraphQLite backend (
backends/graphqlite.py): embedded SQLite-based graph database
(graphqlite>=0.4) — no server needed, ideal for local dev viauvx. Speaks Cypher
(same templates as Neo4j). Database defaults to~/.codesteward/graph.db; override with
GRAPHQLITE_DB_PATH. - Neo4j backend extracted into
backends/neo4j.py(same Cypher queries, now behind the
GraphBackendinterface). get_backend()factory inbackends/__init__.pydispatches byGRAPH_BACKENDvalue.- New optional dependency extras:
janusgraph(gremlinpython) andgraphqlite(graphqlite).
Added — codesteward-mcp
GRAPH_BACKENDenvironment variable to select the graph backend:neo4j(default),
janusgraph, orgraphqlite.JANUSGRAPH_URLenvironment variable for the Gremlin Server WebSocket URL.GRAPHQLITE_DB_PATHenvironment variable for the SQLite database file path.gremlinraw query type incodebase_graph_queryfor JanusGraph raw Gremlin passthrough.
Cypher/Gremlin mismatch is rejected with a clear error.docker-compose.janusgraph.yml— drop-in JanusGraph stack (BerkeleyDB JE + Lucene,
single-node, no external Cassandra/HBase required).docker-compose.neo4j.yml— renamed from the previousdocker-compose.ymlfor clarity.- Docker image now installs the
janusgraphextra by default. - New optional dependency extras on
codesteward-mcp:janusgraphandgraphqlite
(re-exported fromcodesteward-graph). - Global setup templates: Claude Code (
templates/global-claude-code/) and OpenAI Codex
(templates/global-codex/) with CLAUDE.md, skill file, settings snippet, and AGENTS.md. codesteward-mcp setupsubcommand — one-time global setup that auto-detects installed
AI tools (Claude Code, Cursor, Cline, Codex CLI, Gemini CLI), registers the MCP server in
each tool's global config, and merges workflow instructions into CLAUDE.md / AGENTS.md /
GEMINI.md. Idempotent — safe to re-run.--uninstallreverses all changes cleanly.
--backendflag acceptsgraphqlite(default),neo4j, orjanusgraph.- Cline support:
.clinerulestemplate, Cline detection insetupcommand (cross-platform
globalStorage path resolution), and Cline section in AGENT_SETUP.md with marketplace install
instructions viallms-install.md. docs/setup/— per-tool setup guides (Claude Code, Cursor & Cline, Codex CLI, Gemini CLI,
Windsurf / VS Code / Claude Desktop / Continue.dev, Docker + Neo4j / JanusGraph). Referenced
from README.md Quick Start.
Changed — codesteward-mcp
GRAPH_BACKENDdefault changed fromneo4jtoauto— auto-detects the appropriate
backend at startup: Neo4j ifNEO4J_PASSWORDis set, JanusGraph ifJANUSGRAPH_URLis
non-default, otherwise GraphQLite. Existing deployments with explicit env vars are unaffected.- Tool response fields renamed:
neo4j_connected→backend_connected; new
graph_backendfield ingraph_rebuildandgraph_statusresponses. _make_async_driver()replaced by_make_backend()— returns aGraphBackendinstance
(orNonefor stub mode) instead of a raw Neo4j driver.GraphBuildernow accepts abackendparameter (theGraphBackendinstance) instead of
neo4j_driver.- Cypher query templates moved from inline constants in
tools/graph.pyinto each backend's
query_named()implementation. - Server instructions updated to describe all three backends and the
gremlinquery type. - README.md Quick Start rewritten: leads with
uvx codesteward-mcp setupfor zero-config
global setup; manual setup simplified with GraphQLite as default. llms-install.mdrewritten for GraphQLite default and Cline compatibility.- All
uvxargs in templates and docs fixed to use the--frompattern
(uvx --from "codesteward-mcp[graph-all,graphqlite]" codesteward-mcp) — the previous
pattern failed on macOS whereuvxcannot parse extras as a command name. - Global setup templates (
templates/global-claude-code/,templates/global-codex/) updated
to use GraphQLite as default backend. - License changed from BSD 3-Clause to Apache 2.0.
Docker image
docker pull ghcr.io/bitkaio/codesteward:v0.4.0Full setup guide: AGENT_SETUP.md
v0.3.0
Added — codesteward-graph
- Taint-source node and edge emission across all 12 parsers, enabling L1 taint analysis by the
codesteward-taintbinary without requiring a separate source-annotation pass:- Python — Flask/Django/FastAPI
request.*, WSGIenviron, StarletteRequest - TypeScript/JavaScript — Express
req.body/req.query/req.params/req.headers/req.cookies;
NestJS parameter decorators (@Body,@Param,@Query,@Headers, etc.) - Java — Spring MVC
@RequestParam,@PathVariable,@RequestBody,@RequestHeader,
@CookieValue; Jakarta EE@QueryParam,@PathParam,@FormParam,@HeaderParam - Go —
net/httpr.URL.Query(),r.FormValue(),r.Header.Get(),r.Body;
Ginc.Query(),c.Param(),c.PostForm(),c.GetHeader() - Rust — Actix-web/Axum typed extractors:
web::Path<T>,web::Query<T>,web::Json<T>,
web::Form<T>,web::Bytes,web::Multipart,extract::Path,extract::Json, etc. - PHP — superglobals (
$_GET,$_POST,$_REQUEST,$_FILES,$_COOKIE,$_SERVER);
Laravel$request->input()/query()/file()/etc.; Symfony property bags ($request->query,
$request->headers, …); PSR-7getQueryParams()/getParsedBody()/etc.;
CodeIgniter4getGet()/getPost()/getJSON()/etc. - C# — ASP.NET Core parameter attributes (
[FromQuery],[FromRoute],[FromBody],
[FromForm],[FromHeader]);HttpRequestproperty access (Request.Query,
Request.Form,Request.Headers,Request.Cookies) - Kotlin — Spring Boot
@RequestParam,@PathVariable,@RequestBody,@RequestHeader,
@CookieValue; Ktorcall.receive*(),call.parameters,call.request.queryParameters;
Http4krequest.query(),request.path(),request.bodyString() - Scala — Play Framework
request.body.*,request.queryString,request.headers;
Akka HTTP directives (parameters,entity,formField,headerValueByName,cookie,path) - C — CGI
getenv()for HTTP env vars (QUERY_STRING,HTTP_COOKIE, etc.), stdin reads
(fread/fgets/read); Mongoosemg_http_get_var/mg_http_get_header;
libmicrohttpdMHD_lookup_connection_value - C++ — all C patterns reused; Crow
req.body/req.url_params/req.headers;
Drogonreq->getBody()/req->getParameter()/req->getHeader()/req->getCookie();
Pistacherequest.query()/request.resource(); Oat++getPathVariable()/getQueryParameter() - COBOL — no applicable web taint patterns; no change
- Python — Flask/Django/FastAPI
tests/test_engine/test_taint_sources.py— new test module with 50+ tests covering taint-source
detection for C, C++, C#, Rust, PHP, Kotlin, Scala, and NestJS (TypeScript)
Added — codesteward-mcp
taint_analysisMCP tool: invokes thecodesteward-taintGo binary as an async subprocess
and returns YAML with unsafe/sanitized path counts and a findings list. The tool is registered
only when the binary is present onPATH(shutil.which); the server starts normally without it.TAINT_FLOWedges are now writable viagraph_augment(addedtaint_flowto
_ALLOWED_EDGE_TYPES).- Docker image: new
taint-fetcherbuild stage bundles thecodesteward-taintbinary by
default (latest GitHub Release). Pin with--build-arg TAINT_VERSION=<version>or omit
entirely with--build-arg TAINT_VERSION=none.
Changed — codesteward-mcp
codebase_graph_querysemantictemplate updated fromDATA_FLOWtoTAINT_FLOW: results
now returnsource_name,source_file,sink_name,sink_file,cwe,hops,level,
frameworkinstead offunction_name,file,line,flow_description. Returns empty
untiltaint_analysishas been run.
Removed — codesteward-graph
DATA_FLOWedges are no longer emitted by any parser. UseTAINT_FLOWedges written by the
codesteward-taintbinary for data-flow analysis._extract_semantic_edges()removed fromTreeSitterBase(and all callers inpython.py,
typescript.py,java.py).
v0.2.2
Docker image
docker pull ghcr.io/bitkaio/codesteward:v0.2.2Full setup guide: AGENT_SETUP.md
Full Changelog: v0.2.1...v0.2.2
v0.2.1
Docker image
docker pull ghcr.io/bitkaio/codesteward:v0.2.1Full setup guide: AGENT_SETUP.md
Full Changelog: v0.2.0...v0.2.1
v0.2.0
Docker image
docker pull ghcr.io/bitkaio/codesteward:v0.2.0Full setup guide: AGENT_SETUP.md
Full Changelog: v0.1.0...v0.2.0