When using tokscale to analyze Codex CLI usage, I noticed that a long-running Codex session crossing calendar months appears to be over-counted. It
looks like historical messages/context from the resumed session may be counted again instead of only counting actual per-turn token usage events.
Environment
- Tool:
tokscale
- Client: Codex CLI
- Data source:
~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonl
Expected behavior
Codex usage should be aggregated from actual usage events:
.type == "event_msg" and .payload.type == "token_count"
For each token_count event, usage should come from payload.info.last_token_usage when available. If only payload.info.total_token_usage exists, usage
should be calculated as a delta from the previous cumulative value within the same rollout file.
Daily/monthly buckets should use the timestamp of each token_count event, not the session file date or session_meta.timestamp.
Actual behavior
For a Codex session that spans multiple months or is resumed later, Tokscale appears to include historical messages/context again, causing duplicated
usage in the monthly totals.
Why this matters
Codex transcript files contain more than user-entered messages. Entries such as response_item with role == "user" may include injected context,
AGENTS.md instructions, resumed history, or other system/context payloads. Counting those as new usage can double-count historical context.
Suggested fix
For Codex parsing:
- Only treat event_msg entries with payload.type == "token_count" as token usage records.
- Prefer payload.info.last_token_usage.
- If last_token_usage is missing, compute the delta from payload.info.total_token_usage within the same rollout file.
- Assign usage to daily/monthly buckets by the token_count event timestamp.
- Avoid deriving usage from user/message transcript entries such as response_item.payload.role == "user".
This should prevent resumed or cross-month Codex sessions from duplicating historical context in usage totals.
Note:
This conclusion was produced by analyzing local Codex CLI session JSONL records with Codex itself. The analysis compared Codex transcript event
types and found that reliable usage accounting should be based on event_msg entries with payload.type == "token_count", rather than transcript
message entries that may contain restored history or injected context.
When using
tokscaleto analyze Codex CLI usage, I noticed that a long-running Codex session crossing calendar months appears to be over-counted. Itlooks like historical messages/context from the resumed session may be counted again instead of only counting actual per-turn token usage events.
Environment
tokscale~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonlExpected behavior
Codex usage should be aggregated from actual usage events:
For each token_count event, usage should come from payload.info.last_token_usage when available. If only payload.info.total_token_usage exists, usage
should be calculated as a delta from the previous cumulative value within the same rollout file.
Daily/monthly buckets should use the timestamp of each token_count event, not the session file date or session_meta.timestamp.
Actual behavior
For a Codex session that spans multiple months or is resumed later, Tokscale appears to include historical messages/context again, causing duplicated
usage in the monthly totals.
Why this matters
Codex transcript files contain more than user-entered messages. Entries such as response_item with role == "user" may include injected context,
AGENTS.md instructions, resumed history, or other system/context payloads. Counting those as new usage can double-count historical context.
Suggested fix
For Codex parsing:
This should prevent resumed or cross-month Codex sessions from duplicating historical context in usage totals.
Note:
This conclusion was produced by analyzing local Codex CLI session JSONL records with Codex itself. The analysis compared Codex transcript event
types and found that reliable usage accounting should be based on
event_msgentries withpayload.type == "token_count", rather than transcriptmessage entries that may contain restored history or injected context.