I was puzzled by getting Avg steps per trajectory: 17.0 after setting MAX_STEPS to 10.
This is because, we indicate as TrajectoryStep, both the EnvStep and AgentStep:
https://github.com/The-AI-Alliance/AgentLab2/blob/8252e484701c8b6be6b54ee94ecb084bbaddc547/src/agentlab2/core.py#L181-L184
Then, we compute the number of steps simply by counting the TrajectoryStep:
https://github.com/The-AI-Alliance/AgentLab2/blob/8252e484701c8b6be6b54ee94ecb084bbaddc547/src/agentlab2/experiment.py#L215-L216
In RL, the number of steps usually indicates the number of iteractions with the environment. For the max_steps variable we use this semantic, but for the internal code and computation we use a different one.
A common solution is to use the concept of Transition (instead of TrajectoryStep), which includes both the env step data and the agent one.
I was puzzled by getting
Avg steps per trajectory: 17.0after setting MAX_STEPS to 10.This is because, we indicate as TrajectoryStep, both the EnvStep and AgentStep:
https://github.com/The-AI-Alliance/AgentLab2/blob/8252e484701c8b6be6b54ee94ecb084bbaddc547/src/agentlab2/core.py#L181-L184
Then, we compute the number of steps simply by counting the TrajectoryStep:
https://github.com/The-AI-Alliance/AgentLab2/blob/8252e484701c8b6be6b54ee94ecb084bbaddc547/src/agentlab2/experiment.py#L215-L216
In RL, the number of steps usually indicates the number of iteractions with the environment. For the max_steps variable we use this semantic, but for the internal code and computation we use a different one.
A common solution is to use the concept of Transition (instead of TrajectoryStep), which includes both the env step data and the agent one.