Significantly reduce token use and LLM confusion: SKIP includeFileDetails
!
#1078
dkamins
started this conversation in
Feature Requests
Replies: 1 comment
-
For reference, this is an overview of the code path discussed in OP relating to Main task loop init: private async initiateTaskLoop(userContent: UserContent): Promise<void> {
let nextUserContent = userContent
let includeFileDetails = true
while (!this.abort) {
const didEndLoop = await this.recursivelyMakeClineRequests(nextUserContent, includeFileDetails)
includeFileDetails = false // we only need file details the first time calls: async recursivelyMakeClineRequests(
userContent: UserContent,
includeFileDetails: boolean = false,
): Promise<boolean> { which calls: async loadContext(userContent: UserContent, includeFileDetails: boolean = false) { which calls: async getEnvironmentDetails(includeFileDetails: boolean = false) { which includes with a hard-coded 200 file limit: if (includeFileDetails) {
details += `\n\n# Current Working Directory (${cwd.toPosix()}) Files\n`
const isDesktop = arePathsEqual(cwd, path.join(os.homedir(), "Desktop"))
if (isDesktop) {
// don't want to immediately access desktop since it would show permission popup
details += "(Desktop files not shown automatically. Use list_files to explore if needed.)"
} else {
const [files, didHitLimit] = await listFiles(cwd, true, 200)
const result = formatResponse.formatFilesList(cwd, files, didHitLimit)
details += result
}
} which adds ~5KB of text, recursively listing fils in the project root, ending with:
which comes from (when formatFilesList: (absolutePath: string, files: string[], didHitLimit: boolean): string => { |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Observation
Looking at the prompts generated for all new tasks, you can see Roo is including a recursive list of (up to 200) files in the project directory. When it hits that limit, it kindly informs the LLM:
This file list may make sense for small projects to give the LLM context of what is available.
Problem
For large projects this is worse than useless because the arbitrary list of files included is almost always irrelevant to the task at hand. This means wasting tokens (cost) and filling the context window up with information distracting to the LLM.
For example, in one project, my file list is consistently populated almost entirely with files from the project's
/dev
directory (build and test tools) and/data
directory (test and runtime data), before it even gets alphabetically to/src
, where it barely scratches the surface.I clocked this file list text at 5604 characters and 1790 (!) tokens. I'm sure some projects with longer paths or filenames will be even bigger.
Proposal
I am proposing that either:
This would save money while making the agent faster and smarter. Win win win.
Implementation Details
(Internally Roo is does this due to setup code in
initiateTaskLoop
, where it setsincludeFileDetails = true
for the first iteration. I will follow up this post with more details of the code path to illustrate this.)Beta Was this translation helpful? Give feedback.
All reactions