You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
better approach when SWA window exceeded, simply refill the window. this is not 100% correct but good enough for fastforward users. Disable FF or increase window if not good enough
constint swa_pos_min = llama_memory_seq_pos_min(llama_get_memory(llama_ctx_v4), 0); //this is the furthest back we can rewind to.
4303
4304
int goal_npast = ComputeSharedPrefixLength(current_context_tokens,embd_inp); //this is where we want to rewind to.
4304
4305
goal_npast -= 4;
4305
4306
goal_npast = goal_npast < 0 ? 0 : goal_npast;
4306
4307
if (swa_pos_min < 0 || goal_npast <= swa_pos_min) {
4307
-
triggerff = false;
4308
+
ff_swa_retain_amount = kcpp_active_swa_size;
4308
4309
if (debugmode==1 && !is_quiet)
4309
4310
{
4310
-
printf("\nNote: Context cannot be reused (Desired n_past=%d, SWA lowest n_past=%d), doing a full reprocess... to avoid this, disable SWA or increase SWA padding)\n", goal_npast, swa_pos_min);
4311
+
printf("\nNote: SWA context cannot be reused (Desired n_past=%d, SWA lowest n_past=%d), to avoid this, disable SWA or increase SWA padding), output may degrade.\n", goal_npast, swa_pos_min);
0 commit comments