Skip to content

[Bug] Wrong status of document in poisoned queue #987

Open
@KurtP20

Description

@KurtP20

Context / Scenario

When ingesting an invalid URL, e.g. ImportWebPageAsync("http://malformed_url") KM places the document in the poisoned queue after some attempts: Microsoft.KernelMemory.Pipeline.Queue.DevTools.SimpleQueues[0] Message '20250124.114916.8130921.4d6c0b1c4b4d41ff84a0cb26ac27abe8' processing failed with exception, max attempts reached, moving to poison queue..
But the status reported by GetDocumentStatusAsync is still as it was before (my log message: Document 416A1AABBD2B38AE93197949C710199DC83695E497F514EFA5097173535AE492 null?:False completed:False empty:False remaining steps:extract, partition, gen_embeddings, save_records ready:False).

It would be nice to have an additional field failed in DataPipelineStatus, maybe even with a message-field why it failed. Since one most likely wants to delete the failed document, it would be nice to include an optional flag deleteUponFailure to ImportWebPageAsync (or the other Import* methods).

What happened?

Status reports URL is still ingesting, while it is in the poisoned queue.

Importance

a fix would make my life easier

Platform, Language, Versions

KernelMemory 0.95
kernelmemory/service created 2025-01-20T15:41:17.539712455Z
C# / .net9

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions