Replies: 1 comment
-
|
Hi Matt, |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi folks - firstly I want to say RagFlow is an awesome project. I already have a few ideas of things to contribute having only been using it for 24 hours. I am quickly seeing some great results! Thank you.
I work in EdTech and a lot of the documents we are doing RAG over are college lecture notes and presentations that contain a lot of math formulas. I am seeing some great results where RagFlow accurately identifies the whole formula - but I am also seeing many examples where the coordinates seem to be off - and then the Math being detected makes no sense. I have not experimented with many model providers - kinda plugged in gpt-4o for testing. For PDF Parser I am using DeepDoc.
For example here is a bad example:
You can see that the bounding box was positioned in a way that meant the formula's were truncated.
Ideally I would love it if there was a way to drive up the amount of time it does this (a good example)
You can see it identifies the embedded mathematical notation cleanly.
In my application I am using RAG to help an Agent choose images and formula's from the students notes when explaining concepts. So clean detection of the formula is really important to us.
Any ideas what I could do to help improve the results.
Thanks,
Matt
Beta Was this translation helpful? Give feedback.
All reactions