You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update mkdocs.yml and extend-document-automation talk for clarity and consistency
- Changed the title of a talk in mkdocs.yml to better reflect its content: "Building Custom Embedding Models Per Customer" by Manav, Glean.
- Added a new talk on document automation by Eli Badgio, enhancing the navigation structure.
- Reformatted FAQs in the extend-document-automation talk for improved readability and consistency, using bold for questions.
- Deleted outdated talks on document workflows and multi-agent systems to streamline content and focus on relevant topics.
Copy file name to clipboardExpand all lines: docs/talks/extend-document-automation.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -147,11 +147,11 @@ As AI capabilities continue to advance, the companies that will benefit most are
147
147
148
148
## FAQs
149
149
150
-
## What is document automation and why is it important?
150
+
**What is document automation and why is it important?**
151
151
152
152
Document automation uses AI to process and extract information from documents, reducing manual effort and errors. This is particularly crucial for enterprises where manual document workflows consume significant resources. For example, a single building construction project can require processing a million documents. Effective document automation can achieve 95%+ extraction accuracy, streamlining operations across industries from finance to healthcare.
153
153
154
-
## What are the main types of document processing use cases?
154
+
**What are the main types of document processing use cases?**
155
155
156
156
There are three broad categories of document processing:
157
157
- Ingestion: Including RAG context for agents and data mining
@@ -160,29 +160,29 @@ There are three broad categories of document processing:
160
160
161
161
The most challenging and often overlooked category is back office processing, where documents serve as critical systems of record with zero tolerance for errors.
162
162
163
-
## What's the most common mistake when implementing document automation?
163
+
**What's the most common mistake when implementing document automation?**
164
164
165
165
The most common failure is attempting to "one-shot" the automation—trying to replace an entire manual process with automation in a single step. This approach almost never works, even for relatively simple use cases. It sets unrealistic expectations and doesn't account for the complex change management required when transitioning from manual to automated processes.
166
166
167
-
## How should I approach understanding my document data?
167
+
**How should I approach understanding my document data?**
168
168
169
169
Take time to thoroughly understand both your documents and the existing manual processes around them. Many companies have enormous amounts of tacit knowledge hidden in people's heads—whether in engineering, operations, or domain experts. This knowledge is critical for successful automation.
170
170
171
171
One effective approach is to implement a human-in-the-loop flow where AI-processed documents are still routed to humans for review. This helps collect robust production data separate from your evaluation sets. Also consider creating extractors that characterize normalized attributes of your documents (supplier types, layout patterns, terminology variations) to help cluster and understand your document landscape.
172
172
173
-
## Why are tailored evaluations so important for document automation?
173
+
**Why are tailored evaluations so important for document automation?**
174
174
175
175
Generic benchmarks don't translate well to specific document automation tasks. You need evaluations tailored to your domain, data, and processes. Invest early in creating comprehensive, fine-grained evaluation sets across a variety of document examples.
176
176
177
177
The best approach is to involve domain experts (like nurses for healthcare documents or billing specialists for financial documents) to help build robust evaluation sets and optimize extraction schemas. This creates a competitive advantage as you can quickly test and confidently roll out improvements when new models become available.
178
178
179
-
## Should I aim for 100% automation from the start?
179
+
**Should I aim for 100% automation from the start?**
180
180
181
181
No. It's much more effective to start with partial automation and gradually increase automation rates. Focus on what's called the "true automation rate"—not just accuracy on extracted data, but how often the end-to-end pipeline works perfectly and you can catch errors 100% of the time.
182
182
183
183
For example, a system with 85% true automation rate where you can reliably identify the 15% that needs human review is far better than a system with 98% accuracy but no way to identify which 2% contains errors.
184
184
185
-
## How should I think about redesigning processes for automation?
185
+
**How should I think about redesigning processes for automation?**
186
186
187
187
Rather than trying to create a one-to-one replacement of an existing manual process, rethink the process from scratch with automation in mind. This might involve:
188
188
- Breaking complex workflows into multiple steps
@@ -192,7 +192,7 @@ Rather than trying to create a one-to-one replacement of an existing manual proc
192
192
193
193
When possible, design the process around asynchronous processing rather than optimizing for latency, especially for back-office workflows where accuracy is more important than speed.
194
194
195
-
## How can I reduce costs for document processing?
195
+
**How can I reduce costs for document processing?**
196
196
197
197
Several strategies can help manage costs:
198
198
- Use model distillation to create smaller, faster models for specific tasks
@@ -202,7 +202,7 @@ Several strategies can help manage costs:
202
202
203
203
For back-office workflows, prioritize accuracy over latency and design processes that accommodate asynchronous processing.
204
204
205
-
## How should I involve domain experts in document automation?
205
+
**How should I involve domain experts in document automation?**
206
206
207
207
Domain experts are crucial for successful document automation. Involve them in:
208
208
- Tailoring classifications and schema design to match industry jargon
@@ -212,7 +212,7 @@ Domain experts are crucial for successful document automation. Involve them in:
212
212
213
213
Teams that effectively incorporate domain experts gain a significant competitive advantage in document automation.
214
214
215
-
## What's the best way to start automating a document workflow?
215
+
**What's the best way to start automating a document workflow?**
216
216
217
217
Start by identifying a single, high-priority use case rather than attempting to automate multiple workflows simultaneously. Focus on understanding the data and existing process thoroughly before designing automation.
0 commit comments