Skip to content

Commit 5bf8f29

Browse files
committed
Update mkdocs.yml and extend-document-automation talk for clarity and consistency
- Changed the title of a talk in mkdocs.yml to better reflect its content: "Building Custom Embedding Models Per Customer" by Manav, Glean. - Added a new talk on document automation by Eli Badgio, enhancing the navigation structure. - Reformatted FAQs in the extend-document-automation talk for improved readability and consistency, using bold for questions. - Deleted outdated talks on document workflows and multi-agent systems to streamline content and focus on relevant topics.
1 parent dc382a2 commit 5bf8f29

File tree

6 files changed

+295
-302
lines changed

6 files changed

+295
-302
lines changed
File renamed without changes.

docs/talks/extend-document-automation.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -147,11 +147,11 @@ As AI capabilities continue to advance, the companies that will benefit most are
147147

148148
## FAQs
149149

150-
## What is document automation and why is it important?
150+
**What is document automation and why is it important?**
151151

152152
Document automation uses AI to process and extract information from documents, reducing manual effort and errors. This is particularly crucial for enterprises where manual document workflows consume significant resources. For example, a single building construction project can require processing a million documents. Effective document automation can achieve 95%+ extraction accuracy, streamlining operations across industries from finance to healthcare.
153153

154-
## What are the main types of document processing use cases?
154+
**What are the main types of document processing use cases?**
155155

156156
There are three broad categories of document processing:
157157
- Ingestion: Including RAG context for agents and data mining
@@ -160,29 +160,29 @@ There are three broad categories of document processing:
160160

161161
The most challenging and often overlooked category is back office processing, where documents serve as critical systems of record with zero tolerance for errors.
162162

163-
## What's the most common mistake when implementing document automation?
163+
**What's the most common mistake when implementing document automation?**
164164

165165
The most common failure is attempting to "one-shot" the automation—trying to replace an entire manual process with automation in a single step. This approach almost never works, even for relatively simple use cases. It sets unrealistic expectations and doesn't account for the complex change management required when transitioning from manual to automated processes.
166166

167-
## How should I approach understanding my document data?
167+
**How should I approach understanding my document data?**
168168

169169
Take time to thoroughly understand both your documents and the existing manual processes around them. Many companies have enormous amounts of tacit knowledge hidden in people's heads—whether in engineering, operations, or domain experts. This knowledge is critical for successful automation.
170170

171171
One effective approach is to implement a human-in-the-loop flow where AI-processed documents are still routed to humans for review. This helps collect robust production data separate from your evaluation sets. Also consider creating extractors that characterize normalized attributes of your documents (supplier types, layout patterns, terminology variations) to help cluster and understand your document landscape.
172172

173-
## Why are tailored evaluations so important for document automation?
173+
**Why are tailored evaluations so important for document automation?**
174174

175175
Generic benchmarks don't translate well to specific document automation tasks. You need evaluations tailored to your domain, data, and processes. Invest early in creating comprehensive, fine-grained evaluation sets across a variety of document examples.
176176

177177
The best approach is to involve domain experts (like nurses for healthcare documents or billing specialists for financial documents) to help build robust evaluation sets and optimize extraction schemas. This creates a competitive advantage as you can quickly test and confidently roll out improvements when new models become available.
178178

179-
## Should I aim for 100% automation from the start?
179+
**Should I aim for 100% automation from the start?**
180180

181181
No. It's much more effective to start with partial automation and gradually increase automation rates. Focus on what's called the "true automation rate"—not just accuracy on extracted data, but how often the end-to-end pipeline works perfectly and you can catch errors 100% of the time.
182182

183183
For example, a system with 85% true automation rate where you can reliably identify the 15% that needs human review is far better than a system with 98% accuracy but no way to identify which 2% contains errors.
184184

185-
## How should I think about redesigning processes for automation?
185+
**How should I think about redesigning processes for automation?**
186186

187187
Rather than trying to create a one-to-one replacement of an existing manual process, rethink the process from scratch with automation in mind. This might involve:
188188
- Breaking complex workflows into multiple steps
@@ -192,7 +192,7 @@ Rather than trying to create a one-to-one replacement of an existing manual proc
192192

193193
When possible, design the process around asynchronous processing rather than optimizing for latency, especially for back-office workflows where accuracy is more important than speed.
194194

195-
## How can I reduce costs for document processing?
195+
**How can I reduce costs for document processing?**
196196

197197
Several strategies can help manage costs:
198198
- Use model distillation to create smaller, faster models for specific tasks
@@ -202,7 +202,7 @@ Several strategies can help manage costs:
202202

203203
For back-office workflows, prioritize accuracy over latency and design processes that accommodate asynchronous processing.
204204

205-
## How should I involve domain experts in document automation?
205+
**How should I involve domain experts in document automation?**
206206

207207
Domain experts are crucial for successful document automation. Involve them in:
208208
- Tailoring classifications and schema design to match industry jargon
@@ -212,7 +212,7 @@ Domain experts are crucial for successful document automation. Involve them in:
212212

213213
Teams that effectively incorporate domain experts gain a significant competitive advantage in document automation.
214214

215-
## What's the best way to start automating a document workflow?
215+
**What's the best way to start automating a document workflow?**
216216

217217
Start by identifying a single, high-priority use case rather than attempting to automate multiple workflows simultaneously. Focus on understanding the data and existing process thoroughly before designing automation.
218218

0 commit comments

Comments
 (0)