Skip to content

Commit b1dba87

Browse files
authored
fix: memory leak on chipper processor, beam search parameters, and bbox bug (#258)
This PR intends to solve the following issues: * Memory leak in DonutProcessor when using large images in numpy format * Use the right settings for beam search size > 1 * Solve a bug that in very rare cases made the last element predicted by Chipper to have a bbox = None
1 parent 63eecdf commit b1dba87

File tree

3 files changed

+11
-8
lines changed

3 files changed

+11
-8
lines changed

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,9 @@
1+
## 0.7.7
2+
3+
• Fix a memory leak in DonutProcessor when using large images in numpy format
4+
• Set the right settings for beam search size > 1
5+
• Fix a bug that in very rare cases made the last element predicted by Chipper to have a bbox = None
6+
17
## 0.7.6
28

39
* fix a bug where invalid zoom factor lead to exceptions; now invalid zoom factors results in no scaling of the image
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.7.6" # pragma: no cover
1+
__version__ = "0.7.7" # pragma: no cover

unstructured_inference/models/chipper.py

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -149,10 +149,7 @@ def predict_tokens(
149149
with torch.no_grad():
150150
encoder_outputs = self.model.encoder(
151151
self.processor(
152-
np.array(
153-
image,
154-
np.float32,
155-
),
152+
image,
156153
return_tensors="pt",
157154
).pixel_values.to(self.device),
158155
)
@@ -177,9 +174,9 @@ def predict_tokens(
177174
encoder_outputs=encoder_outputs,
178175
input_ids=self.input_ids,
179176
logits_processor=self.logits_processor,
180-
do_sample=False,
177+
do_sample=True,
181178
no_repeat_ngram_size=0,
182-
num_beams=5,
179+
num_beams=3,
183180
return_dict_in_generate=True,
184181
output_attentions=True,
185182
output_scores=True,
@@ -304,7 +301,7 @@ def postprocess(
304301
end = i
305302

306303
# If exited before eos is achieved
307-
if start != -1 and start < end and len(parents) > 0:
304+
if start != -1 and start <= end and len(parents) > 0:
308305
slicing_end = end + 1
309306
string = self.tokenizer.decode(output_ids[start:slicing_end])
310307

0 commit comments

Comments
 (0)