如何获得pdf的正文 #482

albertcity · 2025-01-21T12:36:07Z

albertcity
Jan 21, 2025

Is there an existing issue for this?

I have searched the existing issues

Environment

OS:
Zotero Version:
Plugin Version:

Describe the feature request

我想开发一个script，可以自动提取一片论文（pdf格式）的content，并且用chatgpt生成对应的summary和tags。我想知道要如何在action脚本中获得当前pdf的所有正文内容呢？

Describe the solution you'd like

获得pdf正文内容的api。

Anything else?

No response

cs-qyzhang · 2025-02-14T06:10:03Z

cs-qyzhang
Feb 14, 2025

我已经写了个脚本可以用LLM总结正文：#457。要读取pdf我是这样做的：

let pdfAttachment = await item.getBestAttachment();
let pdfPath = await pdfAttachment.getFilePath();
let fileData = await IOUtils.read(pdfPath);

解析pdf文件我是搭建了http服务，将文件发送给服务器做解析。

0 replies

2025-03-17T02:44:11Z

github-actions[bot]
bot Mar 17, 2025

This issue is stale because it has been open for 30 days with no activity.

0 replies

Loboqui · 2025-03-20T17:00:02Z

Loboqui
Mar 20, 2025

qyzhang提供了非常好的想法！由于在zotero内debug可能比较困难，为了分离与Zotero文献交互和实际处理的逻辑，更好的方法是本地运行http服务，zotero把pdf交给服务，服务去获得pdf正文，调用ai，服务返回各个标签给zotero，记录在zotero中。

这样来看，zotero像一个写好的前端。如果您要做后续数据分析，甚至可以存在数据库里或csv里。这样可以轻松提取各文献关键信息并汇总，适用于meta分析或者其他需要对文献批量提同样问题的场景

0 replies

windingwind · 2025-03-20T18:34:30Z

windingwind
Mar 20, 2025
Maintainer

await Zotero.PDFWorker.getFullText(item.id)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

如何获得pdf的正文 #482

{{title}}

Replies: 4 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

如何获得pdf的正文 #482

albertcity Jan 21, 2025

Is there an existing issue for this?

Environment

Describe the feature request

Describe the solution you'd like

Anything else?

Replies: 4 comments

cs-qyzhang Feb 14, 2025

github-actions[bot] bot Mar 17, 2025

Loboqui Mar 20, 2025

windingwind Mar 20, 2025 Maintainer

albertcity
Jan 21, 2025

cs-qyzhang
Feb 14, 2025

github-actions[bot]
bot Mar 17, 2025

Loboqui
Mar 20, 2025

windingwind
Mar 20, 2025
Maintainer