07_列出所有的人物 (Person)

人物 (Person/Pronoun) 是指語言中指涉人的詞彙，例如「人名」或是「代名詞」。

輸入要分析的句子字串 "謝長廷用一句馬形容馬英九，馬沒有回應他"

from ArticutAPI import Articut
from pprint import pprint
username = "" #這裡填入您在 https://api.droidtown.co 使用的帳號 email。若使用空字串，則預設使用每小時 2000 字的公用額度。
apikey   = "" #這裡填入您在 https://api.droidtown.co 登入後取得的 api Key。若使用空字串，則預設使用每小時 2000 字的公用額度。
articut = Articut(username, apikey)

inputSTR = "謝長廷用一句馬形容馬英九，馬沒有回應他。"
resultDICT = articut.parse(inputSTR)
pprint(resultDICT["result_pos"])

列出所有指涉人物的詞彙，包括代名詞

personLIST = articut.getPersonLIST(resultDICT)
pprint(personLIST)

輸出結果如下

[[(15, 18, '謝長廷'), (177, 180, '馬英九')], [], [(16, 17, '馬'), (112, 113, '他')], []]

.getPersonLIST() 預設會把代名詞一起計入，所以在前述的操作中，(16, 17, '馬') 和 (112, 113, '他') 這兩個代名詞會被擷取出來。如果不需要擷取代名詞的話，可以在使用 .getPersonLIST() 時把，加入 includePronounBOOL=False 的參數，把代名詞的擷取功能關閉。

列出所有指涉人物的詞彙，但不包括代名詞

personLIST = articut.getPersonLIST(resultDICT,  includePronounBOOL=False)
pprint(personLIST)

輸出結果如下

[[(15, 18, '謝長廷'), (177, 180, '馬英九')], [], [], []]

這麼一來，在結果裡就不會出現代名詞的 (16, 17, '馬') 和 (112, 113, '他') 了

輸出結果和原入的句子之間的索引對應關係如下：

Droidtown Linguistic Tech.
Document | Blog | Twitter @DroidtownLing | Facebook @Articut | Website

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

07_列出所有的人物 (Person)

輸入要分析的句子字串 "謝長廷用一句馬形容馬英九，馬沒有回應他"

列出所有指涉人物的詞彙，包括代名詞

輸出結果如下

列出所有指涉人物的詞彙，但不包括代名詞

輸出結果如下

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally