Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added ._.git
Binary file not shown.
Binary file added ._.gitignore
Binary file not shown.
Binary file added ._LICENSE
Binary file not shown.
Binary file added ._README.md
Binary file not shown.
Binary file added ._SUMMARY.md
Binary file not shown.
Binary file added ._chapter12
Binary file not shown.
Binary file added ._chapter2
Binary file not shown.
Binary file added ._chapter8
Binary file not shown.
Binary file added ._glossary.md
Binary file not shown.
Empty file modified .gitignore
100644 → 100755
Empty file.
Empty file modified LICENSE
100644 → 100755
Empty file.
2 changes: 2 additions & 0 deletions README.md
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,8 @@
- 8.7.2 基于规则的方法
- 8.7.3 形态丰富语言的词性标注
- [8.8 总结](chapter8/8.8_Summary.md)
- 第十章 机器翻译
- [Intro](chapter10/intro.md)
- 第十二章 成分文法
- [Intro](chapter12/intro.md)
- [12.1 句法](chapter12/12.1_Constituency.md)
Expand Down
Empty file modified SUMMARY.md
100644 → 100755
Empty file.
Binary file added chapter10/._intro.md
Binary file not shown.
11 changes: 11 additions & 0 deletions chapter10/intro.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
## 10 机器翻译和编码器解码器模型
---
本章将要介绍机器翻译技术,即使用计算机将一种语言翻译成另一种语言。

广义上的翻译,例如文学作品翻译、诗词翻译是一项困难的、迷人的并且具有译者强烈的个人色彩的任务,和其他涉及到人类创作的领域一样。

机器翻译现在聚焦于实际问题。或许现在机器翻译最普遍的作用是获取信息。我们可能会想要翻译一些网络上的使用说明,也可能想要翻译美味佳肴的菜谱,也可能是想要翻译组装家具的步骤。或者我们可能想要去阅读报纸上的文章,或者从网络上(例如另一种语言的维基百科或者政府网站)获取信息。用于获取信息的机器翻译可能是自然语言处理技术最普遍的应用之一,谷歌翻译一天之内要在一百多种语言间翻译数十亿单词。

最后,机器翻译最近被用来满足人们即时沟通的需求。它包含增量翻译,即在句子未补完的情况下进行翻译,类似于同声传译。图片中心翻译可以用实现图片内容翻译的功能,比如使用手机相片的OCR文本结果进行菜单或路标的翻译。

机器翻译的标准算法是编码器解码器神经网络(seq2seq网络),可以通过RNN或者Transformer实现。在之前的章节中,我们了解到RNN或者Transfomer架构可以用于分类任务(例如实现从一个句子到正向负向情感标签的映射,进行情感分析),或者用于给序列标注标签(例如给输入序列的每一个单词分配词性或者一个命名实体标签)。在词性标注任务中,输出标签和每一个输入单词是直接相关的,所以我们可以直接对每个输出标签$y_t$和输入单词$x_t$进行建模。
Binary file added chapter12/._12.1_Constituency.md
Binary file not shown.
Binary file added chapter12/._12.2_Context-Free-Grammars.md
Binary file not shown.
Binary file not shown.
Binary file added chapter12/._assets
Binary file not shown.
Binary file added chapter12/._intro.md
Binary file not shown.
Empty file modified chapter12/12.1_Constituency.md
100644 → 100755
Empty file.
Empty file modified chapter12/12.2_Context-Free-Grammars.md
100644 → 100755
Empty file.
14 changes: 0 additions & 14 deletions chapter12/12.3_Some-Grammar-Rules-for-English.md
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,15 +1 @@
## 12.3 一些英语语法规则(*Some Grammar Rules for English*)

在本节中,我们将介绍英语短语结构的另外几个方面;为了保持一致性,我们将继续关注 ATIS 领域的句子。由于篇幅有限,我们的讨论只能限于重点。强烈建议读者去查阅优秀的英语参考语法,如 Huddleston and Pullum (2002)[^1]。

[^1]: Huddleston, R. and G. K. Pullum. 2002. The Cambridge Grammar of the English Language. Cambridge University Press.

12.3.1 句子级结构

在前面提到的小语法 $\mathscr{L}_0$ 中,我们只为陈述句提供了一个句子级的结构,如 *I prefer a morning flight*。在英语句子的大量结构中,有四种特别常见和重要的结构:陈述句、祈使句、是-否疑问句和 wh-疑问句。

**陈述式**(*declarative*)结构的句子一般有一个主语名词短语,后面接一个动词短语,如“I prefer a morning flight”。这种结构的句子有大量不同的用途,我们将在第 24 章中继续讨论。下面是一些来自 ATIS 领域的例子:

> I want a flight from Ontario to Chicago
> The flight should be eleven a.m. tomorrow
> The return flight should leave at around seven p.m.
Binary file added chapter12/assets/._fig12.1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter12/assets/._fig12.2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter12/assets/._fig12.3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter12/assets/._fig12.4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified chapter12/assets/fig12.1.png
100644 → 100755
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified chapter12/assets/fig12.2.png
100644 → 100755
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified chapter12/assets/fig12.3.png
100644 → 100755
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified chapter12/assets/fig12.4.png
100644 → 100755
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified chapter12/intro.md
100644 → 100755
Empty file.
Binary file added chapter2/._2.1_Regular-Expressions.md
Binary file not shown.
Binary file added chapter2/._2.2_Words.md
Binary file not shown.
Binary file added chapter2/._2.3_Corpora.md
Binary file not shown.
Binary file added chapter2/._2.4_Text-Normalization.md
Binary file not shown.
Binary file added chapter2/._2.5_Minimum-Edit-Distance.md
Binary file not shown.
Binary file added chapter2/._2.6_Summary.md
Binary file not shown.
Binary file added chapter2/._assets
Binary file not shown.
Binary file added chapter2/._intro.md
Binary file not shown.
Empty file modified chapter2/2.1_Regular-Expressions.md
100644 → 100755
Empty file.
Empty file modified chapter2/2.2_Words.md
100644 → 100755
Empty file.
Empty file modified chapter2/2.3_Corpora.md
100644 → 100755
Empty file.
Empty file modified chapter2/2.4_Text-Normalization.md
100644 → 100755
Empty file.
Empty file modified chapter2/2.5_Minimum-Edit-Distance.md
100644 → 100755
Empty file.
Empty file modified chapter2/2.6_Summary.md
100644 → 100755
Empty file.
Binary file added chapter2/assets/._fig2.1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.10.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.11.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.12.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.13.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.14.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.15.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.16.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.17.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.18.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.19.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.6.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.7.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added chapter2/assets/._fig2.8.png
Binary file added chapter2/assets/._fig2.9.png
Binary file added chapter2/assets/._operator-precedence.png
Binary file added chapter2/assets/._table2.4.4.png
Empty file modified chapter2/assets/fig2.1.png
100644 → 100755
Empty file modified chapter2/assets/fig2.10.png
100644 → 100755
Empty file modified chapter2/assets/fig2.11.png
100644 → 100755
Empty file modified chapter2/assets/fig2.12.png
100644 → 100755
Empty file modified chapter2/assets/fig2.13.png
100644 → 100755
Empty file modified chapter2/assets/fig2.14.png
100644 → 100755
Empty file modified chapter2/assets/fig2.15.png
100644 → 100755
Empty file modified chapter2/assets/fig2.16.png
100644 → 100755
Empty file modified chapter2/assets/fig2.17.png
100644 → 100755
Empty file modified chapter2/assets/fig2.18.png
100644 → 100755
Empty file modified chapter2/assets/fig2.19.png
100644 → 100755
Empty file modified chapter2/assets/fig2.2.png
100644 → 100755
Empty file modified chapter2/assets/fig2.3.png
100644 → 100755
Empty file modified chapter2/assets/fig2.4.png
100644 → 100755
Empty file modified chapter2/assets/fig2.5.png
100644 → 100755
Empty file modified chapter2/assets/fig2.6.png
100644 → 100755
Empty file modified chapter2/assets/fig2.7.png
100644 → 100755
Empty file modified chapter2/assets/fig2.8.png
100644 → 100755
Empty file modified chapter2/assets/fig2.9.png
100644 → 100755
Empty file modified chapter2/assets/operator-precedence.png
100644 → 100755
Empty file modified chapter2/assets/table2.4.4.png
100644 → 100755
Empty file modified chapter2/intro.md
100644 → 100755
Empty file.
Binary file added chapter8/._8.1_Mostly-English-Word-Classes.md
Binary file not shown.
Binary file added chapter8/._8.2_Part-of-Speech-Tagging.md
Binary file not shown.
Binary file not shown.
Binary file added chapter8/._8.4_HMM-Part-of-Speech-Tagging.md
Binary file not shown.
Binary file added chapter8/._8.5_Conditional-Random-Fields.md
Binary file not shown.
Binary file not shown.
Binary file added chapter8/._8.7_Further-Details.md
Binary file not shown.
Binary file added chapter8/._8.8_Summary.md
Binary file not shown.
Binary file added chapter8/._assets
Binary file not shown.
Binary file added chapter8/._intro.md
Binary file not shown.
Empty file modified chapter8/8.1_Mostly-English-Word-Classes.md
100644 → 100755
Empty file.
Empty file modified chapter8/8.2_Part-of-Speech-Tagging.md
100644 → 100755
Empty file.
Empty file modified chapter8/8.3_Named-Entities-and-Named-Entity-Tagging.md
100644 → 100755
Empty file.
Empty file modified chapter8/8.4_HMM-Part-of-Speech-Tagging.md
100644 → 100755
Empty file.
Empty file modified chapter8/8.5_Conditional-Random-Fields.md
100644 → 100755
Empty file.
Empty file modified chapter8/8.6_Evaluation-of-Named-Entity-Recognition.md
100644 → 100755
Empty file.
Empty file modified chapter8/8.7_Further-Details.md
100644 → 100755
Empty file.
Empty file modified chapter8/8.8_Summary.md
100644 → 100755
Empty file.
Binary file added chapter8/assets/._fig8.1.png
Binary file added chapter8/assets/._fig8.10.png
Binary file added chapter8/assets/._fig8.11.png
Binary file added chapter8/assets/._fig8.12.png
Binary file added chapter8/assets/._fig8.13.png
Binary file added chapter8/assets/._fig8.14.png
Binary file added chapter8/assets/._fig8.15.png
Binary file added chapter8/assets/._fig8.16.png
Binary file added chapter8/assets/._fig8.2.png
Binary file added chapter8/assets/._fig8.3.png
Binary file added chapter8/assets/._fig8.4.png
Binary file added chapter8/assets/._fig8.5.png
Binary file added chapter8/assets/._fig8.6.png
Binary file added chapter8/assets/._fig8.7.png
Binary file added chapter8/assets/._fig8.8.png
Binary file added chapter8/assets/._fig8.9.png
Binary file added chapter8/assets/._ner_output.png
Binary file added chapter8/assets/._ner_sentence.png
Binary file added chapter8/assets/._pos-tagging-test.png
Empty file modified chapter8/assets/fig8.1.png
100644 → 100755
Empty file modified chapter8/assets/fig8.10.png
100644 → 100755
Empty file modified chapter8/assets/fig8.11.png
100644 → 100755
Empty file modified chapter8/assets/fig8.12.png
100644 → 100755
Empty file modified chapter8/assets/fig8.13.png
100644 → 100755
Empty file modified chapter8/assets/fig8.14.png
100644 → 100755
Empty file modified chapter8/assets/fig8.15.png
100644 → 100755
Empty file modified chapter8/assets/fig8.16.png
100644 → 100755
Empty file modified chapter8/assets/fig8.2.png
100644 → 100755
Empty file modified chapter8/assets/fig8.3.png
100644 → 100755
Empty file modified chapter8/assets/fig8.4.png
100644 → 100755
Empty file modified chapter8/assets/fig8.5.png
100644 → 100755
Empty file modified chapter8/assets/fig8.6.png
100644 → 100755
Empty file modified chapter8/assets/fig8.7.png
100644 → 100755
Empty file modified chapter8/assets/fig8.8.png
100644 → 100755
Empty file modified chapter8/assets/fig8.9.png
100644 → 100755
Empty file modified chapter8/assets/ner_output.png
100644 → 100755
Empty file modified chapter8/assets/ner_sentence.png
100644 → 100755
Empty file modified chapter8/assets/pos-tagging-test.png
100644 → 100755
Empty file modified chapter8/intro.md
100644 → 100755
Empty file.
Empty file modified glossary.md
100644 → 100755
Empty file.