# ERNIE及ERNIE 2.0论文笔记

### ERNIE 2.0

#### 简介

It is effective to learn general language representation by pre-training a language model with a large amount of unannotated data. Traditional methods usually focus on context-independent word embedding. Methods such as Word2Vec[9] and GloVe[10] learn fixed word embeddings through word co-occurrence on large corpora.

Recently, several studies centered on contextualized language representations have been proposed and context-dependent language representations have shown state-of-the-art results in various natural language processing tasks. ELMo[1] proposes to extract context-sensitive features from a language model. OpenAI GPT[2] enhances the context-sensitive embedding by adjusting the Transformer[11]. BERT[3], however, adopts a masked language model while adding a next sentence prediction task into the pre-training. XLM[12] integrates two methods to learn cross-lingual language models, namely the unsupervised method that relies only on monolingual data and supervised method that leverages parallel bilingual data. MT-DNN[13] achieves a better result through learning several supervised tasks in GLUE[14] together based on the pre-trained model, which eventually leads to improvements on other supervised tasks that are not learned in the stage of multi-task supervised fine-tuning. XLNet[5] uses Transformer-XL[15] and proposes a generalized autoregressive pre-training method that learns bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order.

#### 模型框架

Capitalization Prediction 预测是否大写字词。（大写字词通常有明确语义，大小写区分模型在类似NER等任务中有更有优势）
Token-Document Relation Prediction 预测某个segment中出现的token是否在同文档的其他segments中出现。（建模关键字，增强焦点准确性）
Sentence Reordering 段落被随机切分成1-m个segments并随机重排，目标恢复原序。（转成k分类问题，其中$k=\sum_{n=1}^{m}n!$，旨在学习句子关系）
Sentence Distance 3分类问题，0表示两句相邻，1表示两句在同一篇文档，2表示来自不同文档。
Discourse Relation 通过Sileo et.al提出的无监督方法获取一批训练数据，预测两个句子间的语义或修饰关系。
IR Relevance 3分类问题，预测用户query和搜索结果title的相关性，0表示强相关（用户点击），1表示弱相关（展示了用户没有点击），2表示不相关。