論文Abstract100本ノック#9 - 十の並列した脳

前回↓

ryosuke-okubo.hatenablog.com

今回から言語処理にまつわる論文を扱う。

41 Word2Vec（2013）
42 GloVe（2014）
43 Doc2Vec（2014）
44 Seq2Seq（2014）
45 HRED（2015）

41 Word2Vec（2013）

f:id:ryosuke_okubo:20190928093218p:plain

原文：

Efficient Estimation of Word Representations in Vector Space

Abstract：

We propose two novel model architectures for computing continuous vector representations of words from very large data sets.

訳：

非常に大きなデータセットから単語の連続ベクトル表現を計算するための2つの新しいモデルアーキテクチャを提案する。

The quality of these representations is measured in a word similarity task, and the results are compared to the previously best performing techniques based on different types of neural networks.

訳：

これらの表現の品質は単語の類似性タスクで測定され，結果をさまざまなタイプのニューラルネットワークに基づいている従来に最高の性能を発揮した手法と比較する。

We observe large improvements in accuracy at much lower computational cost, i.e. it takes less than a day to learn high quality word vectors from a 1.6 billion words data set.

訳：

はるかに低い計算コストで精度が大幅に向上する，言い換えると16億語のデータセットから高品質の言語ベクトルを学習するのに1日もかからない。

Furthermore, we show that these vectors provide state-of-the-art performance on our test set for measuring syntactic and semantic word similarities.

訳：

さらに，これらのベクトルは構文および意味の単語の類似性を測定するためのテストセットで最先端の性能を提供することを示す。

42 GloVe（2014）

原文：

GloVe: Global Vectors for Word Representation

Abstract：

Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic, but the origin of these regularities has remained opaque.

語彙：

semantic

opaque

訳：

単語のベクトル空間表現を学習するための最近の方法はベクトル演算を使用してきめ細かいセマンティックおよび構文規則性をとらえることに成功しているが，これらの規則性の起源は不透明なままである。

We analyze and make explicit the model properties needed for such regularities to emerge in word vectors.

語彙：

explicit

emerge

訳：

我々はこのような規則性が単語ベクトルに現れるために必要なモデルプロパティを分析し明示的にする。

The result is a new global logbilinear regression model that combines the advantages of the two major model families in the literature:

global matrix factorization and local context window methods.

語彙：

logbilinear

訳：

その結果は文献の2つの主要なモデルファミリーの利点を組み合わせた新しいグローバルな対数線形回帰モデルである：

global matrix factorizationとlocal context window methods。

Our model efficiently leverages statistical information by training only on the nonzero elements in a word-word cooccurrence matrix, rather than on the entire sparse matrix or on individual context windows in a large corpus.

語彙：

leverages

cooccurrence matrix

訳：

このモデルはスパース行列全体または大規模なコーパスの個々のコンテキストウィンドウではなく，単語間共起行列の非ゼロ要素のみで学習することにより統計情報を効率的に活用する。

The model produces a vector space with meaningful substructure, as evidenced by its performance of 75% on a recent word analogy task.

語彙：

meaningful

訳：

このモデルは最近の単語の類推タスクで75％のパフォーマンスを示しているように，意味のある部分構造を持つベクトル空間を生成します。

It also outperforms related models on similarity tasks and named entity recognition.

語彙：

named entity recognition

訳：

また類似性タスクおよび固有表現抽出に関する関連モデルよりも優れている。

43 Doc2Vec（2014）

f:id:ryosuke_okubo:20190928093252p:plain

原文：

Distributed Representations of Sentences and Documents

Abstract：

Many machine learning algorithms require the input to be represented as a fixed-length feature vector.

訳：

多くの機械学習アルゴリズムでは入力を固定長の特徴ベクトルとして表す必要がある。

When it comes to texts, one of the most common fixed-length features is bag-of-words.

訳：

テキストに関して言えば，最も一般的な固定長特徴の1つはbag-of-wordsである。

Despite their popularity, bag-of-words features have two major weaknesses:

they lose the ordering of the words and they also ignore semantics of the words.

語彙：

Despite

訳：

人気があるにもかかわらず，bag-of-wordsには2つの大きな弱点がある：

単語の順序が失われ，単語のセマンティクスも無視される。

For example, "powerful," "strong" and "Paris" are equally distant.

訳：

たとえば，「強力な」，「強い，、「パリ」は同じくらい遠い。

In this paper, we propose Paragraph Vector, an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents.

訳：

本稿では，文，段落，文書などの可変長のテキストから固定長の特徴表現を学習する教師なしアルゴリズムである，Paragraph Vectorを提案する。

Our algorithm represents each document by a dense vector which is trained to predict words in the document.

訳：

我々のアルゴリズムは文書内の単語を予測するために学習された密ベクトルによって各文書を表す。

Its construction gives our algorithm the potential to overcome the weaknesses of bag-of-words models.

訳：

その構造によってアルゴリズムにbag-of-wordsモデルの弱点を克服する可能性が与えられる。

Empirical results show that Paragraph Vectors outperform bag-of-words models as well as other techniques for text representations.

訳：

経験結果はParagraph Vectorsがbag-of-wordsモデルやテキスト表現の他の手法よりも優れていることを示す。

Finally, we achieve new state-of-the-art results on several text classification and sentiment analysis tasks.

語彙：

sentiment

訳：

最後に，いくつかのテキスト分類および感情分析タスクに関する新しいstate-of-the-artを達成する。

44 Seq2Seq（2014）

f:id:ryosuke_okubo:20190928093315p:plain

原文：

Sequence to Sequence Learning with Neural Networks

Abstract：

Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks.

訳：

DNNは困難な学習タスクで優れたパフォーマンスを達成した強力なモデルである。

Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences.

語彙：

whenever

訳：

DNNはラベル付きの大きな学習セットが利用できる場合は常に機能するが，シークエンスにシークエンスをマッピングするために使用することはできない。

In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure.

語彙：

assumptions

訳：

本稿では，シークエンス構造に関する最小限の仮定を行う，シークエンス学習に対する一般的なエンドツーエンドのアプローチを示す。

Our method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.

訳：

我々の方法では多層のLong Short-Term Memory（LSTM）を使用して入力シーケンスを固定次元ベクトルにマッピングし，次に別のdeep LSTMを使用してベクトルからターゲットシークエンスをデコードする。

Our main result is that on an English to French translation task from the WMT'14 dataset, the translations produced by the LSTM achieve a BLEU score of 34.8 on the entire test set, where the LSTM's BLEU score was penalized on out-of-vocabulary words.

訳：

主な結果は，WMT'14データセットからの英語からフランス語への翻訳タスクでLSTMによって生成された翻訳において，LSTMのBLEUスコアが語彙外の単語に対してペナルティを課された場合，テストセット全体で34.8のBLEUスコアを達成する。

Additionally, the LSTM did not have difficulty on long sentences.

訳：

さらに，LSTMは長い文でも問題はなかった。

For comparison, a phrase-based SMT system achieves a BLEU score of 33.3 on the same dataset.

語彙：

For comparison

訳：

比較のために，フレーズベースのSMTシステムは同じデータセットで33.3のBLEUスコアを達成する。

When we used the LSTM to rerank the 1000 hypotheses produced by the aforementioned SMT system, its BLEU score increases to 36.5, which is close to the previous best result on this task.

語彙：

hypotheses

訳：

LSTMを使用して前述のSMTシステムによって生成された1000の仮説を再ランク付けすると，BLEUスコアは36.5に増加し，このタスクの以前の最良の結果に近くなる。

The LSTM also learned sensible phrase and sentence representations that are sensitive to word order and are relatively invariant to the active and the passive voice.

語彙：

word order

訳：

LSTMはまた，語順に敏感で，能動的および受動的音声に比較的不変である理にかなったフレーズと文の表現を学習した。

Finally, we found that reversing the order of the words in all source sentences (but not target sentences) improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

訳：

最後に，ソース文とターゲット文の間に短期間の依存関係が多く導入され最適化問題が容易になるため，すべてのソース文（ターゲット文ではない）の単語の順序を逆にすることでLSTMのパフォーマンスが著しく向上することがわかった。

45 HRED（2015）

f:id:ryosuke_okubo:20190928093340p:plain

原文：

A Hierarchical Recurrent Encoder-Decoder For Generative Context-Aware Query Suggestion

Abstract：

Users may strive to formulate an adequate textual query for their information need.

語彙：

strive

adequate

訳：

ユーザーは情報のニーズに応じて適切なテキストクエリを作成するよう努力する場合がある。

Search engines assist the users by presenting query suggestions.

訳：

検索エンジンはクエリ候補を提示することでユーザーを支援する。

To preserve the original search intent, suggestions should be context-aware and account for the previous queries issued by the user.

語彙：

context-aware

account for

訳：

元の検索意図を保持するために，提案はコンテキスト認識してユーザーが発行した以前のクエリを考慮する必要がある。

Achieving context awareness is challenging due to data sparsity.

訳：

データの希薄性のためコンテキスト認識の達成は困難である。

We present a probabilistic suggestion model that is able to account for sequences of previous queries of arbitrary lengths.

語彙：

arbitrary

訳：

我々は任意の長さの以前のクエリのシークエンスを説明できる確率的提案モデルを提示する。

Our novel hierarchical recurrent encoder-decoder architecture allows the model to be sensitive to the order of queries in the context while avoiding data sparsity.

訳：

我々の斬新なhierarchical recurrent encoder-decoderアーキテクチャにより，データスパースを回避しながらモデルをコンテキスト内のクエリの順序に敏感にすることができる。