論文Abstract100本ノック#4 - 十の並列した脳

前回↓

ryosuke-okubo.hatenablog.com

今回から画像生成問題で頻用されるGANを扱っていく。

参考記事：

何をしたいかで有名どころのGANの種類、派生を整理

16 GAN（2014）
17 DCGAN（2015）
18 PCGAN（2017）
19 ACGAN（2016）
20 SAGAN（2018）

16 GAN（2014）

f:id:ryosuke_okubo:20190909081909p:plain

原文：

Generative Adversarial Networks

Abstract

We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models:

a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

語彙：

generative

via

adversarial

distribution

discriminative

訳：

敵対的プロセスを介して生成モデルを推定するための新しいフレームワークを提案する，このプロセスでは2つのモデルを同時に学習する：

データ分布をキャプチャする生成モデルGと，Gではなく学習データからサンプルが取得された確率を推定する判別モデルD。

The training procedure for G is to maximize the probability of D making a mistake.

語彙：

procedure

訳：

Gの学習手続きはDがミスをする確率を最大化することである。

This framework corresponds to a minimax two-player game.

語彙：

corresponds

minimax

訳：

このフレームワークはミニマックスの2人用ゲームに対応している。

In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to 1/2 everywhere.

語彙：

arbitrary

unique solution

訳：

任意の関数GとDの空間には，Gが学習データの分布をリカバリーし，Dがどこでも1/2に等しいという一意解が存在する。

In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation.

語彙：

entire

backpropagation

訳：

GとDが多層パーセプトロンで定義されている場合，システム全体を逆伝播で学習できる。

There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples.

語彙：

approximate inference

訳：

サンプルの学習または生成中にマルコフ連鎖または展開された近似推論ネットワークは必要ない。

Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.

語彙：

qualitative

quantitative

訳：

実験によって生成されたサンプルの定性的および定量的評価を通じてフレームワークの可能性を示される。

17 DCGAN（2015）

f:id:ryosuke_okubo:20190909081937p:plain

原文：

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

Abstract

In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications.

語彙：

huge

訳：

近年，畳み込みネットワーク（CNN）による教師あり学習はコンピュータービジョンアプリケーションで大規模に採用されている。

Comparatively, unsupervised learning with CNNs has received less attention.

訳：

それに比べて，CNNによる教師なし学習はあまり注目されていない。

In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning.

訳：

ここで，教師あり学習と教師なし学習のCNNの成功の間のギャップを埋めることを支援したいと考えている。

We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning.

語彙：

constraints

candidate

訳：

特定のアーキテクチャ上の制約があるdeep convolutional generative adversarial networks（DCGAN）と呼ばれるCNNのクラスを紹介し，教師なし学習の強力な候補であることを示す。

Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator.

語彙：

scenes

訳：

さまざまな画像データセットの学習で，deep convolutional adversarial pairがオブジェクトパーツからgeneratorとdiscriminatorの両方のシーンまでの表現の階層を学習するという説得力のある証拠を示している。

Additionally, we use the learned features for novel tasks - demonstrating their applicability as general image representations.

訳：

さらに，学習した機能を新しいタスクに使用し，一般的な画像表現としての適用性を示す。

18 PCGAN（2017）

f:id:ryosuke_okubo:20190909082002p:plain

原文：

Progressive Growing of GANs for Improved Quality, Stability, and Variation

Abstract

We describe a new training methodology for generative adversarial networks.

語彙：

methodology

訳：

GANの新しい学習方法を説明する。

The key idea is to grow both the generator and discriminator progressively:

starting from a low resolution, we add new layers that model increasingly fine details as training progresses.

語彙：

progressively

resolution

訳：

重要なアイデアはgeneratorとdiscriminatorの両方を段階的に成長させることである：

低解像度から始めて，学習が進むにつれてますます細かいディテールをモデル化する新しいレイヤーを追加する。

This both speeds the training up and greatly stabilizes it, allowing us to produce images of unprecedented quality, e.g., CelebA images at 1024^2.

語彙：

stabilizes

unprecedented

訳：

これにより学習速度が向上し学習が大幅に安定します，これにより前例のない品質の画像（1024^2のCelebA画像など）を作成できる。

We also propose a simple way to increase the variation in generated images, and achieve a record inception score of 8.80 in unsupervised CIFAR10.

訳：

また，生成された画像のバリエーションを増やす簡単な方法を提案し，教師なしCIFAR10で記録開始スコア8.80を達成する。

Additionally, we describe several implementation details that are important for discouraging unhealthy competition between the generator and discriminator.

語彙：

implementation

unhealthy

訳：

さらに，generatorとdiscriminator間の不要な競争を阻止するために重要ないくつかの実装の詳細について説明する。

Finally, we suggest a new metric for evaluating GAN results, both in terms of image quality and variation.

語彙：

metric

訳：

最後に，画像の質とバリデーションとの両方の観点から，GANの結果を評価するための新しい指標を提案する。

As an additional contribution, we construct a higher-quality version of the CelebA dataset.

訳：

加えて，CelebAデータセットの高品質バージョンを構築する。

19 ACGAN（2016）

f:id:ryosuke_okubo:20190909082029p:plain

原文：

Conditional Image Synthesis With Auxiliary Classifier GANs

Abstract

Synthesizing high resolution photorealistic images has been a long-standing challenge in machine learning.

語彙：

Synthesizing

long-standing

訳：

高解像度の写真のようにリアルな画像を合成することは機械学習において長年の課題であった。

In this paper we introduce new methods for the improved training of generative adversarial networks (GANs) for image synthesis.

訳：

この論文では画像合成のための生成的敵対ネットワーク（GAN）の改善されたトレーニングのための新しい方法を紹介する。

We construct a variant of GANs employing label conditioning that results in 128x128 resolution image samples exhibiting global coherence.

語彙：

employing

coherence

訳：

ラベル調整を採用したGANのバリアントを構築し，結果として128x128の解像度の画像サンプルがグローバルな一貫性を示す。

We expand on previous work for image quality assessment to provide two new analyses for assessing the discriminability and diversity of samples from class-conditional image synthesis models.

語彙：

discriminability

訳：

画質評価の以前の作業を拡張してクラス条件付き画像合成モデルからのサンプルの識別可能性と多様性を評価するための2つの新しい分析を提供する。

These analyses demonstrate that high resolution samples provide class information not present in low resolution samples.

訳：

これらの分析は高解像度サンプルが低解像度サンプルには存在しないクラス情報を提供することを示す。

Across 1000 ImageNet classes, 128x128 samples are more than twice as discriminable as artificially resized 32x32 samples.

語彙：

artificially

訳：

1000個のImageNetクラスにわたって，128x128サンプルは人為的にサイズ変更された32x32サンプルの2倍以上の識別が可能です。

In addition, 84.7% of the classes have samples exhibiting diversity comparable to real ImageNet data.

単語：

訳：

さらに，クラスの84.7％には実際のImageNetデータに匹敵する多様性を示すサンプルがある。

20 SAGAN（2018）

f:id:ryosuke_okubo:20190909082053p:plain

原文：

Self-Attention Generative Adversarial Networks

Abstract

In this paper, we propose the Self-Attention Generative Adversarial Network (SAGAN) which allows attention-driven, long-range dependency modeling for image generation tasks.

訳：

この論文では，画像生成タスクのための注意駆動型の長距離依存性モデリングを可能にするSelf-Attention Generative Adversarial Network（SAGAN）を提案します。

Traditional convolutional GANs generate high-resolution details as a function of only spatially local points in lower-resolution feature maps.

訳：

従来のcGANは低解像度の特徴マップの空間的に局所的な点のみの関数として高解像度の詳細を生成する。

In SAGAN, details can be generated using cues from all feature locations.

語彙：

cues

訳：

SAGANでは，すべての特徴点からのキューを使用して詳細を生成できる。

Moreover, the discriminator can check that highly detailed features in distant portions of the image are consistent with each other.

語彙：

consistent

訳：

さらに，discriminatorは画像の離れた部分の非常に詳細な特徴が互いに整合していることを確認できる。

Furthermore, recent work has shown that generator conditioning affects GAN performance.

訳：

さらに，最近の研究ではgeneratorの調整がGANのパフォーマンスに影響することが示されている。

Leveraging this insight, we apply spectral normalization to the GAN generator and find that this improves training dynamics.

語彙：

Leveraging

dynamics

訳：

この洞察を活用して，GANのgeneratorにスペクトル正規化を適用し，これにより学習ダイナミクスが改善されることがわかる。

The proposed SAGAN achieves the state-of-the-art results, boosting the best published Inception score from 36.8 to 52.52 and reducing Frechet Inception distance from 27.62 to 18.65 on the challenging ImageNet dataset.

訳：

提案されたSAGANは最先端の結果を達成し，公開されているInception scoreの最高値を36.8から52.52に引き上げ，ImageNetデータセットのFrechet Inception distanceを27.62から18.65に減少させる。

Visualization of the attention layers shows that the generator leverages neighborhoods that correspond to object shapes rather than local regions of fixed shape.

語彙：

neighborhoods

rather than

訳：

attentionレイヤーを視覚化すると，generatorは固定形状のローカル領域ではなくオブジェクトの形状に対応する近傍を活用することがわかる。

次回↓

ryosuke-okubo.hatenablog.com