論文Abstract100本ノック#6 - 十の並列した脳

前回↓

ryosuke-okubo.hatenablog.com

26 CycleGAN（2017）
27 StarGAN（2017）
28 StackGAN（2016）
29 AnoGAN（2017）
30 3DGAN（2016）

26 CycleGAN（2017）

f:id:ryosuke_okubo:20190912142058p:plain

原文：

Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks

Abstract

Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs.

語彙：

aligned

訳：

画像から画像への変換は，位置合わせされた画像ペアのトレーニングセットを使用して入力画像と出力画像間のマッピングを学習することを目的とする，視覚およびグラフィックの問題のクラスである。

However, for many tasks, paired training data will not be available.

訳：

ただし，多くのタスクでは，ペアのトレーニングデータは利用できない。

We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples.

訳：

ペアの例がない場合にソースドメインXからターゲットドメインYに画像を変換する学習方法を示す。

Our goal is to learn a mapping G:X→Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss.

語彙：

mapping

indistinguishable

訳：

G(X)からの画像の分布が敵対的損失を使用して分布Yと区別できないように写像G:X→Yを学習することを目標とする。

Because this mapping is highly under-constrained, we couple it with an inverse mapping F:Y→X and introduce a cycle consistency loss to push F(G(X))≈X (and vice versa).

語彙：

under-constrained

and vice versa

訳：

この写像は非常に制約が厳しいため，逆写像F：Y→Xと組み合わせF(G(X))≈X（およびその逆も）をプッシュするためのcycle consistency lossを導入する。

Qualitative results are presented on several tasks where paired training data does not exist, including collection style transfer, object transfiguration, season transfer, photo enhancement, etc.

訳：

画風の変換，オブジェクト変換，季節の変換，写真の強調など，ペアのトレーニングデータが存在しないいくつかのタスクで定性的な結果が表示される。

Quantitative comparisons against several prior methods demonstrate the superiority of our approach.

訳：

いくつかの従来の方法との定量的比較は，当社のアプローチの優位性を示している。

27 StarGAN（2017）

f:id:ryosuke_okubo:20190912142123p:plain

原文：

StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Abstract

Recent studies have shown remarkable success in image-to-image translation for two domains.

訳：

最近の研究はで2つのドメインによる画像から画像への変換で顕著な成功を示している。

However, existing approaches have limited scalability and robustness in handling more than two domains, since different models should be built independently for every pair of image domains.

語彙：

existing

訳：

ただし，既存のアプローチでは3つ以上のドメインを処理する際のスケーラビリティとロバストネスが制限されている，これは画像ドメインのペアごとに異なるモデルを個別に構築する必要があるためである。

To address this limitation, we propose StarGAN, a novel and scalable approach that can perform image-to-image translations for multiple domains using only a single model.

訳：

この制限に対処するために，単一のモデルのみを使用して複数のドメインの画像から画像への変換を実行できる，新規でスケーラブルなアプローチであるStarGANを提案する。

Such a unified model architecture of StarGAN allows simultaneous training of multiple datasets with different domains within a single network.

訳：

このようなStarGANの統合モデルアーキテクチャにより単一のネットワーク内で異なるドメインを持つ複数のデータセットを同時に学習できる。

This leads to StarGAN's superior quality of translated images compared to existing models as well as the novel capability of flexibly translating an input image to any desired target domain.

訳：

これにより既存のモデルと比較してStarGANの画像変換の品質が向上するだけでなく，入力画像を任意のターゲットドメインに柔軟に変換する新しい機能が実現する。

We empirically demonstrate the effectiveness of our approach on a facial attribute transfer and a facial expression synthesis tasks.

語彙：

attribute

訳：

顔の属性の変換と表情の合成タスクに対するアプローチの有効性を経験的に示している。

28 StackGAN（2016）

f:id:ryosuke_okubo:20190912142148p:plain

原文：

StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

Abstract

Synthesizing high-quality images from text descriptions is a challenging problem in computer vision and has many practical applications.

語彙：

descriptions

訳：

テキスト記述から高品質の画像を合成することはコンピュータービジョンにおいて困難な問題であり，多くの実用的な用途がある。

Samples generated by existing text-to-image approaches can roughly reflect the meaning of the given descriptions, but they fail to contain necessary details and vivid object parts.

語彙：

fail to

訳：

既存のテキストからイメージへのアプローチによって生成されたサンプルは与えられた説明の意味を大まかに反映することができるが，必要な詳細と鮮明なオブジェクト部分を含んでいない。

In this paper, we propose Stacked Generative Adversarial Networks (StackGAN) to generate 256x256 photo-realistic images conditioned on text descriptions.

訳：

本論文では，テキスト記述を条件とする256x256のリアルな画像を生成するStacked Generative Adversarial Networks（StackGAN）を提案する。

We decompose the hard problem into more manageable sub-problems through a sketch-refinement process.

語彙：

decompose ~ into -

訳：

sketch-refinementプロセスを通じて困難な問題をより管理しやすいサブ問題に分解する。

The Stage-I GAN sketches the primitive shape and colors of the object based on the given text description, yielding Stage-I low-resolution images.

語彙：

primitive

yielding

訳：

Stage-I GANは指定されたテキスト記述に基づいてオブジェクトのプリミティブな形状と色をスケッチし，Stage-I低解像度画像を生成する。

The Stage-II GAN takes Stage-I results and text descriptions as inputs, and generates high-resolution images with photo-realistic details.

訳：

Stage-II GANはStage-Iの結果とテキストの説明を入力として受け取り，リアルな詳細を持つ高解像度の画像を生成する。

It is able to rectify defects in Stage-I results and add compelling details with the refinement process.

語彙：

rectify

訳：

それによりStage-Iの結果の欠陥を修正し洗練されたプロセスで説得力のある詳細を追加できる。

To improve the diversity of the synthesized images and stabilize the training of the conditional-GAN, we introduce a novel Conditioning Augmentation technique that encourages smoothness in the latent conditioning manifold.

語彙：

stabilize

Conditioning Augmentation

manifold

訳：

合成画像の多様性を改善しcGANの学習を安定させるために，潜在的なコンディショニング多様体の滑らかさを促進する新しいConditioning Augmentation技術を導入する。

Extensive experiments and comparisons with state-of-the-arts on benchmark datasets demonstrate that the proposed method achieves significant improvements on generating photo-realistic images conditioned on text descriptions.

訳：

大規模な実験とベンチマークデータセットの最新技術との比較により，提案された方法がテキスト記述を条件とするリアルな画像の生成で大幅な改善を達成することが実証されている。

29 AnoGAN（2017）

f:id:ryosuke_okubo:20190912142214p:plain

原文：

Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery

Abstract

Obtaining models that capture imaging markers relevant for disease progression and treatment monitoring is challenging.

語彙：

relevant

訳：

疾患の進行と治療のモニタリングに関連するイメージングマーカーをキャプチャするモデルを取得することは困難である。

Models are typically based on large amounts of data with annotated examples of known markers aiming at automating detection.

語彙：

large amounts of

annotated

aiming at

訳：

通常，モデルは検出の自動化を目的とした既知のマーカーの注釈付きサンプルを含む大量のデータに基づいている。

High annotation effort and the limitation to a vocabulary of known markers limit the power of such approaches.

語彙：

effort

訳：

高度な注釈作業と既知マーカーの語彙の制限によりこのようなアプローチの効果が制限される。

Here, we perform unsupervised learning to identify anomalies in imaging data as candidates for markers.

語彙：

anomalies

訳：

ここでは，マーカー候補として画像データの異常を特定するために教師なし学習を実行する。

We propose AnoGAN, a deep convolutional generative adversarial network to learn a manifold of normal anatomical variability, accompanying a novel anomaly scoring scheme based on the mapping from image space to a latent space.

訳：

anatomical

訳：

AnoGANは通常の解剖学的変動の多様性を学習するためのDCGANであり，画像空間から潜在空間へのマッピングに基づく新しい異常スコアスキームを伴う。

Applied to new data, the model labels anomalies, and scores image patches indicating their fit into the learned distribution.

語彙：

indicating

訳：

このモデルは，新しいデータに適用されると，異常にラベルを付け，学習した分布に適合することを示す画像パッチにスコア付けをする。

Results on optical coherence tomography images of the retina demonstrate that the approach correctly identifies anomalous images, such as images containing retinal fluid or hyperreflective foci.

語彙：

optical coherence tomography

retina

訳：

網膜の光干渉断層撮影画像の結果は，このアプローチが網膜液またはhyperreflective fociを含む画像などの異常画像を正しく識別することを示している。

30 3DGAN（2016）

f:id:ryosuke_okubo:20190912142238p:plain

原文：

Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

Abstract

We study the problem of 3D object generation.

訳：

3Dオブジェクト生成の問題を研究する。

We propose a novel framework, namely 3D Generative Adversarial Network (3D-GAN), which generates 3D objects from a probabilistic space by leveraging recent advances in volumetric convolutional networks and generative adversarial nets.

訳：

新しいフレームワークとして3D-GANを提案する，これは3次元の畳み込みネットワークとGANの最近の進歩を活用して，確率空間から3Dオブジェクトを生成する。

The benefits of our model are three-fold:

first, the use of an adversarial criterion, instead of traditional heuristic criteria, enables the generator to capture object structure implicitly and to synthesize high-quality 3D objects;

second, the generator establishes a mapping from a low-dimensional probabilistic space to the space of 3D objects, so that we can sample objects without a reference image or CAD models, and explore the 3D object manifold;

third, the adversarial discriminator provides a powerful 3D shape descriptor which, learned without supervision, has wide applications in 3D object recognition.

語彙：

criterion

implicitly

establishes

訳：

このモデルの利点は3つある：

まず，従来のヒューリスティックな基準の代わりに敵対的な基準を使用すると，generatorがオブジェクト構造を暗黙的に捉えて高品質の3Dオブジェクトを合成できる；

次に，generatorは低次元の確率空間から3Dオブジェクトの空間へのマッピングを確立することで，参照画像やCADモデルなしでオブジェクトをサンプリングし3Dオブジェクト多様体を探索できる；

第三に，discriminatorは教師なし学習をした強力な3D形状記述子を提供するため，3Dオブジェクト認識に幅広い用途が生じる。

Experiments demonstrate that our method generates high-quality 3D objects, and our unsupervisedly learned features achieve impressive performance on 3D object recognition, comparable with those of supervised learning methods.

訳：

実験によりこの方法によって高品質の3Dオブジェクトを生成することが実証された，および教師なし学習機能により3Dオブジェクト認識で優れたパフォーマンスを実現する，これは教師あり学習方法のパフォーマンスに匹敵する。

次回↓

ryosuke-okubo.hatenablog.com