2020-01-20

論文Abstract100本ノック#20

機械学習論文

前回↓

ryosuke-okubo.hatenablog.com

96 PPO（2017）

f:id:ryosuke_okubo:20191114211139p:plain

原文：

Proximal Policy Optimization Algorithms

Abstract：

We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent.

訳：

我々は環境との相互作用によるデータのサンプリングと，確率的勾配上昇を使用した「surrogate」目的関数の最適化を交互に行う，強化学習のためのpolicy gradient methodsの新しいファミリーを提案する。

Whereas standard policy gradient methods perform one gradient update per data sample, we propose a novel objective function that enables multiple epochs of minibatch updates.

訳：

標準のpolicy gradient methodsはデータサンプルごとに1つの勾配更新を実行するが，我々はミニバッチ更新の複数のエポックを可能にする新しい目的関数を提案する。

The new methods, which we call proximal policy optimization (PPO), have some of the benefits of trust region policy optimization (TRPO), but they are much simpler to implement, more general, and have better sample complexity (empirically).

訳：

proximal policy optimization（PPO）と呼ばれる新しい方法には，TRPOの利点があるが，実装がはるかに簡単で，より一般的で，サンプルの複雑さ（経験的に）が優れている。

Our experiments test PPO on a collection of benchmark tasks, including simulated robotic locomotion and Atari game playing, and we show that PPO outperforms other online policy gradient methods, and overall strikes a favorable balance between sample complexity, simplicity, and wall-time.

訳：

我々の実験ではシミュレートされたロボットの移動やAtariゲームのプレイなど，ベンチマークタスクのコレクションでPPOをテストし，PPOが他のオンラインpolicy gradient methodsよりも優れており，全体的なサンプルの複雑さ，シンプルさ、および壁時間のバランスが有利であることを示す。

97 ACER（2016）

f:id:ryosuke_okubo:20191114211201p:plain

原文：

Sample Efficient Actor-Critic with Experience Replay

Abstract：

This paper presents an actor-critic deep reinforcement learning agent with experience replay that is stable, sample efficient, and performs remarkably well on challenging environments, including the discrete 57-game Atari domain and several continuous control problems.

訳：

本論文では安定性とサンプル効率が高く，離散57ゲームのAtari ドメインやいくつかの連続制御問題を含む，困難な環境で非常に優れたパフォーマンスを発揮する、actor-criticの深層強化学習エージェントを紹介する。

To achieve this, the paper introduces several innovations, including truncated importance sampling with bias correction, stochastic dueling network architectures, and a new trust region policy optimization method.

語彙：

訳：

これを達成するために，本論文ではバイアス補正による切り捨てられた重要度サンプリング，確率論的なデュエリングネットワークアーキテクチャ，および新しいTRPOの方法を含むいくつかの革新を紹介する。

98 UNREAL（2016）

f:id:ryosuke_okubo:20191114211225p:plain

原文：

Reinforcement Learning with Unsupervised Auxiliary Tasks

Abstract：

Deep reinforcement learning agents have achieved state-of-the-art results by directly maximising cumulative reward.

訳：

深層強化学習エージェントは，累積報酬を直接最大化することにより最先端の結果を達成した。

However, environments contain a much wider variety of possible training signals.

訳：

ただし，環境にははるかに多様な可能な学習信号が含まれている。

In this paper, we introduce an agent that also maximises many other pseudo-reward functions simultaneously by reinforcement learning.

訳：

本論文では，強化学習によって同時に他の多くの擬似報酬機能も最大化するエージェントを紹介する。

All of these tasks share a common representation that, like unsupervised learning, continues to develop in the absence of extrinsic rewards.

語彙：

in the absence of

訳：

これらのタスクはすべて，教師なし学習のように，外部からの報酬がなくても発達し続ける共通の表現を共有している。

We also introduce a novel mechanism for focusing this representation upon extrinsic rewards, so that learning can rapidly adapt to the most relevant aspects of the actual task.

訳：

我々はまた，学習が実際のタスクの最も関連性の高い側面に迅速に適応できるように，この表現を外的報酬に集中させるための新しいメカニズムも導入する。

Our agent significantly outperforms the previous state-of-the-art on Atari, averaging 880% expert human performance, and a challenging suite of first-person, three-dimensional Labyrinth tasks leading to a mean speedup in learning of 10× and averaging 87% expert human performance on Labyrinth.

訳：

我々のエージェントは，Atariの以前の最先端技術を大幅に上回り，平均して880％の熟練した人間のパフォーマンス，およびchallenging suite of first-person，three-dimensional Labyrinthタスクで平均で10倍の学習の加速，平均して87％の専門的な人間のパフォーマンスをLabyrinthで達成した。

99 NAC（2008）

原文：

Natural Actor-Critic

Abstract：

This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic.

訳：

本論文では新しいモデルを使用しない強化学習アーキテクチャであるNatural Actor-Criticについて説明する。

The actor updates are based on stochastic policy gradients employing Amari’s natural gradient approach, while the critic obtains both the natural policy gradient and additional parameters of a value function simultaneously by linear regression.

訳：

actorの更新はAmariの自然勾配アプローチを採用した確率的方策勾配に基づいているが，criticは線形回帰によって自然方策勾配と値関数の追加パラメーターの両方を同時に取得する。

We show that actor improvements with natural policy gradients are particularly appealing as these are independent of coordinate frame of the chosen policy representation, and can be estimated more efficiently than regular policy gradients.

訳：

自然な方策勾配によるactorの改善は，選択された方策表現の座標フレームに依存せず，通常の方策の勾配よりも効率的に推定できるため，特に魅力的であることを示す。

The critic makes use of a special basis function parameterization motivated by the policy-gradient compatible function approximation.

訳：

criticは方策勾配互換関数近似によって動機付けられた特別な基底関数パラメーター化を利用する。

We show that several well-known reinforcement learning methods such as the original Actor-Critic and Bradtke’s Linear Quadratic Q-Learning are in fact Natural Actor-Critic algorithms.

訳：

オリジナルのActor-CriticやBradtkeの線形2次Q学習などいくつかのよく知られている強化学習方法が，実際にはNatural Actor-Criticアルゴリズムであることを示す。

Empirical evaluations illustrate the effectiveness of our techniques in comparison to previous methods, and also demonstrate their applicability for learning control on an anthropomorphic robot arm.

訳：

経験的評価は以前の方法と比較した本手法の有効性を示し，また擬人化ロボットアームの制御の学習への適用性を示している。

100 AlphaStar（2019）

原文：

AlphaStar: An Evolutionary Computation Perspective

Abstract：

In January 2019, DeepMind revealed AlphaStar to the world-the first artificial intelligence (AI) system to beat a professional player at the game of StarCraft II-representing a milestone in the progress of AI.

訳：

2019年1月，DeepMindはAlphaStarをStarCraft IIのゲームでプロのプレーヤーを破った人工知能（AI）システムとして世界に初めて公開した，これはAIの進歩のマイルストーンである。

AlphaStar draws on many areas of AI research, including deep learning, reinforcement learning, game theory, and evolutionary computation (EC).

訳：

AlphaStarは，ディープラーニング，強化学習，ゲーム理論，進化計算（EC）などAI研究の多くの分野を活用している。

In this paper we analyze AlphaStar primarily through the lens of EC, presenting a new look at the system and relating it to many concepts in the field.

訳：

本論文では，主にECのレンズを通してAlphaStarを分析し，システムの新しい外観を提示し，フィールドの多くの概念に関連付ける。

We highlight some of its most interesting aspects-the use of Lamarckian evolution, competitive co-evolution, and quality diversity.

訳：

我々はラマルク進化，競合的共進化，品質の多様性の使用という最も興味深い側面のいくつかを強調する。

In doing so, we hope to provide a bridge between the wider EC community and one of the most significant AI systems developed in recent times.

語彙：

In doing so

訳：

そうすることで，より広範なECコミュニティと最近開発された最も重要なAIシステムの1つとの間の橋渡しを提供したいと考えている。

（終）

2020-01-13

論文Abstract100本ノック#19

機械学習論文

前回↓

ryosuke-okubo.hatenablog.com

91 Ape-X（2018）

f:id:ryosuke_okubo:20191105210452p:plain

原文：

Distributed Prioritized Experience Replay

Abstract：

We propose a distributed architecture for deep reinforcement learning at scale, that enables agents to learn effectively from orders of magnitude more data than previously possible.

語彙：

at scale

訳：

我々は大規模な深層強化学習のための分散アーキテクチャを提案する，これによりエージェントは以前よりもはるかに多くのデータを効果的に学習できる。

The algorithm decouples acting from learning:

the actors interact with their own instances of the environment by selecting actions according to a shared neural network, and accumulate the resulting experience in a shared experience replay memory;

the learner replays samples of experience and updates the neural network.

訳：

このアルゴリズムは、actingをlearningから切り離す：

actorsは共有されたニューラルネットワークに従ってアクションを選択することで環境独自のインスタンスと対話し，共有のexperience replayメモリに結果のexperienceを蓄積する；

learnerはexperienceのサンプルを再生し，ニューラルネットワークを更新する。

The architecture relies on prioritized experience replay to focus only on the most significant data generated by the actors.

訳：

アーキテクチャは優先順位付けされたexperience replayに依存して，actorsによって生成された最も重要なデータのみに焦点を合わせる。

Our architecture substantially improves the state of the art on the Arcade Learning Environment, achieving better final performance in a fraction of the wall-clock training time.

語彙：

a fraction of

wall-clock

訳：

当社のアーキテクチャはアーケード学習環境の最新技術を大幅に改善し，わずかな学習時間で優れた最終パフォーマンスを実現する。

92 R2D2（2019）

f:id:ryosuke_okubo:20191105210522p:plain

原文：

RECURRENT EXPERIENCE REPLAY IN DISTRIBUTED REINFORCEMENT LEARNING

Abstract：

Building on the recent successes of distributed training of RL agents, in this paper we investigate the training of RNN-based RL agents from distributed prioritized experience replay.

訳：

RL agentsの分散学習の最近の成果に基づいて，本論文で我々は，分散優先順位付けされたexperience replayからRNNベースのRL agentsの学習を調査する。

We study the effects of parameter lag resulting in representational drift and recurrent state staleness and empirically derive an improved training strategy.

語彙：

representational

staleness

derive

訳：

代表的なドリフトと再発状態の陳腐化をもたらすパラメーターラグの影響を研究し，経験的に改善された学習戦略を導出する。

Using a single network architecture and fixed set of hyperparameters, the resulting agent, Recurrent Replay Distributed DQN, quadruples the previous state of the art on Atari-57, and matches the state of the art on DMLab-30.

語彙：

quadruples

訳：

単一のネットワークアーキテクチャと固定されたハイパーパラメーターセットを使用して，結果のエージェントであるRecurrent Replay Distributed DQNは，Atari-57の最新技術を4倍にし，DMLab-30の最新技術と一致する。

It is the first agent to exceed human-level performance in 52 of the 57 Atari games.

訳：

それは57のAtari gamesのうち52で人間レベルのパフォーマンスを超えた最初のエージェントである。

93 A3C（2016）

f:id:ryosuke_okubo:20191105210544p:plain

原文：

Asynchronous Methods for Deep Reinforcement Learning

Abstract：

We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers.

語彙：

asynchronous

訳：

我々はDNNコントローラーの最適化に非同期勾配降下を使用する，深層強化学習のための概念的にシンプルで軽量なフレームワークを提案する。

We present asynchronous variants of four standard reinforcement learning algorithms and show that parallel actor-learners have a stabilizing effect on training allowing all four methods to successfully train neural network controllers.

語彙：

stabilizing effect

訳：

我々は4つの標準強化学習アルゴリズムの非同期バリアントを提示し，並列のactor-learnersが学習に安定化効果をもたらし，4つの方法すべてがニューラルネットワークコントローラーを正常に学習できることを示す。

The best performing method, an asynchronous variant of actor-critic, surpasses the current state-of-the-art on the Atari domain while training for half the time on a single multi-core CPU instead of a GPU.

訳：

actor-criticの非同期バリアントである最高のパフォーマンスを発揮する方法は，GPUではなく単一のマルチコアCPUで半分の時間で学習しながらも，Atari ドメインの現在の最先端技術を上回る。

Furthermore, we show that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.

訳：

さらに，非同期のactor-criticは，視覚入力を使用してランダムな3D迷路をナビゲートする新しいタスクだけでなく，さまざまな連続的なモーター制御の問題にも成功することを示す。

94 DDPG（2015）

f:id:ryosuke_okubo:20191105210617p:plain

原文：

Continuous control with deep reinforcement learning

Abstract：

We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain.

訳：

我々はDeep Q-Learningの成功の根底にあるアイデアを継続的な行動ドメインに適合させる。

We present an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces.

訳：

連続的な行動空間で動作できる決定論的なpolicy gradientに基づいたモデルフリーのアルゴリズムであるactor-criticを提示する。

Using the same learning algorithm, network architecture and hyper-parameters, our algorithm robustly solves more than 20 simulated physics tasks, including classic problems such as cartpole swing-up, dexterous manipulation, legged locomotion and car driving.

語彙：

dexterous

manipulation

訳：

同じ学習アルゴリズム，ネットワークアーキテクチャ，ハイパーパラメーターを使用して，当社のアルゴリズムはcartpole swing-up，器用な操作，脚の移動，車の運転などの古典的な問題を含む，20を超える物理シミュレーションタスクをロバストに解決する。

Our algorithm is able to find policies whose performance is competitive with those found by a planning algorithm with full access to the dynamics of the domain and its derivatives.

訳：

我々のアルゴリズムは，ドメインとその派生物のダイナミクスに完全にアクセスできるプランニングアルゴリズムによって発見されたものとパフォーマンスが競合するpoliciesを見つけることができる。

We further demonstrate that for many of the tasks the algorithm can learn policies end-to-end:

directly from raw pixel inputs.

訳：

さらに多くのタスクについてアルゴリズムがend-to-endでpoliciesを学習できることを示す：

生のピクセル入力から直接。

95 TRPO（2015）

f:id:ryosuke_okubo:20191105210645p:plain

原文：

Trust Region Policy Optimization

Abstract：

We describe an iterative procedure for optimizing policies, with guaranteed monotonic improvement.

訳：

我々は単調な改善を保証しながらpoliciesを最適化するための反復手順を説明する。

By making several approximations to the theoretically-justified procedure, we develop a practical algorithm, called Trust Region Policy Optimization (TRPO).

訳：

理論的に正当化された手順にいくつかの近似を行うことにより，Trust Region Policy Optimization（TRPO）と呼ばれる実用的なアルゴリズムを開発する。

This algorithm is similar to natural policy gradient methods and is effective for optimizing large nonlinear policies such as neural networks.

訳：

このアルゴリズムは自然なpolicy gradient methodsに似ており，ニューラルネットワークなどの大規模な非線形policiesの最適化に効果的である。

Our experiments demonstrate its robust performance on a wide variety of tasks:

learning simulated robotic swimming, hopping, and walking gaits;

and playing Atari games using images of the screen as input.

訳：

我々の実験はさまざまなタスクでそのロバストなパフォーマンスを実証している：

シミュレートされたロボットの水泳、ホッピング、およびwalking gaitsの学習；

また，画面の画像を入力として使用してAtariゲームをプレイする。

Despite its approximations that deviate from the theory, TRPO tends to give monotonic improvement, with little tuning of hyperparameters.

訳：

理論から逸脱する近似にもかかわらず，TRPOは単調な改善をもたらす傾向があり，ハイパーパラメーターの調整はほとんどない。

次回↓

ryosuke-okubo.hatenablog.com

2020-01-06

論文Abstract100本ノック#18

機械学習論文

前回↓

ryosuke-okubo.hatenablog.com

86~100は強化学習について扱う。

参考：

https://qiita.com/shionhonda/items/ec05aade07b5bea78081

86 Deep Q-Network（2013）

f:id:ryosuke_okubo:20191101175525p:plain

原文：

Playing Atari with Deep Reinforcement Learning

Abstract：

We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning.

語彙：

reinforcement learning

訳：

我々は強化学習を使用して高次元の感覚入力から直接コントロールポリシーを正常に学習する最初のディープラーニングモデルを紹介する。

The model is a convolutional neural network, trained with a variant of Q-learning, whose input is raw pixels and whose output is a value function estimating future rewards.

訳：

このモデルはCNNであり，Q学習のバリアントで学習され，その入力は生のピクセルであり出力は将来の報酬を推定する値関数である。

We apply our method to seven Atari 2600 games from the Arcade Learning Environment, with no adjustment of the architecture or learning algorithm.

訳：

我々はArcade Learning Environmentの7つのAtari 2600ゲームにこの方法を適用し，アーキテクチャや学習アルゴリズムを調整しない。

We find that it outperforms all previous approaches on six of the games and surpasses a human expert on three of them.

訳：

6つのゲームで以前のすべてのアプローチよりも優れており，3つのゲームで人間の専門家を上回っている。

87 Double Deep Q Network（2015）

原文：

Deep Reinforcement Learning with Double Q-learning

Abstract：

The popular Q-learning algorithm is known to overestimate action values under certain conditions.

語彙：

overestimate

訳：

一般的なQ学習アルゴリズムは特定の条件下でアクション値を過大評価することが知られている。

It was not previously known whether, in practice, such overestimations are common, whether they harm performance, and whether they can generally be prevented.

訳：

実際には，このような過大評価が一般的であるかどうか，パフォーマンスに悪影響を与えるかどうか，一般的に防止できるかどうかは以前は知られていなかった。

In this paper, we answer all these questions affirmatively.

語彙：

affirmatively

訳：

本論文では，これらすべての質問に肯定的に答える。

In particular, we first show that the recent DQN algorithm, which combines Q-learning with a deep neural network, suffers from substantial overestimations in some games in the Atari 2600 domain.

語彙：

substantial

訳：

特に，Q学習とDNNを組み合わせた最近のDQN アルゴリズムは，Atari 2600ドメインの一部のゲームでかなり過大評価されていることを最初に示す。

We then show that the idea behind the Double Q-learning algorithm, which was introduced in a tabular setting, can be generalized to work with large-scale function approximation.

訳：

次に，表形式設定で導入されたDouble Q-learning algorithmの背後にある考え方が，大規模な関数近似で機能するように一般化できることを示す。

We propose a specific adaptation to the DQN algorithm and show that the resulting algorithm not only reduces the observed overestimations, as hypothesized, but that this also leads to much better performance on several games.

語彙：

hypothesized

訳：

我々はDQN アルゴリズムへの特定の適応を提案し，結果として得られるアルゴリズムが，仮説として観測された過大評価を減らすだけでなく，これがいくつかのゲームではるかに優れたパフォーマンスにつながることを示す。

88 Rainbow（2017）

f:id:ryosuke_okubo:20191101175547p:plain

原文：

Rainbow: Combining Improvements in Deep Reinforcement Learning

Abstract：

The deep reinforcement learning community has made several independent improvements to the DQN algorithm.

訳：

深層強化学習コミュニティはDQN アルゴリズムにいくつかの独立した改善を加えた。

However, it is unclear which of these extensions are complementary and can be fruitfully combined.

語彙：

fruitfully

訳：

ただし，これらの拡張機能のどれが補完的なものであるかは不明であり，効果的に組み合わせることができる。

This paper examines six extensions to the DQN algorithm and empirically studies their combination.

訳：

本論文ではDQN アルゴリズムの6つの拡張機能を調査して，それらの組み合わせを経験的に研究する。

Our experiments show that the combination provides state-of-the-art performance on the Atari 2600 benchmark, both in terms of data efficiency and final performance.

訳：

我々の実験はデータの効率性と最終的なパフォーマンスの両方の点で，この組み合わせがAtari 2600ベンチマークで最先端のパフォーマンスを提供することを示す。

We also provide results from a detailed ablation study that shows the contribution of each component to overall performance.

語彙：

overall

訳：

また，全体的なパフォーマンスに対する各コンポーネントの寄与を示す詳細なablation studyの結果も提供する。

89 Dueling Network（2015）

f:id:ryosuke_okubo:20191101175607p:plain

原文：

Dueling Network Architectures for Deep Reinforcement Learning

Abstract：

In recent years there have been many successes of using deep representations in reinforcement learning.

訳：

近年，強化学習で深い表現を使用することで多くの成功があった。

Still, many of these applications use conventional architectures, such as convolutional networks, LSTMs, or auto-encoders.

語彙：

conventional

訳：

しかしながら，これらのアプリケーションの多くは畳み込みネットワーク，LSTM，またはオートエンコーダなどの従来のアーキテクチャを使用している。

In this paper, we present a new neural network architecture for model-free reinforcement learning.

訳：

本論文では，モデルなしの強化学習のための新しいニューラルネットワークアーキテクチャを紹介する。

Our dueling network represents two separate estimators:

one for the state value function and one for the state-dependent action advantage function.

訳：

dueling networkは2つの独立した見積もりを表す：

1つは状態関数で，もう1つは状態依存の行動アドバンテージ関数である。

The main benefit of this factoring is to generalize learning across actions without imposing any change to the underlying reinforcement learning algorithm.

語彙：

underlying

訳：

このファクタリングの主な利点は，基礎となる強化学習アルゴリズムに変更を加えることなく，行動全体で学習を一般化することである。

Our results show that this architecture leads to better policy evaluation in the presence of many similar-valued actions.

語彙：

in the presence of

訳：

我々の結果はこのアーキテクチャが多くの同様の価値のある行動の存在下でより良いポリシー評価につながることを示す。

Moreover, the dueling architecture enables our RL agent to outperform the state-of-the-art on the Atari 2600 domain.

訳：

さらに，duelingアーキテクチャによりRL agentはAtari 2600ドメインで最先端のパフォーマンスを発揮できる。

90 Gorila（2015）

f:id:ryosuke_okubo:20191101175629p:plain

原文：

Massively Parallel Methods for Deep Reinforcement Learning

Abstract：

We present the first massively distributed architecture for deep reinforcement learning.

語彙：

massively

訳：

我々は深層強化学習のための最初の大規模分散アーキテクチャを紹介する。

This architecture uses four main components:

parallel actors that generate new behaviour;

parallel learners that are trained from stored experience;

a distributed neural network to represent the value function or behaviour policy;

and a distributed store of experience.

語彙：

behaviour

訳：

このアーキテクチャは4つの主要コンポーネントを使用する：

新しい振る舞いを生成する並列アクター；

蓄積された経験から学習された並行学習者；

価値関数またはbehaviour policyを表す分散ニューラルネットワーク；

経験の分散蓄積。

We used our architecture to implement the Deep Q-Network algorithm (DQN).

訳：

我々はこのアーキテクチャを使用してDeep Q-Networkアルゴリズム（DQN）を実装した。

Our distributed algorithm was applied to 49 games from Atari 2600 games from the Arcade Learning Environment, using identical hyperparameters.

語彙：

identical

訳：

我々の分散アルゴリズムは，同一のハイパーパラメーターを使用して，Arcade Learning EnvironmentのAtari 2600ゲームの49ゲームに適用された。

Our performance surpassed non-distributed DQN in 41 of the 49 games and also reduced the wall-time required to achieve these results by an order of magnitude on most games.

語彙：

wall-time

order of magnitude

訳：

我々のパフォーマンスは49のゲームのうち41で非分散DQNを上回り，またほとんどのゲームでこれらの結果を達成するために必要なwall-timeを1桁削減した。

次回↓

ryosuke-okubo.hatenablog.com

2019-12-30

論文Abstract100本ノック#17

機械学習論文

前回↓

ryosuke-okubo.hatenablog.com

81 Santa（2015）

f:id:ryosuke_okubo:20191029205813p:plain

原文：

Bridging the Gap between Stochastic Gradient MCMC and Stochastic Optimization

Abstract：

Stochastic gradient Markov chain Monte Carlo (SG-MCMC) methods are Bayesian analogs to popular stochastic optimization methods;

however, this connection is not well studied.

訳：

確率的勾配マルコフ連鎖モンテカルロ（SG-MCMC）メソッドは一般的な確率的最適化メソッドのベイジアン類似である；

ただし，この接続は十分に研究されていない。

We explore this relationship by applying simulated annealing to an SGMCMC algorithm.

語彙：

simulated annealing

訳：

我々は焼きなまし法をSGMCMCアルゴリズムに適用してこの関係を調査する。

Furthermore, we extend recent SG-MCMC methods with two key components:

i) adaptive preconditioners (as in ADAgrad or RMSprop),

and ii) adaptive element-wise momentum weights.

訳：

さらに，最近のSG-MCMCメソッドを2つの主要コンポーネントで拡張する：

i）適応型前提条件（ADAgradまたはRMSpropなど）

ii）適応的な要素ごとの運動量の重み

The zero-temperature limit gives a novel stochastic optimization method with adaptive element-wise momentum weights, while conventional optimization methods only have a shared, static momentum weight.

訳：

ゼロ温度の制限は適応的な要素ごとの運動量の重みを持つ新しい確率的最適化手法を提供するが，従来の最適化手法は共有された静的な運動量の重みのみを持つ。

Under certain assumptions, our theoretical analysis suggests the proposed simulated annealing approach converges close to the global optima.

語彙：

assumptions

訳：

特定の仮定の下で，我々の理論的分析は提案された焼きなまし法によるアプローチがグローバルな最適値の近くに収束することを示唆する。

Experiments on several deep neural network models show state-of-the-art results compared to related stochastic optimization algorithms.

訳：

いくつかのDNNモデルの実験では，関連する確率的最適化アルゴリズムと比較した最新の結果が示されている。

82 GD by GD（2016）

原文：

Learning to learn by gradient descent by gradient descent

Abstract：

The move from hand-designed features to learned features in machine learning has been wildly successful.

訳：

機械学習において手作業で設計された特徴から学習された特徴への移行は，大成功を収めている。

In spite of this, optimization algorithms are still designed by hand.

語彙：

In spite of this

訳：

それにもかかわらず，最適化アルゴリズムは依然として手作業で設計されている。

In this paper we show how the design of an optimization algorithm can be cast as a learning problem, allowing the algorithm to learn to exploit structure in the problems of interest in an automatic way.

訳：

本論文では，最適化アルゴリズムの設計を学習問題としてどのようにキャストできるかを示し，アルゴリズムが関心のある問題の構造を自動的に活用することを学習できるようにする。

Our learned algorithms, implemented by LSTMs, outperform generic, hand-designed competitors on the tasks for which they are trained, and also generalize well to new tasks with similar structure.

訳：

LSTMによって実装された当社の学習アルゴリズムは，学習されたタスクにおいて一般的な手作業の競合他社よりも優れており，同様の構造を持つ新しいタスクに一般化されている。

We demonstrate this on a number of tasks, including simple convex problems, training neural networks, and styling images with neural art.

訳：

我々はこれを単純な凸問題，ニューラルネットワークの学習，ニューラルアートによる画像のスタイリングなど，いくつかのタスクで実証する。

83 AdaSecant（2017）

f:id:ryosuke_okubo:20191029205840p:plain

原文：

A Robust Adaptive Stochastic Gradient Method for Deep Learning

Abstract：

Stochastic gradient algorithms are the main focus of large-scale optimization problems and led to important successes in the recent advancement of the deep learning algorithms.

訳：

確率的勾配アルゴリズムは大規模な最適化問題の主な焦点であり，ディープラーニングアルゴリズムの最近の進歩において重要な成功をもたらした。

The convergence of SGD depends on the careful choice of learning rate and the amount of the noise in stochastic estimates of the gradients.

訳：

SGDの収束は学習率の慎重な選択と勾配の確率的推定におけるノイズの量に依存する。

In this paper, we propose an adaptive learning rate algorithm, which utilizes stochastic curvature information of the loss function for automatically tuning the learning rates.

訳：

本論文では，学習率を自動的に調整するために損失関数の確率的曲率情報を利用する，適応学習率アルゴリズムを提案する。

The information about the element-wise curvature of the loss function is estimated from the local statistics of the stochastic first order gradients.

訳：

損失関数の要素ごとの曲率に関する情報は確率的一次勾配の局所統計から推定される。

We further propose a new variance reduction technique to speed up the convergence.

訳：

我々はさらに収束を高速化する新しいバリアンス削減手法を提案する。

In our experiments with deep neural networks, we obtained better performance compared to the popular stochastic gradient algorithms.

訳：

DNNを使用した実験では，一般的な確率的勾配アルゴリズムと比較して優れたパフォーマンスが得られた。

84 AMSGrad（2019）

f:id:ryosuke_okubo:20191029205904p:plain

原文：

On the Convergence of Adam and Beyond

Abstract：

Several recently proposed stochastic optimization methods that have been successfully used in training deep networks such as RMSProp, Adam, Adadelta, Nadam are based on using gradient updates scaled by square roots of exponential moving averages of squared past gradients.

訳：

RMSProp，Adam，Adadelta，Nadamなどのディープネットワークの学習に使用されている最近提案されたいくつかの確率的最適化手法は，過去の2乗勾配の指数移動平均の平方根でスケーリングされた勾配更新の使用に基づいている。

In many applications, e.g. learning with large output spaces, it has been empirically observed that these algorithms fail to converge to an optimal solution (or a critical point in nonconvex settings).

訳：

多くのアプリケーションで，例えば大きな出力スペースで学習すると，これらのアルゴリズムが最適解（または非凸の設定の臨界点）に収束しないことが経験的に観察されている。

We show that one cause for such failures is the exponential moving average used in the algorithms.

訳：

このような失敗の原因の1つは，アルゴリズムで使用される指数移動平均であることを示す。

We provide an explicit example of a simple convex optimization setting where Adam does not converge to the optimal solution, and describe the precise problems with the previous analysis of Adam algorithm.

訳：

我々はAdamが最適なソリューションに収束しない単純な凸最適化設定の明示的な例を提供し，Adamアルゴリズムの以前の分析に関する正確な問題を説明する。

Our analysis suggests that the convergence issues can be fixed by endowing such algorithms with `long-term memory' of past gradients, and propose new variants of the Adam algorithm which not only fix the convergence issues but often also lead to improved empirical performance.

訳：

我々の分析はこのようなアルゴリズムに過去の勾配の「長期記憶」を与えることで収束の問題を修正できることを示唆し，収束の問題を修正するだけでなくしばしば経験的パフォーマンスの改善にもつながるAdamアルゴリズムの新しいバリアントを提案する。

85 AdaBound＆AMSBound（2019）

f:id:ryosuke_okubo:20191029205930p:plain

原文：

Adaptive Gradient Methods with Dynamic Bound of Learning Rate

Abstract：

Adaptive optimization methods such as AdaGrad, RMSprop and Adam have been proposed to achieve a rapid training process with an element-wise scaling term on learning rates.

訳：

AdaGrad，RMSprop，Adamなどの適応型最適化手法が，学習率に関する要素ごとのスケーリング用語を使用して迅速な学習プロセスを実現するために提案されてきてた。

Though prevailing, they are observed to generalize poorly compared with SGD or even fail to converge due to unstable and extreme learning rates.

語彙：

unstable

訳：

普及しているものの，SGDと比較して一般化が不十分であるか，不安定で極端な学習率のために収束しないことさえある。

Recent work has put forward some algorithms such as AMSGrad to tackle this issue but they failed to achieve considerable improvement over existing methods.

語彙：

tackle

訳：

最近の研究でこの問題に取り組むためにAMSGradなどのいくつかのアルゴリズムが提案されたが，既存の方法を大幅に改善することはできなかった。

In our paper, we demonstrate that extreme learning rates can lead to poor performance.

訳：

我々の論文では，極端な学習率がパフォーマンスの低下につながる可能性があることを示す。

We provide new variants of Adam and AMSGrad, called AdaBound and AMSBound respectively, which employ dynamic bounds on learning rates to achieve a gradual and smooth transition from adaptive methods to SGD and give a theoretical proof of convergence.

訳：

我々はAdaBoundおよびAMSBoundと呼ばれる，それぞれAdamおよびAMSGradの新しいバリアントを提供する，これらは学習率に動的境界を採用して適応法からSGDへの段階的かつスムーズな移行を実現し，収束の理論的証明を提供する。

We further conduct experiments on various popular tasks and models, which is often insufficient in previous work.

語彙：

insufficient

訳：

さまざまな一般的なタスクとモデルの実験をさらに行うが，これは以前の作業では不十分な場合がある。

Experimental results show that new variants can eliminate the generalization gap between adaptive methods and SGD and maintain higher learning speed early in training at the same time.

訳：

実験結果は，新しいバリアントが適応法とSGD間の一般化のギャップを解消し，同時にトレーニングの早い段階でより高い学習速度を維持できることを示す。

Moreover, they can bring significant improvement over their prototypes, especially on complex deep networks.

訳：

さらに，特に複雑で深いネットワーク上でプロトタイプを大幅に改善できる。

The implementation of the algorithm can be found at this https URL .

訳：

このhttps URLでアルゴリズムの実装が見られる。

次回↓

ryosuke-okubo.hatenablog.com

2019-12-23

論文Abstract100本ノック#16

機械学習論文

前回↓

ryosuke-okubo.hatenablog.com

76~85は最適化の手法について扱う。

76 AdaGrad（2011）

f:id:ryosuke_okubo:20191026190313p:plain

原文：

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization

Abstract：

We present a new family of subgradient methods that dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative gradient-based learning.

語彙：

subgradient methods

iterations

訳：

より有益な勾配ベースの学習を実行するために，我々は以前の反復で観測されたデータのジオメトリの知識を動的に組み込む新しい部分勾配法のファミリーを提示する。

Metaphorically, the adaptation allows us to find needles in haystacks in the form of very predictive but rarely seen features.

語彙：

haystacks

訳：

比喩的には，適応によって非常に予測的ではあるがめったに見られない特徴の形で干し草の山の中から針を見つけることができる。

Our paradigm stems from recent advances in stochastic optimization and online learning which employ proximal functions to control the gradient steps of the algorithm.

語彙：

stems from

proximal functions

訳：

我々のパラダイムは，アルゴリズムの勾配ステップを制御するためにproximal functionsを使用する確率的最適化とオンライン学習の最近の進歩に由来している。

We describe and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal function that can be chosen in hindsight.

語彙：

apparatus

modifying

in hindsight

訳：

学習率の設定を大幅に簡素化し，後から選択できる最良のproximal functionと同じくらい良いregret guaranteesをもたらす，proximal functionを適応的に更新するための装置を説明および分析する。

We give several efficient algorithms for empirical risk minimization problems with common and important regularization functions and domain constraints.

訳：

我々は一般的かつ重要な正則化関数とドメイン制約を伴う経験的リスク最小化問題のためのいくつかの効率的なアルゴリズムを提供する。

We experimentally study our theoretical analysis and show that adaptive subgradient methods outperform state-of-the-art, yet non-adaptive, subgradient algorithms.

訳：

理論的な分析を実験的に研究し，適応型の部分勾配法が，最先端しかし非適応型の部分勾配アルゴリズムよりも優れていることを示す。

77 Adam（2014）

f:id:ryosuke_okubo:20191026190341p:plain

原文：

Adam: A Method for Stochastic Optimization

Abstract：

We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments.

語彙：

first-order

lower-order

訳：

我々は低次モーメントの適応推定に基づいた，確率的目的関数の1次勾配ベースの最適化アルゴリズムであるAdamを紹介する。

The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters.

語彙：

straightforward

diagonal

and/or

訳：

この方法は実装が簡単で，計算効率が高く，メモリ要件がほとんどなく，勾配の対角線の再スケーリングに不変であり，データおよびまたはパラメーターの点で大きな問題に適している。

The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients.

語彙：

non-stationary

訳：

この方法は非常にノイズの多い，またはスパースな勾配のある非定常の目的や問題にも適している。

The hyper-parameters have intuitive interpretations and typically require little tuning.

語彙：

interpretations

訳：

ハイパーパラメーターには直感的な解釈があり，通常はほとんど調整する必要がない。

Some connections to related algorithms, on which Adam was inspired, are discussed.

訳：

Adamが触発されたいくつかの関連アルゴリズムへの接続について説明する。

We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

語彙：

convergence properties

regret

訳：

また，我々はアルゴリズムの理論的な収束性を分析し，オンライン凸最適化フレームワークの下で最もよく知られている結果に匹敵する収束率にregretの限界を提供する。

Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods.

訳：

経験的な結果は，Adamが実際にうまく機能し，他の確率的最適化手法と比較して有利であることを示している。

Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

訳：

最後に，無限ノルムに基づいたAdamのバリアントであるAdaMaxについて説明する。

78 RMSpropGraves（2013）

原文：

Generating Sequences With Recurrent Neural Networks

Abstract：

This paper shows how Long Short-term Memory recurrent neural networks can be used to generate complex sequences with long-range structure, simply by predicting one data point at a time.

語彙：

long-range

訳：

本論文では一度に1つのデータポイントを予測するだけで，LSTM RNNを使用して長距離構造の複雑なシーケンスを生成する方法を示す。

The approach is demonstrated for text (where the data are discrete) and online handwriting (where the data are real-valued).

訳：

このアプローチはテキスト（データが離散的である場合）およびオンライン手書き（データが実数値である場合）について実証されている。

It is then extended to handwriting synthesis by allowing the network to condition its predictions on a text sequence.

訳：

次に，ネットワークがテキストシーケンスで予測を調整できるようにすることで手書き合成に拡張される。

The resulting system is able to generate highly realistic cursive handwriting in a wide variety of styles.

語彙：

cursive

訳：

結果として得られるシステムは，非常にリアルな筆記体をさまざまなスタイルで生成できる。

79 Nadam（2016）

f:id:ryosuke_okubo:20191026190406p:plain

原文：

INCORPORATING NESTEROV MOMENTUM INTO ADAM

Abstract：

This work aims to improve upon the recently proposed and rapidly popularized optimization algorithm Adam (Kingma & Ba, 2014).

訳：

この作業は，最近提案され急速に普及した最適化アルゴリズムAdam(Kingma & Ba, 2014)を改善することを目的としている。

Adam has two main components—a momentum component and an adaptive learning rate component.

訳：

Adamには2つの主要なコンポーネントがあるーそれは、運動量コンポーネントと適応学習率コンポーネントである。

However, regular momentum can be shown conceptually and empirically to be inferior to a similar algorithm known as Nesterov’s accelerated gradient (NAG).

語彙：

inferior

Nesterov’s accelerated gradient

訳：

ただし，規則的な運動量はNesterovの加速勾配法（NAG）として知られる同様のアルゴリズムよりも劣っていることが概念的にも経験的にもを示されうる。

We show how to modify Adam’s momentum component to take advantage of insights from NAG, and then we present preliminary evidence suggesting that making this substitution improves the speed of convergence and the quality of the learned models.

訳：

我々はNAGからの洞察を活用するためにAdamの運動量コンポーネントを変更する方法を示し，この置換を行うことで収束速度と学習モデルの品質が向上することを示唆する予備的な証拠を提示する。

80 Eve（2016）

f:id:ryosuke_okubo:20191026190432p:plain

原文：

Eve: A Gradient Based Optimization Method with Locally and Globally Adaptive Learning Rates

Abstract：

Adaptive gradient methods for stochastic optimization adjust the learning rate for each parameter locally.

訳：

確率的最適化のための適応勾配法は各パラメーターの学習率を局所的に調整する。

However, there is also a global learning rate which must be tuned in order to get the best performance.

語彙：

in order to

訳：

ただし，グローバルな学習率もあり，それは最高のパフォーマンスを得るために調整する必要がある。

In this paper, we present a new algorithm that adapts the learning rate locally for each parameter separately, and also globally for all parameters together.

訳：

本論文では，各パラメータに対して学習率をローカルに個別に適応させ，またすべてのパラメータを一緒にグローバルに適応させる新しいアルゴリズムを提示する。

Specifically, we modify Adam, a popular method for training deep learning models, with a coefficient that captures properties of the objective function.

訳：

具体的には，深層学習モデルを学習する一般的な方法であるAdamを，目的関数のプロパティをキャプチャする係数で変更する。

Empirically, we show that our method, which we call Eve, outperforms Adam and other popular methods in training deep neural networks, like convolutional neural networks for image classification, and recurrent neural networks for language tasks.

訳：

経験的に，Eveと呼ばれる我々の方法が，画像分類のためのCNNや言語タスクのためのRNNのようなDNNの学習において，Adamや他の一般的な方法よりも優れていることを示す。

次回↓

ryosuke-okubo.hatenablog.com

2019-12-16

論文Abstract100本ノック#15

機械学習論文

前回↓

ryosuke-okubo.hatenablog.com

71 SoundNet（2016）

f:id:ryosuke_okubo:20191022061307p:plain

原文：

SoundNet: Learning Sound Representations from Unlabeled Video

Abstract：

We learn rich natural sound representations by capitalizing on large amounts of unlabeled sound data collected in the wild.

語彙：

in the wild

訳：

野生で収集されたラベル付けされていない大量の音声データを活用して，豊かで自然な音声表現を学習する。

We leverage the natural synchronization between vision and sound to learn an acoustic representation using two-million unlabeled videos.

語彙：

synchronization

訳：

我々は200万のラベルのないビデオを使用して音響表現を学習するために，視覚と音声の自然な同期を活用する。

Unlabeled video has the advantage that it can be economically acquired at massive scales, yet contains useful signals about natural sound.

訳：

ラベルのないビデオには大規模で経済的に取得できるという利点があるが，自然な音に関する有用な信号が含まれている。

We propose a student-teacher training procedure which transfers discriminative visual knowledge from well established visual recognition models into the sound modality using unlabeled video as a bridge.

訳：

我々はラベル付けされていないビデオをブリッジとして使用して，確立された視覚認識モデルから識別的視覚知識を音声モダリティに転送するstudent-teacher training procedureを提案する。

Our sound representation yields significant performance improvements over the state-of-the-art results on standard benchmarks for acoustic scene/object classification.

訳：

我々の音響表現により，音響シーン/オブジェクト分類の標準ベンチマークでの最新の結果よりも大幅にパフォーマンスが向上する。

Visualizations suggest some high-level semantics automatically emerge in the sound network, even though it is trained without ground truth labels.

語彙：

emerge

訳：

視覚化によりground truth labelsなしで学習されていても。サウンドネットワークにいくつかの高レベルのセマンティクスが自動的に現れることが示唆される。

72 LPCNet（2018）

f:id:ryosuke_okubo:20191022061333p:plain

原文：

LPCNet: Improving Neural Speech Synthesis Through Linear Prediction

Abstract：

Neural speech synthesis models have recently demonstrated the ability to synthesize high quality speech for text-to-speech and compression applications.

語彙：

text-to-speech

訳：

ニューラル音声合成モデルは，最近テキスト読み上げおよび圧縮アプリケーション向けに高品質の音声を合成する能力を実証している。

These new models often require powerful GPUs to achieve real-time operation, so being able to reduce their complexity would open the way for many new applications.

語彙：

open the way

訳：

これらの新しいモデルは多くの場合リアルタイム操作を実現するために強力なGPUを必要とするため，複雑さを軽減することで多くの新しいアプリケーションに突破口が開かれるだろう。

We propose LPCNet, a WaveRNN variant that combines linear prediction with recurrent neural networks to significantly improve the efficiency of speech synthesis.

訳：

我々はLPCNet，線形予測とRNNを組み合わせて音声合成の効率を大幅に向上させるWaveRNNバリアントを提案する。

We demonstrate that LPCNet can achieve significantly higher quality than WaveRNN for the same network size and that high quality LPCNet speech synthesis is achievable with a complexity under 3 GFLOPS.

訳：

LPCNetは同じネットワークサイズでWaveRNNよりも大幅に高い品質を達成できること，および3 GFLOPS未満の複雑さで高品質のLPCNet音声合成が実現可能であることを示す。

This makes it easier to deploy neural synthesis applications on lower-power devices, such as embedded systems and mobile phones.

訳：

これにより組み込みシステムや携帯電話などの低電力デバイスにニューラル合成アプリケーションを簡単に展開できる。

73 RawNet（2019）

f:id:ryosuke_okubo:20191022061402p:plain

原文：

RawNet: Fast End-to-End Neural Vocoder

Abstract：

Neural networks based vocoders have recently demonstrated the powerful ability to synthesize high quality speech.

訳：

ニューラルネットワークベースのボコーダーは，最近高品質の音声を合成する強力な能力を実証した。

These models usually generate samples by conditioning on some spectrum features, such as Mel-spectrum.

訳：

これらのモデルは通常，Mel-spectrumなどのスペクトル機能を条件としてサンプルを生成する。

However, these features are extracted by using speech analysis module including some processing based on the human knowledge.

訳：

ただし，これらの特徴は人間の知識に基づいた処理を含む音声分析モジュールを使用して抽出される。

In this work, we proposed RawNet, a truly end-to-end neural vocoder, which use a coder network to learn the higher representation of signal, and an autoregressive voder network to generate speech sample by sample.

訳：

ここで，我々はコーダーネットワークを使用して信号の高次表現を学習する真のエンドツーエンドニューラルボコーダーであるRawNetと，サンプルごとに音声サンプルを生成する自己回帰ボーダーネットワークを提案する。

The coder and voder together act like an auto-encoder network, and could be jointly trained directly on raw waveform without any human-designed features.

訳：

コーダーとボーダーはともにオートエンコーダーネットワークのように機能し，人工の機能を使用せずに生波形で直接共同で学習できる。

The experiments on the Copy-Synthesis tasks show that RawNet can achieve the comparative synthesized speech quality with LPCNet, with a smaller model architecture and faster speech generation at the inference step.

訳：

Copy-Synthesisタスクの実験はRawNetがLPCNetと比較した合成音声品質を達成できることを示している，モデルアーキテクチャは小さく推論ステップでの音声生成は高速である。

74 CycleGAN-VC（2017）

f:id:ryosuke_okubo:20191022061428p:plain

原文：

Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks

Abstract：

We propose a parallel-data-free voice-conversion (VC) method that can learn a mapping from source to target speech without relying on parallel data.

訳：

我々は並列データに依存せずにソースからターゲットのスピーチへのマッピングを学習できる，並列データフリーのvoice-conversion（VC）メソッドを提案する。

The proposed method is general purpose, high quality, and parallel-data free and works without any extra data, modules, or alignment procedure.

語彙：

general purpose

訳：

提案された方法は，汎用で，高品質、並列データフリーであり，追加のデータ，モジュール，またはアライメント手順なしで機能する。

It also avoids over-smoothing, which occurs in many conventional statistical model-based VC methods.

訳：

従来の多くの統計モデルベースのVCメソッドで発生するover-smoothingも回避する。

Our method, called CycleGAN-VC, uses a cycle-consistent adversarial network (CycleGAN) with gated convolutional neural networks (CNNs) and an identity-mapping loss.

訳：

CycleGAN-VCと呼ばれる我々の方法は，identity-mapping lossを伴う，ゲート制御されたCNNとCycleGANを使用する。

A CycleGAN learns forward and inverse mappings simultaneously using adversarial and cycle-consistency losses.

訳：

CycleGANはAdversarial LossとCycle Consistency Lossを使用して順方向マッピングと逆方向マッピングを同時に学習する。

This makes it possible to find an optimal pseudo pair from unpaired data.

訳：

これによりペアになっていないデータから最適な擬似ペアを見つけることができる。

Furthermore, the adversarial loss contributes to reducing over-smoothing of the converted feature sequence.

訳：

さらに、adversarial lossは変換された機能シーケンスのover-smoothingの削減に貢献する。

We configure a CycleGAN with gated CNNs and train it with an identity-mapping loss.

訳：

ゲート制御されたCNNを使用してCycleGANを構成し，identity-mapping lossで学習する。

This allows the mapping function to capture sequential and hierarchical structures while preserving linguistic information.

訳：

これにより，言語情報を保持しながらマッピング機能で順次および階層構造をキャプチャできる。

We evaluated our method on a parallel-data-free VC task.

訳：

我々は並列データのないVCタスクでこの方法を評価した。

An objective evaluation showed that the converted feature sequence was near natural in terms of global variance and modulation spectra.

訳：

客観的な評価により，変換された特徴シーケンスはグローバル分散と変調スペクトルの点で自然に近いことがわかった。

A subjective evaluation showed that the quality of the converted speech was comparable to that obtained with a Gaussian mixture model-based method under advantageous conditions with parallel and twice the amount of data.

語彙：

advantageous

訳：

主観的な評価により，変換された音声の品質は並列でデータ量が2倍の有利な条件下でガウス混合モデルベースの方法で得られた品質に匹敵することが示された。

75 StarGAN-VC（2018）

f:id:ryosuke_okubo:20191022061459p:plain

原文：

StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks

Abstract：

This paper proposes a method that allows non-parallel many-to-many voice conversion (VC) by using a variant of a generative adversarial network (GAN) called StarGAN.

訳：

本論文ではStarGANと呼ばれるGANのバリアントを使用して，非並列多対多音声変換を可能にする方法を提案する。

Our method, which we call StarGAN-VC, is noteworthy in that it

(1) requires no parallel utterances, transcriptions, or time alignment procedures for speech generator training,

(2) simultaneously learns many-to-many mappings across different attribute domains using a single generator network,

(3) is able to generate converted speech signals quickly enough to allow real-time implementations

and (4) requires only several minutes of training examples to generate reasonably realistic-sounding speech.

語彙：

noteworthy

utterances

simultaneously

implementations

訳：

StarGAN-VCと呼ばれる我々の方法はその点で注目に値する，

（1）音声発生器の学習に並行した発話，転写，または時間調整手順を必要としない

（2）単一の生成ネットワークを使用して異なる属性ドメイン間で多対多のマッピングを同時に学習する

（3）変換された音声信号をリアルタイム実装を可能にするのに十分な速さで生成できる

（4）合理的に現実的な音声を生成するための学習例は数分で済む。

Subjective evaluation experiments on a non-parallel many-to-many speaker identity conversion task revealed that the proposed method obtained higher sound quality and speaker similarity than a state-of-the-art method based on variational autoencoding GANs.

訳：

非並列多対多話者同一性変換タスクの主観評価実験により，提案された方法が変分自動符号化GANに基づく最新の方法よりも高い音質と話者の類似性を得ることが明らかになった。

次回↓

作成中

2019-12-09

論文Abstract100本ノック#14

機械学習論文

前回↓

ryosuke-okubo.hatenablog.com

66 CTC（2006）

f:id:ryosuke_okubo:20191018201048p:plain

原文：

Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks

Abstract：

Many real-world sequence learning tasks require the prediction of sequences of labels from noisy, unsegmented input data.

訳：

多くの実際のシーケンス学習タスクではノイズのあるセグメント化されていない入力データからラベルのシーケンスを予測する必要がある。

In speech recognition, for example, an acoustic signal is transcribed into words or sub-word units.

語彙：

acoustic

sub-word

訳：

たとえば音声認識では，音響信号は単語または部分語単位に転写される。

Recurrent neural networks (RNNs) are powerful sequence learners that would seem well suited to such tasks.

訳：

RNNはこのようなタスクに適していると思われる強力なシーケンス学習器である。

However, because they require pre-segmented training data, and post-processing to transform their outputs into label sequences, their applicability has so far been limited.

語彙：

so far

訳：

ただし，事前にセグメント化された学習データと，出力をラベルシーケンスに変換する後処理が必要なため，これまでのところその適用性は制限されていた。

This paper presents a novel method for training RNNs to label un-segmented sequences directly, thereby solving both problems.

語彙：

thereby

訳：

本論文ではセグメント化されていないシーケンスに直接ラベル付けするRNNを学習し，それによって両方の問題を解決する新しい方法を提示する。

An experiment on the TIMIT speech corpus demonstrates its advantages over both a baseline HMM and a hybrid HMM-RNN.

訳：

TIMITのスピーチコーパスの実験によってbaseline HMMとhybrid HMM-RNNの両方に対する利点を示される。

67 WaveNet（2016）

f:id:ryosuke_okubo:20191018201117p:plain

原文：

WaveNet: A Generative Model for Raw Audio

Abstract：

This paper introduces WaveNet, a deep neural network for generating raw audio waveforms.

語彙：

waveforms

訳：

本論文では，未加工のオーディオ波形を生成するためのDNNであるWaveNetを紹介する。

The model is fully probabilistic and autoregressive, with the predictive distribution for each audio sample conditioned on all previous ones;

nonetheless we show that it can be efficiently trained on data with tens of thousands of samples per second of audio.

訳：

このモデルは完全に確率的で自己回帰的であり，各オーディオサンプルの予測分布は以前のすべてのサンプルを条件としている；

それにもかかわらず1秒あたり数万サンプルのオーディオでデータを効率的に学習できることを示す。

When applied to text-to-speech, it yields state-of-the-art performance, with human listeners rating it as significantly more natural sounding than the best parametric and concatenative systems for both English and Mandarin.

語彙：

concatenative

Mandarin

訳：

テキスト読み上げに適用したとき，最新のパフォーマンスが得られる，人間のリスナーは英語とMandarinの両方で最高のパラメトリックおよび連結システムよりもはるかに自然な音として評価する。

A single WaveNet can capture the characteristics of many different speakers with equal fidelity, and can switch between them by conditioning on the speaker identity.

訳：

1つのWaveNetは多くの異なるスピーカーの特性を同等の忠実度でキャプチャし，スピーカーIDを調整することでそれらを切り替えることができる。

When trained to model music, we find that it generates novel and often highly realistic musical fragments.

訳：

音楽をモデル化する学習を受けたとき，それは斬新でしばしば非常に現実的な音楽断片を生成することがわかる。

We also show that it can be employed as a discriminative model, returning promising results for phoneme recognition.

語彙：

phoneme

訳：

また音素認識の有望な結果を返す識別モデルとして使用できることを示す。

68 Parallel WaveNet（2017）

f:id:ryosuke_okubo:20191018201148p:plain

原文：

Parallel WaveNet: Fast High-Fidelity Speech Synthesis

Abstract：

The recently-developed WaveNet architecture is the current state of the art in realistic speech synthesis, consistently rated as more natural sounding for many different languages than any previous system.

訳：

最近開発されたWaveNetアーキテクチャは現実的な音声合成の最新技術であり，以前のどのシステムよりも多くの異なる言語でより自然な音として一貫して評価されている。

However, because WaveNet relies on sequential generation of one audio sample at a time, it is poorly suited to today's massively parallel computers, and therefore hard to deploy in a real-time production setting.

語彙：

is ~ suited to

massively parallel computers

訳：

ただし，WaveNetは一度に1つのオーディオサンプルの順次生成に依存するため，今日の超並列計算機にはあまり適しておらず，リアルタイムのプロダクション設定で展開するのは困難である。

This paper introduces Probability Density Distillation, a new method for training a parallel feed-forward network from a trained WaveNet with no significant difference in quality.

訳：

本論文ではProbability Density Distillationを紹介する，これは品質に大きな違いがない学習済みWaveNetから並列フィードフォワードネットワークを学習するための新しい方法である。

The resulting system is capable of generating high-fidelity speech samples at more than 20 times faster than real-time, and is deployed online by Google Assistant, including serving multiple English and Japanese voices.

語彙：

capable of

high-fidelity

訳：

結果として得られるシステムはリアルタイムよりも20倍以上高速な忠実度の高い音声サンプルを生成でき，複数の英語と日本語の音声を提供するなど，Google Assistantによってオンラインで展開される。

69 SSWS（2018）

f:id:ryosuke_okubo:20191018201218p:plain

原文：

Comprehensive evaluation of statistical speech waveform synthesis

Abstract：

Statistical TTS systems that directly predict the speech waveform have recently reported improvements in synthesis quality.

訳：

音声波形を直接予測する統計TTSシステムは最近合成品質の改善を報告している。

This investigation evaluates Amazon's statistical speech waveform synthesis (SSWS) system.

訳：

この調査ではAmazonの統計的音声波形合成（SSWS）システムを評価する。

An in-depth evaluation of SSWS is conducted across a number of domains to better understand the consistency in quality.

語彙：

in-depth

訳：

SSWSの詳細な評価は品質の一貫性をよりよく理解するため多くのドメインにわたって実施される。

The results of this evaluation are validated by repeating the procedure on a separate group of testers.

訳：

この評価の結果はテスターの別のグループで手順を繰り返すことによって検証される。

Finally, an analysis of the nature of speech errors of SSWS compared to hybrid unit selection synthesis is conducted to identify the strengths and weaknesses of SSWS.

訳：

最後に，SSWSの長所と短所を特定するためにハイブリッドユニット選択合成と比較したSSWSの音声エラーの性質の分析が行われる。

Having a deeper insight into SSWS allows us to better define the focus of future work to improve this new technology.

訳：

SSWSをより深く理解することでこの新しいテクノロジーを改善するための今後の作業の焦点をより明確にすることができる。

70 MelNet（2019）

f:id:ryosuke_okubo:20191018201244p:plain

原文：

MelNet: A Generative Model for Audio in the Frequency Domain

Abstract：

Capturing high-level structure in audio waveforms is challenging because a single second of audio spans tens of thousands of timesteps.

訳：

オーディオの1秒が何万ものタイムステップに及ぶため，オーディオ波形の高レベル構造をキャプチャすることは困難である。

While long-range dependencies are difficult to model directly in the time domain, we show that they can be more tractably modelled in two-dimensional time-frequency representations such as spectrograms.

語彙：

tractably

訳：

長距離の依存関係を時間領域で直接モデル化することは困難だが，スペクトログラムなどの2次元の時間-周波数表現でより扱いやすくモデル化できることを示す。

By leveraging this representational advantage, in conjunction with a highly expressive probabilistic model and a multiscale generation procedure, we design a model capable of generating high-fidelity audio samples which capture structure at timescales that time-domain models have yet to achieve.

語彙：

leveraging

in conjunction

訳：

この表現上の利点を，非常に表現力の高い確率モデルとマルチスケール生成手順とともに活用することで，時間領域モデルがまだ達成していないタイムスケールで構造をキャプチャする高忠実度のオーディオサンプルを生成できるモデルを設計する。

We apply our model to a variety of audio generation tasks, including unconditional speech generation, music generation, and text-to-speech synthesis

---showing improvements over previous approaches in both density estimates and human judgments.

訳：

我々はこのモデルを無条件の音声生成，音楽生成，テキスト音声合成などさまざまな音声生成タスクに適用する，

密度推定と人間の判断の両方において以前のアプローチよりも改善されている。

次回↓

ryosuke-okubo.hatenablog.com