強化学習でランダムな探索で報酬を得る条件を満たす可能性は低い。SILはある場面における行動価値が、実際に得られた収益より小さい場合のみ、その行動系列を選択するよう学習することで、うまくいった系列を繰り返

fromTw

elu_18 のブックマーク 2018/07/10 20:20

<blockquote class="hatena-bookmark-comment"><a class="comment-info" href="https://b.hatena.ne.jp/entry/367262521/comment/elu_18" data-user-id="elu_18" data-entry-url="https://b.hatena.ne.jp/entry/s/arxiv.org/abs/1806.05635" data-original-href="https://arxiv.org/abs/1806.05635" data-entry-favicon="https://cdn-ak2.favicon.st-hatena.com/64?url=https%3A%2F%2Farxiv.org%2Fabs%2F1806.05635" data-user-icon="/users/elu_18/profile.png">Self-Imitation Learning</a><ul class="comment-tag" style="list-style: none; margin: 0px;"><li style="float: left">[<a href="https://b.hatena.ne.jp/q/fromTw">fromTw</a>]</li></ul><br><p style="clear: left"> 強化学習でランダムな探索で報酬を得る条件を満たす可能性は低い。SILはある場面における行動価値が、実際に得られた収益より小さい場合のみ、その行動系列を選択するよう学習することで、うまくいった系列を繰り返</p><a class="datetime" href="https://b.hatena.ne.jp/elu_18/20180710#bookmark-367262521"><span class="datetime-body">2018/07/10 20:20</span></a></blockquote><script src="https://b.st-hatena.com/js/comment-widget.js" charset="utf-8" async></script>

このブックマークにはスターがありません。
最初のスターをつけてみよう！

Self-Imitation Learning

arxiv.org2018/07/10

1 人がブックマーク・1 件のコメント

他のコメントを読む

＼コメントがサクサク読めるアプリです／

はてなブックマーク

Self-Imitation Learning

はてなブックマーク

公式Twitter

はてなのサービス