stibbarのブックマーク / 2018年3月9日

stibbar id:stibbar

2018年3月9日のブックマーク (5件)

Deep Reinforcement Learning with Double Q-learning (Double DQN) - DeepLearningを勉強する人
Deep Reinforcement Learning with Double Q-learning [1509.06461] Deep Reinforcement Learning with Double Q-learning 論文まとめ Q-learningは、maxを取っている関係上、action-valueを過大評価(overestimate)する傾向があることが知られている. これまでに挙げられていた過大評価の原因柔軟性が不十分な関数近似による誤差 Thrun and Schwartz (1993) 環境のノイズ van Hasselt (2010) この論文ではより一般的に、任意の推定誤差によって過大評価は引き起こされることが示されている. 学習過程では必ず不正確な推定値になってしまうのでこれは重要な問題. Double Q-learning (van Hasselt, 20
stibbar 2018/03/09
reinforcement-learning

dqn

double-dqn
リンク
ディープラーニングで論理回路を学習、予測させてみた - Qiita
上記のパラメータ項目にも書いてありますが、今回は4層のネットワークで学習の方を行っていきたいと思います。4層のネットワークによるDLのソースコードについては次のようになります。 /* プログラム名:deeplearn.c */ /* ディープラーニング（学習） */ #include<stdio.h> #include<stdlib.h> #include<math.h> #include<time.h> /* パラメータを設定する（今書いてある値は例であるため、好きに設定してよい） */ #define NUM_LEARN 10000 /* 学習回数 */ #define NUM_SAMPLE 6 /* 訓練データ数（今回は6パターン用意する）*/ #define NUM_INPUT 3 /* 入力層の数（論理回路の入力）*/ #define NUM_HIDDEN_ONE 4 /* 中間
stibbar 2018/03/09
qiita

deep-learning

logic-circuit
リンク
強化学習その2
1. 強化学習その2 2017-01-26 @ 機械学習勉強会サイボウズ・ラボ西尾泰和関連スライド一覧 https://github.com/nishio/reinforcement_learning 2017-02-24 加筆 4. Sutton & Barto の新作 draftが読める。目次を一部紹介: 第1部: Tabular Solution Methods 6 Temporal-Difference Learning 8 Planning and Learning with Tabular Methods 第2部: Approximate Solution Methods 12 Eligibility Traces 13 Policy Gradient Methods 第3部: Looking Deeper 16 Applications and Case Studie
stibbar 2018/03/09
http://blog.brainpad.co.jp/entry/2017/02/24/121500

slideshare

reinforcement-learning
リンク
経済評論家って、お金はどうしてるの？上念司に突撃したら自分の人生を激しく後悔した｜新R25 - シゴトも人生も、もっと楽しもう。
R25世代の資産運用や仮想通貨への関心の高まりを受けてスタートした新連載『マネ凸（トツ）』。編集長の渡辺がマネーの賢者の「お金の話」に切り込んでいくインタビュー企画です。前回は親会社の代表である藤田から（無理やり）マネーの金言を引き出しましたが、第2回はまた違う立場の賢者からお話を聞いてみたいということで、こんな取材テーマを考えてみました。「いつも経済のことを（エラそうに）語ってる経済評論家って、自分のお金はどうしてるの？」どうでしょう。言われてみれば気になりません？ということで、これまで数多くの書籍を出版し、テレビやラジオでも活躍する人気経済評論家の上念司さんにマネ凸してきましたよ！
stibbar 2018/03/09
economics

administration
リンク
30 Seconds Drawing
About Thirty Seconds DrawingThis is a tool for practicing drawing that displays random poses at regular intervals. In 2005, posemaniacs.com was the first in the world to make it available on the web, and it has since spread to a variety of sites. How to useDecide on the number of seconds and other settings, and press the Start button. The poses will change one after another with a countdown, so dr
stibbar 2018/03/09
drawing

illustration
リンク
- 2018年3月10日
- 2018年3月9日
- 2018年3月7日