yubessyのブックマーク / 2018年5月13日

Reinforcement Learning - Bandit Problems

Table of Contents Table of Contents $k$-armed Bandit Probl em Action-value Methods Estimating Action Values Action Selection Rule : Greedy Action Selection Rule : $\epsilon$-Greedy Crux: Nonstationary Action Value 1. Transience 2. Convergence Action Selection Rule: Optimistic Initial Values Action Selection Rule: Upper-Confidence-Bound Selection Gradient Bandit Algorithms Ending Remarks To begin, w

yubessy 2018/05/13

bandit

リンク

Record and share your terminal sessions, the simple way - asciinema.org

Record and share your terminal sessions, the simple way. Forget screen recording apps and blurry video. Experience a lightweight, text-based approach to terminal recording. asciinema [as-kee-nuh-muh] is a free and open source solution for recording terminal sessions and sharing them on the web. Read about how it works. Easy recording Record right where you work - in a terminal. To start, run ascii

yubessy 2018/05/13

ASCII

リンク

Why does deep and cheap learning work so well?

We show how the success of deep learning could depend not only on mathematics but also on physics: although well-known mathematical theorems guarantee that neural networks can approximate arbitrary functions well, the class of functions of practical interest can frequently be approximated through "cheap learning" with exponentially fewer parameters than generic ones. We explore how properties freq

yubessy 2018/05/13

MachineLearning

リンク

DiscoverDev | Daily digest of engineering blog posts for software developers

Discover dev brings you a daily digest of the best engineering blogs from across the web! Handpicked by AI and a network of globally distributed nerds! Join us and thousands of fellow developers - Subscribe to our mailing list, follow us on Twitter or tap into our RSS feed (Feed burner)! [×] Subscribe to our weekly newsletter! We take your time and privacy seriously - no sharing of em ail, no spamm

yubessy 2018/05/13

programming

リンク

GitHub - codecrafters-io/build-your-own-x: Master programming by recreating your favorite technologies from scratch.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

yubessy 2018/05/13

ええやん

programming

リンク

Monad laws - HaskellWiki

Here, p ≡ q simply means that you can replace p with q and vice-versa, and the behaviour of your program will not change: p and q are equivalent. Using eta-expansion, the associativity law can be re-written for clarity as:

yubessy 2018/05/13

Monad

リンク

はてなブックマーク

タグ

2018年5月13日のブックマーク (6件)

Reinforcement Learning - Bandit Problems

Record and share your terminal sessions, the simple way - asciinema.org

Why does deep and cheap learning work so well?

DiscoverDev | Daily digest of engineering blog posts for software developers

GitHub - codecrafters-io/build-your-own-x: Master programming by recreating your favorite technologies from scratch.

Monad laws - HaskellWiki

お知らせ

今週のはてなブックマーク数ランキング（2024年7月第1週）

月間はてなブックマーク数ランキング（2024年6月）

今週のはてなブックマーク数ランキング（2024年6月第5週）

公式Twitter

キーボードショートカット一覧

はてなブックマーク

公式Twitter

はてなのサービス