クロスエントロピー損失関数を使いSGDで学習した場合、訓練コストが0に近づいても学習し続けるとL2マージン最大化の解が得られ暗黙的な正則化効果で汎化する。訓練コストや検証コストが下げ止まっても学習をやめては

fromTw

elu_18 のブックマーク 2017/11/06 20:39

<blockquote class="hatena-bookmark-comment"><a class="comment-info" href="https://b.hatena.ne.jp/entry/348027455/comment/elu_18" data-user-id="elu_18" data-entry-url="https://b.hatena.ne.jp/entry/s/arxiv.org/abs/1710.10345" data-original-href="https://arxiv.org/abs/1710.10345" data-entry-favicon="https://cdn-ak2.favicon.st-hatena.com/64?url=https%3A%2F%2Farxiv.org%2Fabs%2F1710.10345" data-user-icon="/users/elu_18/profile.png">The Implicit Bias of Gradient Descent on Separable Data</a><ul class="comment-tag" style="list-style: none; margin: 0px;"><li style="float: left">[<a href="https://b.hatena.ne.jp/q/fromTw">fromTw</a>]</li></ul><br><p style="clear: left"> クロスエントロピー損失関数を使いSGDで学習した場合、訓練コストが0に近づいても学習し続けるとL2マージン最大化の解が得られ暗黙的な正則化効果で汎化する。訓練コストや検証コストが下げ止まっても学習をやめては</p><a class="datetime" href="https://b.hatena.ne.jp/elu_18/20171106#bookmark-348027455"><span class="datetime-body">2017/11/06 20:39</span></a></blockquote><script src="https://b.st-hatena.com/js/comment-widget.js" charset="utf-8" async></script>

このブックマークにはスターがありません。
最初のスターをつけてみよう！

The Implicit Bias of Gradient Descent on Separable Data

arxiv.org2017/11/06

We examine gradient descent on unregularized logistic regression probl ems, with homogeneous linear predictors on linearly separable datasets. We show the predictor converges to the direction of the...

4 人がブックマーク・1 件のコメント

他のコメントを読む

＼コメントがサクサク読めるアプリです／

はてなブックマーク

The Implicit Bias of Gradient Descent on Separable Data

はてなブックマーク

公式Twitter

はてなのサービス