25 Sept 2019, 19:27 (modified: 24 Dec 2019, 14:47)ICLR 2020 Conference Desk Rejected SubmissionReaders: Everyone Abstract: This paper presents an unbiased exploration framework for the belief state $p(s)$ in non-cooperative, multi-agent, partially-observable environments through differentiable recurrent functions. As well as single-agent exploration via intrinsic reward and generative RNNs, severa
![Generative Integration Networks](https://cdn-ak-scissors.b.st-hatena.com/image/square/af3ca7912a7013ba0c03e0ab262afaf17dc5cfd3/height=288;version=1;width=512/https%3A%2F%2Fopenreview.net%2Fimages%2Fopenreview_logo_512.png)