Exploration versus exploitation in model-based reinforcement learning: an empirical study (2022)
- Authors:
- USP affiliated authors: BARROS, LELIANE NUNES DE - IME ; MAUÁ, DENIS DERATANI - IME ; LOVATTO, ÂNGELO GREGÓRIO - IME
- Unidade: IME
- DOI: 10.1007/978-3-031-21689-3_3
- Assunto: CONTROLE ÓTIMO
- Keywords: Reinforcement learning; Model-based reinforcement learning; Optimal control; Gradient optimization methods
- Agências de fomento:
- Language: Inglês
- Imprenta:
- Source:
- Título: Proceedings
- Conference titles: Brazilian Conference on Intelligent Systems - BRACIS
- Este periódico é de assinatura
- Este artigo NÃO é de acesso aberto
- Cor do Acesso Aberto: closed
-
ABNT
LOVATTO, Ângelo Gregório e BARROS, Leliane Nunes de e MAUÁ, Denis Deratani. Exploration versus exploitation in model-based reinforcement learning: an empirical study. 2022, Anais.. Cham: Springer, 2022. Disponível em: https://doi.org/10.1007/978-3-031-21689-3_3. Acesso em: 02 jan. 2026. -
APA
Lovatto, Â. G., Barros, L. N. de, & Mauá, D. D. (2022). Exploration versus exploitation in model-based reinforcement learning: an empirical study. In Proceedings. Cham: Springer. doi:10.1007/978-3-031-21689-3_3 -
NLM
Lovatto ÂG, Barros LN de, Mauá DD. Exploration versus exploitation in model-based reinforcement learning: an empirical study [Internet]. Proceedings. 2022 ;[citado 2026 jan. 02 ] Available from: https://doi.org/10.1007/978-3-031-21689-3_3 -
Vancouver
Lovatto ÂG, Barros LN de, Mauá DD. Exploration versus exploitation in model-based reinforcement learning: an empirical study [Internet]. Proceedings. 2022 ;[citado 2026 jan. 02 ] Available from: https://doi.org/10.1007/978-3-031-21689-3_3 - Decision-aware model learning for actor-critic methods: when theory does not meet practice
- Gradient estimation in model-based reinforcement learning: a study on linear quadratic environments
- When a robot reaches out for human help
- Analyzing the effect of stochastic transitions in policy gradients in deep reinforcement learning
- A contact network-based approach for online planning of containment measures for COVID-19
- Differentiable planning for optimal liquidation
- Model-based policy gradients: an empirical study on linear quadratic environments
- Deep reactive policies for planning in stochastic nonlinear domains
- Markov decision processes specified by probabilistic logic programming: representation and solution
- Modeling Markov decision processes with imprecise probabilities using probabilistic logic programming
Informações sobre o DOI: 10.1007/978-3-031-21689-3_3 (Fonte: oaDOI API)
How to cite
A citação é gerada automaticamente e pode não estar totalmente de acordo com as normas
