Gradient estimation in model-based reinforcement learning: a study on linear quadratic environments (2021)
- Authors:
- USP affiliated authors: BARROS, LELIANE NUNES DE - IME ; LOVATTO, ÂNGELO GREGÓRIO - IME ; BUENO, THIAGO PEREIRA - IME
- Unidade: IME
- DOI: 10.1007/978-3-030-91702-9_3
- Subjects: MODELOS PARA PROCESSOS ESTOCÁSTICOS; APRENDIZADO COMPUTACIONAL
- Agências de fomento:
- Language: Inglês
- Imprenta:
- Source:
- Título do periódico: Proceedings
- Conference titles: Brazilian Conference on Intelligent Systems - BRACIS
- Este periódico é de assinatura
- Este artigo NÃO é de acesso aberto
- Cor do Acesso Aberto: closed
-
ABNT
LOVATTO, Ângelo Gregório e BUENO, Thiago Pereira e BARROS, Leliane Nunes de. Gradient estimation in model-based reinforcement learning: a study on linear quadratic environments. 2021, Anais.. Cham: Springer, 2021. Disponível em: https://doi.org/10.1007/978-3-030-91702-9_3. Acesso em: 24 abr. 2024. -
APA
Lovatto, Â. G., Bueno, T. P., & Barros, L. N. de. (2021). Gradient estimation in model-based reinforcement learning: a study on linear quadratic environments. In Proceedings. Cham: Springer. doi:10.1007/978-3-030-91702-9_3 -
NLM
Lovatto ÂG, Bueno TP, Barros LN de. Gradient estimation in model-based reinforcement learning: a study on linear quadratic environments [Internet]. Proceedings. 2021 ;[citado 2024 abr. 24 ] Available from: https://doi.org/10.1007/978-3-030-91702-9_3 -
Vancouver
Lovatto ÂG, Bueno TP, Barros LN de. Gradient estimation in model-based reinforcement learning: a study on linear quadratic environments [Internet]. Proceedings. 2021 ;[citado 2024 abr. 24 ] Available from: https://doi.org/10.1007/978-3-030-91702-9_3 - Analyzing the effect of stochastic transitions in policy gradients in deep reinforcement learning
- Decision-aware model learning for actor-critic methods: when theory does not meet practice
- Exploration versus exploitation in model-based reinforcement learning: an empirical study
- Deep reactive policies for planning in stochastic nonlinear domains
- Planning in stochastic computation graphs: solving stochastic nonlinear problems with backpropagation
- Model-based policy gradients: an empirical study on linear quadratic environments
- On the performance of planning through backpropagation
- Real-time symbolic dynamic programming for hybrid MDPs
- A planner agent that tries its best in presence of nondeterminism
- Compilador de regras para geracao de um sistema especialista com encadeamento regressivo
Informações sobre o DOI: 10.1007/978-3-030-91702-9_3 (Fonte: oaDOI API)
How to cite
A citação é gerada automaticamente e pode não estar totalmente de acordo com as normas