Decision-aware model learning for actor-critic methods: when theory does not meet practice (2020)
- Authors:
- USP affiliated authors: BARROS, LELIANE NUNES DE - IME ; MAUÁ, DENIS DERATANI - IME ; LOVATTO, ÂNGELO GREGÓRIO - IME ; BUENO, THIAGO PEREIRA - IME
- Unidade: IME
- Subjects: APRENDIZADO COMPUTACIONAL; MODELOS DE APRENDIZAGEM
- Keywords: reinforcement learning; model-based; actor-critic; decision aware
- Agências de fomento:
- Language: Inglês
- Imprenta:
- Source:
- Título do periódico: Proceedings
- Conference titles: Conference on Neural Information Processing Systems - NeurIPS
-
ABNT
LOVATTO, Ângelo Gregório et al. Decision-aware model learning for actor-critic methods: when theory does not meet practice. 2020, Anais.. San Diego: NeurIPS, 2020. Disponível em: https://openreview.net/pdf?id=a9lwn6v40C4. Acesso em: 24 abr. 2024. -
APA
Lovatto, Â. G., Bueno, T. P., Mauá, D. D., & Barros, L. N. de. (2020). Decision-aware model learning for actor-critic methods: when theory does not meet practice. In Proceedings. San Diego: NeurIPS. Recuperado de https://openreview.net/pdf?id=a9lwn6v40C4 -
NLM
Lovatto ÂG, Bueno TP, Mauá DD, Barros LN de. Decision-aware model learning for actor-critic methods: when theory does not meet practice [Internet]. Proceedings. 2020 ;[citado 2024 abr. 24 ] Available from: https://openreview.net/pdf?id=a9lwn6v40C4 -
Vancouver
Lovatto ÂG, Bueno TP, Mauá DD, Barros LN de. Decision-aware model learning for actor-critic methods: when theory does not meet practice [Internet]. Proceedings. 2020 ;[citado 2024 abr. 24 ] Available from: https://openreview.net/pdf?id=a9lwn6v40C4 - Gradient estimation in model-based reinforcement learning: a study on linear quadratic environments
- Analyzing the effect of stochastic transitions in policy gradients in deep reinforcement learning
- Exploration versus exploitation in model-based reinforcement learning: an empirical study
- Deep reactive policies for planning in stochastic nonlinear domains
- On the performance of planning through backpropagation
- When a robot reaches out for human help
- Planning in stochastic computation graphs: solving stochastic nonlinear problems with backpropagation
- Model-based policy gradients: an empirical study on linear quadratic environments
- A contact network-based approach for online planning of containment measures for COVID-19
- Markov decision processes specified by probabilistic logic programming: representation and solution
Download do texto completo
Tipo | Nome | Link | |
---|---|---|---|
3065241.pdf | Direct link |
How to cite
A citação é gerada automaticamente e pode não estar totalmente de acordo com as normas