Robust Maximum Entropy Behavior Cloning

مؤلف البحث

Mostafa Hussein, Brendan Crowe, Marek Petrik, Momotaz Begum

المشارك في البحث

Mostafa Abdallah Mohammed Hussein

قسم البحث

الهندسة الكهربائية

تاريخ البحث

Fri, 01/01/2021 - 12:00

سنة البحث

2021

مجلة البحث

arXiv preprint arXiv:2101.01251

الناشر

arXiv preprint arXiv:2101.01251

صفحات البحث

arXiv preprint arXiv:2101.01251

موقع البحث

https://arxiv.org/abs/2101.01251

ملخص البحث

Imitation learning (IL) algorithms use expert demonstrations to learn a specific task. Most of the existing approaches assume that all expert demonstrations are reliable and trustworthy, but what if there exist some adversarial demonstrations among the given data-set? This may result in poor decision-making performance. We propose a novel general frame-work to directly generate a policy from demonstrations that autonomously detect the adversarial demonstrations and exclude them from the data set. At the same time, it's sample, time-efficient, and does not require a simulator. To model such adversarial demonstration we propose a min-max problem that leverages the entropy of the model to assign weights for each demonstration. This allows us to learn the behavior using only the correct demonstrations or a mixture of correct demonstrations.

كلية الهندسة

Robust Maximum Entropy Behavior Cloning

آخر الأبحاث

جامعة
اسيوط

روابط هامة

عنواننا

Typography

Body

General

Header

Main Menu

Footer

Copyright

كلية الهندسة

آخر الأبحاث

جامعة اسيوط

روابط هامة

عنواننا

Typography

Body

General

Header

Main Menu

Footer

Copyright

جامعة
اسيوط