Tag
This paper proposes an automatic generation pipeline to create a large-scale training dataset (RAINbow) for DialNav, a dialog-based vision-and-language navigation task. Combined with dual-strategy training and a localization model, it achieves substantial gains over the baseline.