Publication: Cross-Modal Transfer Learning for Image and Sound
Дата
2022
Авторы
Journal Title
Journal ISSN
Volume Title
Издатель
Аннотация
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.Recently the research on transfer learning between similar domains has become increasingly common. However, the fields of cross-domain and cross-modal knowledge transfers are more complicated and have been studied less. We propose the new transfer learning strategy between tasks on essentially different domains called as cross-modal transfer learning and consider its ideas and the algorithm. The key element of cross-modal transfer pipeline is cross-modal adapter, i.e. a neural network that transforms the target domain features to the source domain features that can be efficiently processed by a pre-trained neural network. In the experiments the dataset ImageNet and audio dataset ESC-50 are chosen as source domain and target domain respectively. It is shown that a fairly simple neural cross-modal adapter makes it possible to achieve high classification accuracy on target domain using the knowledge obtained by pre-trained neural network on the source domain. Our experiments also show that cross-modal transfer learning noticeably reduces the training time in comparison with the building target model “from scratch”.
Описание
Ключевые слова
Цитирование
Soroka, A. A. Cross-Modal Transfer Learning for Image and Sound / Soroka, A.A., Trofimov, A.G. // Studies in Computational Intelligence. - 2022. - 1008 SCI. - P. 238-245. - 10.1007/978-3-030-91581-0_32