Total for the last 12 months
number of access : ?
number of downloads : ?
ID 115141
Title Alternative
New Classification and Generative Model for Medical Visual Question Answering
Author
Zhou, Yangyang University of Tokushima
Keywords
Classification model
generative model
medical image
transformer
visual question answering
Content Type
Journal Article
Description
Medical images are playing an important role in the medical domain. A mature medical visual question answering system can aid diagnosis, but there is no satisfactory method to solve this comprehensive problem so far. Considering that there are many different types of questions, we propose a model called CGMVQA, including classification and answer generation capabilities to turn this complex problem into multiple simple problems in this paper. We adopt data augmentation on images and tokenization on texts. We use pre-trained ResNet152 to extract image features and add three kinds of embeddings together to deal with texts. We reduce the parameters of the multi-head self-attention transformer to cut the computational cost down. We adjust the masking and output layers to change the functions of the model. This model establishes new state-of-the-art results: 0.640 of classification accuracy, 0.659 of word matching and 0.678 of semantic similarity in ImageCLEF 2019 VQA-Med data set. It suggests that the CGMVQA is effective in medical visual question answering and can better assist doctors in clinical analysis and diagnosis.
Journal Title
IEEE Access
ISSN
21693536
Publisher
IEEE
Volume
8
Start Page
50626
End Page
50636
Published Date
2020-03-11
Rights
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
EDB ID
DOI (Published Version)
URL ( Publisher's Version )
FullText File
language
eng
TextVersion
Publisher
departments
Science and Technology