An autoencoder-classified cluster of SARS-CoV-2 strain with two mutations in helicase
Miyake, Jun Osaka University
Yoshino, Mitsuaki Osaka University
Sato, Takaaki Osaka University
Niioka, Hirohiko Osaka University
Sakata, Yasushi Osaka University
Using an autoencoder-based analysis to classify genomes of SARS-CoV-2 coronaviruses, we found a cluster consisting only of a specific genotype with two mutations in the helicase. This virus genotype, called C-type SARS-CoV-2, was almost exclusively prevalent in the United States from March to July 2020. This type of virus, characterized by a pair of the C17747T (P504L) and A17858G (Y541C) mutations on the nsp13 gene, had never been highly prevalent at any other time or in any other part of the world. In the U.S., Washington State was the center of the epidemic, and the C-type viruses, along with the viruses with wild-type helicase, seemed to have aroused the pandemic. In Washington State, USA, the CoViD-19 epidemic during the first two months of the year, starting at the end of February 2020, was mainly caused by the type-C virus. During this period, the infection spread rapidly; from May onwards, the number of viruses with wild-type helicases became higher than that of type-C viruses, and no type-C viruses have been collected since early July. The involvement of the helicase in this COVID-19 disease was discussed.
This article is a preprint and has not been peer-reviewed. It reports new medical research that has yet to be evaluated and so should not be used to guide clinical practice.
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
medrxiv_20210722.pdf 961 KB