Novel method to predict emergence of worrisome coronavirus variants

A new method can predict the course of evolution of the novel coronavirus

Virus Outbreak Malaysia

Scientists have developed a new method to predict the course of evolution of the novel coronavirus and determine which lineages currently in circulation could spread widely in future, an advance that may help vaccine manufacturers stay one-step ahead of worrisome antibody-escaping variants.

The study posted in the platform bioRxiv, and yet-to-be peer-reviewed, assessed 3,11,795 genome sequences of the coronavirus spike protein and found mutations altering the amino acid building blocks that make up the protein.

According to the researchers, including Lipi Thukral from the CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), New Delhi, the genome sequences revealed 2,584 mutations in the spike protein that enables the virus to enter human cells.

Proteins are made of a chain of molecules called amino acids, and each of these molecules is represented by a number denoting its position, and a letter representing the 20 different amino acids we commonly find on the Earth.

Since January last year, Thukral and her team have been analysing the chain of amino acid molecules that make up the many different samples of the virus spike protein sequenced across the globe.

"Analysing such tremendous amounts of biological data that spans over one year, we monitored if there was a trend in how the virus is mutating. The main conclusion of the study is that there is a clear trend in the way in which the virus is evolving," Thurkal told PTI.

Vaccines generally induce antibodies against specific parts of a pathogen's proteins, and if these parts undergo mutation, the antibodies can no longer effectively bind and neutralise the virus, the scientists said.

Experts believe the findings can not only help understand the course of evolution of the virus, but also aid in tweaking vaccines or in developing new ones.

Among the mutations analysed, the study specifically looked for those which co-occurred -- also known as mutation clusters.

"Mutational clusters are a set of co-occurring mutations. That is whenever an amino acid, say X, in a protein gets mutated, another amino acid Y in the same protein also gets mutated. X and Y then form a mutational cluster," said Vigneshwar Ramakrishnan, Professor of Bioinformatics from SASTRA University, Tamil Nadu, who was not involved in the study.

"We are trying to monitor all these mutations simultaneously. When they occur together, it has been reported in the past that it is a means for the pathogen to evolve into a completely different strain," added Thukral, the lead author of the study.

The study found that four such clusters were prevalent in the samples from across the globe, and share common mutations.

Virologist Upasana Ray, who was not involved in the study, noted that the common positions in each of these clusters were the amino acid building blocks N501, A222, N439, and S477 of the spike protein.

According to the scientists, the amino acid molecules at these positions are more prone to mutate and have been reported in multiple samples collected from across the globe.

The scientists could predict even as early as July 2020 that the versions of the virus with the mutation known as N501, currently seen in the UK and South African variants of the coronavirus, could spread globally.

Now, the scientists have found that another mutation occurring in the 222nd amino acid building block, or residue of the spike protein, is prevalent across global samples, along with a clutch of other mutations.

"We are geographically monitoring a lot of these clusters, namely A222 with many mutation combinations. That is how we stumbled across A222," Thukral said.

"A222 is a special cluster because of the sheer number, the prevalence of its associated variants is very high," she added.

The study also found two other mutation clusters on the section of the spike protein which directly interacts with the human receptor ACE2 -- the gateway through which the virus enters cells.

Thukral feels there is a certain kind of selection pressure being applied on these critical residues, causing "more and more mutations" here that may allow the virus to adapt and escape antibodies.

This could lead to the origin of new worrisome lineages of the virus, she added.

"So, when a cluster of mutations survives longer, the resulting variant becomes a dominant one either by allowing it to infect more efficiently by increasing the binding ability with the receptor, by allowing the use of multiple receptors on the host, or enabling host cell entry by escaping from neutralising antibodies," said Ray from CSIR-Indian Institute of Chemical Biology (CSIR-IICB), Kolkata.

However, Thukral believes her team's platform can help stay ahead of the evolving virus.

"Let's say we are looking down the next six months, the same prediction platform could be repeated in a lab to understand what a new pool of mutations is doing, and understand if there are new selection pressures and what things have changed," Thukral said.

"If we can analyse the scientific data and track these mutations, and understand the precise mutation combinations to work on, then we can defeat the spread in time," she said.

Ramakrishnan said the study can be an important first step in developing better vaccine candidates.

"It is important for us to know what are the positions that undergo mutations so that we can design better vaccines against areas which do not change," he added.