GenomeBits representation of SARS-CoV-2 Delta and Omicron genome sequences

In a recent study published in PLoS ONE, researchers uncovered distinct genomic features of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Delta and Omicron variants.

Study: GenomeBits insight into omicron and delta variants of coronavirus pathogen. Image Credit: CROCOTHERY/Shutterstock

Understanding SARS-CoV-2, the causal pathogen of the coronavirus disease 2019 (COVID-19) pandemic, is still challenging. It has been suggested that the SARS-CoV-2 genome might have formed due to the recombination of genomes close to those of bat and pangolin coronaviruses (CoVs). It is critical to investigate the origin of SARS-CoV-2 to prevent the occurrence of pandemics in the future.

SARS-CoV-2 Delta and Omicron variants feature common and unique mutations in the spike protein. Previously, the authors described GenomeBits [a statistical algorithm that maps nucleotide bases into a finite alternating sum series of distributed terms of binary values (0, 1)] and revealed distinct genomic patterns for SARS-CoV-2 Alpha, Beta, Gamma, Epsilon, and Eta variants.

The study and findings

In the present study, researchers applied the GenomeBits method to uncover the distinctive patterns from SARS-CoV-2 Delta and Omicron genomic sequences. Genomic sequence data were obtained from the global initiative on sharing avian influenza data (GISAID) repository. In similarity plots generated using the Waterman-Eggert algorithm with lalign36 alignment software, the authors observed a more significant deviation of Omicron variant (B.1.1.529) than Delta variant (AY.4.2) from the ancestral SARS-CoV-2 (Wuhan-Hu-1) sequences.

The sequences of the Delta variant from Spain exhibited more significant deviations when queried against Omicron sequences from Spain. Similar variations were noted with Delta sequences from the United States (US) against Omicron sequences from the US. Conventional similarity methods provide limited information on nucleotide bases: adenine (A), cytosine (C), thymine (T), and guanine (G), and determining the parameters to achieve optimal alignment could be difficult. Moreover, the computational resources substantially increase based on the number and length of sequences.

On the contrary, the GenomeBits method runs efficiently with less processing time for massive genomic data. The technique considers an alternating sum series with terms of nucleotide variables converted to binary values (0, 1). The significant difference between GenomeBits and other binary representation techniques is the alternating signs (±) of the terms in the GenomeBits sums. That is, if a term at a given nucleotide position is negative, then the successive term would be negative, and vice versa

In the GenomeBits representation, the authors observed that the curves of Delta sequences mirrored those of Omicron sequences. This became more prominent when both curves were averaged. The regions of null (low noise) or constant average values were indicative of perfect mirroring. The technique illustrated and ordered (constant) to disordered (peak) transition near the non-structural protein (NSP)-5 polymerase within the open reading frame (ORF)-1a region up to the part of the spike protein.

Distinct patterns were also observed around the spike region. The disordered (peak) curves diverged rapidly, denoting dissimilarities with the increasing base position. The positive and negative terms partly canceled out, converging at some non-zero values. Furthermore, data noise reduction could be observed by including sliding windows of different sizes up to 500 bases.

Conclusions

The researchers observed constant and peaked transitions around the spike protein region of SARS-CoV-2 Delta and Omicron variants using the GenomeBits method. Numerical representations of genomic sequences have been instrumental in bioinformatics and could help handle enormous sequence data. GenomeBits might help with future bioinformatics surveillance of infectious diseases, and sequence-to-numeral mapping methods would likely prevail for characterizing new sequences.

Journal reference:
  • Canessa E, Tenze L. (2022). GenomeBits insight into omicron and delta variants of coronavirus pathogen. PLoS ONEdoi: https://doi.org/10.1371/journal.pone.0271039 https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0271039

Posted in: Medical Science News | Medical Research News | Disease/Infection News

Tags: Adenine, Avian Influenza, Bases, Bioinformatics, Coronavirus, Coronavirus Disease COVID-19, covid-19, Cytosine, Genome, Genomic, Guanine, Infectious Diseases, Influenza, Nucleotide, Omicron, Pandemic, Pathogen, Polymerase, Protein, Respiratory, SARS, SARS-CoV-2, Severe Acute Respiratory, Severe Acute Respiratory Syndrome, Spike Protein, Structural Protein, Syndrome, Thymine

Comments (0)

Written by

Tarun Sai Lomte

Tarun is a writer based in Hyderabad, India. He has a Master’s degree in Biotechnology from the University of Hyderabad and is enthusiastic about scientific research. He enjoys reading research papers and literature reviews and is passionate about writing.

Source: Read Full Article