SARS-COV2 genome: Decoded

The world is witnessing a coronavirus pandemic, which is an ongoing global pandemic of coronavirus disease 2019 (COVID‑19), caused by Severe Acute Respiratory Syndrome (SARS‑CoV‑2). The outbreak was first identified in Wuhan, China, in December 2019 and eventually spread to other countries through human to human transmission.

While development of a vaccine seems to be the efficient option to control this pandemic and prepare ourselves for any of the future occurrences, understanding the structure of the virus and functions that each of these structures possess for efficient replication of the virus becomes an important step for developing such vaccines and anti-virals.

Here we analyze all the structures that the SARS-COV2 contains and its functions in enabling efficient replication in its host cells

1.A string of RNA

The RNA strand of the novel Coronavirus.

The viral RNA genome consists of a protective protein covering called N protein or Nucleocapsid protein. Viruses must hijack living cells to replicate and spread. When the coronavirus finds a suitable cell, it injects a strand of RNA that contains the entire coronavirus genome.  Scientists have identified genes for as many as 29 proteins, which carry out a range of jobs from making copies of the coronavirus to suppressing the body’s immune responses. Further, a sequence in the RNA strand injected into the host cell recruits ribosomes to translate the RNA genome into proteins.

A ribosome in the host cell translating the RNA

The coronavirus genome consists of large sequences called ORFs (Open Reading Frames) . These sequences when translated by the host cell ribosome gives rise to a chain of 16 proteins called NSPs(Non-Structural Proteins). NSPs play a very important role in helping the virus hijack the cell mechanisms to work for the replication of the virus and enhancement of the infection.

The list of NSPs that the virus produces in its host cell :

NSP1(Non-Structural Protein 1)

NSP1 protein can be called as cell saboteur. This protein slows down the infected cell’s production of its own proteins. This sabotage forces the cell to make more virus proteins and prevents it from assembling antiviral proteins that could stop the virus.

Structure of NSP1

This protein’s function in the replication cycle is very important as these proteins tend to hijack the mRNAs of the host cell , thus accelerating the production of the viral proteins inside the host cell.

NSP2(Non -Structural Protein 2)

It is known as the MYSTERY PROTEIN as its function inside the host cell isn’t clear . But its function is believed to be associated with other proteins . These proteins can be speculated to be responsible for enhanced immune masking of the virus from the host immune system.

Ribbon structure of NSP2
Surface density diagram of NSP2

NSP3(Non-Structural Protein 3)

NSP3 is a large protein that has two important jobs. One is cutting loose other viral proteins so they can do their own tasks. When a ORF is being translated , many proteins are produced simultaneously , this protein ensures proteins are cut out from one another in the process and are able to perform their own tasks.

It also alters many of the infected cell’s proteins. Normally, a healthy cell tags old proteins for destruction. But the coronavirus can remove those tags, changing the balance of proteins and possibly reducing the cell’s ability to fight the virus.

Structure of a NSP3 protein.

NSP4 (Non-Structural Protein 4)

It is known as the bubble maker .Combining with other proteins, NSP4 helps build fluid-filled bubbles within infected cells. Inside these bubbles, parts for new copies of the virus are constructed. As the virus continues to translate proteins and making new structures inside the host cell, these are added to a cell like structure(bubble) which eventually develops to be a new virus , with all its structures arranged and ready to move out of the cell and target more host cells. The bubble along with all the necessary structural proteins is assembled inside the Endoplasmic reticulum of the host cell.

NSP5(Non-Structural Protein)

This protein makes most of the cuts that free other NSP proteins to carry out their own jobs. This protein along with NSP3 plays an important role in cutting free all the proteins and ensuring they function individually after translation.

Structure of NSP6

NSP6(Non-Structural Protein)

Works with NSP3 and NSP4 to make virus factory bubbles.

Image representing assembly of required proteins inside the bubble in endoplasmic reticulum eventually creating a copy of the virus

NSP7 & NSP8 (Non-Structural Protein 7,8)

NSP7
NSP8

These two proteins help NSP12 make new copies of the RNA genome, which can ultimately end up inside new viruses.

When new virus particles are produced inside the cell , new RNA genomes should be incorporated into them which contains information about the production of the necessary proteins . These NSPs play an important role in their replication .

NSP9(Non-Structural Protein 9)

AT THE HEART OF THE CELL. This protein infiltrates tiny channels in the infected cell’s nucleus, which holds our own genome. It may be able to influence the movement of molecules in and out of the nucleus — but for what purpose, it’s still unclear .

Structure of NSP9

NSP10(Non-Structural Protein 10)

NSP10 main function is genetic camaflouge.Human cells have antiviral proteins that find viral RNA and shred it. This protein works with NSP16 to camouflage the virus’s genes so that they don’t get attacked.

Structure of NSP10

NSP12 (Non-Structural Protein)

This protein assembles genetic letters into new virus genomes. Researchers have found that the antiviral Remdesivir interferes with NSP12 in other coronaviruses, and trials are now underway to see if the drug can treat Covid-19.

Structure of NSP12

NSP13( Non -Structural Protein 13)

NSP13 main function is unwinding RNA .

Normally, virus RNA is wound into intricate twists and turns. Scientists suspect that NSP13 unwinds it so that other proteins can read its sequence and make new copies.Reading the viral RNA becomes an important step in the replication process.

NSP14(Non-Structural Protein)

NSP14 main function is as viral proofreader.

NSP14 checks for any errors in the new copy of the RNA made by NSP12 . NSP12 ,when it adds RNA genome to the new virus particles, it usually adds wrong letters to the new copy which might hinder the accurate translation process in its further infections .

Structure of NSP14

NSP15(Non-Structural Protein 15)

Researchers suspect that this protein chops up leftover virus RNA as a way to hide from the infected cell’s antiviral defenses.

Structure of NSP15

NSP16(Non-Structural Protein 16)

NSP16 contributes in adding more genetic camaflouge.

NSP16 works with NSP10 to help the virus’s genes hide from proteins that chop up viral RNA.

Structure of NSP16

Spike protein (S)

The projections on the surface of the virus is nothing but the S spike protein . The spike protein is one of four structural proteins that form the outer layer of the coronavirus and protect the RNA inside. Structural proteins also help assemble and release new copies of the virus. It is only after the assembly of these structural proteins in the bubble that the virus is transferred outside .

RBD form
Structure of the S protein

It is due to the presence of this spike protein on the surface of the virus which looks like a crown , that the virus family is referred to as coronavirus. Spike protein plays an important role in the initiation of the infection by binding to one of the receptors in the human cells called ACE2 receptors .

Representation of S spike protein with the ACE2 receptor (yellow part)

Before we see about the rest of the proteins , take a look at the SARS-COV2 genome and position of various structural proteins that is going to be discussed ahead.

Timeline of SARS-COV2 genome and a depiction of the skeleton model of the virus.

The proteins that we looked so far is coded by ORF1a and ORF1b

ORF3a protein (ESCAPE ARTIST)

The ORF3a like the ORF1ab is an open reading frame coding for multiple proteins.

The SARS-CoV-2 genome also encodes a group of so-called “accessory proteins.” They help change the environment inside the infected cell to make it easier for the virus to replicate.

The ORF3a protein pokes a hole in the membrane of an infected cell, making it easier for new viruses to escape. It also triggers inflammation, one of the most dangerous symptoms of Covid-19. ORF3b overlaps the same RNA, but scientists aren’t sure if SARS-CoV-2 uses this gene to make proteins.

Envelope protein (E)

Envelope protein as seen on the surface of the virus

The envelope protein is a structural protein that helps form the oily bubble of the virus. It may also have jobs to do once the virus is inside the cell. Researchers have found that it latches onto proteins that help turn our own genes on and off. It’s possible that pattern changes when the E protein interferes.

Membrane protein (M)

Another structural protein that forms part of the outer coat of the virus.

Structure of the M protein

ORF6 (Open Reading Frame 6)

ORF6 main function is as a signal blocker .

This accessory protein blocks signals that the infected cell would send out to the immune system. It also blocks some of the cell’s own virus-fighting proteins, the same ones targeted by other viruses such as polio and influenza.

ORF7a(Open Reading Frame 7a)

ORF7a is also referred to as the virus liberator

When new viruses try to escape a cell, the cell can snare them with proteins called tetherin. Some research suggests that ORF7a cuts down an infected cell’s supply of tetherin, allowing more of the viruses to escape. Researchers have also found that the protein can trigger infected cells to commit suicide — which contributes to the damage Covid-19 causes to the lungs. ORF7b overlaps this same stretch of RNA, but it’s not clear what, if anything, the gene does.

Structure of ORF7a protein.

ORF8 (Open Reading Frame 8)

The gene for this accessory protein is dramatically different in SARS-CoV-2 than in other coronaviruses. Researchers are debating what it does.

Understanding these mysterious proteins in the coronaviruses can impart additional knowledge on how these viruses work and a potential target for development of anti-virals.

Structure of ORF8

The Nucleocapsid protein (N) protects the virus RNA, keeping it stable inside the virus. Many N proteins link together in a long spiral, wrapping and coiling the RNA as described earlier .

The accessory proteins ORF9b and ORF9c overlap this same stretch of RNA. ORF9b blocks interferon, a key molecule in the defense against viruses, but it’s not clear if ORF9c is used at all.

End of the line : End of the SARS-COV2 genome

The coronavirus genome ends with a snippet of RNA that stops the cell’s protein-making machinery. It then trails away as a repeating sequence of aaaaaaaaaaaaa.

Understanding the genetic makeup of the virus is the first step towards efficient development of anti-virals and vaccines .

Leave a comment