research

Gene expression involves delicate regulations by RNAs and RNA-binding proteins (RBP). The overarching goal of our research is to understand how non-coding RNAs and RBPs contribute to gene regulation in human health and disease. 

TSS-miRNA biogenesis

MicroRNAs (miRNAs) have been shown to affect diverse cellular pathways critical to human development and disease, underscoring the need to elucidate the mechanism by which their levels are regulated. Canonically, miRNAs are produced from primary transcripts that are cleaved by the nuclear Microprocessor complex, with the resulting precursor (pre-)miRNA hairpins exported from the nucleus by Exportin-5, and further processed by cytoplasmic Dicer. In 2018, we discovered an alternative biogenesis pathway that produces 7-methylguanosine-capped pre-miRNAs with 5´ extensions. miRNAs produced from this pathway were named transcription start site (TSS-) miRNAs. Our results expanded an unusual pathway that is distinct from canonical miRNA biogenesis in pre-miRNA synthesis, nuclear-cytoplasmic transport, Dicer processing and guide strand selection. It therefore impacts a broad spectrum of biological research areas from RNA pol II transcription, RNA processing and export, to studies of miRNA function and RNAi vector development.

miRNA biogenesis pathways:  Canonically, miRNAs are produced from primary transcripts that are cleaved by the nuclear Microprocessor complex, with the resulting precursor (pre-)miRNA hairpins exported from the nucleus by Exportin-5, and further processed by cytoplasmic Dicer. In 2018, we discovered an alternative biogenesis pathway that produces 7-methylguanosine-capped pre-miRNAs with 5´ extensions. miRNAs produced from this pathway were named transcription start site (TSS-) miRNAs.
The canonical miRNA biogenesis pathway (top panel) is compared to the TSS-miRNAs biogenesis pathway (bottom panel).

tss-MIrna function

Given that TSS-miRNA are independent of Drosha-processing, in cancer cells with low Drosha expression, their levels are elevated compared to most miRNAs. To probe the function of TSS-miRNAs, we adopted an AGO-CLASH (cross linking and sequencing of hybrids) protocol. CLASH is a modified CLIP-seq (crosslinking and immunoprecipitation followed by RNA sequencing) method that physically connects AGO-bound miRNA and target mRNA, allowing for high-confidence identification of the miRNA targetome. By performing CLASH in colorectal cancer HCT116 cells lacking Drosha, we identified the targetome of TSS-miRNAs. We also made the novel discovery that the most abundant TSS-miRNA miR-320a down-regulates expression of the endoplasmic reticulum chaperone CALNEXIN and activates expression of the cell stress-inducible transcription factor ATF4. Therefore,miR-320a activates both the unfolded protein response and the downstream integrated stress response, pathways that contribute to oncogenesis

AGO-CLASH workflow: AGO-irCLASH (infrared Crosslinking, Ligation and Sequencing of Hybrids) protocol which could biochemically identify high-confidence TDMD-elements from different cell types. Briefly, in AGO-irCLASH experiments, we used 254 nm UV to crosslink cells of interest. Because 254 nm UV crosslinks RNAs and proteins that are within a few angstroms, this treatment preserves native AGO-RNA complexes in the cells. The cells are then lysed, and the crosslinked AGO-RNA complexes are isolated by immunoprecipitation using AGO antibodies. Note that immunoprecipitation in CLASH is performed under stringent washing conditions, aiming to obtain only the target protein and directly crosslinked RNAs. To capture the crosslinked target RNA/miRNA pairs, RNAs are partially trimmed by RNaseA, ligated intermolecularly and then with ir-adaptors, size extracted on nitrocellulose membrane, and finally converted to cDNA for illumina sequencing. Since RNase A digestion leaves a 3ʹ phosphate on the target RNA, the intermolecular ligation will mostly occur between the 3ʹ end of the miRNAs, which are usually spared from RNase digestion, and the 5ʹ end of the digested target RNAs, which are phosphorylated by T4 PNK.
AGO-CLASH workflow

Target directed miRNA degradation

Starting in 2020, we investigated a mechanism called target RNA-directed miRNA degradation (TDMD), in which miRNA levels are controlled by the targets that form extensive base-pairing interaction with specific miRNAs. First discovered with viral and artificial transcripts, TDMD has only been demonstrated for three cellular RNA transcripts before 2021. With the establishment of CLASH, we pursued the new research direction of TDMD. The premises for transcriptome-wide identification of TDMD triggers in CLASH data is that target RNA and miRNAs are physically ligated, and therefore we can capture miRNA 3′ end A/U extensions, which often occurs during TDMD . From CLASH data sets obtained from six human and mouse cell lines, we predicted eighteen high-confidence TDMD triggers and experimentally validated eight. Therefore,we have substantially expanded the inventory of endogenous TDMD transcripts. Among these was BCL2L11, which encodes the BIM protein that can induce apoptosis. By degrading anti-apoptotic miR-221 and miR-222, the BCL2L11 TDMD trigger could enhance BIM-induced apoptosis. Our novel discovery supports an exciting new gene-regulatory model by which TDMD trigger sequences within mRNAs cooperate with the encoded proteins to control critical cellular functions. The next key step would be a full functional understanding and capturing the potential therapeutic value of bona fide TDMD triggers.

TDMD: A. Aside from regular miRNA target identification, AGO-irCLASH allows TDMD pattern search by three criteria: 1, the seed-region and the 3ʹ half of the miRNA are extensively base-paired with the target RNA. 2, the candidate TDMD target RNA/miRNA hybrid should be more likely to contain A/U extensions. 3, the putative TDMD trigger should reside in an evolutionary conserved region. B. Model of cooperative apoptosis induction by the BCL2L11 mRNA.
A. Aside from regular miRNA target identification, AGO-irCLASH allows TDMD pattern search by three criteria: 1, the seed-region and the 3ʹ half of the miRNA are extensively base-paired with the target RNA. 2, the candidate TDMD target RNA/miRNA hybrid should be more likely to contain A/U extensions. 3, the putative TDMD trigger should reside in an evolutionary conserved region. B. Model of cooperative apoptosis induction by the BCL2L11 mRNA

m6A modification of 7SK snRNA

In 2023, we discovered that 7SK small nuclear RNA contains high levels of N6-methyladenosine (m6A) in non-small cell lung cancer (NSCLC) cells and that m6A-7SK is essential in Pol II transcriptional control. m6A writer, METTL3, and the m6A eraser, ALKBH5, dynamically regulate 7SK at eight different sites. Importantly, specific removal of m6A on 7SK using a dCasRx-ALKBH5 system dampens global Pol II transcription and inhibits NSCLC cell growth. Mechanistically, removal of m6A modification induces 7SK conformational change and sequesters P-TEFb, leading to significant Pol II pausing and reduced transcriptional output. Our results reveal a new layer of transcriptional regulation by m6A modification on a non-coding RNA. More importantly, the specific reduction of m6A-7SK inhibits NSCLC cell growth. Our discovery impacts a broad spectrum of biological research areas, from Pol II transcription, m6A RNA modification, RNA structure, RNA-protein interaction, to studies of NSCLC biology and therapeutic development.

This is a two-panel scientific schematic comparing two cellular states side by side. The figure explains how a modified RNA molecule called 7SK may affect RNA polymerase II transcription and NSCLC cell growth. The two panels are separated vertically:

Left panel: a light pink/tan background
Right panel: a light blue-gray background

Left panel: high m6A-7SK in NSCLC cells
At the top left, the title reads:

“NSCLC cells”
“high m6A-7SK”

Below the title is a large, curly, black line drawing representing an RNA molecule. Several small red circles are attached to this RNA. Each red circle is labeled “m6”, indicating multiple m6A modifications along the RNA.
A pink rounded rectangle labeled “hnRNPs” overlaps part of the RNA, suggesting that hnRNP proteins bind to this highly modified RNA.
In the center of the figure, between the two panels, are two gray arrows pointing in opposite directions:

The upper arrow points left and is labeled “METTL3”
The lower arrow points right and is labeled “ALKBH5”

These arrows imply that METTL3 and ALKBH5 regulate the m6A state of 7SK.
Below the RNA on the left side is a green rounded rectangle labeled “P-TEFb”. A black downward arrow points from the RNA region toward this P-TEFb box, suggesting release or recruitment of P-TEFb.
Further down is a simplified transcription complex:

A horizontal line represents DNA
A dark box on the left side of the line says “Pol II pro”
A large pale blue oval centered on the DNA line is labeled “Transcribing Pol II”
Near the top of this complex are labels “S2P” and “S5P”
Small orange dots are scattered near the polymerase and along a thin blue tail-like line
Two smaller blue circles beneath the main polymerase are labeled “NELF” and “DSIF”

At the bottom of the left panel, the caption reads:

“High level of Pol II transcription”

This side visually communicates an active transcription state.

Right panel: low m6A-7SK
At the top right, the title reads:

“low m6A-7SK”

Below that is another black RNA line drawing, but this RNA has no visible red m6-labeled circles. The RNA is associated with two overlapping colored boxes:

A green rounded rectangle labeled “P-TEFb”
A pink rounded rectangle labeled “HEXIM”

These overlapping boxes sit directly on the RNA, suggesting that P-TEFb is retained in a complex with HEXIM when m6A is low.
Lower in the right panel is another transcription complex:

A horizontal DNA line
A dark box on the left labeled “Pol II pro”
A large pale blue oval labeled “Paused Pol II”
A label “S5P” appears near the top, but S2P is absent
Smaller blue circles below are again labeled “NELF” and “DSIF”

At the bottom right, the caption reads:

“Reduced Pol II transcription”
“Inhibited NSCLC cell growth”

This side visually communicates a less active, paused transcription state associated with reduced cancer cell growth.
Model of m6A-7SK-mediated Pol II transcription activation