Multiplex genomic recorders
A technology for recording transcription events directly in DNA
The more things in biology we discover today, the faster we can discover things tomorrow. Biologists are the new engineers. But their tools look a lot different than any we’ve seen before. Sequencing is the microscope of tomorrow. And sequencing was built by biological tools.
- Laura Deming
Biology feels different right now. New broadly enabling technologies and tools are driving forward progress in nearly every specific field at a rapid pace. The large scale adoption and application of a powerful set of common tools has created a virtuous cycle of further technology refinement and engineering. The rate of iteration is increasing, and previously intractable problems are now within reach.
There has been a wave of exciting new developments in genome engineering and molecular technology development that has swept onto bioRxiv in the recent weeks. In reflecting on some of the new results, Zack Chiang cited the idea from an essay by Laura Deming that there is a Flywheel Effect at work in biology and biotechnology at large. The central idea is that one of the most unique things about biotechnology is that it is discovered, not invented.
At the core of our tools such as PCR, next-generation sequencing, and CRISPR, the most important machinery is biological in nature. The more that we discover and understand, we find ourselves with even more incredible new tools to compose together with what is already in our growing molecular toolbox.
Recent work has made it feel as if these efforts are beginning to compound, and that progress in biology is in the midst of a period of rapid acceleration. I’m going to highlight the preprint entitled “Multiplex genomic recording of enhancer and signal transduction activity in mammalian cells” which introduces a beautiful new molecular tool that leverages the new prime editing technology to record molecular events into DNA. This work was led by Wei Chen and Junhong Choi from the Shendure Lab at the University of Washington.
Across the tree of life, both development and disease are mediated by a core set of signaling pathways and gene regulatory modules. Development biologists aim to deeply understand how these pathways and modules orchestrate the astonishing programmed complexity of single cells transforming into three-dimensional organisms of all shapes and sizes. Disease researchers study the multitudes of states in which homeostasis is lost in these core modules. Geneticists try to understand how allelic variation is propagated through these pathways in order to result in the tremendous phenotypic variation we observe.
In all cases, there is a fundamental question: how do we measure these core signaling pathways and gene regulatory modules? In modern molecular biology and genomics, sequencing-based technologies represent the current gold standard. RNA-seq enables genome-wide profiling of transcriptional activity. When trying to determine the importance of enhancer sequences, which are DNA sequences that exert regulatory control over gene expression, one of the most powerful and high-throughput approaches is the massively parallel reporter assay (MPRA).
While there are many technical variations, the core idea of all MPRA experiments is the same. First, a library of plasmids is introduced, each with a unique DNA barcode, a cis-regulatory element (CRE)1, a minimal promotor, and a reporter gene (such as luciferase, which will fluoresce when expressed):
As shown in the schematic above, the expression driven by the different CREs can be read out as a ratio of the amount of RNA barcodes present relative to the DNA barcodes using next-generation sequencing.
While RNA-seq and MPRAs are both valuable approaches, they come with some limitations. Fundamentally, each measurement represents a single static slice of a dynamic process which is only inferred by attempting to piece together the slices. The quality of the reconstruction is limited by sampling density. What if we could measure these systems continually as they occurred in a way that didn’t require destructive sampling?
Here, the fundamental idea is that “DNA is the natural medium for biological information storage, and is easily ‘read’ through sequencing.” This forms the basis for this new technology: ENGRAM (ENhancer-driven Genomic Recording of transcriptional Activity in Multiplex). The workflow of this technique is very similar to that of the MPRA introduced above, but with an important twist. Instead of destroying the cell and sequencing a ratio of barcodes, the transcription event is recorded by the insertion of a barcode into a locus of DNA in the cell via prime editing.
This is accomplished by some really clever molecular biology. Prime editing is a powerful new genome editing technique from the Liu Lab at Harvard which enables the programming of both the location being edited, and the edit being made. Because guide RNAs such as prime-editing guide RNAs (pegRNAs) are made by RNA polymerase III instead of RNA polymerase II, they used a CRISPR endoribonuclease called Csy4 to drive the enzymatic release of the pegRNAs to enable recording.
While there are certainly molecular details that take time to fully grok, the core idea is powerful and simple: dynamic cellular events are recorded into DNA non-destructively as they happen, and the events are read using sequencing after the fact.
As a technology paper, some of the primary results presented are the validation and optimization of ENGRAM. Several different designs were evaluated, and ultimately the team arrived at what they called ENGRAM 2.0:
In this design, they encode Csy4 into the same transcript as the pegRNA that it will enzymatically release, which they demonstrated reduced background compared to their original design (ENGRAM 1.0) where they relied on constitutive expression of Csy4.
I’m not going to go blow-by-blow through the validation and optimization results in this post, but instead highlight some of the results demonstrating how this technology can be used to measure signaling pathways. They set up recorders of three different signaling pathways (doxycycline response, TNFα, Wnt) by using a regulatory DNA element that responds to each respectively:
They were able to measure a beautiful “strikingly sigmoidal” concentration-specific response recorded for each stimulant:
They went further and showed that they could effectively multiplex this technique by reading out all three signals in response to stimulants in a single population of cells. Even more, they showed a proof-of-concept for reading out the order in which events occurred:
Here they have order-specific pegRNAs. For example, A will only make an edit to a blank DNA tape, whereas A’ will only edit a tape that has already been edited by B (the same goes for B and B’). They can then read out the specific order at which the transcriptional events take place.
The discovery and application of CRISPR led to a revolution in the life sciences. In the short amount of time since, there has been a Cambrian explosion of new and refined genome editing and engineering technologies, including prime editing. It hasn’t taken long for talented molecular technology developers to mix and match these tools, and combine them in new and exciting ways with our existing tools like high-throughput DNA sequencing and synthesis.
A beautiful example of this is ENGRAM. The results in this paper represent a proof-of-concept for a technology which holds the promise of maturing into a technique like the massively parallel reporter assay, except that cellular events are recorded non-destructively into DNA as they happen. To me, this fundamental direction seems like a big deal. I think that as this type of technology develops, we may see far more standard use of molecular recorders like this in the future.
That concludes my highlight of “Multiplex genomic recording of enhancer and signal transduction activity in mammalian cells” from the Shendure Lab. However, there is more to the story. In the conclusion, the authors write:
Of note, in parallel to this work, we developed a different strategy for “pseudo-processive” genome editing called DNA Ticker Tape. In principle, ENGRAM and DNA Ticker Tape are compatible. For the goal of multiplex, temporally resolved recording of core signaling pathway activity over extended periods of time, the combination of ENGRAM and DNA Ticker Tape may be more powerful than the ENGRAM variant described here.
Curious to learn more? Stay tuned for my next post where I’ll highlight the exciting preprint describing DNA Ticker Tape that was posted alongside this one! If you don’t want to miss this next post or others to come, you can sign up to have them delivered to your email when they are posted:
Until next time! 🧬
An enhancer is a type of cis-regulatory element. This is in contrast with trans-regulatory elements, which are the DNA sequences that encode the trans-acting factors that interact with the CREs. Gene regulation is complex!