What is gene editing?
Gene editing (or genome engineering) is the process of changing the DNA of a cell and altering how it functions. This can be to repair a faulty gene, to change how the gene is expressed or to stop it working at all.
Why would we want to do that? Read on to find out!
Why knock out genes?
The genome sequence and its annotation gives us a list of parts that make up the cell or body, but no insight on how these parts work and interact to make the cell function.
Take a look at the picture above; that's all of the parts coded for by the 'car genome'. Could you build a working engine from a subset of those parts? Would you know which bits did what and how they worked with other parts?
With no prior knowledge of how a car works, one way to find out what the parts do is to break them on a fully assembled and functional car and see what happens. If you cut a wire and the indicators stop working, then you can conclude that the wire has something to do with the indicating system. If you remove the discs and the brakes no longer work, you can surmise that they are involved in slowing the car down - the 'mutant' car displays a 'braking system phenotype'.
Scientists do the same thing with genes. By systematically knocking them out we can build a picture of how everything works together to produce a functioning organism.
Of all the 20,000 genes in the human genome, we still have very little idea of what ~6000 of them actually do!
Considering coding regions account for only 2% of the genome, it will be many years before we get a complete understanding of our biology.
Gene targeting using stem cells
Until just a few years ago, the main method used in mice for knocking out specific genes was by altering the genome of cultured embryonic stem (ES) cells. This was achieved by the cloning and introduction of painstakingly engineered fragments of DNA, and taking advantage of the cell's natural repair mechanism of homologous recombination. These altered ES cells are then implanted into a developing mouse embryo.
The adult mouse is a patchwork (chimera) of the original cells and the altered ones. By using altered cells from a black mouse and an albino embryo host, a researcher can instantly tell how much of the mouse is made up of the introduced cells.
If the cells go on to develop the germ line (the cells that make eggs and sperm) of the adult, then the mutation will be passed on to the next generation and be present in every cell.
After several rounds of breeding to produce homozygous animals, the mice can be studied (phenotyped) to see what the knocked out gene does.
For human research, use of embryonic stem cells is highly controversial as it involves the destruction of an embryo, albeit one usually left over from IVF treatments.
An alternative approach of reprogramming adult fibroblast cells into induced pluripotent stem cells (iPSCs), which can then form any other type of cell in the body, was first demonstrated in mice in 2006 and in human cells in 2007.
iPSCs are used today for many purposes, from knocking out genes in basic research to tissue repair and organ synthesis.
The International Mouse Phenotyping Consortium (IMPC) aims to knock out and study the effects of every protein-coding gene in the mouse over the next five years, having already phenotyped over 3200 genes in the previous five.
Limitations of gene targeting using ES cells
Although gene targeting using embryonic stem cells is a very powerful way to create mutations, it is currently limited to just a few mammalian species.
Mouse ES cells were first isolated and cultured in the laboratory back in 1981 and mutated by gene targeting in 1987, but it wasn't until 1998 that the same technology was successfully demonstrated in human cells, and not until 2008 in rats.
Gene targeting is also a very intricate and involved process; the initial part of creating a mouse knockout can take at least 6 months from the design of the mutant gene to injecting the resulting ES cells into blastocysts.
If only a system existed that could easily target any region of the genome without all of the prerequisites, like a programmable pair of scissors?
In 2007, the Nobel Prize in Physiology or Medicine was awarded to Mario R. Capecchi, Martin J. Evans and Oliver Smithies for their discoveries of "principles for introducing specific gene modifications in mice by the use of embryonic stem cells"
There's more than one way to edit the genome!
One way to mutate a gene is to cut the DNA at a defined point and let the natural cell repair machinery try and fix it. In addition to knocking out a gene and stop it working, scientists can also introduce new sequences to alter how a gene works or correct a mutation that causes a disease.
Shao Y, Guan Y, Wang L, et al (2014) CRISPR/Cas-mediated genome editing in the rat via direct injection of one-cell embryos. Nat Protoc 9:2493–2512.
The image above shows the kind of things you can do to a genome once you've managed to cut it where you want. They take advantage of two systems that cell uses to repair damage: Non-homologous End Joining (NHEJ), and Homology Directed Repair (HDR). Let's look at those in more detail...
NHEJ basically glues the DNA back together again, but doesn't always get it totally right, deleting or adding a small number of bases.
If this happens in an exon, the part of the gene that codes for a protein, it can result in a frame-shift mutation and stop the protein from being made and functioning correctly.
If two pairs of scissors are used, larger regions of genome can be removed to ensure that the gene is knocked out.
Instead of simply gluing back the DNA, HDR uses a template (normally this would be the sister chromatid) to make a precise and error-free repair.
If a short single-stranded template is added in addition to the scissors, the cell sometimes uses this instead. The template can contain small changes, such as a single base change to model a disease or fix a mutated gene.
To insert larger blocks of sequence, for example if you want the gene to make a fluorescent protein so you can see where it is expressed, double stranded can be added. This is usually in the form of DNA cloned into a plasmid vector and incorporates into the cell using homologous recombination.
But amongst the 3.4 billion bases in the genome, how do you cut the DNA exactly where you want to? Several technologies have been developed over the years to try and achieve this, but the most popular are ZFNs ,TALENs and, very recently, CRISPR.
In addition to knocking out genes and modelling diseases, genome editing can also be used to control the spread of insect populations such as mosquitoes, which harbour malaria and Dengue fever. The process is called gene drive.
Editing genomes with TALENs and ZFNs
All of the systems described below have a major advantage over traditional gene targeting, in that mRNA containing the instructions to build the protein can be injected directly into a developing embryo.
This negates the need to use embryonic stem cells, which means that the technology can be used for a variety of different species.
Zinc finger nucleases
Zinc Fingers are proteins that bind specifically to groups of three bases of DNA. They can be engineered together like building blocks to recognise a DNA sequence of interest
ZFNs act in pairs like handles on a pair of scissors, with a special enzyme called Fok I acting as the blade in the middle. The Fok I acts on the DNA, causing a double stranded break (DSB) and cutting the DNA.
ZFNs were first successfully reported at editing genes in fruit flies (Drosophila) and zebrafish in 2008.
Zinc Finger Nucleases are not without their drawbacks, however. They are very difficult to produce as the blocks can interfere with each other, and can be very expensive at several thousand dollars each. They also suffer a lot from 'off target effects', where the scissors cut the genome in places over than the one intended.
TALENs are similar to ZFNs in that they use proteins fused to Fok I to bind to specific sequences of DNA and then cut it at a defined point.
The proteins used are called TALEs, and bind specifically to one base pair instead of three.
They are much cheaper than ZFNs, costing a few hundred dollars compared to several thousand.
TALENs were first described in 2009, but modified to work more easily in the chains needed for genome editing in 2011.
ZFNs and TALENs have recently been superceded by an even easier to use and cheaper technology - a system borrowed from nature, called CRISPR.
CRISPR is an acronym for Clustered Regularly Interspaced Short Palindromic Repeats, and is pronounced 'crisper'.
CRISPR and Cas in nature - a bacterial immune system
Like us, bacteria get attacked by viruses and have to fight off infection. To do this, they create a memory of the infection by incorporating small parts of the viral DNA into their own genome! They achieve this by using a special type of enzymes, called Cas (CRISPR-Associated) proteins. The process consists of three parts and is described below:
A new strain of virus infects a cell.
Cas1 proteins bind to the viral DNA, creating a 'spacer' sequence specific to the invader.
The spacer DNA is then incorporated into the growing CRISPR array, which forms a memory of past infections.
This array is transcribed and processed into small CRISPR RNAs (crRNA).
The crRNA forms a complex with a different type of Cas protein, creating a programmed pair of scissors.
Upon a re-infection with a known virus, the Cas complex recognises the viral DNA from the crRNA sequence and binds to it.
The Cas complex cuts the viral DNA, stopping it from functioning.
To prevent the Cas9 from cutting the CRISPR region itself, it requires the viral DNA sequence after the spacer to be 'NGG' (where N is any base) in order to work. This called the protospacer-adjacent motif (PAM) site, and differs between species of bacteria.
This limits the capacity of the system to roughly one binding site every 8bp, but in a mammalian genome of 3.4bn bases, that's over 400 million potential editing sites!
Instead of using crRNAs derived from viruses, researchers discovered that any sequence can be used to program the interference Cas proteins to cut DNA. They now had the near-perfect tool for editing the genome exactly where they wanted!
There are many forms of Cas proteins which are divided into two classes and five types. Cas9, the one used in gene editing in the laboratory, was isolated from the bacterium Streptococcus pyogenes.
Engineering CRISPR/Cas9 for genome editing
So how does the CRISPR/Cas9 system work? The picture below shows how it all fits together.
The guide RNA (gRNA) consists of 20bp of the desired target sequence (where in the genome you want to edit) and a standard scaffold that allows it to attach to the Cas9 protein.
Cas9 (either in the form of messenger RNA or actual protein) and gRNA are introduced into the cell or developing embryo by an electric current or injection, respectively.
The Cas9/gRNA complex binds to the genomic DNA in the cell, unzipping it and checking the sequence against the gRNA sequence like a locksmith trying out keys in a lock
If the DNA matches the gRNA, and the next three bases are 'NGG', the Cas9 enzyme cuts the DNA, causing a double-stranded break.
The cell's repair machinery then tries to fix the break, resulting in small mutations to cause a frame-shift and stop the gene from working, or allowing a template with altered DNA to be incorporated.
Multiple genes can be knocked out at once by simply injecting more than one guideRNA at once.
Where ZFNs and TALENs require careful construction of the modular proteins, all the CRISPR/Cas9 system needs is an easy-to-make guide RNA of 20bp unique sequence. Instead of taking months and costing hundreds or thousands of dollars, a CRISPR experiment takes weeks and costs very little.
This allows the technology to be used a very high-throughput manner, allowing experiments in cells and model organisms at a scale not previously possible.
There is currently a big patent battle over who owns CRISPR editing technology between the Feng Zhang lab in the Broad Institute, and Jennifer Doudna / Emmanuelle Charpentier at UC Berkeley / Umeå University. Hundreds of millions of dollars are at stake!
All parties agree, however, that the use of CRISPR in basic research should not be impeded.
CRISPR - a timeline of discovery
Scientific breakthroughs do not happen in isolation - they all build on existing research. The story of CRISPR goes back to 1987, but it wasn't until 2012 that the system was demonstrated to work in gene editing.
In 2013, CRISPR/Cas9 was shown to work in human and mouse cells, which triggered an explosion in research using the technology. In just three years, it has become the 'go-to' technology for genome engineering for many animals and plants.
To use an analogy, it has allowed scientists to jump from the horse and straight into a Ferrari!
A new interference Cas protein, called Cpf1, was recently reported to work in mammalian cells and directly in mouse embryos. It cuts the DNA leaving a jagged edge, rather than the straight cut of Cas9, which may increase it's efficiency of homology-directed repair.
It's not just a pair of scissors! Other applications for CRISPR/Cas9
An example of gene editing - unlocking and treating Duchenne muscular dystrophy