Piercing the dark matter: bioinformatics of long-range sequencing and mapping

Some new genomics technologies can already provide higher throughput and higher resolution analysis of long-read sequences or long-term mapping than before.
These remote technologies are rapidly advancing the field with the improvement of reference genomes, more comprehensive variant recognition, and a more complete transcriptome and epigenome perspective.
However, they also need new bioinformatics methods to take full advantage of their unique characteristics while overcoming their complex errors and patterns.
Here, we discussed some of the most important applications of new technologies, focusing on currently available bioinformatics tools and future research opportunities

       In the late 1980s and 1990s, the development of the first-generation sequencing technology was crucial to the sequencing of the first microbial, plant and animal genomes, including the initial sequencing of the human genome.
The most important technology of this generation is the automatic Sanger sequencer, which can sequence hundreds of DNA molecules at a time.
Several supporting biotechnologies were also developed for these early projects, including mate pairing, bacterial artificial chromosomes (BACs), optical mapping and other analyses to increase the relatively limited sequences that can be generated.
In the mid to late 21st century, high-throughput second-generation sequencing technology quickly replaced first-generation sequencing, largely because the cost of whole-genome sequencing has been greatly reduced.
High-throughput short-read sequencing is the main development of this generation, supplemented by some related biotechnologies, such as paired-end sequencing, polymerized fosmids, and improved optical mapping technology.
These second-generation technologies have enabled many new genome sequencing and extensive resequencing efforts to analyze genome diversity and pathogenic variants, as well as extensive research on transcription, gene regulation, and epigenetics in many species.
However, although the second-generation sequencing technology has achieved population size analysis of many animal and plant species, it also has important limitations, especially the poor or fuzzy positioning of repetitive elements, and the indels or structure in the library construction process Variants (SVs) have limited spanning capabilities and amplified artifacts;
therefore, the limitations of short-read sequences make a large part of most genomes inaccessible, and their true complexity is also hidden

Guess you like

Origin blog.csdn.net/u010608296/article/details/113073866