During the initial data processing stage, we use standard and commonly known approaches such as mapping to a reference genome or search for similar sequences in public databases. We think it is important to follow two rules here. First, very carefully separate meaningful data from garbage to reduce the side effects and uncertainty during the subsequent stages. Second, focus on the entire genome rather than on the coding regions only, because at least half of the information is hidden in the “dark matter” of the genome.
Processing of the Next Generation Sequencing (NGS) data may include the following steps:
- Preliminary filtering and quality control
- Mapping to a reference genome
- Building gene expression tables
- Assembling transcripts or the entire genome
- Searching for differences between groups of samples using statistical methods
- Searching for SNPs (single nucleotide polymorphism) and structural variations
- Annotation using open and proprietary databases as well as information from specialized consortiums
- Prediction of molecular interactions
- Searching for annotations and relations in the literature using automatic algorithms as well as searching manually
While processing sensitive data, we pay extra attention to follow the appropriate security protocols. We have our own computational equipment or, in some cases, we use resources provided by our customers.
We have broad experience with different platforms and various software products. We are part of a consortium who possesses a unique technology of single molecule NGS sequencing, a high throughput sequencing technology without molecule amplification.