miRNAseq – Bioinformatics Core

Counting miRNA Expression Levels in High-Throughput Sequence Data.

A typical method for processing miRNA short read sequence is to first clip the platform-specific 3′ adapter and then map the resulting sequence to a database of either mature or precursor sequences. However, we have found that it is much simpler and robust to use a different method. We used a custom method- mapping procedure that uses the fact that the miRNA sequence is significantly shorter than the current read length of today’s sequencing technologies. Because the length of microRNAs is between 20 and

25 bp, most sequencing reads will read through to the 3′ adapter. What one has are chimeric sequences, which are part microRNA and part adapter. We created a custom genome file of all microRNA sequences from miRBASE with the technology-specific adapter sequences and then created a BWA index file for alignment. The sequenced reads were then mapped to this target database using BWA with default options. The resulting .SAM file was then processed to look for full-length hits and we simply counted the number of sequences that mapped to each microRNA/adapter hybrid in the genome.