Improvements in scaling of Bowtie2 alignment software and implications for RNA-Seq pipelines

This is a guest post by Adam Marko, an IT and Biotech Professional with 10+ years of experience across diverse industries

What is Bowtie2?

Bowtie2 is a commonly used, open-source, fast, and memory efficient application used as part of a Next Generation Sequencing (NGS) workflow. It aligns the sequencing reads, which are the genomic data output from an NGS device such as an Illumina HiSeq Sequencer, to a reference genome. Applications like Bowtie2 are used as the first step in pipelines such as those for variant determination, and an area of continuously growing research interest, RNA-Seq.

What is RNA-Seq?

RNA Sequencing (RNA-Seq) is a type of NGS that seeks to identify the presence and quantity of RNA in a sample at a given point in time. This can be used to quantify changes in gene expression, which can be a result of time, external stimuli, healthy or diseased states, and other factors. Through this quantification, researchers can obtain a unique snapshot of the genomic status of the organism to identify genomic information previously undetectable with other technologies.

There is considerable research effort being put into RNA-Seq, and the number of publications has grown steadily since its first use in 2009.

Plot of the number of RNA-Seq research publications accepted each year

Figure 1. RNA-Seq research publications published per year as of April 2019. Note the continuous growth. At the current rate, there will be 60% more publications in 2019 as compared to 2018. Source: NCBI PubMed

RNA-Seq is being applied to many research areas and diseases, and a few notable examples of using the technology include:

  • Oral Cancer: Researchers used an RNA-Seq approach to identify differences in gene expression between oral cancer and normal tissue samples.
  • Alzheimer’s Disease: Researchers compared the gene expression of different lobes of deceased Alzheimer’s Disease patients brain with the brain of healthy individuals. They were able to identify genomic differences between the diseased and unaffected individuals.
  • Diabetes: Researchers identified novel gene expression information from pancreatic beta-cells, which are cells critical for glycemic control.

Compute Infrastructure for aligning with Bowtie2

Designing a compute resource to meet the sequence analysis needs of Bioinformatics researchers can be a daunting task for IT staff. Limited information is available about multithreading and performance increases in the diverse portfolio of software related to NGS analysis. To further complicate things, processors are now available in a variety of models, with a large range of core counts and clock speeds, from both AMD and Intel. See, for example, the latest Intel Xeon “Cascade Lake” CPUs: Intel Xeon Scalable “Cascade Lake SP” Processor Review

Though many sequence analysis tools have multithreading options, the ability to scale is often limited, and rarely linear. In some cases, performance can decrease as more threads are added. Multithreading applications does not guarantee a performance improvement.

Threads Run Time (seconds)
8 620
16 340
32 260
48 385
64 530

Table 1. Research data showing previous version of Bowtie2 scaling with thread count. Performance would decrease above 32 threads.

Plot of Bowtie2 run time as the number of threads increases

Figure 2. Plot of thread scaling of previous version of Bowtie2. Performance decreases after 32 threads due to a variety of factors. Non-linear scaling and performance decreases with core count have been shown in other scientific applications as well.

However, researchers recently greatly improved the thread scaling of Bowtie2. Original versions of this tool did not scale linearly, and demonstrated reduced performance when using more than 32 threads. Aware of these problems, the developers of Bowtie2 have implemented superior multithread scaling in their applications. Depending on processor type, their results show:

  • Removal of performance decreases over 32 threads
  • An increase in read throughput of up to 44%
  • Reduced memory usage with thread scaling
  • Up to a 4 hour reduction in time to align 40x coverage human genome

This new version of the software is open-source and available for download.

Right Sizing your NGS Cluster

With the recent release of Intel’s Cascade Lake-AP Xeons providing up to 112 threads per socket, as well as high density AMD EPYC processors, it can be tempting to assume that more cores will result in more performance for NGS applications. However, this is not always the case, and some applications will show reduced performance with higher thread count.

When selecting compute systems for NGS analysis, researchers and IT staff need to evaluate which software products will be used, and how they scale with threads. Depending on the use cases, more nodes with fewer, faster, threads could provide better performance than high thread density nodes. Unfortunately there is no “one size fits all” solution, and applications are in constant development, so research into the most recent versions of analysis software is always required.



If you are interested in testing your NGS workloads on the latest Intel and AMD HPC systems, please consider our free HPC Test Drive. We provide bare-metal benchmarking access to HPC and Deep Learning systems.

Adam Marko (for Microway)

About Adam Marko (for Microway)

Adam Marko is an IT and Biotech Professional with 10+ years of experience across diverse industries, including molecular diagnostics, drug discovery, agricultural genomics, and protein structure prediction.
This entry was posted in HPC Guidance, Software. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *