You are here

SPAdes Genome Assembler

Table of Contents

Introduction

SPAdes – (St. Petersburg genome assembler), is intended for assembly of small genomes (such as bacteria).  It comes with some error correction utilities for preprocessing with the SPAdes or other assembers.

Restrictions / License Information

The SPAdes manual gives suggested citations and also appreciates your contribution to a list of publications for research in which SPAdes was used.

Running SPAdes on Breezy

SPAdes 3.10.1 Python scripts and binaries compiled from source code with GCC 4.9beta compilers are in

/global/software/spades/spades3101/bin

To set up your environment to use SPAdes 3.10.1, use

module load spades/3.10.1

A similar environment module is available for version 3.9.1 .  See other subdirectories in /global/software/spades for older versions.

A researcher reported that there is a bug in the corrector program associated with the SPAdes 3.5.0 release and obtained updated source code from the developers that had not been incorporated into an official release.  That version, built from source code, rather than using the pre-compiled binaries, is in

/global/software/spades/spades350_patched_corrector/bin

To use the 3.7.1, 3.6.2 or 3.6.1 official releases or the the patched 3.5.0 version, initialize your environment with

module load gcc/gcc-4.9-20140406-beta

before running any of the SPAdes programs.

The SPAdes manual and sample data are in

/global/software/spades/spades3101/share/spades

SPAdes can use multiple threads, but, does not support parallel processing with multiple nodes.  Note that the SPAdes manual indicates that the -t (threads) command line option defaults to 16.  For batch jobs on Breezy, you should use -t ${PBS_NUM_PPN}.  The PBS_NUM_PPN variable will automatically be replaced by the number of cores you request of the batch job system. You can request up to 24 threads on Breezy, for example, using:

qsub -l nodes=1:ppn=24,mem=250gb,walltime=03:00:00 your_batch_script.pbs

The manual also says that by default SPAdes will use up to 250 GB of memory. As it happens, 250 GB is the maximum usable amount of memory on a Breezy node. If it turns out that your jobs don't require the full resources of a Breezy node, for example, if you are using 6 cores and only 20 GB of memory:

qsub -l nodes=1:ppn=6,mem=20gb,walltime=12:00:00 your_batch_script.pbs

then, you should use the -m argument on the spades.py command line in order to limit the amount of memory that the program will use (to match the amount you have specified on the qsub command line).

For More Information

Updated:

2014-02-14 - Page created.
2014-06-04 - Updated for SPAdes 3.1.0 .
2014-11-18 - Updated for SPAdes 3.1.1 .
2015-03-04 - Updated for SPAdes 3.5.0 .
2015-06-16 - Updated for patched version of SPAdes 3.5.0
2015-11-03 - Updated for SPAdes 3.6.1 .
2016-02-13 - Updated for SPAdes 3.6.2 .
2016-05-06 - Updated for SPAdes 3.7.1 .
2016-12-14 - Updated for SPAdes 3.9.1 .
2017-04-12 - Updated for SPAdes 3.10.1 .

System Breezy Hungabee
Version 3.10.1 3.10.1