You are here
DiScRIBinATE
Table of Contents
Introduction
DiScRIBinATE is a similarity based binning method. User needs to perform a similarity search of the input metagenomic sequences (reads) against the nr protein database using BLASTx search. The generated blastx output is then taken as the input by the DiScRIBinATE program.
Restrictions / License Information
All files are copyrighted, but license is hereby granted STRICTLY for academic, and non-profit use.
Running Instructions
SYNTAX:
Run the program with no parameters to get usage messages describing the
available parameters. For running DiScRIBinATE, navigate to the folder where the
DiScRIBinATE executable is present, and then execute the DiScRIBinATE program
using the following command:
./DiScRIBinATE -i <INPUT_FILE> -min <MIN_BIN_SIZE> -l <MINIMUM_BIT_SCORE>
These parameters are explained below
INPUT PARAMETERS (to be passed to the perl program during run time):
Argument 1 : INPUT_FILE
Name of the input file.(The output generated after
performing a blastx search of the metagenomic sequences
(against the nr database) is taken as input for this program)
Argument 2 : MIN_BIN_SIZE (Range: 1-Total number of reads in the input file. Default:2)
Minimum number of reads to create a bin.
Argument 3: MINIMUM_BIT_SCORE (Default: 35)
BLASTx hits with bit score less than the given value are neglected by the
DiScRIBinATE program
OUTPUT FORMAT:
Each time the program is executed, two files are generated as output
a. InputFileName.bins : The format of this file is as follows:
column1 : Taxid of the organism/taxa.
column2 : Name of the organism/taxa.
column3 : The total number of reads in the input file
which are categorised under that organism/taxa.
column4 : A comma separated list of all the reads in the input
file which are categorised under that particular bin.
The last two lines in this file are the following:
NHBin - Number of reads in the input file which have no BLASTx Hits.
UnAss - Number of reads in the input file classified as 'unassigned'
due to insignificant alignment parameters.
b. InputFileName.bin_stats : The format of this file is as follows:
column1 : Taxid of the organism/taxa
column2 : Name of the organism/taxa
column3 : The total number of reads in the input file which
are categorised as this organism/taxa
The last four lines in this file are the following:
NHBin - Number of reads in the input file which have no BLASTx Hits.
UnAss - Number of reads in the input file classified as 'unassigned'
due to insignificant alignment parameters.
TAss - Total number of assignments.
TReads- Total number of reads in the input file.
System | |
---|---|
Version | Mar21-2013 |