View on GitHub

Sargasso

Sargasso disambiguates mixed-species high-throughput sequencing data.

Usage reference

The main species_separator script has a number of command line options to alter the behaviour and function of read assignment:

species_separator <data-type>
    [--log-level=<log-level>]
    [--reads-base-dir=<reads-base-dir>] [--num-threads=<num-threads>]
    [--mismatch-threshold=<mismatch-threshold>]
    [--minmatch-threshold=<minmatch-threshold>]
    [--multimap-threshold=<multimap-threshold>]
    [--reject-multimaps] 
    [--best] [--conservative] [--recall] [--permissive]
    [--run-separation]
    [--delete-intermediate]
    [--mapper-executable=<mapper-executable>]
    [--mapper-index-executable=<mapper-index-executable>]
    [--sambamba-sort-tmp-dir=<sambamba-sort-tmp-dir>]
    <samples-file> <output-dir>
    (<species> <species-info>)
    (<species> <species-info>)
    ...

These parameters are listed and explained below for reference, grouped by their functionality.

Core

These parameters are required as the base minimum for the execution of the pipeline.

Mapping

These parameters control the mapping of mixed-species RNA-seq reads to reference genomes.

Assignment criteria and optimisation

These parameters are used to specify criteria that affect how reads are assigned to each species, either by choosing individual values for each parameter or by selecting pre-configured assignment profiles.

Performance

These are optional parameters concerning the running of the pipeline.

Next: Support scripts