Skip to content

The filter subcommand

Guanliang MENG edited this page Jun 22, 2023 · 1 revision

You can use this subcommand to filter your raw fastq data, and subsample part of the resulting clean data (via the --data_size_for_mt_assembly option).

$ mitoz filter -h
usage: mitoz filter [-h] --fq1 <file> [--fq2 <file>] [--phred64] [--outprefix <str>]
                    [--fastq_read_length <INT>] [--data_size_for_mt_assembly <float1>,<float2>]
                    [--filter_other_para <str>] [--thread_number <int>] [--workdir <directory>]
                    [--workdir_done <directory>] [--workdir_log <directory>]

Filter input fastq reads.

optional arguments:
  -h, --help            show this help message and exit
  --fq1 <file>          Fastq1 file
  --fq2 <file>          Fastq2 file
  --phred64             Are the fastq phred64 encoded? [False]
  --outprefix <str>     output prefix [out]
  --fastq_read_length <INT>
                        read length of fastq reads, used to split clean fastq files. [150]
  --data_size_for_mt_assembly <float1>,<float2>
                        Data size (Gbp) used for mitochondrial genome assembly, usually between 3~8 Gbp is
                        enough. The float1 means the size (Gbp) of raw data to be subsampled, while the float2
                        means the size of clean data should be >= float2 Gbp, otherwise MitoZ will stop to run.
                        When only float1 is set, float2 is assumed to be 0. Set float1 to be 0 if you want to
                        use ALL raw data. [5,0]
  --filter_other_para <str>
                        other parar. []
  --thread_number <int>
                        thread number [4]
  --workdir <directory>
                        working directory [./]
  --workdir_done <directory>
                        done directory [./done]
  --workdir_log <directory>
                        log directory [./log]
Clone this wiki locally