1. Demultiplex FASTQ by Barcode Combination

The demux command demultiplexes paired-end FASTQ files based on barcode combinations and generates per-combination FASTQ outputs for downstream alignment.

The command accepts barcode files in multiple formats (FASTA, TSV, CSV, XLSX) and automatically detects the format. It removes empty outputs and merges symmetric barcode combinations to ensure canonical {CRF1}-{CRF2} ordering.

Backend Selection by File Size

The demultiplexing backend is chosen automatically based on input file size:

Condition	Backend	Notes
R1 file ≤ 1 GB	cutadapt	Fast, requires cutadapt to be installed
R1 file > 1 GB	Built-in streaming demultiplexer	Memory-efficient, no external dependency

Both backends produce identically formatted output files and go through the same symmetric-merge step.

Parameters

Argument	Type	Default	Description	Example
`-1, –r1`	character	—	Input R1 FASTQ file (`.fastq.gz`).	`-1 raw.R1.fastq.gz`
`-2, –r2`	character	—	Input R2 FASTQ file (`.fastq.gz`).	`-2 raw.R2.fastq.gz`
`-b, –barcode`	character	—	Barcode file used for both R1 and R2 matching. Supported formats: `.fasta`, `.fa`, `.tsv`, `.csv`, `.xlsx`, `.xls`. Format is detected automatically.	`-b barcodes.fasta`
`-o, –output`	character	—	Output directory for demultiplexed FASTQ files.	`-o ./demux_out`
`-e, –error-rate`	numeric	`0`	Maximum allowed barcode mismatch rate (fraction of barcode length). For example, `0.1` allows 1 mismatch in a 10-base barcode.	`-e 0.1`
`-j, –threads`	integer	auto-detect	Number of threads (used by cutadapt for small files).	`-j 16`

Output Files

The command generates the following output files in the specified OUT_DIR:

OUT_DIR/{name1}-{name2}_R1.fastq.gz
OUT_DIR/{name1}-{name2}_R2.fastq.gz

Each pair corresponds to a detected barcode combination. Empty combinations (0 reads) are automatically removed. Symmetric pairs (e.g., A-B and B-A) are merged into a single canonical output.

Example Usage

BARCODE="./barcodes.fasta"
IN_DIR="./fastq"
OUT_DIR="./demultiplex"

# Test Data
for d in "$IN_DIR"/*; do
  [[ -d "$d" ]] || continue
  sample="$(basename "$d")"

  multiEpiPrep demux \
    -1 "${d}/raw_R1.fastq.gz" \
    -2 "${d}/raw_R2.fastq.gz" \
    -b "$BARCODE" \
    -o "${OUT_DIR}/${sample}"
done