1. QC by Percentile

The qc command performs read-level quality control at the BAM level and generates the input tables required for downstream visualization with qc_visualization().

This step is implemented entirely in bash and is designed to be fast, reproducible, and independent of R. It computes sequencing depth for each CRF–CRF pair and applies a percentile-based filtering strategy to remove low-coverage pairs prior to visualization or modeling.

What this script does:

Iterates over all BAM files in a user-specified directory.
Uses samtools idxstats to compute total mapped read counts for each BAM file.
Writes a complete read count table (all_read_count.tsv) with columns:
- pair: CRF–CRF pair name (derived from BAM filename)
- read_count: total mapped reads
Calculates a percentile-based threshold on read counts.
Filters out CRF pairs below the specified percentile, producing filtered_read_count.tsv.
Ensures BAM index files are present before processing to avoid silent failures.

Parameters

Argument	Type	Default	Description	Example
`-i, –input`	character	—	Directory containing input BAM files (one per CRF pair)	`-i ./bam`
`-o, –output`	character	—	Output directory for QC tables	`-o ./qc`
`-p, –percentile`	numeric	`0.25`	Percentile threshold used to filter low-read-count CRF pairs	`-p 0.1`

Output Files

The command generates the following output files in the specified out_dir:

All read counts TSV - all_read_count.tsv
- Contains read counts for all input files
- Two columns: file prefix and read count

	pair	read_count
	<chr>	<int>
1	H3K27ac-H3K4me3	60573
2	H3K4me3-H3K4me3	940240
3	H3K4me1-H3K4me3	432292
4	H3K4me3-H3K9me3	415540
5	H3K27me3-H3K4me3	574643
6	…
7	H3K9me2-H3K9me3	201788

Filtered read counts CSV - filtered_read_count.tsv
- Contains read counts for files passing the percentile filter
- Excludes samples in the lowest percentile based on filtered_percentile parameter
- Two columns: file name and read count

Example Usage

# Test Data
for d in ./bam/*/; do
  sample=$(basename "$d")
  multiEpiPrep qc \
    -i "$d" \
    -o "./qc/${sample}" \
    -p 0.25
done