The frag command converts paired-end BAM files into
fragment-level BED files, providing a standardized interval
representation of sequenced DNA fragments for downstream coverage
construction and signal-based analyses.
This step bridges alignment-level data (BAM) and interval-level representations (BED). It is implemented entirely in bash and focuses on correctness, determinism, and compatibility with downstream tools such as coverage generation and peak calling.
What this script does:
bedtools bamtobed -bedpe.The resulting BED files represent inferred DNA fragments derived from paired-end alignments and serve as the canonical input for subsequent coverage and signal track generation steps.
| Argument | Type | Default | Description | Example |
|---|---|---|---|---|
-i, –input
|
character | — | Directory containing input BAM files. Each BAM file is processed independently and is assumed to contain paired-end alignments. |
-i ./bam
|
-o, –output
|
character | — | Output directory used to store fragment-level BED files. The directory is recreated at runtime to ensure a clean and deterministic output state. |
-o ./bed
|
The command generates the following output files in the specified
out_dir:
<pair>.bed
<pair> corresponds to the BAM filename without
extension.
chrom,
chromStart, and chromEnd, representing the
inferred genomic span of each paired-end fragment.
bedtools
genomecov.
# General Usage (Extract result from qc)
# Test Data
for d in ./bam/*/; do
sample=$(basename "$d")
multiEpiPrep frag \
-i "$d" \
-o "./bed/${sample}" \
done
The cov command converts BED interval files into
genome-wide coverage tracks in bedGraph format, providing a lightweight
and reproducible representation of read or fragment density for
downstream peak calling and signal-based analyses.
This step is implemented entirely in bash and serves as a standardized preprocessing layer between interval-level BED data and signal-level analyses. Each input BED file is processed independently, producing one bedGraph file with continuous coverage blocks.
What this script does:
hg38 or mm10).bedtools genomecov -bg to compute genome-wide
coverage tracks from BED intervals..bedGraph file per input BED file into the
specified output directory.The generated bedGraph files are directly compatible with downstream signal-based steps, including peak calling, background modeling, and visualization.
| Argument | Type | Default | Description | Example |
|---|---|---|---|---|
-i, –input
|
character | — | Directory containing input BED files. Each BED file is treated as an independent sample (typically one CRF–CRF pair). |
-i ./bed
|
-o, –output
|
character | — | Output directory used to store generated bedGraph files. The directory is created recursively if it does not already exist. |
-o ./bedgraph
|
-g, –genome
|
character | — |
Reference genome used to determine chromosome sizes. Supported values
are hg38 and mm10.
|
-g hg38
|
The command generates the following output files in the specified
out_dir:
<pair>.bedGraph
<pair> corresponds to the BED filename without
extension.
chrom, chromStart, chromEnd, and
value.
# Test Data
for d in ./bed/*/; do
sample=$(basename "$d")
multiEpiPrep cov \
-i "$d" \
-o "./bedgraph/${sample}" \
-g hg38
done
The track command converts BED interval files into
bigWig signal tracks, producing compact, indexed coverage
representations suitable for genome browser visualization and downstream
signal-based analyses.
This step is implemented entirely in bash and standardizes the transition from fragment-level intervals (BED) to continuous signal tracks (bigWig). Each input BED file is processed independently, and the command supports optional RPM normalization to make tracks comparable across CRF–CRF pairs with different sequencing depths.
What this script does:
hg38 or mm10).1e6 / total_reads.bedtools genomecov -bg to generate a bedGraph
coverage track.chrom,
start) using a stable locale setting
(LC_ALL=C).bedGraphToBigWig.The resulting bigWig tracks can be loaded directly into genome browsers (e.g., IGV/UCSC) and serve as the canonical signal representation for visualization and interpretation.
| Argument | Type | Default | Description | Example |
|---|---|---|---|---|
-i, –input
|
character | — | Directory containing input BED files. Each BED file is treated as an independent sample (typically one CRF–CRF pair). |
-i ./bed
|
-o, –output
|
character | — |
Output directory used to store generated bigWig tracks
(.bw). Created recursively if it does not exist.
|
-o ./tracks
|
-g, –genome
|
character | — |
Reference genome used to determine chromosome sizes for coverage
calculation and bigWig indexing. Supported values are hg38
and mm10.
|
-g hg38
|
-n, –normalized–no-normalized
|
logical |
true
|
Whether to apply RPM normalization. If enabled, coverage is scaled by
1e6 / total_reads prior to exporting. Use
–no-normalized to export raw (unscaled) coverage tracks.
|
-n–no-normalized
|
The command generates the following output files in the specified
out_dir:
<pair>.bw
<pair> corresponds to the BED filename without
extension.
chrom.sizes,
enabling fast random access for genome browsers.
# Test Data
for d in ./bed/*/; do
sample=$(basename "$d")
multiEpiPrep cov \
-i "$d" \
-o "./bigwig/${sample}" \
-g hg38
done
…