The frag command converts paired-end BAM files into
fragment-level BED files, providing a standardized interval
representation of sequenced DNA fragments for downstream coverage
construction and signal-based analyses.
This step bridges alignment-level data (BAM) and interval-level representations (BED). It is implemented entirely in bash and focuses on correctness, determinism, and compatibility with downstream tools such as coverage generation and peak calling.
What this script does:
bedtools bamtobed -bedpe.The resulting BED files represent inferred DNA fragments derived from paired-end alignments and serve as the canonical input for subsequent coverage and signal track generation steps.
| Argument | Type | Default | Description | Example |
|---|---|---|---|---|
-i, –input
|
character | — | Directory containing input BAM files. Each BAM file is processed independently and is assumed to contain paired-end alignments. |
-i ./bam
|
-o, –output
|
character | — | Output directory used to store fragment-level BED files. The directory is recreated at runtime to ensure a clean and deterministic output state. |
-o ./bed
|
The command generates the following output files in the specified
out_dir:
<pair>.bed
<pair> corresponds to the BAM filename without
extension.
chrom,
chromStart, and chromEnd, representing the
inferred genomic span of each paired-end fragment.
bedtools
genomecov.
# Test Data
BAM_DIR="./bam"
BED_DIR="./bed"
for d in "$BAM_DIR"/*/; do
[ -d "$d" ] || continue
sample=$(basename "$d")
echo "======================"
echo "$sample"
echo "======================"
out="${BED_DIR}/${sample}"
multiEpiPrep frag -i "$d" -o "$out
done
The cov command converts BED interval files into
genome-wide coverage tracks in bedGraph format, providing a lightweight
and reproducible representation of read or fragment density for
downstream peak calling and signal-based analyses.
This step is implemented entirely in bash and serves as a standardized preprocessing layer between interval-level BED data and signal-level analyses. Each input BED file is processed independently, producing one bedGraph file with continuous coverage blocks.
What this script does:
hg38 or mm10).bedtools genomecov -bg to compute genome-wide
coverage tracks from BED intervals..bedGraph file per input BED file into the
specified output directory.The generated bedGraph files are directly compatible with downstream signal-based steps, including peak calling, background modeling, and visualization.
| Argument | Type | Default | Description | Example |
|---|---|---|---|---|
-i, –input
|
character | — | Directory containing input BED files. Each BED file is treated as an independent sample (typically one CRF–CRF pair). |
-i ./bed
|
-o, –output
|
character | — | Output directory used to store generated bedGraph files. The directory is created recursively if it does not already exist. |
-o ./bedgraph
|
-g, –genome
|
character | — |
Reference genome used to determine chromosome sizes. Supported values
are hg38 and mm10.
|
-g hg38
|
The command generates the following output files in the specified
out_dir:
<pair>.bedGraph
<pair> corresponds to the BED filename without
extension.
chrom, chromStart, chromEnd, and
value.
# Test Data
BED_DIR="./bed"
BEDGRAPH_DIR="./bed"
for d in "$BED_DIR"/*/; do
[ -d "$d" ] || continue
sample=$(basename "$d")
echo "======================"
echo "$sample"
echo "======================"
out="${BEDGRAPH_DIR}/${sample}"
multiEpiPrep cov -i "$d" -o "$out -g hg38
done
The track command converts BED interval files into
bigWig signal tracks, producing compact, indexed coverage
representations suitable for genome browser visualization and downstream
signal-based analyses.
This step is implemented entirely in bash and standardizes the transition from fragment-level intervals (BED) to continuous signal tracks (bigWig). Each input BED file is processed independently, and the command supports optional RPM normalization to make tracks comparable across CRF-CRF pairs with different sequencing depths.
What this script does:
hg38 or mm10).1e6 / total_reads.bedtools genomecov -bg to generate a bedGraph
coverage track.chrom,
start) using a stable locale setting
(LC_ALL=C).bedGraphToBigWig.The resulting bigWig tracks can be loaded directly into genome browsers (e.g., IGV/UCSC) and serve as the canonical signal representation for visualization and interpretation.
| Argument | Type | Default | Description | Example |
|---|---|---|---|---|
-i, –input
|
character | — | Directory containing input BED files. Each BED file is treated as an independent sample (typically one CRF–CRF pair). |
-i ./bed
|
-o, –output
|
character | — |
Output directory used to store generated bigWig tracks
(.bw). Created recursively if it does not exist.
|
-o ./tracks
|
-g, –genome
|
character | — |
Reference genome used to determine chromosome sizes for coverage
calculation and bigWig indexing. Supported values are hg38
and mm10.
|
-g hg38
|
-n, –normalized–no-normalized
|
logical |
true
|
Whether to apply RPM normalization. If enabled, coverage is scaled by
1e6 / total_reads prior to exporting. Use
–no-normalized to export raw (unscaled) coverage tracks.
|
-n–no-normalized
|
The command generates the following output files in the specified
out_dir:
<pair>.bw
<pair> corresponds to the BED filename without
extension.
chrom.sizes,
enabling fast random access for genome browsers.
# Test Data
BED_DIR="./bed"
BIGWIG_DIR="./bed"
for d in "$BED_DIR"/*/; do
[ -d "$d" ] || continue
sample=$(basename "$d")
echo "======================"
echo "$sample"
echo "======================"
out="${BIGWIG_DIR}/${sample}"
multiEpiPrep track -i "$d" -o "$out -g hg38
done