### 2) Cat all 18 chrIX sequences into one fasta file, and index the fasta file as pggb need it
Before merge all files, we need to change the IDs as they are all identical.
We rename the sequence ID following PanSN-spec naming pattern (https://github.com/pangenome/PanSN-spec),the naming scheme for PanSN-spec is [sample_name][delim][haplotype_id][delim][contig_or_scaffold_name], so we will rename each ID as "Strain_ID#1#chrIX" where "1" represents haplotype id, and "#" is used as the delimiter
<details><summary>(Click to show possible answers)</summary>
```bash
> ./chroms/ScRAP_chrIX.fa
for chrom_file in `ls ./chroms/*.chrIX.fa`;
do
filename=`basename ${chrom_file}`
new_id=${filename/.chrIX.fa/#1#chrIX}
echo ${new_id}
sed "s/>chrIX/>${new_id}/" ${chrom_file} \
>> ./chroms/ScRAP_chrIX.fa
done
# pggb need the fasta file to be indexed by samtools faidx or seqkit faidx
seqkit faidx ./chroms/ScRAP_chrIX.fa
```
</details>
- ### Step2: It's now ready to run pggb
<details><summary>(Click to show possible answers)</summary>
```bash
mkdir pggb
# here we set -n 18, which is equal to the number of haplotypes
- ### Step4: Compare pggb results across different nucleotide identity threshold
The default minimum average nucleotide identity for segments of pggb is 90, which is suitable for genomes with a divergence <= 10%, let's try different threshold to see how this would affect pangraph
<details><summary>(Click to show possible answers)</summary>
```bash
# use a lower percent identity of 80 to allow more divergent alignment
- ### Step6: Calculate pangenome openness with panacus (https://github.com/marschall-lab/panacus)
panacus is a tool for calculating statistics for GFA files, it supports the following calculations:
- coverage histogram
- pangenome growth statistics
- path-/group-resolved coverage table
here we mainly use it to calculate the pangenome growth statistics
<details><summary>(Click to show possible answers)</summary>
```bash
# 1) run panacus histgrowth to calculate coverage and pangenome growth for nodes (default) with coverage/quorum thresholds 1/0, 2/0, 1/1, 1/0.5, and 1/0.1