Blandine Trouche October 2023
The accession information for the external MAGs used can be found in Supplementary table S3. We used in total 35 external references: - 4 MAGS from the Mariana trench (Zhong et al., 2020) - 22 MAGs from the Mariana trench and adjacent abyssal plains (Zhou et al., 2022) - 9 MAGs from the North Atlantic abyssal plain (Kerou et al., 2021)
Example of commands to download the data with ncbi datasets
conda activate ncbi_datasets
cd NON_REDUNDANT_BINS
mkdir -p 01_EXT_FASTA
cd 01_EXT_FASTA
datasets download genome accession GCA_012928605.1 --include genome
unzip ncbi_dataset.zip
mv ncbi_dataset/data/*/*.fna .
rm -r ncbi_dataset*
# compress all downloaded fasta files
gzip *.fna
Creating the text file necessary for Anvi’o to find the files: ext_Zhou_Zhong_Kerou.txt
name path
B89T1L10 01_EXT_FASTA/GCA_022561135.1_ASM2256113v1_genomic.fna.gz
B7T1B11 01_EXT_FASTA/GCA_022561815.1_ASM2256181v1_genomic.fna.gz
B2D1T2 01_EXT_FASTA/GCA_022567895.1_ASM2256789v1_genomic.fna.gz
B1T1B5 01_EXT_FASTA/GCA_022572285.1_ASM2257228v1_genomic.fna.gz
B17T3L8 01_EXT_FASTA/GCA_022572765.1_ASM2257276v1_genomic.fna.gz
B16T1B3 01_EXT_FASTA/GCA_022573065.1_ASM2257306v1_genomic.fna.gz
B19T1B10 01_EXT_FASTA/GCA_022572565.1_ASM2257256v1_genomic.fna.gz
B26D1T2 01_EXT_FASTA/GCA_022571195.1_ASM2257119v1_genomic.fna.gz
B44T3L14 01_EXT_FASTA/GCA_022565935.1_ASM2256593v1_genomic.fna.gz
B56T1B5 01_EXT_FASTA/GCA_022564175.1_ASM2256417v1_genomic.fna.gz
B51T1B5 01_EXT_FASTA/GCA_022564515.1_ASM2256451v1_genomic.fna.gz
B52T3L11 01_EXT_FASTA/GCA_022564385.1_ASM2256438v1_genomic.fna.gz
B49T1B8 01_EXT_FASTA/GCA_022564815.1_ASM2256481v1_genomic.fna.gz
B5T1L6 01_EXT_FASTA/GCA_022563695.1_ASM2256369v1_genomic.fna.gz
B6T1L6 01_EXT_FASTA/GCA_022562615.1_ASM2256261v1_genomic.fna.gz
B10T1B5 01_EXT_FASTA/GCA_022574355.1_ASM2257435v1_genomic.fna.gz
B10T1B11 01_EXT_FASTA/GCA_022574415.1_ASM2257441v1_genomic.fna.gz
B12T1B11 01_EXT_FASTA/GCA_022573895.1_ASM2257389v1_genomic.fna.gz
B15D1T2 01_EXT_FASTA/GCA_022573365.1_ASM2257336v1_genomic.fna.gz
B15MC02 01_EXT_FASTA/GCA_022573335.1_ASM2257333v1_genomic.fna.gz
B10D1T1 01_EXT_FASTA/GCA_022545335.1_ASM2254533v1_genomic.fna.gz
B12D1T1 01_EXT_FASTA/GCA_022545345.1_ASM2254534v1_genomic.fna.gz
MTA1 01_EXT_FASTA/GCA_012928605.1_ASM1292860v1_genomic.fna.gz
MTA4 01_EXT_FASTA/GCA_012928615.1_ASM1292861v1_genomic.fna.gz
MTA5 01_EXT_FASTA/GCA_012928585.1_ASM1292858v1_genomic.fna.gz
MTA6 01_EXT_FASTA/GCA_012928565.1_ASM1292856v1_genomic.fna.gz
NPMR_NP_delta_1 01_EXT_FASTA/GCA_016276965.1_ASM1627696v1_genomic.fna.gz
NPMR_NP_theta_3 01_EXT_FASTA/GCA_016838785.1_ASM1683878v1_genomic.fna.gz
NPMR_NP_delta_2 01_EXT_FASTA/GCA_016838725.1_ASM1683872v1_genomic.fna.gz
NPMR_NP_iota_1 01_EXT_FASTA/GCA_016838825.1_ASM1683882v1_genomic.fna.gz
NPMR_NP_theta_2 01_EXT_FASTA/GCA_016838795.1_ASM1683879v1_genomic.fna.gz
NPMR_NP_theta_5 01_EXT_FASTA/GCA_016838745.1_ASM1683874v1_genomic.fna.gz
NPMR_NP_theta_4 01_EXT_FASTA/GCA_016838765.1_ASM1683876v1_genomic.fna.gz
NPMR_NP_delta_3 01_EXT_FASTA/GCA_016838865.1_ASM1683886v1_genomic.fna.gz
NPMR_NP_theta_1 01_EXT_FASTA/GCA_016838845.1_ASM1683884v1_genomic.fna.gz
conda activate anvio-7.1
cd NON_REDUNDANT_BINS
# checking the steps that will be run by the workflow
anvi-run-workflow -w contigs \
-c config_contigs_ext.json \
--save-workflow-graph
# Run the actual workflow
anvi-run-workflow -w contigs \
-c config_contigs_ext.json \
--additional-params \
#--directory your_working_directory \
--jobs 36 \
--keep-going --rerun-incomplete >> workflow_log_contigs_ext.txt 2>&1