Aerith Omics

Omics in Soil & Gut & Microbiome & Plant

This tutorial shows how to perform stable isotope labeling (SIP) proteomics search using Sipros on 13C-labeled E. coli DDA mass spectrometry data. This workflow works on WSL Ubuntu 20.04 in Windows 11 and CentOS 7.

Install environment

1
2
3
conda create -n py2 scikit-learn python=2.7
conda create -n mono -c conda-forge mono
conda create -n r -c conda-forge -c bioconda r-base r-stringr r-tidyr bioconductor-biostrings

Make folder for the workflow

1
mkdir fasta raw ft regular sip configs bin

Download raw file

1
2
3
4
5
cd raw
# Download raw file with 1% 13C
wget ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2023/04/PXD041414/Pan_062822_X1iso5.raw
# Download raw file with 50% 13C
wget ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2023/04/PXD041414/Pan_052322_X13.raw
Read more »

Install metawrap

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Express Installation
conda create --name metawrap --channel ursky metawrap-mg=1.3.2

# fix error
# Can't locate Bio/Root/Version.pm in
# @INC (you may need to install the Bio::Root::Version module)
cd ~/miniconda3/envs/metawrap
ln -s lib/perl5/site_perl/5.22.0/ perl5
which config-metawrap
cp ~/config-metawrap ~/miniconda3/envs/metawrap/bin/config-metawrap

# fix bowtie2-build-s: symbol lookup error, undefined symbol
conda install tbb=2020.2

conda activate metawrap

Insall blast DB

Aspera download link

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
ftp
open ftp.ncbi.nlm.nih.gov
# user
anonymous
cd /blast/db/
passive
mls nt.*.tar.gz download.list.txt
cd /blast/db/v4/
mls nt_v4.*.tar.gz downloadV4.list.txt
bye

nohup cat NCBI.nt.download.list.txt | \
xargs -n 1 -P 1 \
bash -c '~/.aspera/connect/bin/ascp -v -k 1 -T -l `
`1000m -i ~/asperaweb_id_dsa.openssh `
anonftp@ftp.ncbi.nlm.nih.gov:/blast/db/$0 ./' \
>> downloadlog.txt 2>&1 &

nohup cat downloadV4.list.txt | \
xargs -n 1 -P 1 \
bash -c '~/.aspera/connect/bin/ascp -v -k 1 -T -l `
`1000m -i ~/asperaweb_id_dsa.openssh `
anonftp@ftp.ncbi.nlm.nih.gov:/blast/db/v4/$0 ./' \
>> downloadlog.txt 2>&1 &
for a in nt*.tar.gz; do tar xzf $a; done

vim ~/miniconda3/envs/metawrap/bin/config-metawrap

~/.aspera/connect/bin/ascp -v -k 1 -T -l 1000m \
-i ~/asperaweb_id_dsa.openssh \
anonftp@ftp.ncbi.nlm.nih.gov:/pub/taxonomy/taxdump.tar.gz ./
tar -xvf taxdump.tar.gz
Read more »

Import fastq to Qiime2

Illumina single end

Splited fastq file

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
QCstat<-read.table("01.CleanData/QCstat.xls",skip = 1)
colnames(QCstat) <-
c(
'Sample Name',
'Raw PE(#)',
'Combined(#)',
'Qualified(#)',
'Nochime(#)',
'Base(nt)',
'AvgLen(nt)',
'Q20',
'Q30',
'GC%',
'Effective%'
)
sampleName<-QCstat$`Sample Name`
df <- data.frame(
`sample-id` = sampleName,
`absolute-filepath` = paste0(
"/mnt/e/xiongyi/Chile/01.CleanData/",
sampleName,
"/",
sampleName,
".fastq"
),
stringsAsFactors = F
)
colnames(df) <- c("sample-id", "absolute-filepath")
write.table(
df,
"metadata.tsv",
quote = F,
row.names = F,
sep = "\t"
)
1
2
3
4
5
source ~/miniconda3/etc/profile.d/conda.sh
conda activate qiime2
mkdir qualityTest
cd qualityTest
fastp -i ../01.CleanData/D1/D1.fastq -o D1.filterd.fq
Read more »

Introduction to 16S and ITS rRNA Sequencing

16S and Internal Transcribed Spacer (ITS) ribosomal RNA (rRNA) sequencing are common amplicon sequencing methods used to identify and compare bacteria or fungi present within a given sample. Both ITS and 16S rRNA gene sequencing are well-established methods for comparing sample phylogeny and taxonomy from complex microbiomes or environments that are difficult or impossible to study.

register a NCBI account

img

Read more »