Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
wgetGenBankWGS
Manage
Activity
Members
Labels
Plan
Issues
0
Issue boards
Milestones
Wiki
Code
Merge requests
0
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
GIPhy
wgetGenBankWGS
Commits
d8275ce4
Commit
d8275ce4
authored
3 years ago
by
Alexis CRISCUOLO
Browse files
Options
Downloads
Patches
Plain Diff
0.7
parent
02a13909
No related branches found
No related tags found
No related merge requests found
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
README.md
+4
-4
4 additions, 4 deletions
README.md
wgetGenBankWGS.sh
+10
-6
10 additions, 6 deletions
wgetGenBankWGS.sh
with
14 additions
and
10 deletions
README.md
+
4
−
4
View file @
d8275ce4
...
...
@@ -30,11 +30,11 @@ Run _wgetGenBankWGS_ with the following command line model:
Run _wgetGenBankWGS_ without option to read the following documentation:
```
wgetGenBankWGS Copyright (C) 2019-2021 Institut Pasteur
wgetGenBankWGS
Copyright (C) 2019-2021 Institut Pasteur
Downloading sequence files corresponding to selected entries from genome assembly report files:
GenBank:
ftp://
ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/assembly_summary_genbank.txt
RefSeq:
ftp://
ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/assembly_summary_refseq.txt
Downloading sequence files corresponding to selected entries from genome assembly report files
(option -d)
:
GenBank: ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/assembly_summary_genbank.txt
RefSeq: ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/assembly_summary_refseq.txt
Selected entries (options -e and -v) can be restricted to a specific phylum using option -p:
-p A archaea
...
...
This diff is collapsed.
Click to expand it.
wgetGenBankWGS.sh
+
10
−
6
View file @
d8275ce4
...
...
@@ -33,7 +33,10 @@
# = VERSIONS = #
# ============ #
# #
VERSION
=
0.6.201018ac
#
VERSION
=
0.7.211026ac
#
# + takes into account the new protocol https in field ftp_path of the genome assembly report files #
# #
# VERSION=0.6.211018ac #
# + takes into account the last field 'asm_not_live_date' in genome assembly report files #
# + adding option -p to select a specific phylum #
# #
...
...
@@ -71,11 +74,11 @@ if [ "$1" = "-?" ] || [ "$1" = "-h" ] || [ $# -le 1 ]
then
#
cat
<<
EOF
wgetGenBankWGS v.
$VERSION
Copyright (C) 2019-2021 Institut Pasteur
wgetGenBankWGS v.
$VERSION
Copyright (C) 2019-2021 Institut Pasteur
Downloading sequence files corresponding to selected entries from genome assembly report files:
GenBank:
ftp://
ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/assembly_summary_genbank.txt
RefSeq:
ftp://
ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/assembly_summary_refseq.txt
Downloading sequence files corresponding to selected entries from genome assembly report files
(option -d)
:
GenBank: ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/assembly_summary_genbank.txt
RefSeq: ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/assembly_summary_refseq.txt
Selected entries (options -e and -v) can be restricted to a specific phylum using option -p:
-p A archaea
...
...
@@ -151,6 +154,7 @@ fi
# =============== #
# #
# = PROTOCOL can be either "ftp:" or "https"; however, "https:" is generally faster ====================== #
# = however since Sep. 2021, the default protocol is now "https" ====================== #
# #
PROTOCOL
=
"https:"
;
# #
...
...
@@ -314,7 +318,7 @@ if [ "$EXCLUDE_PATTERN" != "^#" ]; then echo "exclusion criterion: $EXCLUDE_PATT
tmp
=
$(
randomfile
$SUMMARY
)
;
mv
$SUMMARY
$tmp
;
sed
-n
'2p'
$tmp
>
$SUMMARY
;
sed
'1,2d'
$tmp
|
grep
-E
"
$INCLUDE_PATTERN
"
|
grep
-v
-E
"
$EXCLUDE_PATTERN
"
|
grep
-F
"ftp
://ftp
.ncbi.nlm.nih.gov"
>>
$SUMMARY
;
sed
'1,2d'
$tmp
|
grep
-E
"
$INCLUDE_PATTERN
"
|
grep
-v
-E
"
$EXCLUDE_PATTERN
"
|
grep
-F
"ftp.ncbi.nlm.nih.gov"
>>
$SUMMARY
;
rm
$tmp
;
n
=
$(
grep
-v
-c
"^#"
$SUMMARY
)
;
echo
"
$REPOSITORY
:
$n
entries"
;
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment