As the doctor gone rogue

February 20, 2014

Download files from a webpage with wget.

Filed under: bash — Tags: , , — hypotheses @ 9:15 pm

Recently phylotree just release another update of mitochndrial phylogentic trees (19Feb2014). Besides the updated tree, a very nice feature of http://www.phylotree.org website is that they have a great curation of mitochondrial sequences publicly available for download. You can check out their website here http://www.phylotree.org/mtDNA_seqs.htm

If you have a curious mind, you may want to download all the sequences, construct your own trees or use the data to do something else creatively.

In this case wget might be your best friends, although you might be able to write a python script to do something similar.

 wget -r --accept "*.ext" --level 2 http://www.website.com/pagewithLink.html
 

As a reminder, if you are behind a proxy firewall, take a look at my previous post  https://bhoom.wordpress.com/2013/07/26/how-to-wget-with-proxy-authentication/

 

Advertisements

July 26, 2013

How to wget with proxy authentication?

Filed under: bash, NGS — Tags: , , , — hypotheses @ 2:25 am

Once again, I have a problem with proxy server authentication through my university network. Trying to install the new KGGSeq software to do next-generation sequencing data analysis.

As a quick fix, with cygwin, here is what I did.

1. Need to tell bash that  that we are using a proxy server

## Add these to ~/.bashrc for my bash start up shell


## Add these to ~/.bashrc for my bash start up shell

proxy="http://user:password@proxy-server.university:8080"
export http_proxy=$proxy

2. Need to tell wget what username and password to use with the proxy server.

As an example to download KGGSeq through cygwin, here’s what I did.

wget --proxy-user "bhoom" --proxy-password "bhoom_password"  http://statgenpro.psychiatry.hku.hk/limx/kggseq/download.php?file=kggseq.zip

Wget – ArchWiki.

I’m still not quite sure why they still use it. There seems to be several other enterprise authentication system, but all other systems are probably pricy? But does price justify all the other troubles we all have with slow connection for every website, problems running many bioinformatics software that cannot connect through proxy-server, etc?

July 14, 2010

Using wget to download files from secured website

Filed under: bash — Tags: , — hypotheses @ 4:56 pm

Our collaborator just uploaded a bunch of files to their website today. One way to get those data easily is to download them all using wget, which mirror all the structures on the remote websites.

Normally you can


wget -r http://fly.srk.fer.hr --user=bhoom --password=bhoom_Password

-r : for recursive

You can also limit how many levels you want to download, the space limit, etc.

Create a free website or blog at WordPress.com.