As the doctor gone rogue

June 7, 2010

Import HTML table to R

Filed under: R — hypotheses @ 11:29 pm

Ever remember the day you have to copy a table from WWW paste it into excel and still have to spend more time fixing it into the format you want, and still can’t do nothing else to it beside looking at an Excel table?
During the past couple days, I have been looking around and saw that they have already incorporate this functionality in R. I also read on R-blogger today http://bit.ly/9UawIE The process to import the table seems pretty simple. Here’s how.
library(XML)
# URL for the Google Data
u=”http://www.google.com/adplanner/static/top1000/”
tables = readHTMLTable(u)
my.table=tables[[2]] # The first element of the list is empty

Compare to the other method which import the table to a list. Then you will need to convert those list into the appropriate table. http://r.789695.n4.nabble.com/Read-HTML-table-td840241.html

xpathApply( htmlTreeParse("http://blabla", useInt=T), "//td", function(x) xmlValue(x))

Advertisements

2 Comments »

  1. Yeah — I discovered this 2 weeks ago on that flowdata blogsite. It’s pretty cool. I hope I remember this function exist when I do need to use it.

    Comment by Beverley — June 8, 2010 @ 2:20 am

  2. The first method works well only when you have a dedicated page for table.

    Comment by Bhoom — June 11, 2010 @ 1:44 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: