Using R with Excel


Quite often that I get a dataset sent to me in an Excel format, that I want to use R to perform additional analysis afterward.  My solution is always exporting the excel file to CSV format (to ease some of the problem with missing cell value). I recently discovered a new solution to bypass this step and read-in the data from Excel to R directly.

With the help of “RODBC” package, reading the data from excel spreadsheet seems quite simple. I believe, though, that this is still limited to “xls” file format, and not the newer “xlsx” file format.

1. You need to establish a connection channel between R and your data base file “DBF”


> ch <- odbcConnectExcel("datafile.xls")

If you are using the windows GUI for R, you can ask for a dialog box to enter the file directly using


> ch <- odbcConnectExcel()

2. You can then view the spreadsheets within the file using


> sqlTables(ch)

3. To fetch the data into R bypassing “read.table”


> excelData <- sqlFetch(ch,"spreadsheet",as.is=TRUE) ## avoid string2factor conversion

> excelData_withspace <- sqlFetch(ch,'spreadsheet name$' ) ## notice the additional $ sign and single quote


If the spreadsheet name contains a space, you will need to enclose the name within a single quote.

4. Once you finish working on the spreadsheet, you should close the connection.


> close(ch)

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.