Quite often that I get a dataset sent to me in an Excel format, that I want to use R to perform additional analysis afterward. My solution is always exporting the excel file to CSV format (to ease some of the problem with missing cell value). I recently discovered a new solution to bypass this step and read-in the data from Excel to R directly.
With the help of “RODBC” package, reading the data from excel spreadsheet seems quite simple. I believe, though, that this is still limited to “xls” file format, and not the newer “xlsx” file format.
1. You need to establish a connection channel between R and your data base file “DBF”
> ch <- odbcConnectExcel("datafile.xls")
If you are using the windows GUI for R, you can ask for a dialog box to enter the file directly using
> ch <- odbcConnectExcel()
2. You can then view the spreadsheets within the file using
3. To fetch the data into R bypassing “read.table”
> excelData <- sqlFetch(ch,"spreadsheet",as.is=TRUE) ## avoid string2factor conversion > excelData_withspace <- sqlFetch(ch,'spreadsheet name$' ) ## notice the additional $ sign and single quote
If the spreadsheet name contains a space, you will need to enclose the name within a single quote.
4. Once you finish working on the spreadsheet, you should close the connection.