There are several ways to import data into R.
The standard way, what it used to be, is from a text file using read.table() function.
For excel files, the most famous spreadsheet software on the world, several libraries can be used to import data from .xls file, for example
. In the past, the problem was with the xlsx file, which was not supported yet.
Recently, I discovered that
can be used to import xlsx file now. So, this bypass the step that I normally have to save the excel file to text file and do the regular file import.
data <- read.xls(xls="myData.xlsx",sheet=1,header=TRUE, as.is=TRUE)
Life is not that simple. Even after I figured out how to create a docker machine to limit the disk usage, cpu, and memory through docker-machine running on Ubuntu host. However, there seems to be a problem that prevent the host to connect directly to the guest docker-machine.
bhoom@mg0:~$ docker-machine create -d virtualbox --virtualbox-disk-size "100000" --virtualbox-memory "32000" --virtualbox-cpu-count "16" fireDock0
Running pre-create checks...
(fireDock0) Copying /home/bhoom/.docker/machine/cache/boot2docker.iso to /home/bhoom/.docker/machine/machines/fireDock0/boot2docker.iso...
(fireDock0) Creating VirtualBox VM...
(fireDock0) Creating SSH key...
(fireDock0) Starting the VM...
(fireDock0) Check network to re-create if needed...
(fireDock0) Waiting for an IP...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with boot2docker...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
This machine has been allocated an IP address, but Docker Machine could not
reach it successfully.
SSH for the machine should still work, but connecting to exposed ports, such as
the Docker daemon port (usually <ip>:2376), may not work properly.
You may need to add the route manually, or use another related workaround.
This could be due to a VPN, proxy, or host file configuration issue.
You also might want to clear any VirtualBox host only interfaces you are not using.
Checking connection to Docker...
Error creating machine: Error checking the host: Error checking and/or regenerating the certs: There was an error validating certificates for host "192.168.99.100:2376": dial tcp 192.168.99.100:2376: i/o timeout
You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'.
Be advised that this will trigger a Docker daemon restart which will stop running containers.
If this is your case, and you have proxy server setup for your general internet connection, try
if you are lucky, you should be able to connect to your docker-machine (locally). Otherwise, life is not that simple.
There are generally 4 steps to create an analytics environment on your server that will be separate from the rest of the system. By running your analysis within a container, it might reduce the risk of crashing the server because you might have used up all the resouce and cause the server to freeze up.
virtualbox (to create a docker machine)
docker-machine this will be the machine to run your container
Map your command to run in the container. Following the
sudo apt-get install virtualbox-qt
curl -L https://github.com/docker/machine/releases/download/v0.8.0/docker-machine-`uname -s-uname -m` > /usr/local/bin/docker-machine && \ chmod +x /usr/local/bin/docker-machine
create a docker-machine
This will create a separate docker machine called
docker-machine create -d virtualbox --virtualbox-disk-size "100000" --virtualbox-cpu-count "8" --virtualbox-memory "32092" docker2
docker-machine start docker2
We then need to specify a new destination where docker container will run, i.e. on
eval $(docker-machine env docker2)
See <https://docs.docker.com/machine/install-machine/> for more info
You’re now at a point where you can run stuff in the container. Here’s an extra step that will make it super easy: put these lines in your .bashrc file (or the Windows equivalent)
docker run -v $PWD:/tmp/working -w=/tmp/working --rm -it kaggle/python python "$@"
docker run -v $PWD:/tmp/working -w=/tmp/working --rm -it kaggle/python ipython
(sleep 3 && open "http://$(docker-machine ip docker2):8888")&
docker run -v $PWD:/tmp/working -w=/tmp/working -p 8888:8888 --rm -it kaggle/python jupyter notebook --no-browser --ip="*" --notebook-dir=/tmp/working
Life just get a lost easier if you want to install vcftools on MacOS.
Once homebrew is installed (see https://bhoom.wordpress.com/tag/brew/), you can simply install vcftools in one line.
brew install homebrew/science/vcftools
- apt-get is installed
- You have
#add repository from webup8team & install oracle java and set as default
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-set-default
If you have other java installed previously, you might want to complete remove them first.
sudo apt-get purge openjdk-\*
There's a long description on how to install Oracle Java on WikiHow if you want to install them from scratch.
One common task that we always have to do relatively often is to find the location of some specific files. Although you may have some clues that those files are located in your current working directory, “.” symbolic link used in the command below, you may have no clue to which sub-folder your files are.
Continue reading Find files and listing details on Linux