There are generally 4 steps to create an analytics environment on your server that will be separate from the rest of the system. By running your analysis within a container, it might reduce the risk of crashing the server because you might have used up all the resouce and cause the server to freeze up.
- Install
virtualbox
(to create a docker machine) -
Install
docker-machine
-
Create a
docker-machine
this will be the machine to run your container -
Map your command to run in the container. Following the
kaggle/python
tutorial.
Install virtualbox-qt
sudo apt-get install virtualbox-qt
Install docker-machine
curl -L https://github.com/docker/machine/releases/download/v0.8.0/docker-machine-`uname -s-uname -m` > /usr/local/bin/docker-machine && \ chmod +x /usr/local/bin/docker-machine
create a docker-machine
This will create a separate docker machine called docker2
docker-machine create -d virtualbox --virtualbox-disk-size "100000" --virtualbox-cpu-count "8" --virtualbox-memory "32092" docker2 docker-machine start docker2
We then need to specify a new destination where docker container will run, i.e. on docker2
eval $(docker-machine env docker2)
See <https://docs.docker.com/machine/install-machine/> for more info
run kaggle/python
You’re now at a point where you can run stuff in the container. Here’s an extra step that will make it super easy: put these lines in your .bashrc file (or the Windows equivalent)
kpython(){ docker run -v $PWD:/tmp/working -w=/tmp/working --rm -it kaggle/python python "$@" } ikpython() { docker run -v $PWD:/tmp/working -w=/tmp/working --rm -it kaggle/python ipython } kjupyter() { (sleep 3 && open "http://$(docker-machine ip docker2):8888")& docker run -v $PWD:/tmp/working -w=/tmp/working -p 8888:8888 --rm -it kaggle/python jupyter notebook --no-browser --ip="*" --notebook-dir=/tmp/working }
Reference
<http://blog.kaggle.com/2016/02/05/how-to-get-started-with-data-science-in-containers/>