Jupyter docker stacks with a custom user

Jupyter allows to set a custom user instead of jovyan which is the default for all containers of the Jupyter Docker Stack. You need to change this user or its UID and GID in order to get the permissions right when you mount a volume from the host into the Jupyter container. The following steps are required:

  1. Create an unprivileged user and an asociated group on the host. Here we call the user and the group docker_worker
  2. Add your host user to the group. This gives you the permissions to modify and read the files also on the host. This is useful if your working directory on the hist is under source code control (eg. git)
  3. Launch the container with the correct settings that change the user inside the container

It is important to know that during the launch the container needs root privileges in order to change the settings in the mounted host volume and inside the container. After the permissions have been changed, the user is switched back and does not run with root privileges, but your new user. Thus make sure to secure your Docker service, as the permissions inside the container also apply to the host.

Prepare an unprivileged user on the host

1. sudo groupadd -g 1011 docker_worker
2. sudo useradd -s /bin/false -u 1010 -g 1020 docker_worker
3. Add your user to the group: sudo usermod -a -G docker_worker stefan

Docker-compose Caveats

It is important to know that docker-compose supports either an array or a dictionary for environment variables (docs).  In the case below we use arrays and we quote all variables. If you accidentally use a dictionary, then the quotes would be passed along to the Jupyter script. You would then see this error message: 

/usr/local/bin/start-notebook.sh: ignoring /usr/local/bin/start-notebook.d/*
Set username to: docker_worker
Changing ownership of /home/docker_worker to 1010:1020
chown: invalid user: ‘'-R'’

The docker-compose file

version: '2'
services:
    datascience-notebook:
        image: jupyter/base-notebook:latest
        volumes:
            - /tmp/jupyter_test_dir:/home/docker_worker/work            
        ports:
            - 8891:8888
        command: "start-notebook.sh"
        user: root
        environment:
          NB_USER: 'docker_worker'
          NB_UID: 1010
          NB_GID: 1020
          CHOWN_HOME: 'yes'
          CHOWN_HOME_OPTS: -R

Here you can see that we set the variables that cause the container to ditch jovyan in favor of docker_worker.

NB_USER: ‘docker_worker’
NB_UID: 1010
NB_GID: 1020
CHOWN_HOME: ‘yes’
CHOWN_HOME_OPTS: -R

This facilitates easy version control of the working directory of Jupyter. I also added the snipped to my Github Jupyter template.

Continue reading


Switching Kernels: Using Python 2.7 and Python 3.5 in Jupyter Notebooks

Jupyter Notebooks are a great way for working with Python interactively. The integration of Python code into documents is very useful for reports or for writing executable documentation of algorithms and functions. The text can be structured and exported in various formats. With the ever increasing popularity of Python based on the data science hype, more and more libraries are available. Although Python3 is considered to be the future of Python, consensus on the question Python 2.7 vs Python 3.5 is not yet reached. There are quite a few differences and Python 3 is not backwards compatible and therefore the code cannot be executed with both versions without modification. When you install Jupyter Notebooks via Anaconda, Python3 is recommended but Python 2.7 packages also exist.

As there is a large number of libraries, which have not yet been ported to Python 3, it can be useful to switch between the language version within a Jupyter Notebook. The following example assumes that you have both Python versions already installed.

Installing a new Kernel

In Jupyter Notebooks, the kernel is responsible for executing Python code. When you install the Anaconda System for Python3, this version also becomes the default for the notebooks. In order to enable Python 2.7 in your notebooks, you need to install a new kernel like this:

sudo python2 -m pip install --upgrade ipykernel
sudo python2 -m ipykernel install

Restart Jupyter to activate the new Python 2.7 kernel.

Switching Kernels

After restarting Jupyter, you can select the kernel and thereby which version to run the code easily from the menu:

 

 

Continue reading