DIff and Merge Tool for ipynb File
Python 的数据分析和展示文档格式,目前的解决方案大体分为两类:
第一类是使用类 Rmd 格式,优点是可读性好,对 git 友好,缺点是交互性比较差, 没有输出结果无法作为说明文档(类似于用 knitr 转换 Rmd 文档得到的 md文件), 例如 ipymd 用 markdown 格式代替 Jupyter notebook 的 ipynb 格式作为源码保存格式, 以及 nbstripout 将 ipynb 中的输出去除,达到净化文本的目的。
第二类是直接分析 ipynb 文件解决 diff 和 merge 问题, 有点是保留的 notebook 的交互性和结果展示,缺点是代码库中(尤其是源码文件, 这点与 Rmd 将输出放在单独的文件中不同)混杂了输出数据和图片,不够“纯粹”, 例如 nbdime,上午用下面的命令安装了这个工具,初步使用效果不错, 打开 git 集成后执行 git diff 会被 nbdime 处理(定义在 ~/.gitconfig 里), 不需要改变工作流程,对输入、输出的比较都比较靠谱。
. activate anaconda
conda install -c conda-forge nbdime
nbdime config-git --enable --global
Jupyterhub
Verified on Ubuntu 16.04.
To use jupyter notebook conda environment, you need nb_conda.
To install and start server, run the following codes in conda's root env:
conda install jupyter
conda install nb_conda
conda install -c conda-forge jupyterhub
conda install notebook
jupyterhub
Access
Note:
The following steps should be unnecessary. In a conda env install nb_conda_kernels:
. activate <your-env>
conda install -c conda-forge nb_conda_kernels
conda install ipykernel
Run service as a daemon
Use the following scripts to make jupyterhub a systemd service, which auto-starts when system start, auto-restarts when service down, etc:
sudo cat << EOF > /etc/systemd/system/jupyterhub.service
[Unit]
Description=Jupyterhub
After=syslog.target
After=network.target
[Service]
Type=simple
User=leo
Group=leo
WorkingDirectory=/home/leo/temp
ExecStart=/bin/bash -c 'PATH=/home/leo/apps/miniconda3/bin:$PATH /home/leo/apps/miniconda3/bin/jupyterhub --port 8282'
Restart=always
Environment=USER=leo HOME=/home/leo
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl enable jupyterhub
sudo systemctl start jupyterhub
Note 1:
Watch the log with tail -f /var/log/syslog
or sudo journalctl -u jupyterhub
.
See the service status with sudo systemctl status jupyterhub.service
.
Note 2:
If you modified the jupyterhub.service file manually,
run sudo systemctl daemon-reload
to reload and
sudo systemctl start jupyterhub.service
to start again.
Note 3:
The jupyterhub depends on configurable-http-proxy.
If not adding its path /home/leo/apps/miniconda3/bin into PATH
in ExecStart
,
the jupyterhub service startup will fail.
Written in 2016:
Start jupyter notebook server with jupyter notebook
.
It creates a new window in existing browser session with url
"http://localhost:8888/tree".
Click [New -> Python3] at the right side of the page to create a new notebook session. In the new page, use [File -> Rename] to give it a name.
When you click any of the notebook file *.ipynb
(here is chap01ex.ipynb)
in browser, it will be opened in a new tab with url
"http://localhost:8888/notebooks/chap01ex.ipynb".
Jupyter notebook support all vi-style key shortcuts, which conflicts with Chrome plugin Vimium. So click the Vimium icon at the right side of the address bar, and add a rule:
Patterns: https?://localhost:8888/notebooks/*
Leave the "Keys" textbox blank, which means disable all keys of Vimium under this pattern.
For Firefox, in [Tools -> Add-ons -> Extensions -> VimFx -> Blacklist], add
http://localhost:8888/*
to disable VimFx key shortcuts on Jupyter web page.
Now you can use jupyter key shortcuts freely.
Use h
key list all available shortcuts.
Note that shortcuts listed there are all capital letter,
while actually you should use the corresponding small letter.
Use j
/k
to select active cell,
<Enter>
to edit it,
c
to copy,
v
to paste,
x
to delete it,
a
/b
to insert before/after acitve cell.
Use m
to make text in a cell as markdown text, y
to code.
For markdown cell, use
Use Alt-Enter to evaluate the current cell and insert a blank cell below.
Use s
to save current notebook and setup a "checkpoint",
use [File -> Revert to Checkpoint] to discard all changes after that checkpoint.
Use o
to toggle output of the current cell,
Shift-o
to toggle output scrolling.
Use [File -> Download as -> Python(.py)] to create a runnable Python script from the current jupyter notebook.
Setup Jupyter Server
Setup Anaconda server on a CentOS (IP: 192.168.12.233):
bash Anaconda3-4.3.1-Linux-x86_64.sh # target path: $HOME/apps/anaconda3
export PATH=$HOME/apps/anaconda3/bin:$PATH
jupyter notebook --generate-config
vi ~/.jupyter/jupyter_notebook_config.py # see notes below to customize it
jupyter notebook
Generate password:
export PATH=$HOME/apps/anaconda3/bin:$PATH
ipython
In [1]: from notebook.auth import passwd
In [2]: passwd()
Out[2]: 'sha1:xxx:xxx'
The value of Out[2]
is the 'password' of the jupyter config file.
Note that the u
before the string must be added, or the password doesn't work.
Setup server properties (file ~/.jupyter/jupyter_notebook_config.py):
c.NotebookApp.ip = '*'
c.NotebookApp.open_browser = False
c.NotebookApp.password = u'sha1:39bfa5b30456:33907b4fb0ecdaa77e772399565096d85bd7dd7d'
c.NotebookApp.port = 7654
If the server is used only by yourself,
add c.NotebookApp.password_required = False
into the config file to login with a token (printed in the server console).
See Running a notebook server for details.
Use Ctrl-z
, bg
and disown
to convert it to a daemon process.
There's no elegant way to reattach to this process.
You have to find it's PID with pgrep jupyter
and kill it.
Then start the server again.
Now on the client, open 'http://192.168.12.233:7654/' in browser. Type the password and login. Click 'New -> Python3' on the right side to create a new notebook.
Custom Notebook Color Themes
Custom notebook color theme with dunovank/jupyter-themes:
conda install jupyterthemes
jt -h # print help info
jt -l # list available themes
jt -t monokai # use theme "monokai"
jt -r # reset to default theme
It install color theme files into folder $HOME/.jupyter/custom. So no matter if the Jupyter server and this extension is in the same conda env, the color theme always works.
Ref:
How to change the theme in Jupyter Notebook?