Updated for debugging technics when analyzing logistic regression script of "Data Science from Scratch", chapter 16 in PyCharm 2017.3. See details in wiznote/diary/2017.12.27.
Origin post:
Now I want to dubug a script named filteringdata.py
.
PyCharm Community Edition
Its editor (with IdeaVim plugin), linter and debugger is the most powerful and verbose in the 3 candidates. But it's complicated when debugging a remote script.
Add the following lines into ~/.ideavimrc:
set nocompatible
set clipboard=unnamedplus
nnoremap ; :
nnoremap : ;
vnoremap ; :
vnoremap : ;
The code navigation shortcut "Ctrl-Alt-Left/Right" of the default keymap "Default for XWin" (in File -> Settings) is conflict with "Switch workspace"'s shortcut key of Ubuntu Unity. So change the "Keymaps" to "Default for GNOME", whose code navigation key is "Alt-Shift-Left/Right".
In a debugging session, pause at a breakpoint set beforehand, evaluate multiline scripts in the window created by Evaluate Expression (Alt-F8) button in Debugger window.
Editor Setup
Setup editor font size: [Font -> Primary font -> Size: 16]
Setup editor background to pure dark: [General -> Text -> Default text], click the "Background" color icon, set R,G,B to 0, click the "Foreground" color icon, set R,G,B to 255.
Set right margin : [File -> Settings -> Editor -> Code Style -> General: Right margin (columns)], set it's value to 80 instead of the default value 120.
Jump between editor and terminal: modify it to Alt-K (default: Alt-F12) [File -> Settings -> Keymap]: search "terminal", double click "Terminal", choose "Add Keyboard Shortcut".
Some Shortcuts
Toggle Project View: Alt-1
Split window: Ctrl-Shift-A, input "split", select "Split vertical/horizontal"
Jump between files: Ctrl-Shift-N
Jump between editor tabs and split window: Ctrl-Tab
Full screen: [View -> Enter Full Screen]
Clipboard history: Ctrl-Shift-v
Jump to previous/next function: Alt-Up/Down
Discussion
You can switch easily between "Console" and "Debugger" panels. The Console panel distinguishes stdout and stderr with different colors, which is particularly useful when debugging some programs with both stdout and stderr outputs.
Meanwhile you can evaluate an expression at the runtime in "Watches" window.
pudb
Comparing with ipdb, pudb needn't add stub in source script.
Installation: conda install -c conda-forge pudb
.
Note: installing with conda instead of pip cooperates better with conda environment.
Debug Python script: pudb filteringdata.py
,
or with additional arguments: pudb uploadES.py fairs.json production Fair
.
Use ?
to list all available commands.
Focus code window with C
(which is the default window focused once you are in pudb), n
to step over,
s
step into , f
finish current function (like step out), c
continue,
o
to see the console output (very useful!).
Use m
(then the module name) to open a module,
this is useful when you want to set a breakpoint in another module.
The module name of the main script is __main__
.
Focus variable window with V
.
In this window:
use w
to toggle line wrap,
[
/]
to grow/shrink relative size of the window,
=
/-
to grow/shrink sidebar.
\
to expand/collapse variable result.
Toggle focus on command line with 'Ctrl-x'.
To resize the console window, press 'Ctrl-x' to focus console window,
then press Right arrow key to focus < Clear >
at the bottom-right corner,
then use =
, -
, +
and _
keys to resize the window.
ipdb
without stubs
In IPython repl, debug any function on site with %debug -b mypackage.py:23 myfunc(inp)
.
Here the -b
option add a breakpoint at the place you want to stop.
Note the command after the %debug
can only be function.
You can't write it as %debug res = myfunc(inp)
(which is a statement).
If you want to just activate the debugger AFTER an exception has fired,
without having to type ‘%pdb on’ and rerunning your code,
you can use the %debug magic: %debug myfunc(inp)
.
Debug a script mymodule.py with %run -d -b26 mymodule.py
.
Here -d
means debug, -b 263
means add breakpoint at line 263.
Activate the debugger AFTER an exception has fired:
%pdb on
run mymodule.py
If you want add a breakpoint in the ipdb session,
run b 141
(141 is the line number of current script) or b myscript.py:141
.
Using b
to list all breakpoints.
To add a conditional breakpoint, for example,
only stops at line 40 when j
equals 50:
b 40, j == 50
.
See Become a pdb power-user
for more details.
To clear the break point #3, run cl 3
in ipdb REPL.
Clear all breakpoints with cl
or clear
.
If your app script import some other libraries, you can add breakpoints
in the imported library files with %run -d -b mylib.py:26 my_app.py
There's no need to add set_trace()
function in this mode.
with stubs
There are totally 3 steps when debugging a Python script with ipdb module: install, insert stubs in source codes and debug.
First install with sudo pip install ipdb
(in Anaconda this is installed by default).
Then add from ipdb import set_trace
at the head into the script,
and add set_trace()
at the first place you want the dubugger to stop.
Finally start a shell and run:
$ ipython
...
IPython 4.0.1 ...
...
In [1]: %run filteringdata.py
> /home/leo/docs/playground/pg2dm-python/ch2/filteringdata.py(53)recommend()
52 # first find nearest neighbor
---> 53 nearest = computeNearestNeighbor(username, users)[0][1]
54
ipdb> h
debug test case
To debug in a unit test written with unittest, for example the following test in file test_app.py:
class TrainerTest(TestCase):
def setUp(self):
self.raw = pd.read_csv('ycz6502.csv', usecols=[0, 4, 6],
index_col=0).dropna()
def test_trainer_03(self):
logging.info('Thresholds between statuses:\n%s' % train_thresholds(self.raw, 0.3))
here the core functionality is implemented in function train_thresholds
in file corr_calc.py, start the debugging process with:
from unittest import TestCase, defaultTestLoader, TextTestRunner
%load -r 1:153 test_app.py
test_trainer = defaultTestLoader.loadTestsFromName(
'test_app.TrainerTest.test_trainer_08')
%debug -b corr_calc.py:138 TextTestRunner(verbosity=1).run(test_trainer)
Or add the starting script like the following in test_app.py:
if __name__ == '__main__':
test_trainer = defaultTestLoader.loadTestsFromName(
'test_app.TrainerTest.test_trainer_08')
TextTestRunner(verbosity=1).run(test_trainer)
Then in the IPython console, debug with run -d -b corr_calc.py:130 test_app.py
Watch variables
In ipdb REPL, to watch interested variables, use the display
command:
Use display var1
, display var2
, etc to add variables,
display
to list all watched variables,
and undisplay
to remove watched variables.
Now if the variables watched change during c
or n
, they will be displayed
at the console.
Use alias
to convert long command short.
For example, after alias ds display
, ds
will list all watched variables.
Use unalias
to remove the alias.
Frequently used commands include:
display: add/list watched variables;
a: print args of current function;
b: set or list break points. use *cl* to clear breakpoints;
c: continue to next break points;
n: next;
s: step;
p & pp: evaluate and print value of an expression;
pp locals(): pretty-print all local variables;
w: print the stack, useful during multiple function calls
q: quit;
Use h a
to see the help information about command a
.
Watching with display
in ipdb is more convenient than watch window in
pudb or PyCharm, because in repl you can see clear changed history of
the variables with interests.
When debug recursive functions, you have to use without stubs style,
because set_trace()
function in the debugged function body will clear the
information of higher level function call in the stack.