Updated for debugging technics when analyzing logistic regression script of "Data Science from Scratch", chapter 16 in PyCharm 2017.3. See details in wiznote/diary/2017.12.27.

Origin post:

Now I want to dubug a script named filteringdata.py.

PyCharm Community Edition

Its editor (with IdeaVim plugin), linter and debugger is the most powerful and verbose in the 3 candidates. But it's complicated when debugging a remote script.

Add the following lines into ~/.ideavimrc:

set nocompatible
set clipboard=unnamedplus
nnoremap ; :
nnoremap : ;
vnoremap ; :
vnoremap : ;

The code navigation shortcut "Ctrl-Alt-Left/Right" of the default keymap "Default for XWin" (in File -> Settings) is conflict with "Switch workspace"'s shortcut key of Ubuntu Unity. So change the "Keymaps" to "Default for GNOME", whose code navigation key is "Alt-Shift-Left/Right".

In a debugging session, pause at a breakpoint set beforehand, evaluate multiline scripts in the window created by Evaluate Expression (Alt-F8) button in Debugger window.

Editor Setup

Setup editor font size: [Font -> Primary font -> Size: 16]

Setup editor background to pure dark: [General -> Text -> Default text], click the "Background" color icon, set R,G,B to 0, click the "Foreground" color icon, set R,G,B to 255.

Set right margin : [File -> Settings -> Editor -> Code Style -> General: Right margin (columns)], set it's value to 80 instead of the default value 120.

Jump between editor and terminal: modify it to Alt-K (default: Alt-F12) [File -> Settings -> Keymap]: search "terminal", double click "Terminal", choose "Add Keyboard Shortcut".

Some Shortcuts

Toggle Project View: Alt-1

Split window: Ctrl-Shift-A, input "split", select "Split vertical/horizontal"

Jump between files: Ctrl-Shift-N

Jump between editor tabs and split window: Ctrl-Tab

Full screen: [View -> Enter Full Screen]

Clipboard history: Ctrl-Shift-v

Jump to previous/next function: Alt-Up/Down

Discussion

You can switch easily between "Console" and "Debugger" panels. The Console panel distinguishes stdout and stderr with different colors, which is particularly useful when debugging some programs with both stdout and stderr outputs.

Meanwhile you can evaluate an expression at the runtime in "Watches" window.

pudb

Comparing with ipdb, pudb needn't add stub in source script.

Installation: conda install -c conda-forge pudb. Note: installing with conda instead of pip cooperates better with conda environment.

Debug Python script: pudb filteringdata.py, or with additional arguments: pudb uploadES.py fairs.json production Fair.

Use ? to list all available commands.

Focus code window with C (which is the default window focused once you are in pudb), n to step over, s step into , f finish current function (like step out), c continue, o to see the console output (very useful!).

Use m (then the module name) to open a module, this is useful when you want to set a breakpoint in another module. The module name of the main script is __main__.

Focus variable window with V. In this window: use w to toggle line wrap, [/] to grow/shrink relative size of the window, =/- to grow/shrink sidebar. \ to expand/collapse variable result.

Toggle focus on command line with 'Ctrl-x'.

To resize the console window, press 'Ctrl-x' to focus console window, then press Right arrow key to focus < Clear > at the bottom-right corner, then use =, -, + and _ keys to resize the window.

ipdb

without stubs

In IPython repl, debug any function on site with %debug -b mypackage.py:23 myfunc(inp). Here the -b option add a breakpoint at the place you want to stop. Note the command after the %debug can only be function. You can't write it as %debug res = myfunc(inp) (which is a statement).

If you want to just activate the debugger AFTER an exception has fired, without having to type ‘%pdb on’ and rerunning your code, you can use the %debug magic: %debug myfunc(inp).

Debug a script mymodule.py with %run -d -b26 mymodule.py. Here -d means debug, -b 263 means add breakpoint at line 263.

Activate the debugger AFTER an exception has fired:

%pdb on
run mymodule.py

If you want add a breakpoint in the ipdb session, run b 141 (141 is the line number of current script) or b myscript.py:141. Using b to list all breakpoints. To add a conditional breakpoint, for example, only stops at line 40 when j equals 50: b 40, j == 50. See Become a pdb power-user for more details.

To clear the break point #3, run cl 3 in ipdb REPL. Clear all breakpoints with cl or clear.

If your app script import some other libraries, you can add breakpoints in the imported library files with %run -d -b mylib.py:26 my_app.py

There's no need to add set_trace() function in this mode.

with stubs

There are totally 3 steps when debugging a Python script with ipdb module: install, insert stubs in source codes and debug.

First install with sudo pip install ipdb (in Anaconda this is installed by default).

Then add from ipdb import set_trace at the head into the script, and add set_trace() at the first place you want the dubugger to stop.

Finally start a shell and run:

$ ipython
...
IPython 4.0.1 ...
...
In [1]: %run filteringdata.py
> /home/leo/docs/playground/pg2dm-python/ch2/filteringdata.py(53)recommend()
     52     # first find nearest neighbor
---> 53     nearest = computeNearestNeighbor(username, users)[0][1]
     54

ipdb> h

debug test case

To debug in a unit test written with unittest, for example the following test in file test_app.py:

class TrainerTest(TestCase):
    def setUp(self):
        self.raw = pd.read_csv('ycz6502.csv', usecols=[0, 4, 6],
                               index_col=0).dropna()
    def test_trainer_03(self):
        logging.info('Thresholds between statuses:\n%s' % train_thresholds(self.raw, 0.3))

here the core functionality is implemented in function train_thresholds in file corr_calc.py, start the debugging process with:

from unittest import TestCase, defaultTestLoader, TextTestRunner
%load -r 1:153 test_app.py
test_trainer = defaultTestLoader.loadTestsFromName(
        'test_app.TrainerTest.test_trainer_08')
%debug -b corr_calc.py:138 TextTestRunner(verbosity=1).run(test_trainer)

Or add the starting script like the following in test_app.py:

if __name__ == '__main__':
    test_trainer = defaultTestLoader.loadTestsFromName(
            'test_app.TrainerTest.test_trainer_08')
    TextTestRunner(verbosity=1).run(test_trainer)

Then in the IPython console, debug with run -d -b corr_calc.py:130 test_app.py

Watch variables

In ipdb REPL, to watch interested variables, use the display command: Use display var1, display var2, etc to add variables, display to list all watched variables, and undisplay to remove watched variables.

Now if the variables watched change during c or n, they will be displayed at the console.

Use alias to convert long command short. For example, after alias ds display, ds will list all watched variables. Use unalias to remove the alias.

Frequently used commands include:

display: add/list watched variables;
a: print args of current function;
b: set or list break points. use *cl* to clear breakpoints;
c: continue to next break points;
n: next;
s: step;
p & pp: evaluate and print value of an expression;
pp locals(): pretty-print all local variables;
w: print the stack, useful during multiple function calls
q: quit;

Use h a to see the help information about command a.

Watching with display in ipdb is more convenient than watch window in pudb or PyCharm, because in repl you can see clear changed history of the variables with interests.

When debug recursive functions, you have to use without stubs style, because set_trace() function in the debugged function body will clear the information of higher level function call in the stack.

Debug Python Script

PyCharm Community Edition

Editor Setup

Some Shortcuts

Discussion

pudb

ipdb

without stubs

with stubs

debug test case

Watch variables

Published

Last Updated

Category

Tags

Contact