Debugging Code Like a Pro Inside Jupyter Notebook

Debugging in Jupyter notebooks presents unique challenges compared to traditional integrated development environments. The interactive, cell-based execution model that makes notebooks powerful for exploration can also obscure bugs, create confusing state dependencies, and complicate systematic debugging. Many data scientists resort to scattered print statements and trial-and-error approaches that waste time and leave underlying issues unresolved. However, Jupyter provides sophisticated debugging capabilities that, when mastered, enable efficient problem diagnosis and resolution without sacrificing the notebook’s interactive advantages. This comprehensive guide reveals professional debugging strategies specifically designed for Jupyter notebooks, covering built-in debugging tools, systematic troubleshooting approaches, state management techniques, and advanced debugging patterns that transform frustrating bug hunts into methodical problem-solving sessions.

Understanding Common Notebook-Specific Bugs

Before diving into debugging techniques, understanding bug patterns unique to notebooks helps you recognize and prevent issues proactively. Notebook bugs often stem from characteristics that distinguish them from traditional scripts.

Execution Order Dependencies create the most insidious notebook bugs. Because cells can execute in any order, the notebook’s visual sequence might not reflect actual execution history. Consider this scenario:

# Cell 1
data = [1, 2, 3, 4, 5]

# Cell 2
data.append(6)
print(data)  # [1, 2, 3, 4, 5, 6]

# Cell 1
data = [1, 2, 3, 4, 5]

# Cell 2
data.append(6)
print(data)  # [1, 2, 3, 4, 5, 6]

If you run Cell 1, then Cell 2, then Cell 2 again without rerunning Cell 1, data becomes [1, 2, 3, 4, 5, 6, 6]. The second execution of Cell 2 operates on modified data, producing unexpected results. Visual inspection of the notebook provides no clue about this state—you must check execution numbers in square brackets next to cells. Non-sequential numbers ([1], [5], [3], [7]) indicate out-of-order execution that might cause bugs.

Stale Variables and Hidden State occur when you delete or modify cells but their effects persist in the kernel’s memory. You might create a variable in a cell, use it elsewhere, then delete the original cell. The variable still exists in memory even though no visible cell shows where it came from. Later, you restart development on the notebook and encounter confusing references to undefined variables—except they’re defined in the hidden kernel state from previous sessions.

Mutable Object Mutations cause subtle bugs when multiple cells reference the same mutable object:

# Cell 1
original_data = {'values': [1, 2, 3]}
processed_data = original_data  # This creates a reference, not a copy!

# Cell 2
processed_data['values'].append(4)

# Cell 3
print(original_data)  # {'values': [1, 2, 3, 4]} - unexpectedly modified!

# Cell 1
original_data = {'values': [1, 2, 3]}
processed_data = original_data  # This creates a reference, not a copy!

# Cell 2
processed_data['values'].append(4)

# Cell 3
print(original_data)  # {'values': [1, 2, 3, 4]} - unexpectedly modified!

This reference-versus-copy issue confuses many developers. The solution requires explicit copying: processed_data = original_data.copy() for shallow copies or import copy; processed_data = copy.deepcopy(original_data) for nested structures.

Import and Dependency Issues arise when notebooks import modules that you modify during development. Python caches imported modules, so changes to imported code don’t appear in the notebook without kernel restart or explicit module reloading. This creates situations where you fix a bug in a module file but the notebook continues exhibiting the old behavior because it’s using the cached version.

The IPython Debugger: Your Primary Debugging Tool

Jupyter notebooks include a powerful built-in debugger accessed through IPython magic commands. This debugger provides traditional debugging capabilities—breakpoints, step-through execution, variable inspection—adapted for the notebook environment.

Basic Debugger Invocation uses the %debug magic command. When a cell raises an exception, immediately typing %debug in the next cell launches an interactive debugger at the point of failure:

# Cell with error
def calculate_average(numbers):
    total = sum(numbers)
    count = len(numbers)
    return total / count

result = calculate_average([])  # Raises ZeroDivisionError

# Cell with error
def calculate_average(numbers):
    total = sum(numbers)
    count = len(numbers)
    return total / count

result = calculate_average([])  # Raises ZeroDivisionError

# Next cell
%debug

# Next cell
%debug

This opens the debugger at the exact line that failed, letting you inspect variable values, examine the call stack, and understand what went wrong. The debugger provides a prompt where you can execute commands:

p variable_name – Print variable value
pp variable_name – Pretty-print variable value
l – List source code around current line
u – Move up the call stack
d – Move down the call stack
c – Continue execution
q – Quit debugger

Setting Explicit Breakpoints with breakpoint() function (Python 3.7+) or import pdb; pdb.set_trace() (earlier versions) pauses execution at specific points:

def process_customer_data(customers):
    results = []
    for customer in customers:
        breakpoint()  # Execution pauses here
        # Inspect customer, results, etc.
        score = calculate_risk_score(customer)
        results.append(score)
    return results

def process_customer_data(customers):
    results = []
    for customer in customers:
        breakpoint()  # Execution pauses here
        # Inspect customer, results, etc.
        score = calculate_risk_score(customer)
        results.append(score)
    return results

When execution reaches the breakpoint, the debugger activates, letting you step through the loop iteration-by-iteration, examining how variables change with each customer.

Step-Through Debugging Commands enable fine-grained execution control:

n (next) – Execute current line, move to next line
s (step) – Step into function calls
r (return) – Continue until current function returns
c (continue) – Continue execution until next breakpoint

These commands let you watch your code execute line-by-line, observing exactly when and how values change—far more effective than guessing with print statements.

Conditional Breakpoints pause execution only when specific conditions occur:

def process_large_dataset(data):
    for i, row in enumerate(data):
        if i == 500 and row['status'] == 'error':
            breakpoint()  # Only pause at row 500 if status is error
        process_row(row)

def process_large_dataset(data):
    for i, row in enumerate(data):
        if i == 500 and row['status'] == 'error':
            breakpoint()  # Only pause at row 500 if status is error
        process_row(row)

This targeted debugging avoids manually stepping through hundreds of iterations to reach the problematic case.

Essential IPython Debugger Commands

%debug

Launch debugger after exception

p var

Print variable value

n

Execute next line

s

Step into functions

l

List source code

c

Continue execution

Advanced Magic Commands for Debugging

Beyond the basic debugger, IPython magic commands provide powerful tools for understanding code behavior, profiling performance, and diagnosing issues.

The %pdb Magic for Automatic Debugging automatically invokes the debugger whenever exceptions occur:

%pdb on  # Enable automatic debugging on exceptions

# Now any exception automatically launches debugger
def buggy_function(data):
    return data[10]  # IndexError automatically enters debugger

buggy_function([1, 2, 3])

%pdb on  # Enable automatic debugging on exceptions

# Now any exception automatically launches debugger
def buggy_function(data):
    return data[10]  # IndexError automatically enters debugger

buggy_function([1, 2, 3])

This saves typing %debug after every error during intensive debugging sessions. Disable with %pdb off when you no longer need automatic debugging.

Variable Inspection with %whos and %who displays all variables in the namespace:

%whos  # Detailed information: name, type, and representation
%who   # Simple list of variable names
%who_ls  # Returns list of variable names as Python list

%whos  # Detailed information: name, type, and representation
%who   # Simple list of variable names
%who_ls  # Returns list of variable names as Python list

This helps identify stale variables cluttering your namespace and reveals hidden state causing unexpected behavior. The detailed %whos output shows variable types, helping diagnose type-related bugs:

Variable    Type       Data/Info
---------------------------------
data        list       [1, 2, 3, 4, 5]
df          DataFrame  (100, 5) DataFrame
model       object     RandomForestClassifier object

Variable    Type       Data/Info
---------------------------------
data        list       [1, 2, 3, 4, 5]
df          DataFrame  (100, 5) DataFrame
model       object     RandomForestClassifier object

Timing Code with %time and %timeit identifies performance bottlenecks:

# Single execution timing
%time result = slow_function(large_dataset)

# Multiple executions for accurate measurement
%timeit fast_function(data)

# Single execution timing
%time result = slow_function(large_dataset)

# Multiple executions for accurate measurement
%timeit fast_function(data)

When code runs unexpectedly slowly, these commands pinpoint which operations consume time, guiding optimization efforts toward actual bottlenecks rather than premature optimization of fast code.

Line-by-Line Profiling with %lprun (requires line_profiler extension) shows exactly which lines consume time:

%load_ext line_profiler

def process_data(data):
    cleaned = clean_data(data)  # How long does this take?
    transformed = transform_data(cleaned)  # And this?
    validated = validate_data(transformed)  # And this?
    return validated

%lprun -f process_data process_data(my_data)

%load_ext line_profiler

def process_data(data):
    cleaned = clean_data(data)  # How long does this take?
    transformed = transform_data(cleaned)  # And this?
    validated = validate_data(transformed)  # And this?
    return validated

%lprun -f process_data process_data(my_data)

This generates detailed timing for every line, revealing that transform_data consumes 90% of execution time while other operations are fast—guiding you to focus optimization there.

Memory Profiling with %memit (requires memory_profiler) identifies memory-intensive operations:

%load_ext memory_profiler
%memit data_copy = large_dataframe.copy()

%load_ext memory_profiler
%memit data_copy = large_dataframe.copy()

This helps diagnose out-of-memory errors by showing which operations allocate excessive memory.

Systematic Debugging Workflow

Professional debugging follows systematic approaches rather than random experimentation. A structured workflow increases debugging efficiency and prevents overlooking root causes.

The Scientific Method Applied to Debugging treats bugs as experiments requiring hypothesis testing:

Observe the symptom – What exactly is going wrong? Document unexpected behavior precisely.
Form hypotheses – What could cause this behavior? Generate multiple possible explanations.
Design tests – How can you confirm or reject each hypothesis?
Execute tests – Run experiments to gather evidence.
Analyze results – Does evidence support or contradict hypotheses?
Iterate – Refine hypotheses based on results and repeat.

For example, if a model produces poor accuracy:

Hypothesis 1: Data quality issues (missing values, outliers)
Test: Check data statistics, visualize distributions
Hypothesis 2: Incorrect feature engineering
Test: Inspect engineered features, verify transformations
Hypothesis 3: Wrong model parameters
Test: Try default parameters, compare with baseline models

Isolating Problems Through Binary Search efficiently narrows down error locations in long notebooks. If a notebook with 50 cells fails, rather than checking each cell sequentially:

Run first 25 cells – does the problem occur?
If yes, problem is in first 25; if no, it’s in last 25
Divide the problematic section in half and repeat
Continue until you’ve isolated the specific cell

This binary search approach finds bugs in log(n) steps rather than n steps—finding a bug in a 64-cell notebook takes 6 tests instead of potentially 64.

Creating Minimal Reproducible Examples clarifies bugs by stripping away irrelevant code. When you encounter mysterious behavior:

Copy the problematic code to a fresh notebook
Remove unrelated code
Replace complex data with simple test cases
Simplify logic while preserving the bug
Continue until you have the minimum code exhibiting the problem

This process often reveals the bug through simplification. The act of creating minimal examples forces clear thinking about what code actually does versus what you think it does.

Rubber Duck Debugging leverages explanation as a debugging tool. Explain your code line-by-line to an inanimate object (traditionally a rubber duck). The act of verbalizing logic often reveals faulty assumptions or overlooked steps. In notebooks, write markdown cells explaining your code in detail—this documentation helps you and serves as permanent explanation for others.

Managing Notebook State for Reliable Debugging

Many notebook bugs stem from confused state management. Disciplined practices prevent these issues and simplify debugging when they occur.

The Golden Rule: Restart and Run All should be your frequent reality check. Click “Kernel → Restart & Run All” regularly to verify your notebook executes correctly from a clean state. This test reveals hidden dependencies on out-of-order execution or deleted cells. If “Restart & Run All” fails, you have state management issues requiring resolution before proceeding.

Make this a habit:

Before committing notebooks to version control
Before sharing notebooks with colleagues
After significant refactoring
When debugging mysterious behavior that might stem from stale state

Clearing Outputs Periodically removes visual clutter and reduces notebook file size, but more importantly, forces you to re-execute cells to see results, ensuring code still works:

# Clear all outputs: Cell → All Output → Clear

# Clear all outputs: Cell → All Output → Clear

This practice prevents relying on cached outputs that no longer match current code—a common source of confusion during debugging.

Explicit Variable Cleanup removes variables you no longer need:

# Delete specific variables
del temporary_data, intermediate_results

# Clear everything (use cautiously)
%reset  # Prompts for confirmation
%reset -f  # Force reset without confirmation

# Delete specific variables
del temporary_data, intermediate_results

# Clear everything (use cautiously)
%reset  # Prompts for confirmation
%reset -f  # Force reset without confirmation

This prevents accidental reference to old variables and keeps your namespace clean, making debugging easier by eliminating false leads from stale data.

Defensive Copying prevents mutable object mutations:

# Wrong - creates reference
working_data = original_data

# Right - creates independent copy
import copy
working_data = copy.deepcopy(original_data)

# For pandas DataFrames
working_df = original_df.copy()

# Wrong - creates reference
working_data = original_data

# Right - creates independent copy
import copy
working_data = copy.deepcopy(original_data)

# For pandas DataFrames
working_df = original_df.copy()

This simple practice prevents entire categories of subtle bugs where modifications to one variable unexpectedly affect another.

Module Reloading ensures imported code stays current:

import importlib
import my_module

# After modifying my_module.py
importlib.reload(my_module)

import importlib
import my_module

# After modifying my_module.py
importlib.reload(my_module)

Or enable automatic reloading at the start of notebooks:

%load_ext autoreload
%autoreload 2  # Reload all modules before executing code

%load_ext autoreload
%autoreload 2  # Reload all modules before executing code

This eliminates confusion when module changes don’t appear in the notebook due to Python’s module caching.

Strategic Print Statement Debugging

Despite sophisticated debugging tools, strategic print statements remain valuable for understanding program flow and state evolution, especially in complex data processing pipelines.

Structured Print Debugging provides more information than simple print statements:

def process_customer(customer_id, data):
    print(f"[DEBUG] Processing customer: {customer_id}")
    print(f"[DEBUG] Input data shape: {data.shape}")
    print(f"[DEBUG] Input data columns: {data.columns.tolist()}")
    
    result = complex_transformation(data)
    
    print(f"[DEBUG] Output shape: {result.shape}")
    print(f"[DEBUG] Output sample:\n{result.head()}")
    
    return result

def process_customer(customer_id, data):
    print(f"[DEBUG] Processing customer: {customer_id}")
    print(f"[DEBUG] Input data shape: {data.shape}")
    print(f"[DEBUG] Input data columns: {data.columns.tolist()}")
    
    result = complex_transformation(data)
    
    print(f"[DEBUG] Output shape: {result.shape}")
    print(f"[DEBUG] Output sample:\n{result.head()}")
    
    return result

The [DEBUG] prefix makes these statements easy to find and remove later. Include context like function names, variable names, and data characteristics rather than just values.

Assertion-Based Debugging catches problems early:

def calculate_discount(price, discount_rate):
    assert price > 0, f"Price must be positive, got {price}"
    assert 0 <= discount_rate <= 1, f"Discount rate must be 0-1, got {discount_rate}"
    
    discount = price * discount_rate
    final_price = price - discount
    
    assert final_price >= 0, f"Final price cannot be negative: {final_price}"
    return final_price

def calculate_discount(price, discount_rate):
    assert price > 0, f"Price must be positive, got {price}"
    assert 0 <= discount_rate <= 1, f"Discount rate must be 0-1, got {discount_rate}"
    
    discount = price * discount_rate
    final_price = price - discount
    
    assert final_price >= 0, f"Final price cannot be negative: {final_price}"
    return final_price

Assertions document assumptions and fail loudly when violated, preventing silent corruption that manifests as mysterious bugs far from the root cause.

Logging Over Print Statements for production-like notebooks:

import logging

logging.basicConfig(level=logging.DEBUG, 
                   format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

def process_data(data):
    logger.info(f"Processing {len(data)} records")
    logger.debug(f"Data columns: {data.columns.tolist()}")
    
    try:
        result = transform_data(data)
        logger.info("Transformation successful")
        return result
    except Exception as e:
        logger.error(f"Transformation failed: {str(e)}")
        raise

import logging

logging.basicConfig(level=logging.DEBUG, 
                   format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

def process_data(data):
    logger.info(f"Processing {len(data)} records")
    logger.debug(f"Data columns: {data.columns.tolist()}")
    
    try:
        result = transform_data(data)
        logger.info("Transformation successful")
        return result
    except Exception as e:
        logger.error(f"Transformation failed: {str(e)}")
        raise

Logging provides better control than print statements—you can adjust verbosity levels, redirect output to files, and add structured information. The DEBUG, INFO, WARNING, ERROR hierarchy lets you control detail level without modifying code.

Debugging Checklist

✓ Check Execution Numbers: Ensure cells executed in sequential order from top to bottom

✓ Restart & Run All: Verify notebook executes correctly from clean state

✓ Inspect Variable State: Use %whos to see all variables and their types

✓ Use Debugger: Launch %debug on exceptions instead of guessing with prints

✓ Create Minimal Examples: Isolate bugs by simplifying code to essentials

✓ Add Assertions: Document assumptions and catch violations early

Debugging Data-Specific Issues

Data science debugging often involves data-related problems distinct from traditional software bugs. These issues require specialized approaches.

Detecting Data Quality Problems systematically:

def diagnose_data_quality(df):
    """Comprehensive data quality report."""
    print("=== Data Quality Report ===\n")
    
    print(f"Shape: {df.shape}")
    print(f"Memory usage: {df.memory_usage(deep=True).sum() / 1024**2:.2f} MB\n")
    
    print("Missing values:")
    missing = df.isnull().sum()
    print(missing[missing > 0])
    print()
    
    print("Data types:")
    print(df.dtypes.value_counts())
    print()
    
    print("Duplicate rows:", df.duplicated().sum())
    print()
    
    print("Numerical column statistics:")
    print(df.describe())
    
    # Check for infinite values
    numeric_cols = df.select_dtypes(include=[np.number]).columns
    for col in numeric_cols:
        inf_count = np.isinf(df[col]).sum()
        if inf_count > 0:
            print(f"WARNING: {col} contains {inf_count} infinite values")

def diagnose_data_quality(df):
    """Comprehensive data quality report."""
    print("=== Data Quality Report ===\n")
    
    print(f"Shape: {df.shape}")
    print(f"Memory usage: {df.memory_usage(deep=True).sum() / 1024**2:.2f} MB\n")
    
    print("Missing values:")
    missing = df.isnull().sum()
    print(missing[missing > 0])
    print()
    
    print("Data types:")
    print(df.dtypes.value_counts())
    print()
    
    print("Duplicate rows:", df.duplicated().sum())
    print()
    
    print("Numerical column statistics:")
    print(df.describe())
    
    # Check for infinite values
    numeric_cols = df.select_dtypes(include=[np.number]).columns
    for col in numeric_cols:
        inf_count = np.isinf(df[col]).sum()
        if inf_count > 0:
            print(f"WARNING: {col} contains {inf_count} infinite values")

Run this diagnostic function when unexpected model behavior or transformation errors occur. Data quality issues cause the majority of data science bugs.

Visualizing Data Distributions reveals subtle problems:

import matplotlib.pyplot as plt
import seaborn as sns

def visualize_distributions(df):
    """Plot distributions for all numeric columns."""
    numeric_cols = df.select_dtypes(include=[np.number]).columns
    
    fig, axes = plt.subplots(len(numeric_cols), 2, figsize=(12, 4*len(numeric_cols)))
    
    for i, col in enumerate(numeric_cols):
        # Histogram
        axes[i, 0].hist(df[col].dropna(), bins=50, edgecolor='black')
        axes[i, 0].set_title(f'{col} - Distribution')
        axes[i, 0].set_xlabel(col)
        
        # Box plot
        axes[i, 1].boxplot(df[col].dropna())
        axes[i, 1].set_title(f'{col} - Outliers')
        axes[i, 1].set_ylabel(col)
    
    plt.tight_layout()
    plt.show()

import matplotlib.pyplot as plt
import seaborn as sns

def visualize_distributions(df):
    """Plot distributions for all numeric columns."""
    numeric_cols = df.select_dtypes(include=[np.number]).columns
    
    fig, axes = plt.subplots(len(numeric_cols), 2, figsize=(12, 4*len(numeric_cols)))
    
    for i, col in enumerate(numeric_cols):
        # Histogram
        axes[i, 0].hist(df[col].dropna(), bins=50, edgecolor='black')
        axes[i, 0].set_title(f'{col} - Distribution')
        axes[i, 0].set_xlabel(col)
        
        # Box plot
        axes[i, 1].boxplot(df[col].dropna())
        axes[i, 1].set_title(f'{col} - Outliers')
        axes[i, 1].set_ylabel(col)
    
    plt.tight_layout()
    plt.show()

Unexpected distributions—extreme skewness, unexpected outliers, bimodal patterns—often explain model failures or transformation bugs.

Tracking Data Through Transformations documents how data changes:

def track_transformation(df, operation_name):
    """Print data characteristics before/after transformations."""
    print(f"\n=== {operation_name} ===")
    print(f"Shape: {df.shape}")
    print(f"Missing values: {df.isnull().sum().sum()}")
    print(f"Numeric columns: {len(df.select_dtypes(include=[np.number]).columns)}")
    return df

# Usage in pipeline
df = load_data()
df = track_transformation(df, "Initial Load")
df = remove_duplicates(df)
df = track_transformation(df, "After Deduplication")
df = handle_missing_values(df)
df = track_transformation(df, "After Missing Value Handling")

def track_transformation(df, operation_name):
    """Print data characteristics before/after transformations."""
    print(f"\n=== {operation_name} ===")
    print(f"Shape: {df.shape}")
    print(f"Missing values: {df.isnull().sum().sum()}")
    print(f"Numeric columns: {len(df.select_dtypes(include=[np.number]).columns)}")
    return df

# Usage in pipeline
df = load_data()
df = track_transformation(df, "Initial Load")
df = remove_duplicates(df)
df = track_transformation(df, "After Deduplication")
df = handle_missing_values(df)
df = track_transformation(df, "After Missing Value Handling")

This tracking identifies exactly where unexpected data changes occur, pinpointing problematic transformation steps.

Conclusion

Mastering debugging in Jupyter notebooks requires understanding both universal debugging principles and notebook-specific considerations. The tools and techniques covered—from the IPython debugger and magic commands through systematic debugging workflows and state management practices—transform debugging from frustrating trial-and-error into methodical problem-solving. Professional debugging in notebooks combines these technical tools with disciplined practices like frequent “Restart & Run All” testing, creating minimal reproducible examples, and maintaining clean namespace state.

The most important shift in becoming a proficient notebook debugger is moving from reactive debugging—scrambling to fix broken code—to proactive debugging that prevents issues through good practices, catches problems early through assertions and validation, and resolves issues efficiently through systematic approaches when they occur. Invest time learning the IPython debugger thoroughly, make “Restart & Run All” a habit, and approach debugging scientifically with hypotheses and tests rather than random code changes. These practices compound over time, making you exponentially more effective at maintaining high-quality, reliable notebook-based analyses that deliver insights rather than frustration.