Pandas is a powerful Python library used for data manipulation and analysis. One of the key features of pandas is the ability to access and manipulate data efficiently using indexers like iloc. The iloc indexer allows you to access rows and columns by integer positions, which is essential when working with datasets that have numeric indices or when label-based access is not practical.
In this comprehensive guide, we’ll explore how to use iloc in pandas DataFrame with real-world examples. We’ll walk through selecting specific rows and columns, slicing, filtering, updating values, and comparing iloc to its counterpart loc.
What is iloc in Pandas?
iloc stands for “integer location” and is used to access rows and columns in a DataFrame by their numerical index positions.
Syntax of iloc
df.iloc[row_position, column_position]
row_position: Integer, list of integers, slice, or boolean arraycolumn_position: Integer, list of integers, slice, or boolean array
Both indices are zero-based, meaning counting starts from 0.
Creating a Sample DataFrame
Let’s start by creating a sample DataFrame to use in the examples throughout this guide.
import pandas as pd
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
'Age': [25, 30, 35, 40, 45],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix']
}
df = pd.DataFrame(data)
The resulting DataFrame looks like:
Name Age City
0 Alice 25 New York
1 Bob 30 Los Angeles
2 Charlie 35 Chicago
3 David 40 Houston
4 Eva 45 Phoenix
Selecting Rows Using iloc
Selecting a Single Row
To select the first row (index 0):
df.iloc[0]
Output:
Name Alice
Age 25
City New York
Name: 0, dtype: object
Selecting Multiple Rows
You can select multiple rows by passing a list of indices:
df.iloc[[0, 2, 4]]
This returns rows at positions 0, 2, and 4.
Slicing Rows
Slicing works similarly to Python’s list slicing:
df.iloc[1:4]
This selects rows 1 through 3 (excluding 4).
Selecting Columns Using iloc
Selecting a Single Column
To select the first column (index 0) for all rows:
df.iloc[:, 0]
This returns the ‘Name’ column.
Selecting Multiple Columns
df.iloc[:, [0, 2]]
This selects the ‘Name’ and ‘City’ columns.
Slicing Columns
df.iloc[:, 1:]
This selects all columns starting from the second column (‘Age’ and ‘City’).
Selecting Rows and Columns Together
Selecting Specific Row and Column
To select the value in the second row and third column:
df.iloc[1, 2]
Returns:
'Los Angeles'
Selecting Multiple Rows and Columns
df.iloc[[1, 3], [0, 2]]
This selects the ‘Name’ and ‘City’ of rows 1 and 3.
Using Slices for Rows and Columns
df.iloc[0:3, 1:3]
Selects rows 0 to 2 and columns 1 to 2 (‘Age’ and ‘City’).
Using Boolean Arrays with iloc
Although iloc doesn’t support boolean conditions like df['Age'] > 30 directly, you can filter with np.where() or boolean arrays.
Example using NumPy:
import numpy as np
indices = np.where(df['Age'] > 30)[0]
df.iloc[indices]
This selects rows where ‘Age’ is greater than 30 using positional indices.
Updating Values with iloc
Updating a Single Value
df.iloc[0, 1] = 26
This updates Alice’s age from 25 to 26.
Updating a Row
df.iloc[2] = ['Charles', 36, 'San Diego']
This updates row 2 with new values.
Updating a Column
df.iloc[:, 1] = [20, 25, 30, 35, 40]
This updates the ‘Age’ column for all rows.
Adding New Columns or Rows
You can’t use iloc directly to add a new row or column, but you can add them first and then access them with iloc.
Adding a Column
df['Senior'] = df['Age'] > 35
Now you can use:
df.iloc[:, 3]
Adding a Row
To add a new row:
df.loc[len(df)] = ['Frank', 50, 'Seattle', True]
Then access it:
df.iloc[-1]
Comparison: iloc vs loc
| Feature | iloc | loc |
|---|---|---|
| Based on | Integer index positions | Label-based access |
| Slicing | Excludes stop index | Includes stop index |
| Use case | Numerical index access | Named index or label access |
| Example | df.iloc[0:2, 1] | df.loc['row1':'row2', 'col1'] |
Use iloc when you know the numerical position of your rows and columns. Use loc when you want to access data by name or label.
Common Errors and Troubleshooting
IndexError
Trying to access an index that’s out of range will throw an IndexError.
Solution: Use df.shape to check the valid index range.
print(df.shape) # (5, 4)
So valid indices are 0 to 4 for rows, and 0 to 3 for columns.
Boolean Filtering with iloc
You might try something like:
df.iloc[df['Age'] > 30]
This will throw an error. You need to use label-based indexing (loc) for boolean masks, or convert them to positional indices.
Practical Examples
Selecting First 3 Rows and Last 2 Columns
df.iloc[0:3, -2:]
Updating the First Row’s Age and City
df.iloc[0, 1] = 28
df.iloc[0, 2] = 'Boston'
Dropping a Row Using Index
To drop the 3rd row:
df.drop(df.index[2], inplace=True)
Conclusion
The iloc indexer in pandas is an essential tool for anyone working with tabular data in Python. It allows for fast and precise access to rows and columns using numerical indices, which is especially helpful when working with DataFrames that have default or numeric indices.
By mastering iloc, you’ll gain fine-grained control over your data, enabling efficient slicing, updating, and extraction. Practice the examples in this guide, and you’ll be equipped to handle most data manipulation tasks using iloc in pandas DataFrames.
Whether you’re cleaning data, analyzing trends, or preparing datasets for machine learning, iloc is a reliable and powerful part of your pandas toolkit.