How to Use loc in Pandas DataFrame

Pandas is one of the most widely used libraries in Python for data analysis and manipulation. It provides powerful tools to handle structured data efficiently. Among these tools, the .loc[] indexer is essential for accessing and modifying specific parts of a DataFrame.

In this article, we’ll explore how to use loc in pandas DataFrame for row and column selection, slicing, filtering, updating values, and more. Whether you’re a beginner or intermediate user, mastering loc can significantly enhance your data manipulation skills in pandas.

What is loc in Pandas?

The loc function is a label-based indexer used to access a group of rows and columns by labels or boolean arrays. Unlike integer-based selection with iloc, loc focuses on label-based access.

Here is the basic syntax:

df.loc[<row_label>, <column_label>]
  • <row_label> can be a single label, list of labels, slice object, or boolean array
  • <column_label> can be the same types as row

Let’s look at examples for each of these use cases.

Setting Up the Example DataFrame

Before diving into examples, let’s create a simple pandas DataFrame that we’ll use throughout this tutorial.

import pandas as pd

data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
'Age': [25, 30, 35, 40, 45],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix']
}

df = pd.DataFrame(data, index=['A', 'B', 'C', 'D', 'E'])

This DataFrame looks like:

Name     Age         City
A Alice 25 New York
B Bob 30 Los Angeles
C Charlie 35 Chicago
D David 40 Houston
E Eva 45 Phoenix

Selecting Rows with loc

Selecting a Single Row

To select a single row by its label:

df.loc['A']

This will return:

Name    Alice
Age 25
City New York
Name: A, dtype: object

Selecting Multiple Rows

You can select multiple rows using a list of labels:

df.loc[['A', 'C', 'E']]

This will return a DataFrame with rows A, C, and E.

Slicing Rows

Use a slice of index labels to select a range of rows:

df.loc['B':'D']

Unlike regular Python slicing, loc includes both the start and end labels in the output.

Selecting Columns with loc

Selecting a Single Column

df.loc[:, 'Name']

This selects all rows but only the ‘Name’ column.

Selecting Multiple Columns

df.loc[:, ['Name', 'Age']]

This returns a DataFrame with only the ‘Name’ and ‘Age’ columns.

Selecting Specific Rows and Columns

To select specific rows and specific columns:

df.loc[['A', 'C'], ['Name', 'City']]

This returns the ‘Name’ and ‘City’ for rows A and C.

Using Boolean Conditions with loc

One of the most powerful uses of loc is filtering rows based on conditions.

Filter Rows Based on Column Value

df.loc[df['Age'] > 30]

This returns all rows where the ‘Age’ is greater than 30.

Filter and Select Specific Columns

df.loc[df['Age'] > 30, ['Name', 'City']]

This filters the rows and returns only the ‘Name’ and ‘City’ columns.

Combine Multiple Conditions

Use bitwise operators & (and), | (or), and ~ (not) with parentheses.

df.loc[(df['Age'] > 30) & (df['City'] != 'Chicago')]

This filters rows where age is over 30 and city is not Chicago.

Updating Values with loc

You can also modify values in a DataFrame using loc.

Update a Single Value

df.loc['A', 'Age'] = 26

This changes Alice’s age to 26.

Update an Entire Row

df.loc['B'] = ['Bobby', 31, 'San Francisco']

This updates the entire row for index B.

Update Multiple Rows Conditionally

df.loc[df['Age'] > 40, 'City'] = 'Unknown'

This sets the ‘City’ to ‘Unknown’ for everyone older than 40.

Adding New Columns with loc

You can add a new column using loc as well.

df.loc[:, 'Senior'] = df['Age'] > 35

This adds a ‘Senior’ column with boolean values.

Using loc with Index Reset

If your DataFrame doesn’t have custom index labels, you can still use loc with default integer-based labels, which are treated as strings when using loc.

Alternatively, you can reset the index and use numeric access like:

df_reset = df.reset_index()
df_reset.loc[0]

Differences Between loc and iloc

Featurelociloc
Based onLabelsInteger positions
Slice behaviorInclusive of both start & endExclusive of end index
Usagedf.loc['A']df.iloc[0]

Knowing when to use loc vs iloc depends on whether your DataFrame uses meaningful index labels or default integers.

Common Errors and How to Avoid Them

KeyError

If the row or column label doesn’t exist, you’ll get a KeyError.

Fix: Always check df.index and df.columns or use df.get() if unsure.

Mixing iloc with loc

Avoid combining loc and iloc in a single expression.

Best Practices for Using loc

  • Use clear and descriptive index labels when creating DataFrames
  • Use slicing with loc for readability and maintainability
  • Combine filtering and column selection in a single loc call for efficiency
  • Avoid chained indexing like df['col'][cond] = value—use loc instead

Conclusion

The loc function in pandas is an essential tool for label-based access and assignment in DataFrames. With it, you can select, filter, update, and manipulate your data more effectively. Mastering loc unlocks powerful data transformation capabilities and helps you write cleaner, more efficient code.

If you’re working with pandas regularly, practicing different loc patterns will not only save time but also prevent common bugs in your data pipelines. Happy coding!

Leave a Comment