Kaggle is one of the most popular platforms for data science and machine learning enthusiasts. It offers a vast collection of datasets that can be used for research, projects, and competitions. If you’re new to Kaggle, you might be wondering: How to download a dataset from Kaggle?
Downloading datasets from Kaggle is straightforward, and there are multiple ways to do it, including using the Kaggle website and the Kaggle API. In this guide, we’ll walk you through both methods step by step.
Method 1: Downloading a Dataset from the Kaggle Website (Manual Download)
This is the simplest way to download datasets from Kaggle. Follow these steps:
Step 1: Create a Kaggle Account
If you don’t already have a Kaggle account, you need to create one:
- Go to Kaggle.
- Click on Sign Up and create an account using Google, Facebook, or email.
Step 2: Find the Dataset
- After logging in, click on Datasets in the top navigation bar.
- Use the search bar to find a dataset related to your project.
- Click on the dataset you want to download.
Step 3: Download the Dataset
- Click on the Download button on the right-hand side.
- The dataset will be downloaded as a ZIP file.
- Extract the ZIP file to access the dataset in CSV, JSON, or other formats.
✅ Best for: Beginners who want to download datasets quickly without using code.
Method 2: Downloading a Dataset Using Kaggle API (Recommended for Automation)
If you frequently work with Kaggle datasets, the Kaggle API is a better option. It allows you to download datasets directly from the command line or within a script.
Step 1: Install Kaggle API
Before using the API, install the Kaggle Python package:
pip install kaggle
Step 2: Get Your Kaggle API Key
- Go to your Kaggle account settings (https://www.kaggle.com/account).
- Scroll down to the API section and click on Create New API Token.
- A
kaggle.jsonfile will be downloaded. This file contains your API credentials.
Step 3: Move the API Key to the Correct Location
Move the kaggle.json file to the following directory:
- Windows:
C:\Users\your-username\.kaggle\ - Mac/Linux:
~/.kaggle/
Make sure the file has the correct permissions:
chmod 600 ~/.kaggle/kaggle.json
Step 4: Find the Dataset Name
- Go to the dataset page on Kaggle.
- Copy the dataset’s URL. For example, if the URL is:
https://www.kaggle.com/datasets/username/dataset-nameThe dataset identifier isusername/dataset-name.
Step 5: Download the Dataset Using the API
Run the following command:
kaggle datasets download -d username/dataset-name
This will download a ZIP file. Extract it using:
unzip dataset-name.zip
✅ Best for: Developers, data scientists, and automation workflows.
Bonus: Downloading Kaggle Datasets Directly into Jupyter Notebook
If you are using a Jupyter Notebook or Google Colab, you can download datasets directly within Python:
import os
os.system("kaggle datasets download -d username/dataset-name")
Or using the Kaggle API within Python:
from kaggle.api.kaggle_api_extended import KaggleApi
api = KaggleApi()
api.authenticate()
api.dataset_download_files('username/dataset-name', path='.', unzip=True)
This downloads and extracts the dataset directly into your working directory.
Frequently Asked Questions (FAQs)
1. Do I need a Kaggle account to download datasets?
Yes, you must create a Kaggle account and agree to the dataset’s terms before downloading.
2. Can I download private datasets from Kaggle?
No, private datasets require permission from the dataset owner.
3. What file formats are Kaggle datasets available in?
Most Kaggle datasets are available in CSV, JSON, Excel, or ZIP formats.
4. Can I download an entire Kaggle competition dataset?
Yes, but you must accept the competition rules before using the API to download competition data:
kaggle competitions download -c competition-name
5. How can I automate dataset downloads for daily updates?
Use a cron job (Linux/Mac) or Task Scheduler (Windows) to run the Kaggle API command on a schedule.
Conclusion
Downloading datasets from Kaggle is easy, whether you use the manual download method or the Kaggle API. If you’re just getting started, the website method is the simplest way to download datasets. However, for automation and large-scale projects, the Kaggle API is the best option.
Now that you know how to download datasets from Kaggle, you can start exploring and working on real-world data science projects. Need help with a specific dataset? Let us know in the comments!