dask Archives - ML Journey

How to Use Dask for Scaling Pandas Workflows

September 8, 2025July 22, 2025 by Peter Song

Pandas has become the go-to library for data manipulation and analysis in Python, but as datasets grow beyond what can fit comfortably in memory, performance bottlenecks emerge. This is where Dask comes in – a flexible parallel computing library that extends the familiar Pandas API to work with larger-than-memory datasets across multiple cores or even … Read more

Polars vs. Dask for Large-Scale Data Processing in Python

July 4, 2025November 19, 2024 by Peter Song

Efficiently processing large datasets is a cornerstone of modern data science and analytics. Python, being a popular language in these domains, offers several tools for handling big data, with Polars and Dask standing out as prominent libraries. While both serve similar purposes, they cater to different needs based on their architecture, performance, and scalability. In … Read more