Exploratory Data Analysis
Perform comprehensive exploratory data analysis (EDA) on scientific data files across multiple domains. This skill provides automated file t…
Maintainer K-Dense Inc. · Last updated March 31, 2026
Dask is a Python library for parallel and distributed computing that enables three critical capabilities: - **Larger-than-memory execution** on single machines for data exceeding available RAM - **Parallel processing** for improved computational speed across multiple cores - **Distributed computation** supporting terabyte-scale datasets across multiple machines Dask scales from laptops (processing ~100 GiB) to clust….
Original source
https://github.com/K-Dense-AI/claude-scientific-skills/tree/main/scientific-skills/dask
Skill Snapshot
Source Doc
This skill should be used when:
Dask provides five main components, each suited to different use cases:
Purpose: Scale pandas operations to larger datasets through parallel processing.
When to Use:
Reference Documentation: For comprehensive guidance on Dask DataFrames, refer to references/dataframes.md which includes:
map_partitionsQuick Example:
import dask.dataframe as dd
Related skills
Perform comprehensive exploratory data analysis (EDA) on scientific data files across multiple domains. This skill provides automated file t…
GeoPandas extends pandas to enable spatial operations on geometric types. It combines the capabilities of pandas and shapely for geospatial…
NetworkX is a Python package for creating, manipulating, and analy.
Polars is a lightning-fast DataFrame library for Python and Rust built on Apache Arrow. Work with Polars' expression-based API, la.