Data & ReproData Analysis & VisualizationK-Dense-AI/claude-scientific-skillsData & Reproduction
VA

Vaex

Maintainer K-Dense Inc. · Last updated March 31, 2026

Vaex is a high-performance Python library designed for la.

Claude CodeOpenClawNanoClawAnalysisWritingvaexdata-analysispackagedata analysis & visualization

Original source

K-Dense-AI/claude-scientific-skills

https://github.com/K-Dense-AI/claude-scientific-skills/tree/main/scientific-skills/vaex

Maintainer
K-Dense Inc.
License
MIT license
Last updated
March 31, 2026

Skill Snapshot

Key Details From SKILL.md

2 min

Key Notes

  • Vaex is a high-performance Python library designed for lazy, out-of-core DataFrames to process and visualize tabular datasets that are too large to fit into RAM. Vaex can process over a billion rows per second, enabling interactive data exploration and analysis on datasets with billions of rows.
  • df = vaex.open('large_file.hdf5') # or.csv,.arrow,.parquet.

Source Doc

Excerpt From SKILL.md

When to Use This Skill

Use Vaex when:

  • Processing tabular datasets larger than available RAM (gigabytes to terabytes)
  • Performing fast statistical aggregations on massive datasets
  • Creating visualizations and heatmaps of large datasets
  • Building machine learning pipelines on big data
  • Converting between data formats (CSV, HDF5, Arrow, Parquet)
  • Needing lazy evaluation and virtual columns to avoid memory overhead
  • Working with astronomical data, financial time series, or other large-scale scientific datasets

Core Capabilities

Vaex provides six primary capability areas, each documented in detail in the references directory:

1. DataFrames and Data Loading

Load and create Vaex DataFrames from various sources including files (HDF5, CSV, Arrow, Parquet), pandas DataFrames, NumPy arrays, and dictionaries. Reference references/core_dataframes.md for:

  • Opening large files efficiently
  • Converting from pandas/NumPy/Arrow
  • Working with example datasets
  • Understanding DataFrame structure

Use cases

  • Processing tabular datasets larger than available RAM (gigabytes to terabytes).
  • Performing fast statistical aggregations on massive datasets.
  • Creating visuali.

Not for

  • Do not rely on this catalog entry alone for installation or maintenance details.

Related skills

Related skills

Back to directory
DA
Data & ReproData Analysis & Visualization

Dask

Dask is a Python library for parallel and distributed computing that enables three critical capabilities: - **Larger-than-memory execution**…

Claude CodeOpenClawAnalysis
K-Dense-AI/claude-scientific-skillsView
EX
Data & ReproData Analysis & Visualization

Exploratory Data Analysis

Perform comprehensive exploratory data analysis (EDA) on scientific data files across multiple domains. This skill provides automated file t…

Claude CodeOpenClawAnalysis
K-Dense-AI/claude-scientific-skillsView
GE
Data & ReproData Analysis & Visualization

GeoPandas

GeoPandas extends pandas to enable spatial operations on geometric types. It combines the capabilities of pandas and shapely for geospatial…

Claude CodeOpenClawAnalysis
K-Dense-AI/claude-scientific-skillsView
NE
Data & ReproData Analysis & Visualization

NetworkX

NetworkX is a Python package for creating, manipulating, and analy.

Claude CodeOpenClawAnalysis
K-Dense-AI/claude-scientific-skillsView