Data & ReproStatistics & Data AnalysisK-Dense-AI/claude-scientific-skillsData & Reproduction
VA

Vaex

Maintainer K-Dense Inc. · Last updated April 1, 2026

Vaex is a high-performance Python library designed for la.

Claude CodeOpenClawNanoClawAnalysisReproductionvaexdata-analysispackagedata analysis & visualization

Original source

K-Dense-AI/claude-scientific-skills

https://github.com/K-Dense-AI/claude-scientific-skills/tree/main/scientific-skills/vaex

Maintainer
K-Dense Inc.
License
MIT license
Last updated
April 1, 2026

Skill Snapshot

Key Details From SKILL.md

2 min

Key Notes

  • Vaex is a high-performance Python library designed for lazy, out-of-core DataFrames to process and visualize tabular datasets that are too large to fit into RAM. Vaex can process over a billion rows per second, enabling interactive data exploration and analysis on datasets with billions of rows.
  • df = vaex.open('large_file.hdf5') # or.csv,.arrow,.parquet.

Source Doc

Excerpt From SKILL.md

When to Use This Skill

Use Vaex when:

  • Processing tabular datasets larger than available RAM (gigabytes to terabytes)
  • Performing fast statistical aggregations on massive datasets
  • Creating visualizations and heatmaps of large datasets
  • Building machine learning pipelines on big data
  • Converting between data formats (CSV, HDF5, Arrow, Parquet)
  • Needing lazy evaluation and virtual columns to avoid memory overhead
  • Working with astronomical data, financial time series, or other large-scale scientific datasets

Core Capabilities

Vaex provides six primary capability areas, each documented in detail in the references directory:

1. DataFrames and Data Loading

Load and create Vaex DataFrames from various sources including files (HDF5, CSV, Arrow, Parquet), pandas DataFrames, NumPy arrays, and dictionaries. Reference references/core_dataframes.md for:

  • Opening large files efficiently
  • Converting from pandas/NumPy/Arrow
  • Working with example datasets
  • Understanding DataFrame structure

Use cases

  • Processing tabular datasets larger than available RAM (gigabytes to terabytes).
  • Performing fast statistical aggregations on massive datasets.
  • Creating visuali.

Not for

  • Do not rely on this catalog entry alone for installation or maintenance details.

Related skills

Related skills

Back to directory
AE
Data & ReproStatistics & Data Analysis

aeon

Aeon is a scikit-learn compatible Python toolkit for time series machine learning. It provides state-of-the-art algorithms for classificatio…

Claude CodeOpenClawAnalysis
K-Dense-AI/claude-scientific-skillsView
AR
Data & ReproStatistics & Data Analysis

arxiv-database

This skill provides Python tools for searching and retrieving preprints from arXiv.org via its public Atom API. It supports keyword search,…

Claude CodeAnalysis
K-Dense-AI/claude-scientific-skillsView
BI
Data & ReproStatistics & Data Analysis

bio-causal-genomics-fine-mapping

Fine-mapping narrows GWAS association signals to identify likely causal variants. Key outputs: - **PIP** (Posterior Inclusion Probability) -…

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView
BI
Data & ReproStatistics & Data Analysis

bio-crispr-screens-base-editing-analysis

Analyzes base editing and prime editing outcomes including editing efficiency, bystander edits, and indel frequencies. Use when quantifying…

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView