AutomationLab Automation & Research InformaticsK-Dense-AI/claude-scientific-skillsData & Reproduction
LA

LaminDB

Maintainer K-Dense Inc. · Last updated April 1, 2026

LaminDB is an open-source data framework for biology designed to make data queryable, traceable, reproducible, and FAIR (Findable, Accessible, Interoperable, Reusable). It provides a unified platform that combines lakehouse architecture, lineage tracking, feature stores, biological ontologies, LIMS (Laboratory Information Management System), and ELN (Electronic Lab Notebook) capabilities through a single Python API.….

Claude CodeOpenClawNanoClawAnalysisReproductionlamindbdata-infrastructurepackagedata management & infrastructure

Original source

K-Dense-AI/claude-scientific-skills

https://github.com/K-Dense-AI/claude-scientific-skills/tree/main/scientific-skills/lamindb

Maintainer
K-Dense Inc.
License
Apache-2.0 license
Last updated
April 1, 2026

Skill Snapshot

Key Details From SKILL.md

2 min

Key Notes

  • Queryability: Search and filter datasets by metadata, features, and ontology terms.
  • Traceability: Automatic lineage tracking from raw data through analysis to results.
  • Reproducibility: Version control for data, code, and environment.
  • FAIR Compliance: Standardized annotations using biological ontologies.
  • LaminDB is an open-source data framework for biology designed to make data queryable, traceable, reproducible, and FAIR (Findable, Accessible, Interoperable, Reusable). It provides a unified platform that combines lakehouse architecture, lineage tracking, feature stores, biological ontologies, LIMS (Laboratory Information Management System), and ELN (Electronic Lab Notebook) capabilities through a single Python API.

Source Doc

Excerpt From SKILL.md

When to Use This Skill

Use this skill when:

  • Managing biological datasets: scRNA-seq, bulk RNA-seq, spatial transcriptomics, flow cytometry, multi-modal data, EHR data
  • Tracking computational workflows: Notebooks, scripts, pipeline execution (Nextflow, Snakemake, Redun)
  • Curating and validating data: Schema validation, standardization, ontology-based annotation
  • Working with biological ontologies: Genes, proteins, cell types, tissues, diseases, pathways (via Bionty)
  • Building data lakehouses: Unified query interface across multiple datasets
  • Ensuring reproducibility: Automatic versioning, lineage tracking, environment capture
  • Integrating ML pipelines: Connecting with Weights & Biases, MLflow, HuggingFace, scVI-tools
  • Deploying data infrastructure: Setting up local or cloud-based data management systems
  • Collaborating on datasets: Sharing curated, annotated data with standardized metadata

Core Capabilities

LaminDB provides six interconnected capability areas, each documented in detail in the references folder.

1. Core Concepts and Data Lineage

Core entities:

  • Artifacts: Versioned datasets (DataFrame, AnnData, Parquet, Zarr, etc.)
  • Records: Experimental entities (samples, perturbations, instruments)
  • Runs & Transforms: Computational lineage tracking (what code produced what data)
  • Features: Typed metadata fields for annotation and querying

Key workflows:

  • Create and version artifacts from files or Python objects
  • Track notebook/script execution with ln.track() and ln.finish()
  • Annotate artifacts with typed features
  • Visualize data lineage graphs with artifact.view_lineage()
  • Query by provenance (find all outputs from specific code/inputs)

Reference: references/core-concepts.md - Read this for detailed information on artifacts, records, runs, transforms, features, versioning, and lineage tracking.

Use cases

  • **Managing biological datasets**: scRNA-seq, bulk RNA-seq, spatial transcriptomics, flow cytometry, multi-modal data, EHR data.
  • **Tracking computational workflows**: Notebooks, scripts, pipeline execution (Nextflow, Snakemake, Redun).
  • **Curating and validating data**: Schema validation, standardi.

Not for

  • Do not rely on this catalog entry alone for installation or maintenance details.

Related skills

Related skills

Back to directory
DN
AutomationLab Automation & Research Informatics

DNAnexus Integration

DNAnexus is a cloud platform for biomedical data analysis and genomics. Build and deploy apps/applets, manage data objects, run workflows, a…

Claude CodeOpenClawAnalysis
K-Dense-AI/claude-scientific-skillsView
PR
AutomationLab Automation & Research Informatics

protocolsio-integration

Protocols.io is a comprehensive platform for developing, sharing, and managing scientific protocols. This skill provides complete integratio…

Claude CodeOpenClawAnalysis
K-Dense-AI/claude-scientific-skillsView
PY
AutomationLab Automation & Research Informatics

PyLabRobot

PyLabRobot is a hardware-agnostic, pure Python Software Development Kit for automated and autonomous laboratories. Use this skill to control…

Claude CodeOpenClawAnalysis
K-Dense-AI/claude-scientific-skillsView
IN
AutomationLab Automation & Research Informatics

instrument-data-to-allotrope

Convert laboratory instrument output files (PDF, CSV, Excel, TXT) to Allotrope Simple Model (ASM) JSON format or flattened 2D CSV. Use this…

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView