Data & ReproSingle-Cell & Spatial OmicsK-Dense-AI/claude-scientific-skillsData & Reproduction
AR

Arboreto

Maintainer K-Dense Inc. · Last updated April 1, 2026

Arboreto is a computational library for inferring gene regulatory networks (GRNs) from gene expression data using paralleli.

Claude CodeOpenClawNanoClawAnalysisReproductionarboretobioinformaticspackagebioinformatics & genomics

Original source

K-Dense-AI/claude-scientific-skills

https://github.com/K-Dense-AI/claude-scientific-skills/tree/main/scientific-skills/arboreto

Maintainer
K-Dense Inc.
License
BSD-3-Clause license
Last updated
April 1, 2026

Skill Snapshot

Key Details From SKILL.md

2 min

Key Notes

  • Arboreto is a computational library for inferring gene regulatory networks (GRNs) from gene expression data using parallelized algorithms that scale from single machines to multi-node clusters.
  • Core capability: Identify which transcription factors (TFs) regulate which target genes based on expression patterns across observations (cells, samples, conditions).
  • network_grnboost = grnboost2(expression_data=matrix).

Source Doc

Excerpt From SKILL.md

Quick Start

Install arboreto:

Basic GRN inference:

Critical: Always use if __name__ == '__main__': guard because Dask spawns new processes.

1. Basic GRN Inference

For standard GRN inference workflows including:

  • Input data preparation (Pandas DataFrame or NumPy array)
  • Running inference with GRNBoost2 or GENIE3
  • Filtering by transcription factors
  • Output format and interpretation

See: references/basic_inference.md

Use the ready-to-run script: scripts/basic_grn_inference.py for standard inference tasks:

2. Algorithm Selection

Arboreto provides two algorithms:

GRNBoost2 (Recommended):

  • Fast gradient boosting-based inference
  • Optimized for large datasets (10k+ observations)
  • Default choice for most analyses

GENIE3:

  • Random Forest-based inference
  • Original multiple regression approach
  • Use for comparison or validation

Quick comparison:

from arboreto.algo import grnboost2, genie3

Use cases

  • Use when analyzing transcriptomics data (bulk RNA-seq, single-cell RNA-seq) to identify transcription factor-target gene relationships and regulatory interactions.

Not for

  • Do not rely on this catalog entry alone for installation or maintenance details.

Related skills

Related skills

Back to directory
AN
Data & ReproSingle-Cell & Spatial Omics

AnnData

AnnData is a Python package for handling annotated data matrices, storing experimental measurements (X) alongside observation metadata (obs)…

Claude CodeOpenClawAnalysis
K-Dense-AI/claude-scientific-skillsView
BI
Data & ReproSingle-Cell & Spatial Omics

bio-imaging-mass-cytometry-cell-segmentation

Cell segmentation from multiplexed tissue images. Covers deep learning (Cellpose, Mesmer) and classical approaches for nuclear and whole-cel…

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView
BI
Data & ReproSingle-Cell & Spatial Omics

bio-read-qc-umi-processing

Extract, process, and deduplicate reads using Unique Molecular Identifiers (UMIs) with umi_tools. Use when library prep includes UMIs and ac…

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView
BI
Data & ReproSingle-Cell & Spatial Omics

bio-single-cell-batch-integration

Integrate multiple scRNA-seq samples/batches using Harmony, scVI, Seurat anchors, and fastMNN. Remove technical variation while preserving b…

OpenClawNanoClawAnalysis
FreedomIntelligence/OpenClaw-Medical-SkillsView