数据与复现单细胞与空间组学K-Dense-AI/claude-scientific-skills数据与复现
GE

geniml

维护者 K-Dense Inc. · 最近更新 2026年4月1日

Geniml is a Python package for building machine learning models on genomic interval data from BED files. It provides unsupervised methods for learning embeddings of genomic regions, single cells, and metadata labels, enabling similarity searches, clustering, and downstream ML tasks.

Claude CodeOpenClawNanoClaw分析处理复现实验genimlbioinformaticspackagebioinformatics & genomics

原始来源

K-Dense-AI/claude-scientific-skills

https://github.com/K-Dense-AI/claude-scientific-skills/tree/main/scientific-skills/geniml

维护者
K-Dense Inc.
许可
BSD-2-Clause license
最近更新
2026年4月1日

技能摘要

来自 SKILL.md 的关键信息

2 min

核心说明

  • Geniml是一个Python package ,用于 building 机器学习 models on genomic interval data ,面向 BED files. It provides unsupervised methods ,用于 learning embeddings of genomic regions,single cells,、 metadata labels,enabling 相似性检索es,聚类,、 downstream ML tasks。
  • hard_tokenization( src_folder='bed_files/',dst_folder='tokens/',universe_file='universe.bed',p_value_threshold=1e-9 )。

原始文档

SKILL.md 摘录

Installation

Install geniml using uv:

For ML dependencies (PyTorch, etc.):

Development version from GitHub:

Core Capabilities

Geniml provides five primary capabilities, each detailed in dedicated reference files:

1. Region2Vec: Genomic Region Embeddings

Train unsupervised embeddings of genomic regions using word2vec-style learning.

Use for: Dimensionality reduction of BED files, region similarity analysis, feature vectors for downstream ML.

Workflow:

  1. Tokenize BED files using a universe reference
  2. Train Region2Vec model on tokens
  3. Generate embeddings for regions

Reference: See references/region2vec.md for detailed workflow, parameters, and examples.

适用场景

  • 可用于training region embeddings (Region2Vec,BEDspace),single-cell ATAC-seq analysis (scEmbed),building consensus peaks (universes),或 any ML-based analysis of genomic regions。

不适用场景

  • Do not rely on this catalog entry alone ,用于 installation 或 maintenance details。

相关技能

相关技能

返回目录
AN
数据与复现单细胞与空间组学

AnnData

AnnData is a Python package for handling annotated data matrices, storing experimental measurements (X) alongside observ…

Claude CodeOpenClaw分析处理
K-Dense-AI/claude-scientific-skills查看
AR
数据与复现单细胞与空间组学

Arboreto

Arboreto is a computational library for inferring gene regulatory networks (GRNs) from gene expression data using parall…

Claude CodeOpenClaw分析处理
K-Dense-AI/claude-scientific-skills查看
BI
数据与复现单细胞与空间组学

bio-imaging-mass-cytometry-cell-segmentation

Cell segmentation from multiplexed tissue images. Covers deep learning (Cellpose, Mesmer) and classical approaches for n…

OpenClawNanoClaw分析处理
FreedomIntelligence/OpenClaw-Medical-Skills查看
BI
数据与复现单细胞与空间组学

bio-read-qc-umi-processing

Extract, process, and deduplicate reads using Unique Molecular Identifiers (UMIs) with umi_tools. Use when library prep…

OpenClawNanoClaw分析处理
FreedomIntelligence/OpenClaw-Medical-Skills查看