数据与复现蛋白质组与代谢组FreedomIntelligence/OpenClaw-Medical-Skills数据与复现
BI

bio-proteomics-data-import

维护者 FreedomIntelligence · 最近更新 2026年4月1日

Load and parse mass spectrometry data formats including mzML, mzXML, and quantification tool outputs like MaxQuant proteinGroups.txt. Use when starting a proteomics analysis with raw or processed MS data. Handles contaminant filtering and missing value assessment.

OpenClawNanoClaw分析处理复现实验bio-proteomics-data-import🧬 bioinformatics (gptomics bio-* suite)bioinformatics — proteomics & metabolomicsload

原始来源

FreedomIntelligence/OpenClaw-Medical-Skills

https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-proteomics-data-import

维护者
FreedomIntelligence
许可
MIT
最近更新
2026年4月1日

技能摘要

来自 SKILL.md 的关键信息

2 min

核心说明

  • Python:pyopenms.MzMLFile().load() ,用于 raw spectra,pandas.read_csv() ,用于 search engine outputs。
  • R:MSnbase::readMSData() ,用于 raw,read.delim() ,用于 MaxQuant/Proteome Discoverer。
  • Load my mass spec data into Python" → Parse mzML/mzXML raw files 或 MaxQuant proteinGroups.txt into data structures ,用于 programmatic access 、 downstream analysis. Python:pyopenms.MzMLFile().load() ,用于 raw spectra,pandas.read_csv() ,用于 search engine outputs R:MSnbase::readMSData() ,用于 raw,read.delim() ,用于 MaxQuant/Proteome Discoverer。
  • contam_col = 'Potential contaminant' if 'Potential contaminant' in protein_groups.columns else 'Contaminant' protein_groups = protein_groups[ (protein_groups.get(contam_col,'')!= '+') & (protein_groups.get('Reverse','')!= '+') & (protein_groups.get('Only identified by site','')!= '+') ]。

原始文档

SKILL.md 摘录

Loading mzML/mzXML Files with pyOpenMS

Goal: Parse raw mass spectrometry data files into memory for programmatic access.

Approach: Load mzML/mzXML into an MSExperiment object, then iterate spectra by MS level to access peaks and precursor info.

Loading MaxQuant Output

Goal: Import MaxQuant proteinGroups.txt with contaminant and decoy filtering.

Approach: Read the TSV file, remove reverse hits, contaminants, and site-only identifications, then extract intensity columns.

import pandas as pd

protein_groups = pd.read_csv('proteinGroups.txt', sep='\t', low_memory=False)

## Extract intensity columns (LFQ or iBAQ)

intensity_cols = [c for c in protein_groups.columns if c.startswith('LFQ intensity') or c.startswith('iBAQ ')]
if not intensity_cols:
    intensity_cols = [c for c in protein_groups.columns if c.startswith('Intensity ') and 'Intensity L' not in c]
intensities = protein_groups[['Protein IDs', 'Gene names'] + intensity_cols]

适用场景

  • 适合在starting proteomics analysis ,支持 raw 或 processed MS data时使用。

不适用场景

  • Do not rely on this catalog entry alone ,用于 installation 或 maintenance details。

上游相关技能

  • quantification - Process imported data for quantification
  • peptide-identification - Identify peptides from raw spectra
  • expression-matrix/counts-ingest - Similar data loading patterns

相关技能

相关技能

返回目录
BI
数据与复现蛋白质组与代谢组

bio-metabolomics-lipidomics

Specialized lipidomics analysis for lipid identification, quantification, and pathway interpretation. Covers LC-MS lipid…

OpenClawNanoClaw分析处理
FreedomIntelligence/OpenClaw-Medical-Skills查看
BI
数据与复现蛋白质组与代谢组

bio-metabolomics-metabolite-annotation

Metabolite identification from m/z and retention time. Covers database matching, MS/MS spectral matching, and confidence…

OpenClawNanoClaw分析处理
FreedomIntelligence/OpenClaw-Medical-Skills查看
BI
数据与复现蛋白质组与代谢组

bio-metabolomics-msdial-preprocessing

MS-DIAL-based metabolomics preprocessing as alternative to XCMS. Covers peak detection, alignment, annotation, and expor…

OpenClawNanoClaw分析处理
FreedomIntelligence/OpenClaw-Medical-Skills查看
BI
数据与复现蛋白质组与代谢组

bio-metabolomics-pathway-mapping

Map metabolites to biological pathways using KEGG, Reactome, and MetaboAnalyst. Perform pathway enrichment and topology…

OpenClawNanoClaw分析处理
FreedomIntelligence/OpenClaw-Medical-Skills查看