Specialized lipidomics analysis for lipid identification, quantification, and pathway interpretation. Covers LC-MS lipid…
bio-proteomics-data-import
维护者 FreedomIntelligence · 最近更新 2026年4月1日
Load and parse mass spectrometry data formats including mzML, mzXML, and quantification tool outputs like MaxQuant proteinGroups.txt. Use when starting a proteomics analysis with raw or processed MS data. Handles contaminant filtering and missing value assessment.
原始来源
FreedomIntelligence/OpenClaw-Medical-Skills
https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills/tree/main/skills/bio-proteomics-data-import
- 维护者
- FreedomIntelligence
- 许可
- MIT
- 最近更新
- 2026年4月1日
技能摘要
来自 SKILL.md 的关键信息
核心说明
- Python:pyopenms.MzMLFile().load() ,用于 raw spectra,pandas.read_csv() ,用于 search engine outputs。
- R:MSnbase::readMSData() ,用于 raw,read.delim() ,用于 MaxQuant/Proteome Discoverer。
- Load my mass spec data into Python" → Parse mzML/mzXML raw files 或 MaxQuant proteinGroups.txt into data structures ,用于 programmatic access 、 downstream analysis. Python:pyopenms.MzMLFile().load() ,用于 raw spectra,pandas.read_csv() ,用于 search engine outputs R:MSnbase::readMSData() ,用于 raw,read.delim() ,用于 MaxQuant/Proteome Discoverer。
- contam_col = 'Potential contaminant' if 'Potential contaminant' in protein_groups.columns else 'Contaminant' protein_groups = protein_groups[ (protein_groups.get(contam_col,'')!= '+') & (protein_groups.get('Reverse','')!= '+') & (protein_groups.get('Only identified by site','')!= '+') ]。
原始文档
SKILL.md 摘录
Loading mzML/mzXML Files with pyOpenMS
Goal: Parse raw mass spectrometry data files into memory for programmatic access.
Approach: Load mzML/mzXML into an MSExperiment object, then iterate spectra by MS level to access peaks and precursor info.
Loading MaxQuant Output
Goal: Import MaxQuant proteinGroups.txt with contaminant and decoy filtering.
Approach: Read the TSV file, remove reverse hits, contaminants, and site-only identifications, then extract intensity columns.
import pandas as pd
protein_groups = pd.read_csv('proteinGroups.txt', sep='\t', low_memory=False)
## Extract intensity columns (LFQ or iBAQ)
intensity_cols = [c for c in protein_groups.columns if c.startswith('LFQ intensity') or c.startswith('iBAQ ')]
if not intensity_cols:
intensity_cols = [c for c in protein_groups.columns if c.startswith('Intensity ') and 'Intensity L' not in c]
intensities = protein_groups[['Protein IDs', 'Gene names'] + intensity_cols]
适用场景
- 适合在starting proteomics analysis ,支持 raw 或 processed MS data时使用。
不适用场景
- Do not rely on this catalog entry alone ,用于 installation 或 maintenance details。
上游相关技能
- quantification - Process imported data for quantification
- peptide-identification - Identify peptides from raw spectra
- expression-matrix/counts-ingest - Similar data loading patterns
相关技能
相关技能
Metabolite identification from m/z and retention time. Covers database matching, MS/MS spectral matching, and confidence…
MS-DIAL-based metabolomics preprocessing as alternative to XCMS. Covers peak detection, alignment, annotation, and expor…
Map metabolites to biological pathways using KEGG, Reactome, and MetaboAnalyst. Perform pathway enrichment and topology…