训练与评测机器学习与科研 AIK-Dense-AI/claude-scientific-skills训练与评测
PU

PufferLib

维护者 K-Dense Inc. · 最近更新 2026年4月1日

PufferLib是一个high-performance reinforcement learning 库 designed ,用于 fast parallel environment 模拟 、 training。 It achieves training at millions of steps per second through optimi。

Claude Code训练编排评测比较pufferlibmachine-learningpackagemachine learning & deep learning

原始来源

K-Dense-AI/claude-scientific-skills

https://github.com/K-Dense-AI/claude-scientific-skills/tree/main/scientific-skills/pufferlib

维护者
K-Dense Inc.
许可
MIT license
最近更新
2026年4月1日

技能摘要

来自 SKILL.md 的关键信息

2 min

核心说明

  • PufferLib是一个high-performance reinforcement learning 库 designed ,用于 fast parallel environment 模拟 、 training. It achieves training at millions of steps per second through optimized vectorization,native multi-agent support,、 efficient PPO implementation (PuffeRL). 库 provides Ocean suite of 20+ environments 、 seamless integration ,支持 Gymnasium,PettingZoo,、 specialized RL 框架s。
  • puffer train procgen-coinrun --train.device cuda --train.learning-rate 3e-4。

原始文档

SKILL.md 摘录

When to Use This Skill

Use this skill when:

  • Training RL agents with PPO on any environment (single or multi-agent)
  • Creating custom environments using the PufferEnv API
  • Optimizing performance for parallel environment simulation (vectorization)
  • Integrating existing environments from Gymnasium, PettingZoo, Atari, Procgen, etc.
  • Developing policies with CNN, LSTM, or custom architectures
  • Scaling RL to millions of steps per second for faster experimentation
  • Multi-agent RL with native multi-agent environment support

1. High-Performance Training (PuffeRL)

PuffeRL is PufferLib's optimized PPO+LSTM training algorithm achieving 1M-4M steps/second.

Quick start training:


## Distributed training

torchrun --nproc_per_node=4 train.py
python
import pufferlib
from pufferlib import PuffeRL

适用场景

  • **Training RL agents** ,支持 PPO on any environment (single 或 multi-agent)。
  • **Creating custom environments** ,使用 PufferEnv API。

不适用场景

  • Do not rely on this catalog entry alone ,用于 installation 或 maintenance details。

相关技能

相关技能

返回目录
BI
训练与评测机器学习与科研 AI

bio-epitranscriptomics-m6anet-analysis

bio-epitranscriptomics-m6anet-analysis:Nanopore direct RNA m6A detection ,支持 m6Anet 深度学习。

OpenClawNanoClaw训练编排
FreedomIntelligence/OpenClaw-Medical-Skills查看
BI
训练与评测机器学习与科研 AI

bio-imaging-mass-cytometry-interactive-annotation

bio-imaging-mass-cytometry-interactive-annotation:Interactive cell type annotation ,用于 IMC data。 Covers napari-based ann…

OpenClawNanoClaw训练编排
FreedomIntelligence/OpenClaw-Medical-Skills查看
BI
训练与评测机器学习与科研 AI

bio-immunoinformatics-tcr-epitope-binding

bio-immunoinformatics-tcr-epitope-binding:预测 TCR-epitope specificity ,使用 ERGO-II 、 深度学习 models ,用于 T-cell receptor antig…

OpenClawNanoClaw训练编排
FreedomIntelligence/OpenClaw-Medical-Skills查看
CI
训练与评测机器学习与科研 AI

cirq

cirq:Cirq is Google Quantum AI's open-source 框架 ,用于 designing,simulating,、 running quantum circuits on quantum computers…

Claude Code训练编排
K-Dense-AI/claude-scientific-skills查看