← All creators

ar0cket1

User

Online RL for Hermes Agent — self-improving LoRA adapters from human feedback using MIS-PO

2 indexed · 0 Featured · 37 stars · avg score 79

Categories

Indexed Skills (2)

Bio shown is the top-scored skill's repo description as a fallback — real GitHub bios land in a future update.