O Open Source Frameworks medium

OpenRLHF

by Community

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)

Visit Community View repo Submit your build →

OSS

OpenRLHF

Added 1 June 2026

#large-language-models #proximal-policy-optimization #raylib #reinforcement-learning #reinforcement-learning-from-human-feedback #transformers #visual-language-models #vllm

Overview

OpenRLHF is an open-source framework for agentic reinforcement learning with language and vision-language models. It is built on Ray for distributed scaling and supports multiple RL algorithms including PPO, DAPO, and REINFORCE++. The framework integrates with vLLM for efficient inference and enables asynchronous RL training.

Best for

Best for
Developers building large-scale RL training systems for language and vision-language models

Use cases

Training LLMs with reinforcement learning from human feedback (RLHF) at scale
Implementing agentic RL workflows that require distributed compute and async execution
Experimenting with policy gradient methods like PPO or REINFORCE++ on multimodal models

Notes

9,583 stars on GitHub. Last updated 2026-05-28. Licensed Apache-2.0.

Use cases

Training LLMs with reinforcement learning from human feedback (RLHF) at scale
Implementing agentic RL workflows that require distributed compute and async execution
Experimenting with policy gradient methods like PPO or REINFORCE++ on multimodal models

Pros

Uses Ray for seamless distributed computing across clusters
Supports a broad range of modern RL algorithms out of the box
Integrates with vLLM for fast LLM inference during training

Cons

Requires familiarity with Ray and distributed system concepts
Community-maintained, so support and documentation are limited
Steep learning curve for developers new to RL frameworks

Indexed from awesome-llm and enriched against its public facts.

Pros

Uses Ray for seamless distributed computing across clusters
Supports a broad range of modern RL algorithms out of the box
Integrates with vLLM for fast LLM inference during training

Cons

Requires familiarity with Ray and distributed system concepts
Community-maintained, so support and documentation are limited
Steep learning curve for developers new to RL frameworks

Pairs with

Other entries in the index that connect to this one. Click through to see the chain.

Uses1entry

O OSS Framework medium

vLLM

Community

A high-throughput and memory-efficient inference and serving engine for LLMs

★ 81,619 updated 1mo ago

Alternative to2entries

O OSS Framework medium

veRL

Community

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

★ 21,691 updated 1mo ago

O OSS Framework medium

open-r1

Community

Fully open reproduction of DeepSeek-R1

★ 26,029 updated 3mo ago

Pairs with1entry

O OSS Framework medium

Awesome LLM Human Preference Datasets

Community

A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.

★ 391 updated 2y ago

Alternatives3entries

O OSS Framework medium

ROLL

Community

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

★ 3,193 updated 1mo ago

O OSS Framework medium

TRL

Community

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

O OSS Framework medium

veRL

Community

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

★ 21,691 updated 1mo ago

Free 27-page guide

Get the free Developer’s Field Guide

A 27-page field guide to the AI coding workflow with Claude. Claude Code, MCP servers, the prompt patterns that work, and what to delegate. Free.

Enter your work email. We send it straight over, plus a few short notes worth knowing. Unsubscribe any time.

← Back to Open Source Submit your own entry →