Skip to content

R0040/2026-03-28/Q001/S02/R04

Research R0040 — RLHF Alternatives
Run 2026-03-28
Query Q001
Search S02
Result S02-R04

Comprehensive survey of RL methods for LLMs.

Summary

Field Value
Title Reinforcement Learning Meets Large Language Models: A Survey of Advancements and Applications Across the LLM Lifecycle
URL https://arxiv.org/html/2509.16679v1
Date accessed 2026-03-28
Publication date 2025-09
Author(s) Multiple authors
Publication arXiv

Selection Decision

Included in evidence base: No

Rationale: Comprehensive survey but largely redundant with the individual primary sources already selected (DPO, CAI, GRPO, KTO papers). Used for cross-referencing but does not add unique evidence.