Tag: reinforcement-learning

All the articles with the tag "reinforcement-learning".

Training Composer 2: How Cursor Builds a Coding Agent Model

27 May, 2026

A structured walkthrough of Sasha Rush's Training Composer 2 workshop: why Cursor chose Kimi K2.5, how continued pretraining and long-horizon RL fit together, what CursorBench measures, and where Composer is headed.
Kimi K2.5: Joint Text–Vision Training and the Agent Swarm

19 May, 2026

A walkthrough of two ideas behind Kimi K2.5: how joint text–vision pre-training and RL make each modality help the other, and how Agent Swarm replaces sequential tool use with a learned parallel orchestrator.

Training Composer 2: How Cursor Builds a Coding Agent Model