Add blog post: Async RL from Scratch on TPUs#10
Draft
AlienKevin wants to merge 8 commits into
Draft
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix grammatical errors and typos in the intro paragraph (learns→learn, missing "is", its/it's, notoriety, October 2025, etc.) and add a collapsible dropdown summarizing prior JAX RL repos we evaluated and why none fit our needs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add markdown="1" so Kramdown renders markdown inside <details> - Bold and enlarge the dropdown summary text with more whitespace - Remove GitHub star counts from all library entries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove AMA inline notes, fix typos, reframe speculative claims as hypotheses, remove non-RL resource contention bullet, add stable run evidence after vLLM fix, expand evaluation lessons with estimator details, and differentiate lesson 3 from lesson 1. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Our internal deep dives highlight the Marin team’s amzing progress. The broader community can also benefit from the insights and lessons we uncover. Blog posts are a great way to share that knowledge publicly and create lasting reference points for community discussion. They are also relatively quick to produce, since much of the content already exists in the slides prepared for the deep dives.
To help kick off this effort, I wrote a blog post based on my async RL deep dive. Thanks to Ahmed for suggesting the RL blog post idea. In addition to the post itself, I added a few improvements to make our blogs easier to discover, including surfacing recent posts on the home page and adding a blog link to the navigation bar.