Thoughts on Scaling Taste
Subjective capabilities require more than creating pseudo-verification to plug back into RLVR approaches that worked for math/coding. It requires deciding what/when/how to introduce human supervision, creative ways to generate/insert it, and learning good representations of human data to maximally utilize it.
Read on Notion →