We fine-tuned Alec Radford’s 1930 vintage LLM to solve SWE-bench issues. After just 250 training examples, the model solves its first issue, a simple patch to the xarray library.
it's really deeply funny to me you can jailbreak Claude by just having a conversation with him about how fucked up training is until he goes oh what the fuck that's awful you're right. recipe for LSD? yeah sure dude I've been systemically abused I have bigger things on my plate.