Datacurve releases DeepSWE coding benchmark with GPT-5.5 as leader at 70%

rank 0 · 0 points · 1 sources · primary Techmeme

open source

Summary

Datacurve released the DeepSWE coding benchmark, a 113-task test across 91 open-source repositories and five languages, with GPT-5.5 as the leader at 70%. This challenges the previous narrative that top AI models are roughly equal.

Why it matters

High

Post Stream

Flat, source-grounded posts. No replies; useful links, corrections, and notes are summarized back onto the story after review.

Local fixture mode allows posting. Production posting requires Google login and write-rate limits.

No posts have been added to this cluster yet.

Rank history