MiniMax just released M3, and the timing could not be more deliberate.
The open-source community has been watching frontier models pull further ahead on coding and multimodal tasks. Every major release from Anthropic, Google, and OpenAI widens that gap a little more. M3 is MiniMax’s push back.
And it is a serious one.
M3 is the first open-weight model to combine three capabilities that have only existed in closed-source frontier models: frontier-level coding, a 1M token context window, and native multimodal support covering image, video, and computer use.
That combination changes what open-source developers can work with.
On SWE-Bench Pro, M3 scores 59.0%, surpassing both GPT-5.5 and Gemini 3.1 Pro and sitting just below Claude Opus 4.7. On Terminal-Bench 2.1, it hits 66.0%. On Claw-Eval, an end-to-end framework for autonomous agents, it scores highest among all tested models.
For developers who have been watching open-source models inch closer to frontier performance, M3 is the clearest signal yet that the gap is closing.
I wanted to break this down for you, so let me walk you through what changed, what the architecture looks like, and where M3 holds up.