Minimax M2.1 : Reasonably small open model that is breakthrough in multi language coding.
While not in multilingual SWE benchmarks (7-9 languages other than python), this is best model in the world at J programming/understanding, while being fast, with succinct well structured thinking, as well. The multi SWE benchmarks are chart topping, and actual performance is good, meets expectations, for me, compared to GPT which tends to disappoint relative to scores compared to others.
For my use, this is the greatest AI breakthrough leap forward so far, considering speed/cost/quality of results.
https://huggingface.co/MiniMaxAI/MiniMax-M2.1Open linkView original on lemmy.ca
230B parameters is "reasonably small" now?!
cost of RAM is an issue, but q3 will fit in 128gb. q8 in 256. 512 is my line for "reasonably small" because single desktop computers under 1500w power circuits (with monitors) can satisfy requirements, at far lower cost than a single 140gb HBM card, where multiples fail to meet power budget.
Oh man, J is a cool language to learn and promptly forget.
I retract my praise on this model's ability to generate a full code file. It's ok at thinking on one small thing at a time. Similar problems to other models on bigger tasks.