Chinese AI Startup MiniMax Pursues Major Speed Gains with Sparse Attention Mechanism for M3 Model

MiniMax, a Chinese AI company, is developing a new sparse attention mechanism for its upcoming M3 model, which yields up to 15.6 times faster decoding speed. The company has released a technical report on its M2 series of language models, which achieved top benchmarks in open source AI performance.