May 1, 2026
Fast, cheap, and fighting for respect
LFM2-24B-A2B: Scaling Up the LFM2 Architecture
A new AI just dropped, and the internet is split between “shockingly fast” and “still not Gemma”
TLDR: Liquid AI released a new open AI model designed to run fast on ordinary machines, even some laptops, and early users say it’s surprisingly speedy. But the comments quickly turned into a showdown over whether speed matters if rivals like Gemma and Qwen still give better, steadier answers.
Liquid AI unveiled LFM2-24B-A2B, its biggest model yet: a large open AI system that’s supposed to be cheap to run, light enough for some everyday laptops, and fast even without a fancy graphics card. On paper, that’s a big deal. In the comments, though, the real action was less “wow, groundbreaking” and more “okay, but how good is it actually?”
The cheer squad showed up fast. One user said it was among the first models to hit around 20 tokens per second on a laptop, which in normal-person language means it spits out answers surprisingly quickly. Another praised it as a great option if you don’t have a GPU, basically crowning it the people’s champ for the underpowered laptop crowd. There was even some excited tinkering energy, with one commenter already dreaming about plugging it into local coding tools and seeing what chaos happens.
But the skeptics were not having a quiet day. One blunt hot take said that if you do have even a modest graphics card, there are simply better models, name-dropping Gemma and Qwen as the obvious favorites. Another commenter threw a harsher punch, saying past LFM models had “serious coherence issues,” which is community-speak for: sure, it’s fast, but does it stay sensible for long? And then came the classic buzzkill: one user pointed out this is still an early checkpoint, basically asking whether everyone is hyping a trailer instead of the full movie.
So the vibe is deliciously messy: speed freaks are impressed, laptop users are thrilled, and quality snobs are side-eyeing the whole thing. The community verdict so far? Cool architecture, fun local toy, but the comments section is still holding the real launch party.
Key Points
- •Liquid AI released an early open-weight checkpoint of LFM2-24B-A2B, a sparse MoE model with 24B total parameters and about 2.3B active parameters per forward pass.
- •The model extends the LFM2 family from 350M to 24B-A2B and is described as fitting in 32GB RAM for deployment on cloud, edge, and consumer systems.
- •LFM2-24B-A2B scales from LFM2-8B-A1B by increasing depth from 24 to 40 layers and experts from 32 to 64 per MoE block while keeping top-4 routing and a lean active path.
- •The article says benchmark quality improves approximately log-linearly across the LFM2 family on GPQA Diamond, MMLU-Pro, IFEval, IFBench, GSM8K, and MATH-500.
- •The model has day-zero support in llama.cpp, vLLM, and SGLang, with multiple GGUF quantization formats and reported comparisons against Qwen3-30B-A3B-Instruct-2507 and gpt-oss-20b on AMD Ryzen AI Max+ 395.