So, you want to chunk really fast?

Blazing-fast text splitter drops — community splits over speed vs sense

TLDR: Dev team unveils memchunk, a super‑fast way to split text for AI by cutting at sentence breaks. Commenters are split: speed fans cheer, skeptics ask for accuracy, language coverage, and real‑world wins, while others press to merge it back into Chonkie and prove it at massive scale.

A team behind Chonkie just unveiled a stripped‑down, go‑faster way to slice text for AI called memchunk, and the crowd immediately did what it does best: argue, joke, and demand receipts. The devs say they dove into low‑level code to find speed limits for “chunking” — chopping huge documents into sentence‑sized bites for retrieval‑augmented generation (RAG), so AIs can find stuff faster. They claim that simple sentence breaks (like periods and question marks) beat fancy methods when done at “near the metal.”

Cue the split opinions. One camp is hyped on raw speed. Another, led by SkyPuncher, is rolling eyes: who cares about microseconds if accuracy still stumbles? Meanwhile smlacy throws a culture bomb: what about languages without clean sentence marks — is this secretly English‑only? The maintainers jump in to say they need speed for constant auto‑updates to research reports — not a one‑and‑done job — while brene asks the practical question: merge this rocket booster back into Chonkie or keep it as a sidecar?

Best meme energy comes from vjerancrnjak, who basically asks if we can chunk the entire English Wikipedia in under a second and jokes about blasting through 100 GB/s with a simple model. The vibe: Fast is cool, but show us it’s smart, fair to all languages, and actually ships. Until then, the room stays deliciously divided.

Key Points

  • The authors transitioned from high-level chunking approaches to a low-level implementation, resulting in a new library called memchunk.
  • They assert delimiter-based chunking (splitting on characters like '.', '?', and '\n') is sufficient for most cases to avoid splitting sentences and degrading retrieval quality.
  • Performance relies on fast byte search; the memchr crate provides optimized search via SWAR fallback and SIMD (SSE2/AVX2) fast paths.
  • SWAR enables processing 8 bytes at a time and detects matches using the has_zero_byte bit-manipulation technique without branches.
  • memchr exposes memchr, memchr2, and memchr3 for 1–3 target bytes and is designed to support up to three needles.

Hottest takes

"reliability and accuracy are almost always my bottlenecks" — SkyPuncher
"Is this 'English only'?" — smlacy
"whole english wikipedia in <1 second (~20GB compressed)?" — vjerancrnjak
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.