Data Processing Benchmark Featuring Rust, Go, Swift, Zig, Julia etc.

Julia sprints ahead, Java cries foul, D fans demand respect

TLDR: A benchmark finding related posts by shared tags crowned Julia (and D in multicore) as speed champs. Comments erupted: Java users say the test was unfair, D fans demand recognition, Julia vs Python memes flew, and R devotees wonder why they weren’t invited—because speed politics matter.

A spicy new benchmark dropped comparing how fast different languages crunch a simple task: given a giant list of posts, find each post’s top 5 related ones by shared tags. The scoreboard? Julia (HO1) blasts ahead, with D, Rust, and C/C-like tools close behind, while Python and Ruby eat dust. Then the comments set themselves on fire.

Java folks showed up with receipts: user pron says the Java run used a serial garbage collector (GC — the memory cleanup system) that’s meant for tiny machines, calling the setup unfair. Cue the meme: “GC = Grievance Collector.” Meanwhile, Julia fans are flexing. One commenter cheered, “Julia is a beast compared to python,” and even dropped a shiny visualization of the results (link).

The real underdog storyline? D stans. Multiple voices declared D “criminally underrated,” claiming it fixes C++ headaches and deserves the crown—especially with D’s multicore results blazing past C++, C#, Rust, and Go. One popular quip: “Learn D if speed’s your thing,” sparking a Rust vs Zig vs D cage match in the replies.

And then came the “who wasn’t invited” drama: R users stormed in asking why the data science darling didn’t even get a cameo. The vibe: victory laps for Julia and D, rebuttals from Java, and a thousand Python memes pretending it’s totally fine. It’s speed wars with feelings, and everyone’s timing everyone else’s hot takes.

Key Points

  • The benchmark computes top-5 related posts by shared tags, using a tag-to-posts index and per-post shared-tag counting, then outputs results to JSON.
  • Execution is standardized via run scripts for Unix/Windows and Docker, with reproducibility from a GitHub workflow.
  • Rules forbid FFI, unsafe code, disabled checks, SIMD (single-thread), hardware targeting, and caching; solutions must parse JSON at runtime, support UTF-8, handle up to 100k posts/100 tags, and use <8 GB RAM.
  • Tests ran on an AWS EC2 c7a.xlarge (4 vCPU, 8 GB RAM) Ubuntu 22.04 VM; results are reported for 5k, 20k, and 60k posts.
  • Single-threaded leaders include Julia HO1 (129.13 ms total) and D HO1 (176.11 ms); multicore leader is D Concurrent (v2) at 388.83 ms total, followed by C# and C++ concurrent implementations.

Hottest takes

"the Java code is run with `-XX:+UseSerialGC`, which is the slowest GC" — pron
"Julia is a beast compared to python" — Imustaskforhelp
"D gets no respect" — jhack
Made with <3 by @siedrix and @shesho from CDMX. Powered by Forge&Hive.