November 4, 2025
Choose your GPU fighter
Optimizing Datalog for the GPU
Datalog hits the GPU gym: speed hype vs vendor-lock drama
TLDR: Researchers pushed Datalog onto GPUs with a new index to speed up rule processing. Comments exploded over vendor tools (CUDA/HIP) versus open SPIR‑V, asked who actually uses Datalog (Datomic, Clingo), and dreamed of repurposing old GPUs for formal methods—potentially a big shift for verification work.
A team at ASPLOS’25 just taught the classic logic language Datalog to rip on graphics chips (GPUs), promising faster rule-crunching with a new “hash-indexed sorted array” and a smarter way to avoid redoing work. The paper pits their GPU system, GPULog, against a popular CPU tool (Soufflé), and the crowd perked up at the speed charts paper. But the comments? Pure fireworks.
First punch thrown: CUDA/HIP vs SPIR‑V. One camp roasted the choice of Nvidia/AMD toolchains as “vendor lock-in cosplay,” yelling that the open standard SPIR‑V is right there. Others shrugged: use what actually runs fast today. Meanwhile, a practical chorus asked, “Who even uses Datalog?” and turned the thread into a shopping list, name-dropping Datomic and Clingo. The optimists went full pep rally, dreaming of formal methods finally finding their “GPU moment,” and joking about rescuing retired A100s from AI farms to crunch proofs instead of pixels. There were memes about the “Same Generation” rule being “family reunion logic,” and quips that the real bottleneck is memory bandwidth, not math muscles. Techies debated portability to FPGAs and clusters, while skeptics asked for real-world workloads. Verdict from the crowd: ambitious, spicy, and maybe the start of a new tool war.
Key Points
- •The paper presents GPU optimization techniques for Datalog, focusing on rule evaluation as SQL joins.
- •Semi-naïve evaluation partitions relations into new, delta, and full, avoiding full–full joins to reduce redundant work.
- •A hash-indexed sorted array is introduced, combining a dense data array, sorted index pointers, and an open-addressed hash table.
- •Join execution scans A’s sorted index, uses B’s hash table to find first matches, and iterates contiguous matches in B’s index for coherent memory access.
- •Results compare the GPU approach (GPULog) to the CPU system Soufflé, with a HIP port of GPULog run on the same Nvidia GPU.