thunderbolt-ibverbs: We have InfiniBand at home

TL;DR

A developer built a Linux kernel module that makes Thunderbolt ports emulate InfiniBand devices, enabling high-speed RDMA communication between consumer mini PCs. This breakthrough could democratize AI training and inference at home.

A developer has created a Linux kernel module that enables Thunderbolt 4 and USB4 ports on AMD mini PCs to emulate InfiniBand devices, achieving high-speed RDMA communication at home. This development could allow consumer hardware to handle AI training and inference workloads traditionally reserved for enterprise networks. Smart home gadgets could benefit from such high-speed data transfer capabilities.

The project involves experimental RDMA-over-USB4 for two AMD mini PCs, specifically 128GB Strix Halo models, enabling bidirectional data transfer rates of approximately 95 Gb/s with around 7 microseconds of latency. The developer reports that this setup supports tensor-parallel inference and Fully Sharded Data Parallel (FSDP) workloads, such as a MiniMax-M2.7 inference run that exceeds the capacity of a single machine, and a Gemma 3 27B LoRA FSDP step that reduced training time from over 21 minutes to just over two minutes compared to Ethernet.

This was achieved by developing a custom Linux kernel module that makes Thunderbolt ports appear as InfiniBand devices, leveraging RDMA (Remote Direct Memory Access) technology to facilitate rapid data exchange. The setup reportedly sustains around 48 Gb/s per direction, with aggregate performance of about 95 Gb/s, vastly outperforming standard Ethernet and soft-RoCE configurations on Thunderbolt networks. Latency measurements show significant improvements over traditional Ethernet and Thunderbolt-based networking, with one-way latency at about 7 microseconds versus 28 to 65 microseconds in other setups.

Why It Matters

This breakthrough demonstrates that high-performance, low-latency RDMA communication can be achieved on consumer hardware using Thunderbolt ports, potentially democratizing access to AI training and inference capabilities without costly enterprise networking gear. If scalable and stable, this approach could enable hobbyists and small labs to perform distributed AI workloads at home, reducing reliance on cloud services and expensive data center infrastructure.

Amazon

Thunderbolt 4 USB4 high-speed data transfer cable

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background

Traditionally, high-speed RDMA networks like InfiniBand are confined to enterprise data centers and supercomputers due to their specialized hardware and complex setup. Recent efforts have explored RDMA over Ethernet (RoCE) and soft-RoCE implementations, but these are limited in performance and latency. The developer’s work builds on ongoing research into making RDMA more accessible, leveraging USB4 and Thunderbolt interfaces common on consumer PCs. This project is experimental, with the developer noting it is based on research code with potential false assumptions and sharp edges, and not intended for production use.

“This is experimental research code, most of it AI-generated, and it loads experimental kernel modules on machines I was willing to crash repeatedly.”

— the developer behind the project

“We built experimental RDMA-over-USB4 for 128GB Strix Halo mini PCs. It lets two consumer boxes talk fast enough to run tensor-parallel inference and FSDP workloads across both machines.”

— the developer

Amazon

RDMA compatible Thunderbolt 4 external device

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It remains unclear how stable and scalable this solution is for long-term or production use, and whether it can be widely adopted across different hardware configurations. Further testing and development are needed to determine its practical viability.

Amazon

high-performance mini PC with Thunderbolt ports

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What’s Next

The next steps involve refining the kernel modules for stability, testing across more hardware setups, and exploring potential integration into consumer operating systems. Broader community engagement and peer review are likely to follow to assess feasibility for wider use.

Amazon

AI training hardware for home use

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Can this setup be used for commercial or production AI workloads?

Currently, no. The project is experimental and not intended for production use. Stability and scalability are still under evaluation.

What hardware is required to replicate this setup?

At minimum, two AMD mini PCs with Thunderbolt 4 or USB4 ports, and the developer’s custom Linux kernel modules. Hardware compatibility and driver support are still being tested.

Does this mean consumers can build their own high-speed networks at home?

Potentially, yes, but current implementations are experimental. Widespread adoption will require further development and stability improvements.

How does this compare to traditional Ethernet or Wi-Fi for AI workloads?

According to the developer, RDMA over Thunderbolt offers significantly lower latency (~7 microseconds) and higher throughput (~95 Gb/s bidirectional) compared to Ethernet or Wi-Fi, which are typically slower and have higher latency.

Source: Hacker News

You May Also Like

Users turn to jailbreaking their older Kindles as Amazon ends support

Many Kindle owners are jailbreaking their devices following Amazon’s announcement to end support for older models on May 20, 2026.

In Indonesia, Prabowo’s $14bn village co-op drive collides with rural realities

Indonesia’s $14 billion village cooperative initiative led by Prabowo encounters practical obstacles in rural areas, highlighting tensions between policy ambitions and local realities.

The Analogue 3D is finally getting save states

Analogue has released a firmware update enabling save states on its Analogue 3D Nintendo 64 clone, enhancing gameplay convenience and accuracy.

Rising Pitching Sensation: Paul Skenes' Remarkable Journey

Captivating the baseball world, Paul Skenes' meteoric rise from college standout to MLB pitching sensation leaves fans eager for more.