Hardware & chips
media r/LocalLLaMA · 4h ago

MINISFORUM DEG1 Oculink eGPU Dock Refurbished Available for $59

A refurbished MINISFORUM DEG1 Oculink eGPU dock is currently available for $59. The product listing highlights its robust build quality, noting that the device has sufficient heft to securely hold a graphics card. Unlike some lower-cost alternatives, this dock includes redrivers to ensure signal integrity. A user who purchased a unit last year reported positive experiences with its performance and stability. The item can be purchased directly from the manufacturer's refurbished product page.

media r/LocalLLaMA · 4h ago

Query on Clustering Nvidia DGX Spark and AMD Ryzen AI Max 395 for Unified Memory Inference

A user inquired about the feasibility of clustering a Nvidia DGX Spark with an AMD Ryzen AI Max 395 to run a single large language model. Both devices possess 128GB of unified memory, offering a potential combined capacity of approximately 256GB minus operating system overhead. The DGX Spark is equipped with a 200Gbit network interface, whereas the AMD Strix system currently has only 5Gbit Ethernet but includes a PCIe Gen 4x4 slot. The user noted that DeepSeek v4 Flash can fit on two DGX Sparks and wondered if the Strix could serve as an alternative node. To improve connectivity, they proposed adding a Mellanox ConnectX-6 QSFP+28 to the AMD system to achieve higher bandwidth over the link.

media r/LocalLLaMA · 1d ago

7 Chinese companies shipping H100/H200-class AI chips, most IPO'd in last 6 months

At least seven Chinese companies are now shipping H100/H200-class AI accelerators, with most having gone public within the last six months. Huawei alone shipped 812,000 AI cards last year, accounting for 49% of China's domestic supply, and its Ascend 950 is reportedly targeted at H200-class performance. Several of these firms were founded by former NVIDIA and AMD GPU leaders, including MetaX, which saw revenue grow 3,800x in three years, and Alibaba, which launched a server with 1.5TB of VRAM for on-premises frontier model deployment.

arxiv arXiv cs.LG · 6d ago

Quantum Ring All-Reduce: Communication and Privacy Advantages for Distributed Learning

A quantum version of ring all-reduce reduces per-link communication by a factor of two using entanglement and superdense coding, without altering model or gradient computations. It achieves information-theoretically secure aggregation via verified entanglement, with a 2x overhead in GHZ copies, and provides exponential communication advantages in gradient conflict detection for specific auditing tasks.