The Legendary Chip Architect is Trying to Design “Affordable GPUs” Without HBM

Industry News

Jim Keller, a legendary chip designer and currently CEO of Tenstorrent, an American AI chip design startup, is attempting to design chips that are more efficient than Nvidia GPU, thereby lowering the cost of AI applications and aiming to capture a portion of Nvidia’s market share.

 

As AI expands into smartphones, electric vehicles, and cloud services, many companies are looking for cheaper solutions. Many small businesses are unwilling to pay US$20,000 for Nvidia’s high-end GPUs. In order to target those market segments that Nvidia hasn’t reached currently, Jim Keller is trying to design chips that are more affordable and efficient than Nvidia.

 

Tenstorrent is preparing to launch its second-generation multi-purpose AI chip by the end of 2024. According to the company, in some areas, this AI chip surpasses Nvidia’s AI GPUs in terms of energy and processing efficiency. In fact, compared to Nvidia’s DGX series of AI servers, Tenstorrent’s Galaxy system is three times more efficient and 33% cheaper.

 

Key to Reducing Power Consumption and Price: Abandoning HBM

 

High Bandwidth Memory (HBM) is a popular advanced memory chip capable of transferring large amounts of data quickly. It is a crucial component of generative AI chips, playing a significant role in Nvidia’s product success. However, HBM is also one of the culprits for high power consumption and expensive AI chips. Generally, for each task processed, the GPU has to send data to memory, requiring the high-speed data transfer capabilities of HBM. 

 

Tenstorrent, however, has specially designed its chip to reduce the number of data transfers drastically and doesn’t use HBM. Each Tenstorrent chip has hundreds of cores, each with a small CPU, which can independently determine which data needs to be prioritized and which unnecessary tasks can be abandoned, thereby improving overall efficiency. Keller believes this new approach can allow Tenstorrent chips to replace GPUs and HBM in some AI research areas. Not only that, the company will also try to improve the cost-effectiveness of its products.

 

Since each core is relatively independent, the Tenstorrent chip can be adapted for a wider range of applications by stacking more or fewer together. For example, a small number would be sufficient for a smartphone or wearable device, while 100 could be combined for use in AI data centers.

 

Keller admits that it might take years to disrupt the current large-scale HBM industry. He predicts that more emerging companies will enter the AI market that Nvidia currently cannot serve, rather than a single company replacing Nvidia.

Related article: AI Chip Shortage Continues, But There is a Glimmer of Hope