Follow Us On :

Stock Watch: Nvidia Corporation (NVDA.US)


Nvidia Corporation (NVDA.US) CEO Jensen Huang recently stated that generative AI represents a new “iPhone moment.” While the long-term impact of generative AI remains to be seen, its immediate impact appears to alter, perhaps irrevocably, the approach to cloud infrastructure. As Microsoft’s exclusive hardware acceleration provider for generative AI services, Nvidia should be the provider of choice for cloud AI acceleration globally. And why? We see it below.

Since OpenAI’s launch of ChatGPT last November, this has become the fastest growing user app, with over 100 million users in January. It has also caused cloud service providers to re-evaluate what AI can do. For example, Microsoft Corporation (MSFT.US) has embraced Generative AI, which is the term for AI similar to ChatGPT, and hosts all OpenAI services.

Nvidia CEO Jensen Huang

The drop in data center spending may reflect the impact of ChatGPT

In the first quarter of this year, the major cloud service providers Microsoft (MSFT.US), Alphabet Inc./Google (GOOG) and, Inc. (AMZN) all increased revenue from cloud services substantially. Microsoft Azure Intelligent Cloud segment increased revenue by 16% year-over-yearGoogle Cloud increased revenue by 28% annually. Amazon Web Services (“AWS”) also increased revenue by 16% year-over-year.

Despite this growth, there has been a decline in spending on cloud services hardware, most notably at Intel Corporation (INTC.US), where data center revenue was down 39% year-over-year. In addition, Intel’s second-quarter outlook implied a larger, though as yet unspecified, cut in year-over-year revenue for the data center. Things were better at Advanced Micro Devices, Inc. (AMD.US), where data center revenue was flat year-over-year, but the company predicted a year-over-year decline in data center revenue in the second quarter.

How do you keep up with continued cloud growth with the apparent decline in cloud infrastructure spending?

An investor could always chalk it up to caution against macroeconomic headwinds, perhaps combined with costs and supply chain issues for hardware production.

But really what we have ahead of us is a disruptive technological change, something much more fundamental, that has accelerated with the advent of generative AI. And that’s the shift in focus from traditional CPUs (AMD or Intel processors) as the main computational engine in the data center to a combination of CPUs and data accelerators, which are mostly based on GPUs (graphics processors or graphics cards). from Nvidia).

This is an approach that Nvidia has advocated for years, which is normal since it’s their business, arguing that GPU acceleration is inherently more energy efficient and cost-effective than individual CPUs. The argument against this has been that CPUs are more versatile while GPUs are restricted to certain tasks that benefit from massive GPU parallelism.

But the range of tasks that benefit from the GPU has been increasing. Energy efficient supercomputing is now the almost exclusive domain of GPU (or Nvidia for that matter) acceleration. In commercial cloud services, GPUs accelerate everything from game streaming to the metaverse. And of course the AI.

In this sense, AMD, with a substantial GPU portfolio, is better positioned than Intel, and this may partly explain AMD’s better Q1 data center results. However, the advent of generative AI has upended the AI acceleration market.

Probably, the GPT’s, which is short for Generative Pre-trained Transformer, has probably made conventional GPU acceleration obsolete. When Nvidia introduced the H100 “Hopper” data center accelerator in April 2022, it included a “Transformer Engine” to accelerate generative AI workloads. The Transformer Engine is based on Nvidia’s Tensor Cores to provide a 6X speed improvement on training transformers:

In the white paper published along with the data from the previous quarter, Nvidia explained the motivations of the Transformer Engine that, without going into the details and tremendously complex technical terms, just to recap, in 2022, Nvidia’s Transformer Engine seemed like just a technology. interesting. It was an appropriate innovation considering that Nvidia wanted to remain relevant to the AI research community. I had no idea at the time how critical it would become to the considerations of cloud providers like Microsoft who want to make GPT available to the general public.

How Microsoft leaped forward with the OpenAI collaboration

When Microsoft first unveiled its AI-powered generative browser and search engine in February, it seemed like the company was far ahead of Google in integrating GPT into its products. And now, while Microsoft offers its AI “co-pilots” as standard features, Google’s competing “Bard” is still experimental.

It is clear that Microsoft has taken a big step in AI with the help of its collaboration with OpenAI. All the hosting for this (OpenAI) is provided by Microsoft’s Azure cloud service, including ChatGPT and more advanced generative AIs. Also, it’s clear that Microsoft has had access to OpenAI’s generative AI technology at the code level (a critical thing) and incorporated it into various AI “co-pilots” the company offers today.

But wait, how did Microsoft start such a close and seemingly exclusive relationship with a non-profit research institution?

As it turns out, OpenAI isn’t exactly a non-profit organization. In 2019, OpenAI created OpenAI LP as a wholly owned and onerous subsidiary. This appears to have been done with the sole intention of providing a recipient for a $1 billion investment from Microsoft as if it were a donation.

Subsequently, in January 2023, Microsoft invested another $10 billion in OpenAI LP, as reported by Bloomberg:

The new support, based on the $1 billion Microsoft invested in OpenAI in 2019 and another round in 2021, aims to give Microsoft access to some of the most popular and advanced AI systems. Microsoft is competing with Alphabet Inc., Inc. and Meta Platforms Inc. to dominate the fast-growing technology that generates text, images and other media in response to a short notice.

At the same time, OpenAI needs funding and cloud computing power from Microsoft to process large volumes of data and run the increasingly complex models that allow programs like DALL-E to generate realistic images based on a handful of words, and ChatGPT to create amazingly human images. -as conversational text.

$10 billion is a lot of millions to pay for what amounts to a lot of code, but it’s code that no one outside of OpenAI and Microsoft has access to.

Microsoft’s huge investment in generative AI

To make the leap to Google, which had invented the generative AI approach, Microsoft had to make a massive investment not only in OpenAI software but also in hardware, primarily Nvidia hardware. Huge, we’re just starting to learn through some Microsoft blog posts.

In 2019, Microsoft and OpenAI began a partnership, which expanded this year, to collaborate on new Azure AI supercomputing technologies that accelerate advances in AI, deliver on the promise of extensive language models, and help ensure that the benefits of AI is widely shared.

The two companies began working closely together to create supercomputing resources on Azure that were designed and dedicated to enable OpenAI to train an expanding set of increasingly powerful AI models. This infrastructure included thousands of AI-optimized NVIDIA GPUs connected together in a high-performance, low-latency network based on NVIDIA Quantum InfiniBand communications for high-performance computing.

The scale of the cloud computing infrastructure that OpenAI needed to train its models was unprecedented: exponentially larger pools of networked GPUs than anyone in the industry had ever tried to build, said Phil Waymouth, Microsoft’s senior director in charge of partnerships. strategies that helped negotiate the deal with OpenAI.

Despite the huge infrastructure investment, the customer base at this point was relatively limited to OpenAI researchers and within Microsoft. The shift from the research program to commercial cloud services has been called “the industrialization of AI” by Nvidia CEO Jensen Huang and others.

During his keynote address at the GPU Technology Conference in March 2023, CEO Huang referred to the necessary infrastructure as “AI factories”. In designing the AI factory that OpenAI needed, Microsoft certainly had a great opportunity to evaluate various hardware alternatives before settling on Nvidia.

Nvidia as an almost exclusive supplier to Microsoft

Construction of these AI factories is still underway and will continue as demand for generative AI services expands. An idea of the business opportunity for Nvidia can be derived from another blog post by Matt Vegas, Senior Product Manager for Azure HPC. In the post, he announces that Microsoft is starting to offer virtual machine instances with a minimum of 8 Nvidia Hopper H100 GPUs. This is scalable for AI use in “thousands” of H100s.

Essentially, this is a system that allows a multiple connection system to function as a single unified GPU. They connect to each other via Nvidia InfiniBand fiber optic links with a total data bandwidth of 3.2 Tb/sec.

It’s hard to tell how many actual H100 accelerators Microsoft has bought, but it appears to be in the thousands. Although we can get an idea of how many thousands by estimating the amount of H100 needed to host OpenAI services.

OpenAi has been very secretive about the details of ChatGPT, so it is difficult to determine the hardware resources that a single instance of ChatGPT requires. Tom’s Hardware, a website specializing in hardware technical reviews, has an interesting article in which the author ran a less capable GPT on a PC equipped with an RTX 4090 GPU (high end of Nvidia’s graphics cards for users) and extracted the following conclusions:

If a single GPU with 24 GB of VRAM can run the minor GPT, then I would estimate that a single H100 with 80 GB of VRAM would be enough to run one instance of ChatGPT. This is only an estimate. In fact, a ChatGPT instance can have its processing distributed across multiple H100s and access over 80GB of VRAM, depending on the workload. More advanced GPTs, either from OpenAI or Microsoft, may require even more.

According this report, OpenAI receives 55 million unique visitors per day for an average visit time of 8 minutes. Assuming that each visitor gets exclusive use of one H100 during the visit, this implies that there must be around 300,000 H100s in the Azure cloud service to handle the load. That would equate to 37,500 H100 DGX systems, worth about $3.75 billion in revenue, likely spread over multiple quarters.

As of Nvidia’s fourth quarter of fiscal 2023, most of this infrastructure was likely already accounted for in revenue from Nvidia’s data center business segment. However, the potential for expansion of GPT-like AI services on Azure means there is much more to come. Microsoft’s H100-based service, called ND H100 V5, is currently only offered as a preview. This is probably to ensure that the available hardware is not overloaded.

Are we facing the next wave/cycle of innovation? Source: Edelson Institute


Everything seems to indicate that Nvidia has every chance of coming first and with the greatest number of hardware needs in generative AI.

Almost every hardware vendor in the cloud space has noticed that generative AI represents a huge opportunity. And they are right, but this market is unlikely to be evenly distributed.

The saying that battles are won by the one who gets there first and with the most resources is true for Nvidia’s AI acceleration business. While competitors are only talking about future opportunities, Nvidia is cornering the current market for generative AI acceleration.

Investors were ‘surprised’ by Nvidia’s (NVDA.US) shares even against other big tech companies like Microsoft and Alphabet/Google. The S&P500 lagged behind the biggest tech stocks. Source: Bloomberg

The key to Nvidia’s success is that apart from staying in its current segments, it has almost prophetically anticipated future needs. Nvidia has been there since the beginning of OpenAI, when Jensen Huang hand-delivered the first DGX system to OpenAI back in 2016, 7 years ago now.

NVDA.US, D1. source: xStation

Nvidia Corporation will offer us this week the results of its first fiscal quarter 2024 (on Wednesday 24th after the US market close). In a technical context where it no longer has barriers to recover the all-time highs reached in November 2021.

Darío García, EFA
XTB Spain

Tags :
Share This :

Leave a Reply

Your email address will not be published. Required fields are marked *