NVIDIA - Enabler of the Impossible

Concerning leak from X account MEGAsizeGPU https://x.com/Zed__Wang

According to the account, Nvidia’s 5090 shortage will ease and turn into a ridiculously high inventory situation within a month, due to lower-than-expected demand for data center products. As is known, Nvidia gets a fraction of the profit from gaming graphics cards vs. data center GPUs. So, the less Nvidia sells its latest gaming graphics cards, the better.

However, one must also consider the mechanism highlighted by Jarnis, where chips that are not suitable for data centers can be utilized in gaming graphics cards.

The leaker has traditionally been considered reliable. Based on a quick Google search, tech sites have covered MEGAsizeGPU’s tweets.

Muropaketti’s article on the topic:

6 Likes

The same node is used, but the chips are different. You cannot take already manufactured datacenter Blackwell chip wafers and somehow turn them into gaming cards. Instead, what is possible is that allocation changes have been made, i.e., more wafers for gaming card use, less for datacenter use. If this has been done, there are certainly good reasons for it, and I don’t see this as particularly alarming. And my personal opinion is that this rumor is actually a hoax and only put into circulation so that the “rebellious” gaming card hunters would stay calm for that month or so until gaming card production catches up with the worst demand peak and more cards are available. NVIDIA has suffered significant reputational damage with the botched gaming card launch, where for a month now, they’ve been selling out to the extent that the market price for cards is +50% … +100% vs. the cards’ “MSRP” price, and everyone is out of stock.

As an anecdote, I myself have been actively hunting for an RTX 5090 card for the past month, and before that, I’ve briefly seen the “buy” button three different times, but I’ve never managed to get the product into the shopping cart. Today I finally got far enough to buy the product, but I was a second or two too slow for the day’s massive batch of 6 units (2 different models) to be enough. I was left in a bit of limbo - I got the order in, paid, but I was put on a waiting list for the next batch. It’s ridiculous that in 4 weeks, an estimated <50 units of these cards have been brought into this country, and the situation is still that you practically cannot even place an order as retailers don’t dare to take orders for them unless they are absolutely certain of availability because in previous release agreements, they got burned when the market price soared, and pre-sold cards at the recommended price resulted in losses…

The RTX 5080 situation, although better, is not massively better by any means; perhaps 10 times that amount has been brought into this country, and I have previously managed to buy one and order another (delivery perhaps next month) for workplace use, and even those haven’t been available anywhere for more than a few minutes.

Also, it must be remembered that the bottleneck for datacenter Blackwells is not TSMC’s wafers but packaging capacity, where the datacenter model chip is “glued” together with HBM memory on the same substrate. There isn’t enough of this capacity in the world.

10 Likes

MEGAsizeGPU’s post reeks of anti-scalper campaigning. Let’s hope there isn’t weakness in data center sales.

I carelessly read your previous message about graphics cards, and Muropaketti managed to confuse things further:

“Nvidia plans to utilize unused GB200 data center chips and use them as GB202 chips in Nvidia’s GeForce RTX 5090 graphics cards”

Vs. MEGAsizeGPU’s tweet, which talks about yields.

“Imagine you are Nvidia and have purchased shit loads of TSMC yields for B200, but now the market doesn’t want that much B200, and RTX40 is retired……The only solution is to make as much RTX50 as possible to cover the unused yield of B200”

Apparently, the message should have been interpreted as an allocation change. Quite difficult for a layman, fortunately there’s Jarnis.

If one absolutely wants to worry, then it ultimately doesn’t matter where the manufacturing bottleneck is (packaging, memory, wafers, etc.).

Yeah, there’s a bit of confusion in the air. GB200 and B200 chips have HBM memory controllers, which can only be used in data center chips. GB202 (“consumer-Blackwell”) has GDDR7 memory controllers and can be put into gaming cards. The only choice NVIDIA can make is between gaming RTX 5090 cards and the soon-to-be-released professional model (RTX xxxx Blackwell) because both use the GB202 chip, the only difference being that the professional card apparently has 48GB of video memory and perhaps a 96GB model will also come. The RTX 5090 gets those GB202 chips that are, so to speak, slightly flawed (not all units are active), whereas the professional cards get those that are fully intact.

What NVIDIA could do is put “fully intact” GB202 chips into gaming cards if manufacturing has yielded a higher percentage of fully intact chips than expected and enough have already been stockpiled for the upcoming professional card model. I don’t see this as a big drama either - the manufacturing of each chip costs the same regardless of whether it’s fully intact or a piece only suitable for an RTX 5090 card, and as long as there are enough fully successful chips for professional cards (note, not datacenter, but professional workstation PCs), the rest doesn’t really matter. Professional cards won’t suddenly sell in multiple quantities just because there are more chips - for them, it’s enough to have enough to meet demand, and the remaining chips can go into gaming card use.

Perhaps that tweet could theoretically be understood as the person confusing professional workstation cards and datacenter cards, and the change NVIDIA has made is no longer hoarding all fully intact chips for future professional card use but rather investing more in the availability of RTX 5090 cards. And the reason for this could simply be that TSMC has exceeded expectations regarding the percentage of GB202 chips that are fully intact. The plan might have initially been to only put the slightly flawed ones into gaming cards, but the result was too few gaming cards, and now the knob is being turned to a different setting. However, this would then not be related in any way to datacenter cards, whose chip is a different model (different memory controllers).

Or then it’s simply really about where purchased silicon wafers are allocated - more GB202 wafers ordered, and these taken from the (G)B200 chip allocation (GB200/B200). TSMC is still making the same number of wafers for NVIDIA as ordered, but the mix is changing slightly.

4 Likes

Groq (and Cerebras etc. inference players) are indeed quite interesting challengers. In addition to price, these are interesting because:

  • Production in the USA and geopolitical twists
  • CUDA’s moat (at least with current innovations) is smaller in inference

But based on SemiAnalysis’s (Dylan Patel) analyses, Nvidia can easily beat inference chips on the most important metric, performance / total-cost-of-ownership. This situation, however, may change if/when Groq eventually reaches modern nanometers in its production (4nm was perhaps a possible production entry this year now that they have received money from investors).

The biggest bottleneck in Groq is precisely how the price has been brought down - i.e., the lack of HBM. This is why situations arise where a couple of hundred LPUs are needed for the same service purpose that one Nvidia H/B-series can handle. And precisely for this reason, that perf / tco is on Nvidia’s side.

And as models become more complex and test-time training / CoT etc. increase context and thus also memory requirements, it is a challenging place for inference optimization. To some extent, even new LPUs are optimized for NLP SOTA models that are 1-2 years old, which is reflected, for example, in limitations related to new non-NLP tasks (where model development seems to be heading).

10 Likes

Additionally, it should be noted that NVIDIA is not sleeping. If there is a real long-term advantage to an inference-optimized chip, NVIDIA will release one at some point and start selling two different products for two different purposes. However, development cycles for these are years, so a fast-moving competitor has gotten to the market first.

3 Likes

So Ross doesn’t see Nvidia as a competitor to Groq, but you do? :face_with_monocle:

1 Like

Exactly – just as I wrote. But that advantage is indeed also a bottleneck, because memory is the first limitation encountered in AI.

Here, instead of an analysis made by superficial AI, I recommend reading an expert analysis, especially since it’s available almost for free:

This surface-level analysis by AI is simply wrong – or at least a very small part of the truth.

For example, in CoT, breaking down the problem into parts (at least with the current model paradigm) massively increases the size of the KV-cache and thus the demands on memory. This means that the LPU’s memory bandwidth limitation is so fundamental that it places significant constraints on what kind of models and how deep a CoT can be used for inference. And ultimately, it also limits the performance obtained from model inference on an LPU vs. a GPU. You might get cheaper $/token execution with an LPU, but the result squeezed from the model is worse than what you get with a GPU.

If you want to understand this better, I recommend, for example, this Lex Friedman mega-episode featuring Dylan Patel and Nathan Lambert. The discussion related to memory and inference begins around the 2-hour mark and continues for 60 minutes thereafter:

So, the LPU is certainly super good for inference use cases where it can be assumed that current LLMs are “good enough”. For example, I could imagine that in an MS D365 Copilot world, this could be really cost-effective for use cases where AI processes, say, sales orders from purchase orders. The output remains constant, and the input is universal enough that improvements to the underlying LLMs do not bring any practical functional improvements. No chains of thought or other frills are needed.

But since most AI is still quite a lot “more dynamic” for now, this indeed poses a massive challenge for all hardware-level optimization for the time being.

11 Likes

Without consulting the model, but as an industry insider, I see MoE models like Deepseek (v3, r1) as an interesting wildcard. They require relatively a lot of memory (challenging for low-memory LPUs like Groq’s) but memory bandwidth utilization is limited. Because of this, they are better suited for APU/GPU or even specialized CPU (with a robust memory bus) use. If AMD could get its software side in order, I believe they would truly have something to contribute in that sector. Unfortunately, so far they have only invested in hardware. Still, their hardware is used by some big players who can afford their own software development on top of the company’s own poor software. Cerebras and Groq are interesting companies, but I don’t believe they will scale sufficiently in relation to their current valuation (Cerebras is likely to go public this year). Of course, I’ve been wrong before.

4 Likes

I/O fund has reduced its position to 10%. Reports from the supply chain indicate that the ramp up of the largest NVLink GB200 systems would be slower than anticipated.

2 Likes

The 50-series is shaping up to be Nvidia’s worst new GPU release since the 20-series. It’s not enough that there was huge disappointment in terms of efficiency and price, but quality control also seems to have completely failed:

kuva

https://videocardz.com/newz/first-geforce-rtx-5080-discovered-with-104-rops-8-missing-in-action

kuva

https://www.notebookcheck.net/GeForce-RTX-5090-drops-below-RTX-4090-in-high-end-graphics-card-benchmark-chart.966347.0.html

My personal Nvidia graphics card ownership history goes 6600 GT –\u003e 8800 GT –\u003e 1080 Ti –\u003e 3090, so I believe I can, as a buyer, assess when Nvidia releases a top-tier product and when it’s complete garbage. If I had to rate it by gut feeling, I’d put the 50-series on a tier list at C or D-level. I honestly don’t believe any rational or informed person would buy this generation’s Nvidia graphics card if there are alternatives :smiley:
kuva

Fortunately, the sales of gaming graphics cards have practically no impact on the company’s stock price, as all the money and future growth come from the AI side.

4 Likes

That “benchmark chart” thing mainly tells about the poor quality of said benchmark for measuring the performance of really fast graphics cards. In actual tests, the 5090 is still about 30% faster than the 4090…

But the other things you mentioned are valid points. NVIDIA has badly botched this launch. On the other hand, gaming graphics cards are a bit of a side hobby, it doesn’t affect the company’s stock price at all.

8 Likes

An excellent, very clear video from io-tech’s Sampsa Kurri about the design peculiarities of Nvidia’s GeForce RTX 40 and 50 series graphics card chips, which may have contributed to power connector melting incidents.

3 Likes

image

12 Likes

https://x.com/LiveSquawk/status/1894860864488136969

image

Most importantly, at the end. The dividend stream continues strong :upside_down_face: - the dividend party approves this.

As I, at least, expected, the biggest pressure is on margins, but there’s a bit of room to compromise on those margin percentages…

22 Likes

Strong earnings report, mostly exceeded market expectations for both revenue and EPS. The company is a clear market leader in AI processors, and its new chips are widely used in cloud services and AI models. Strong profitability is based on high margins…dada.a.., long-term contracts, and the growing role of software and AI services in the business, meaning the outlook, according to the company, is still good, and the figures this quarter looked quite good.

On the other hand, some noted that other tech giants are developing their own solutions, which could weaken market position, but then again, this quarter and other discussions were convincing regarding Nvidia.

In the future, potential changes in AI demand, weaker-than-expected earnings development, or supply chain issues could cause strong fluctuations in the stock price, but as investors have noted, the company still has strong cash flow and a good long-term growth forecast, etc.

https://x.com/wallstengine/status/1894860626268430378
image
image


image
image
image

image

17 Likes

I’ll also put these here, which illustrate Nvidia’s vibe very well. :slight_smile:

https://x.com/StockMarketNerd/status/1894865186642628634
image
image

https://x.com/finchat_io/status/1894863266452853233
image
image

https://x.com/Quartr_App/status/1894864316500722014
image
image


EDIT:

I also added this, which allows for a good comparison of what went well and what went less well. :slight_smile:

https://x.com/ConsensusGurus/status/1894862310218383512
image
image

8 Likes

I divested from Nvidia shares on Tuesday. My confidence began to erode in January after reading this article, which I had already linked here once:

Below are the more detailed reasons for the sale. I still have Nvidia exposure through an ETF, so I naturally hope for the company’s continued success.

1. Dependence on key customers and one market segment

Amazon, Meta, Microsoft, and Alphabet are estimated to account for about 40% of Nvidia’s sales:

At the same time, 78% of sales come from the data center segment:

https://www.visualcapitalist.com/nvidia-revenue-by-product-line/

Nvidia stands on a significantly shakier foundation than the aforementioned quartet, whose revenue growth is not dependent on the acquisitions of a few customers and the success of a single sector. Last spring, I was willing to take this risk, but not in the current situation.

2. The need for computing power is taking a backseat

This risk materialized in January with the release of DeepSeek. As Nvidia’s customers optimize their AI models based on the knowledge gained from DeepSeek, the need for new investments may temporarily collapse or at least significantly decrease. What would happen if Nvidia’s most important quartet of customers each announced cuts to their new AI investments? Microsoft has already announced the termination of data center leasing agreements:

3. Investors are moving their funds elsewhere

Nvidia’s stock performance has been practically flat for the past 8 months, meaning the demand for shares relative to supply has decreased. At the same time, market FOMO seems to have shifted to companies representing the next wave of AI, such as Palantir, and who knows where next.

4. Size becomes a limiting factor

Nvidia is now a company worth over 3 trillion. Cisco was a 500 billion company in 2000, and Apple is now about a 3.7 trillion company, meaning the value of the largest company on the stock exchange has increased by about 7.5 times in 25 years. If we correct Cisco’s 2000 PE level from 200 to 40, we get a 37-fold increase. This means the stock market’s “ceiling raising” by about 15.5% per year. I would not expect such figures from Nvidia, even if everything went perfectly for the next 25 years. At an annual growth rate of 15.5%, Nvidia would be a 110 trillion company in 2050, which is about 5.5 times the value of the world’s current gold reserves.

The risk-reward ratio appears weak in the long term, for example, compared to the Nasdaq100 index, which has yielded 14% per year over the past 20 years. If Nvidia were an animal, it would probably be a Tyrannosaurus rex. Large enough chunks of meat will run out on the planet sooner or later.

15 Likes

You shouldn’t put all your eggs in one blog. I recommend listening to yesterday’s earnings call or reading its transcript. Many of the concerns mentioned above were addressed in analysts’ questions.

A few excerpts from Jensen’s answers. Everyone can make their own interpretations, but so far, the comments have been more bullish than bullshit.

"With resenting AI, we’re observing another scaling law, inference time or test time scaling, more computation. The more the model thinks the smarter the answer. Models like OpenAI, Grok 3, DeepSeek-R1 are reasoning models that apply inference time scaling. Reasoning models can consume 100x more compute.

Future reasoning models can consume much more compute."

"We’re at the beginning of reasoning AI and inference time scaling. But we’re just at the start of the age of AI, multimodal AIs, enterprise AI sovereign AI and physical AI are right around the corner. We will grow strongly in 2025. Going forward, data centers will dedicate most of capex to accelerated computing and AI.

Data centers will increasingly become AI factories, and every company will have either rented or self-operated."

9 Likes

110,000 billion is roughly the annual value of today’s entire global GDP in dollars.

Assuming that the global economy grows nominally by 5% in dollars by 2050, the GDP would be 370,000 billion.

So, NVIDIA’s market value would be one-third of the global GDP. For comparison, today it is less than 3% of global GDP, and Apple is just over 3%.

GDP can be understood as the sum of all wages and capital income. Or the sum of all expenditures and investments.

If NVIDIA were to trade at, say, P/E 25x, a market value of 110,000 billion would mean a net profit of 4400 billion.

Thus, 1.2% of the entire global GDP would go to NVIDIA’s profits.

Cf. today, Apple’s net profit of about 100 billion is 0.1% of global GDP, and even now, the profits of mega-tech companies sometimes draw criticism here and there, even though their combined profits are less than one percent of global GDP.

So, that annual growth rate of just over 15% would indeed be challenging by current standards, unless global GDP grows even faster miraculously.

It doesn’t matter what competitive advantage there is now in inference, infredfiwwidnwdindi, etc., the world is a pair of pants into which NVIDIA’s waist, at the pace expected by investors, will soon not fit. :smiley: If expectations are indeed wild. A more realistic figure might be 5-6% in the long run, if the company’s competitive advantages don’t disappear for decades.

25 Likes