COMPUTERS

The NVIDIA GTC 2024 Keynote Live Blog (Starts at 1:00pm PT/20:00 UTC)

AnandTech Live Blog: The newest updates are at the top. This page will auto-update, there’s no need to manually refresh your browser.

05:02PM EDT – AWS, Google/GCP, Microsoft/Azure, Oracle are all on board

05:00PM EDT – NVIIDA started with two customers and has many now:

04:59PM EDT – “This ability is super, super important”

04:58PM EDT – And this is where FP4 and NVLink switch come in

04:57PM EDT – Now looking at Blackwell vs Hopper

04:54PM EDT – Throughput is everything.

04:52PM EDT – Inference of LLMs is a challenge due to their size. They don’t fit on one GPU

04:51PM EDT – GB200 NVL72 can do it on 2000 GPUs with 4MW of power

04:51PM EDT – Traijning GPT-MoE-18.T would take 90 days on a 8000 GPU GH100 system consuming 15W

04:50PM EDT – Weight: 3000 pounds (prime, many more pounds)

04:48PM EDT – That saved 20kW to be spent on computation

04:48PM EDT – And those are all copper cables. No optical transceivers needed

04:48PM EDT – 5000 NVLink cables. 2 miles of cables

04:47PM EDT – 1 EFLOPS of FP4 in one rack

04:47PM EDT – This gives the GB200 NVL 720 PFLOPS (spare) of FP8 throughput

04:46PM EDT – 9 racks of NVSwitches

04:46PM EDT – 18 racks of GB200 nodes, each node with 2 GB200s

04:46PM EDT – Which helps NVIDIA build the DGX GB200 NVL72

04:45PM EDT – Connecting GPUs to behave as one giant GPU

04:44PM EDT – Meanwhile there’s also the new NVLink chip. 50B transistors, built on TSMC 4NP

04:44PM EDT – But why stop there?

04:43PM EDT – 5x the inference/token generation ability of Hopper

04:42PM EDT – “The future is generative”

04:41PM EDT – FP4 gains are even greater, since Hopper can’t process FP4 natively

04:41PM EDT – Meanwhile FP8 performance is 2.5x that of Hopper

04:40PM EDT – New: FP4 support. FP6 as well

04:40PM EDT – Blackwell also adds a line decompression engine that can sustain 800GB/sec of transfers

04:39PM EDT – Security remains a popular subject as well. Protecting the results of a model training, protecting the model itself. Which means supporting full speed encryption throughout

04:38PM EDT – So NVIDIA has added a RAS engine that can do a full, in-system self-test of a node

04:38PM EDT – Jensen is also pushing the importance of reliability. No massive cluster is going to stay up 100% of the time, especially running weeks on end

04:36PM EDT – 5th generation NVLink

04:36PM EDT – Support down to FP4

04:36PM EDT – AI is about probability. Need to figure out when you can use lower precisions and when you can’t

04:35PM EDT – Automaticaly and dynamically recast numerical formats to a lower precision when possible

04:35PM EDT – Second generation transformer engine

04:35PM EDT – “We need a whole lot of new features”

04:34PM EDT – “But there’s more!”

04:34PM EDT – And that’s Grace Blackwell

04:34PM EDT – NVLink on top, PCIe Gen 6 on the bottom

04:34PM EDT – “This is s miracle”

04:33PM EDT – GB200 is memory coherent

04:33PM EDT – GB200. 1 Grace CPU + 2 Blackwell GPUs (4 GPU dies)

04:33PM EDT – Jensen showing off Blackwell boards… and trying not to drop them

04:32PM EDT – The chip goes into two types of systems. B100, which is designed to be drop-in compatible with H100/H200 HGX systems

04:31PM EDT – No memory locality issues or cache issues. CUDA sees it as a single GPU

04:31PM EDT – 10TBps link between the dies

04:30PM EDT – “You’re very good. Good girl”

04:30PM EDT – “It’s okay, Hopper”

04:30PM EDT – Holding up Blackwell next to a GH100 GPU

04:29PM EDT – “Blackwell is not a chip. It’s the name of a platform”

04:29PM EDT – NVLink 5 scales up to 576 GPUs

04:29PM EDT – And NVDIA is building a rack-scale offering using GB200 and the new NVLink opertions, GB200 NVL72

04:28PM EDT – NVLink 5. Which comes with a new switch chip

04:28PM EDT – Available as an accelerator and as a Grace Blackwell Superchip

04:27PM EDT – 1.8TB/sec NVLink bandwidth per chip

04:27PM EDT – 192GB HBM3E@8Gbps

04:27PM EDT – 208B transistors

04:27PM EDT – Two dies on one package, full cache coherency

04:27PM EDT – Now rolling a video

04:27PM EDT – Named after David Backwell, the mathematician and game theorist

04:26PM EDT – And here’s Blackwell. “A very, very big GPU”

04:26PM EDT – “We’re going to have to build even bigger GPUs”

04:24PM EDT – “We need even larger models”

04:23PM EDT – To help the world build these bigger systems, NVIDIA has to build them first

04:23PM EDT – As well as developing technologies like NVLink and tensor cores

04:22PM EDT – The answer is to put a whole bunch of GPUs together

04:22PM EDT – “What we need are bigger GPUs” “Much much bigger GPUs”

04:22PM EDT – Even a PetaFLOP GPU would take 30 billion seconds to train that model. That’s 1000 years

04:21PM EDT – 1.8T parameters is the current largest model. This requires several billion tokens to train

04:21PM EDT – Doubling the parameter count requires increasing the token count

04:20PM EDT – Now recapping the history of large language models, and the hardware driving them

04:20PM EDT – Omniverse will be the fundamental operating system for digital twins

04:20PM EDT – Going to connect Cadence’s digital twin platform to Omniverse

04:19PM EDT – Cadence, the EDA tool maker, is joining Club GPU as well

04:18PM EDT – TSMC is announcing today that they’re going into production with cuLitho

04:18PM EDT – Accelerating computational lithography

04:18PM EDT – Sysopsys. NVIDIA’s very first software partner. Literally

04:17PM EDT – Ansys

04:17PM EDT – Jensen will be announcing several important partnerships

04:17PM EDT – NVIDIA will have partners joining them today

04:16PM EDT – Jensen would like to simulate everything they do in digital twin virtual environments

04:16PM EDT – It’s not about driving down the cost, it’s about driving up the scale

04:15PM EDT – Need new ways to keep growing computing. Keep consuming computing

04:15PM EDT – “General purpose computing has run out of steam”

04:15PM EDT – NVIDIA’s GPUs are only worth using because of the software written for them. So NVIDIA has always made it a point to showcase what the software development community has been up to

04:14PM EDT – (I’m told the capacity of SAP Arena for concerts is 18,500 people. With this floor layout, there’s probably more than that here)

04:13PM EDT – Warp, PhysX Flow, Photoshop, and more

04:13PM EDT – Rolling a demo reel of GPU-accelerated applications

04:12PM EDT – “Everything is homemade”

04:12PM EDT – “It’s being animated with robotics. It’s being animated with artificial intelligence”

04:11PM EDT – Everything we’ll be shown today is a simulation, not an animation

04:11PM EDT – As well as software and applications in this industry. And how to start preparing for what’s next

04:11PM EDT – “We’re going to talk about how we’re doing computing next”

04:10PM EDT – Comparing generative AI to the industrial revolution and the age of energy

04:09PM EDT – “The software never existed before, it is a brand new category”

04:09PM EDT – 2023: Generative AI emerged, and a new industry begins

04:09PM EDT – CUDA became a success… eventually. A bit later than Jensen would have liked

04:07PM EDT – How did we get here? Jensen drew a comic flow chart

04:07PM EDT – “The computer is the single most important instrument in society today”

04:06PM EDT – Showing off a iist of exhibitors. It’s a very big list of names

04:05PM EDT – Jensen is recapping the many technologies to be seen here. “Even artificial intelligence”

04:04PM EDT – “I sense a very heavy weight in the room all of a sudden”

04:04PM EDT – “This is not a concert. You have arrived at a developers conference”

04:04PM EDT – So without further ado, here’s Jensen

04:04PM EDT – This is Ryan. Apologies for the late start here folks, it’s been an interesting time getting everyone down to the floor and settled

We’re here in sunny San Jose California for the return of an event that’s been a long-time coming: NVIDIA’s in-person GTC. The Spring 2024 event, NVIDIA’s marquee event for the year, promises to be a big one for NVIDIA, as the company is due to deliver updates on its all-important datacenter accelerator products – the successor to the GH100 GPU and its Hopper architecture – along with NVIDIA’s other professional/enterprise hardware, networking gear, and, of course, a slew of software stack updates.

In the 5 years since NVIDIA was last able to hold a Spring GTC in person, a great deal has changed for the company. They’re now the third biggest company in the world, thanks to explosive sales growth (and even further growth expectations) due in large part to the combination of GPT-3/4 and other transformer models, and NVIDIA’s transformer-optimized H100 accelerator. As a result, NVIDIA is riding high in Silicon Valley, but to keep doing so they also will need to deliver the next big thing to push the envelope on performance, and keep a number of hungry competitors off their turf.

Headlining today’s keynote is, of course, NVIDIA CEO Jensen Huang, whose kick-off address has finally outgrown the San Jose Convention Center. As a result, Huang is filling up the local SAP Center arena instead. Suffice it to say, it’s a bigger venue for a bigger audience for a [i]much[/i] bigger company.

So come join the AnandTech crew for our live blog coverage of NVIDIA’s biggest enterprise keynote in years. The presentation kicks off at 1pm Pacific, 4pm Eastern, 20:00 UTC.


Source link

Related Articles

Back to top button