彼は"Token"という言葉でこれから来るAI経済のスケールを表していました。"Tokens per watt"(1ワット当たりの生成トークン数)が単位の経済です。電力を効率よくトークンに変える工場がファクトリーである。そのファクトリーこそがこれからの経済の主たる製造業になる。彼はAIデータセンターという言葉をもう使わずに、AI Factory(AIファクトリー)という言葉を使って説明します。次世代のAIデータセンター=AIファクトリーは最高の効率で電力をトークンに変える工場である...という捉え方です。
収益の源泉: フアン氏は「AIファクトリーの収益は、Tokens per Wattに等しい」と述べています 。効率よくトークンを作れるインフラ(RubinやDSX)を持つ企業ほど、他社よりも安く、大量に「知能」を市場に供給でき、高い利益率を確保できるからです。(今泉注:ここでジェンセン・フアンはAI半導体において追い上げてくるAMDや中国の競合に対する新しい戦いを仕掛けている。つまり、Tokens per Wattを競争軸にすることで、垂直統合アーキテクチャを持つNVIDIAには競合は絶対に敵わないという競争軸。)
..last year at this time I said that where I stood at that moment in time we saw about $500 billion dollars of very high confidence demand and purchase orders for Blackwell and Reuben through 2026 i said that last year now I don't know if you guys feel the same way but $500 billion is an enormous amount of revenue not one impressed i know why you're not impressed because all of you had record years. well I'm here to tell you that right now where I stand a few short months after GTCDC one year after last GTC right here where I stand I see through 2027 at least $1 trillion. Now does it make any sense and that's what I'm going to spend the rest of the time talking about in fact we are going to be short i am certain computing demand will be much higher than that and there's a reason for that. so the first thing is um we did a lot of work in the last year of course as you know 2025 was NVIDIA's year of inference we wanted to make sure that not only were we good at training and post- training that we were incredibly good at every single phase of AI so that the investments that were made investments made in our infrastructure could scale out for as long as they would like to use it and the useful life of Nvidia's infrastructure would be long and therefore the cost would be incredibly low the longer you could use it the lower the cost. there's no question in my mind Nvidia systems are the lowest cost infrastructure you could get for AI infrastructure in the world.
and so the first part was last year was all about AI for inference and it drove this inflection point simultaneously we were very pleased last year that Anthropic has come to Nvidia that MSL Meta SL has chosen Nvidia. and meanwhile meanwhile and as a collection as a group this represents onethird of the world's AI compute open- source models open-source models have reached near the frontier and it is literally everywhere. and Nvidia as you know today we're the only platform in the world today that runs every single domain of AI across every single one of these AI models in language and biology and computer graphics computer vision and speech proteins and chemicals robotics and otherwise edge or cloud any language NVIDIA's architecture is funible for all of that and we're incredible for all of that that allows us to be the lowest cost the highest confidence platform. because when you're building these systems as I mentioned a trillion dollars is an enormous amount of infrastructure you have to have complete confidence that the trillion dollars you're putting down will be you utilized would be performant would be incredibly cost-effective and have useful life for as long as you could see. that infrastructure investment you could make on Nvidia you could make with complete confidence we have now proven that it is the only infrastructure in the world that you could go anywhere in the world and build with complete confidence. you want to put it in any of the clouds we're delighted by that you want to put it on prem we're happy about that you want to put it in any country anywhere we're delighted to support you we are now a computing platform that runs all of AI.
now our business already starting to show that 60% of our business is hyperscalers the top five hyperscalers however even within that top five hyperscalers some of it is internal AI consumption. the internal AI consumption really important work like Rexus is moving from recommener systems of tables and collaborative filtering and content filtering it's moving towards deep learning and large language models search moving to deep learning large language models almost all of these different hypers scale workloads are now moving shifting towards a workload that Nvidia GPUs are incredibly good at. but on top of that because we work with every AI lab because we work with every we accelerate a every AI model and because we have a large ecosystem of AI natives that we work with that we can bring to the clouds that investment no matter how large no matter how quick that compute will be consumed and that represents 60% of our business. the other 40% is just everywhere regional clouds sovereign clouds enterprise industrial robotics edge big systems supercomputing systems small servers enterprise servers the number of systems incredible the diversity of AI is also its resilience the span of reach of AI is its resilience there is no question this is not a one app technology this is now fundamental this is absolutely a new computing platform shift.
well our job is to continue to advance the technology and one of the most important things that I mentioned last year was last year was our year of inference we dedicated everything we took a giant chance and reinvented while Hopper was at its prime and it was just cooking. we decided that the Hopper architecture the MVL link by 8 had to be taken to the next level we completely rearchitected the system disagregated the computing system alto together and created MVLink 72 the way that it's built the way it's manufactured the way it's programmed completely changed. grace Blackwell MVLink72 was a giant bet and it wasn't easy for anybody and many of my partners here in the room i want to thank all of you for the hard work that you guys did thank you. mvlink72 MV FP4 not just FP4 precision FP4 is a whole different type of tensor core and computational unit we've demonstrated now that we can inference NVFP4 without loss of precision but gigantic boost in performance and energy efficiency. we've also been able to use MVFP4 for training so MVLink72 MVFP4 the invention of Dynamo Tensor RTLM a whole bunch of new algorithms we even built a supercomputer to help us optimize kernels and help us optimize our complete stack we call it DGX cloud we invested billions of dollars of supercomputing capability help us create the kernels the software that made inference possible.