One small discussion making rounds → AMD’s processor efficiency (which was showcased in its fastest supercomputer and is being seen in current 6000 series) in upcoming processors, makes it an attractive product for energy starved world right now.
The TDP values at the bottom right needs to be noted. Everyone is eagerly waiting for the performance number Q1 2023.
GPU/AI:
While AMD is putting in resources to better their software (They never had the money before to do this), we are now seeing large companies with money working on software to easily switch out of CUDA.
Software has become a key battleground for chipmakers seeking to build up an ecosystem of developers to use their chips. Nvidia’s CUDA platform has been the most popular so far for artificial intelligence work.
However, once developers tailor their code for Nvidia chips, it is difficult to run it on graphics processing units, or GPUs, from Nvidia competitors like AMD. Meta said the software is designed to easily swap between chips without being locked in.
It is quiet clear that the market leaders are going to get hit. It is unsure about how underdogs will fare. AMD’s Lisa Su has maintained no significant impact on their biz due to these regulations short term. Most likely because they are only catching up now. AMD CEO Lisa Su says US chip export ban won't hurt her company in the short term | Fortune (paywall, could not find another link… will add here when I find it).
Future TAM will definitely be affected and it will need some tweaking by every CPUGPU company.
Edit: Removing a chart from here. The youtuber has removed it stating bad software setting used for intel test.
Thw latest raptor lake (retail) CPUs is comparable to/better than current AMD processors.
Though the retail market is not hot and AMD is focusing on datacenter and laptops, they have given a good product overall. They are launching an armada of laptop processors next year. There is not much expectation since some OEMs in laptop markets are tightly controlled by intel and not to mention lack of demand. Folks like dell just will not put non intel in their high end laptops.
People are moving to cloud aggressively (Example: FedEx).
Chiplets - As of now, only AMD can move capacity like this. Their CPU/GPU has mix of nodes (5nm and 6nm). And when they see markets down in one sector they can move a part of the capacity in one of the nodes to another to cater to another sector where demand is present. This is unique to AMD as of now.
Genoa marks a big shift in TCO that makes it sensible to replace aging servers. 2S (socket) Genoa offers 4x the general-purpose performance at significantly better TCO versus 2S Skylake/Cascade Lake SP server. Initial capital expenditures for Genoa-based servers are considerably higher due to the cost of higher costs of the CPU, DDR5, and PCIe 5.0. Despite this big cost jump, Genoa and Bergamo-based servers will pay for themselves many times over versus keeping already depreciated servers deployed.
Under this oversimplified model, upgrading to a 2-socket Genoa-based server from 4 existing 2-socket Skylake/Cascade Lake-based servers (2 CPUs vs 8 CPUs) is a net present value positive transaction. The payback period for Capex spent is roughly ~18 months. The payback period for a Rome/Milan server upgrade would still be ~4 years. The improvements are even more significant when you start considering new features related to security, CXL, and AVX512.
We see that DC CPU story is progressing as was being predicted from 2020.
Update on GPU
AMD is targeting the next behemoth, nvidia since last year. Performance wise, AMD hardware is able to match nvidia hardware as we saw last gen. Looking at the overclocking being done with 6800xt, I think that AMD can beat NVDA Titan cards but they do not want to do that fight… not just yet.
But what happened this gen? You will usually read that nvidia still has top performance cards. But that is coming at a cost.
Their GPU takes massive wafer space - 608/628mm^2 for 4090. - The reticle limit is 800mm^2. What this does is that it reduces yields compared to smaller chips.Die Yield Calculator - iSine. Note that this include their graphics and IO IP.
The larger chip also limits nvidia’s ability to pack more shaders (say compute units). They can pack only so much before they hit the reticle limit. Compared to this, AMD top is at ~300mm^2. They can always double their shader count by making large GCD (graphics complex die). They have chiplets… So they keep IO out of GCD and put it in cheaper older node (6nm for 7xxx series) while GCD is in newer nodes (5nm in 7xxx series). But there is no space left for nvidia except for hoping for node shrink. Next gen is going to be when we clearly see nvidia is facing the death of moore’s law their CEO is talking about. https://venturebeat.com/games/jensen-huang-qa-why-moores-law-is-dead-and-smart-design-is-replacing-it/. But AMD is not that affected since they went chiplets.
For nvidia, the desperation for mindshare “fastest GPU” is visible with them pushing the power requirements.
AMD refused to chase prices set by nvidia nor performance set by nvidia. They could have easily pushed clock (and hence power) and matched nvidia performance but thye instead chose to cut their own path. They kept the GPU prices at 1000$/900$ (compared to 1600$/1300$ nvidia) and still come close to nvidia raster performance (raytracing is another story). Raster is majority market. That is their claim. We will see when the cards come in December or third party review.
That is the current status in hardware. nvidia pushing power to keep mindshare + amd refusing to play that game as of now. So unless nvidia next gen (Blackwell) is chiplets, nvidia is going to face what happened to intel in hardware. Offcourse, customers prefer nvidia for their sofware platform. Huge advantage there. So the job is cut out for both teams. nvidia needs to figure out how to do chiplets and amd has to figure out how to do great software. From that end, the following info is noteworthy
AMD is using 6000 unique systems configurations for graphics driver testing, 1500 more than NVIDIA
Regarding alibaba ARM cloud, I am eagerly waiting for AMD Bergamo zen4c 128 cores tailored for cloud. We are also seeing efficiency cores from intel. This will very likely make it to cloud customers. So, we need to wait on these cores to come and compete and then, only then see how ARM does against a legitimate competition in the area that ARM server processor designers are targeting.
Mosesmann - I used to cover ARM back when they were public. I recall an executive at ARM had said that all this being equal if we had an ARM processor at the same process node because of the efficiency of the architecture we can use 1/3 of the transistors than an x86 processor would use, I dont’t know if you can comment on that because it is more like an x86 question and not your area historically but …
Peng - yeah… well hans you know what i would say is okay, I guess… you know I started my career as a microprocessor designer . My first program was a VAX mini digital computer so that shows you my age ( ) , I have done VAXs, I have done MIPS (VP of engineering), SGI, We have done multiple generations of ARM now at xilinx we do ARM SOCs and now (I am) with a company that is x86. So what I would say is that, that is technically not accurate (He is laughing a bit here). Modern architectures have a lot of commonality, I am not saying there aren’t some differences with these instructions and architectures but you know that claim of factors like that is simply not true right… When you target certain things like the ultimate in single threaded performance that leads you to certain architectural choices. If you are not targeting the ultimate in single threaded performance and you are targeting something else like say a mobile handsets or something where you care a lot about power you have different architectural choices. I think there is a lot more about architecture and implementation choices as opposed to what is inherent in the instruction set architecture… I have done a lot of architecture in my time but I think that is tremendously exaggerated.
Jim Keller had something similar to say about ARM vs x86. His talk is available now in YouTube
All I will say is ARM is making incursions, but we do not know how much of it is due to the goof up that intel has done. It could very well be that new server entrants will learn some old lessons from intel. Having your own fab is a different thing altogether. Another factor coming in is geopolitics… so… we can only observe how ARM progresses as of now and set our expectations at the right level. Because it is only now in 2023 that AMD and Intel (may be) will give out cores targetting specifically for cloud customers.
EPYC Genoa Launch Event
So AMD got msft/vmware/semi/HP/Dell/Lenovo/supermicro/google cloud/vmware to talk about the benefits of genoa/EPYC platforms in terms of power/energy savings/performance.
→ together we advance_data centers
They really doubled down on capex and opex. There was stress on TCO across these clients.
New Platform: Very likely a 5 year plan. MSFT said you get upgraded to genoa-x (the 3d cache version) when it arrives. Zen5/Zen6 are going to be drop in replacement. Platform stays. Intel has nothing close to this.
→ Migration is easier said than done though. Because it is not just the processor you change when moving out of Intel. You have other peripheral costs added in. But this platform gives a compelling reason to ask why not (as mentioned in the presentation by a client). Example: “what can be done with intel can be done with 1/3rd the servers at 50% less power and that combines 40% capex and 61% opex reduction/year”.
The generational uplift from Milan to Genoa was incredible across the wide-range of server and HPC benchmarks I’ve carried out. I am now left to daydream about what Genoa-X will look like next year in knowing there still is even more potential to squeeze out of Zen 4 on the server side as well as next year’s Bergamo CPUs for up to 128 cores for focused on cloud computing workloads.
Overall though, on a raw performance basis, Genoa is a clean kill. Nothing comes close, that 50% core count increase is simply crushing to anything out there and on that basis is nearly as massive a generational change as the Naples to Rome jump. The competition has no answer for years.
I stumbled on this article. A view into the the scale of datacenters.
Compass Data Centers, Prince William County: In June, Compass Datacenters filed plans to build up to 10.5 million square feet of data center capacity in the Prince William Digital Gateway, a proposed 2,100-acre technology corridor in Manassas, Virginia which could accommodate up to 27 million square feet of data center development. Compass seeks to rezone 825 acres of land for its project. The Digital Gateway is controversial because it is adjacent to one of the Manassas Civil War battlefields and a state forest, but last week passed a key milestone when the Prince William Planning Commission recommended the project be approved by the Prince William Board of Supervisors, which is expected to review it next month.
Wonderful interview of former Global Foundaries (fab company like TSMC) VP about current status of Intel/AMD/NVIDIA. Near past, present and near future.
note: He details how he underestimated benefits of chiplets and how the economics of chiplets is a big advantage.
The results from Larabel show that the EPYC 9374F came very close to matching the Intel Xeon Platinum 8380 2P in single-core tests. However, in multi-core workloads, the results shine for AMD. This is made even better by the fact that the single-chip has 32-cores on the 1P platform and was running against two Xeon Platinum 8480 chips with a combined total of 80 cores and 160 threads., in power consumption, at a fantastic 327.56W. In contrast, the Intel Xeon 8380 2P maxed out at 583.63W. The difference is roughly 1.5 times in favor of a single AMD EPYC 9374F compared to two Ice Lake Intel Xeon 8380 2P processors.
Wonderful read on future datacenter architectures and how AMD is at the intersection of AI and Datacenter that is coming our way. The author is holding AMD shares.
Advantages of chiplets explained perfectly with real examples from AMD Fellow Sam Naffziger
New information for me are
Porting IO IPs to new node is engineering intensive - So with chiplet they can avoid Moving IO to new node. Why? Point 2 below
Logic (CPU/GPU) scales well with node size but other IPs do not. So with chiplets, they have the option to move only CPU to new node while keeping IO in a separate die.