One among Intel’s extra fascinating initiatives over the previous few years has been XPU – the concept of utilizing quite a lot of compute architectures to be able to finest meet the execution wants of a single workload. In observe, this has led to Intel creating every thing from CPUs and GPUs to extra specialty {hardware} like FPGAs and VPUs. All of this {hardware}, in flip, is overseen on the software program stage by Intel’s oneAPI software program stack, which is designed to summary away most of the {hardware} variations to permit simpler multi-architecture improvement.
Intel has at all times indicated that their XPU initiative was only a starting, and as a part of at present’s annual investor assembly, Intel is lastly disclosing the subsequent step within the evolution of the XPU idea with a brand new mission codenamed Falcon Shores. Aimed on the supercomputing/HPC market, Falcon Shores is a brand new processor structure that may mix x86 CPU and Xe GPU {hardware} right into a single Xeon socket chip. And when it’s launched in 2024, Intel is anticipating it to supply higher than 5x the performance-per-watt and 5x the reminiscence capability of their present platforms.
At a really excessive stage, Falcon Shores seems to be an HPC-grade APU/SoC/XPU for servers. Whereas Intel is providing solely the barest of particulars at the moment, the corporate is being upfront in that they’re combining x86 CPU and Xe GPU {hardware} right into a single chip, with a watch on leveraging the synergy between the 2. And, given the point out of superior packaging applied sciences, it’s a secure guess that Intel has one thing extra complicated than a monolithic die deliberate, be it separate CPU/GPU tiles, HBM reminiscence (e.g. Sapphire Rapids), or one thing else completely.
Diving a bit deeper, whereas integrating discrete parts usually pays advantages over the long term, the character of the announcement strongly signifies that there’s extra to Intel’s plan right here than simply integrating a CPU and GPU right into a single chip (one thing they already do at present in shopper elements). Reasonably, the presentation from Raja Koduri, Intel’s SVP and GM of the Accelerated Computing Techniques and Graphics (AXG) Group, makes it clear that Intel is trying to go after the marketplace for HPC customers with completely huge datasets – the sort that may’t simply match into the comparatively restricted reminiscence capability of a discrete GPU.
A singular chip, as compared, can be a lot better ready to work from giant swimming pools of DDR reminiscence with out having to (comparatively) slowly shuffle information out and in of VRAM, which stays a disadvantage of discrete GPUs at present. In these circumstances, even with excessive velocity interfaces like NVLink and AMD’s Infinity Material, the latency and bandwidth penalties of going between the CPU and GPU stay fairly excessive in comparison with the velocity at which HPC-class processors can truly manipulate information, so making that hyperlink as quick as bodily attainable can doubtlessly supply efficiency and power financial savings.
In the meantime, Intel can also be touting Falcon Shores as providing a versatile ratio between x86 and Xe cores. The satan is within the particulars right here, however at a excessive stage it feels like the corporate is taking a look at providing a number of SKUs with completely different numbers of cores – doubtless enabled by various the variety of x86 and Xe titles.
From a {hardware} perspective then, Intel appears to be planning to throw most of their next-generation applied sciences at Falcon Shores, which is becoming for its supercomputing goal market. The chip is slated to be constructed on an “angstrom period course of”, which given the 2024 date is probably going Intel’s 20A course of. And together with future x86/Xe cores, can even incorporate what Intel is looking “excessive bandwidth shared reminiscence”.
With all of that tech underpinning Falcon Shores, Intel is at the moment projecting a 5x improve over their current-generation merchandise in a number of metrics. This features a 5x improve in performance-per-watt, a 5x improve in compute density for a single (Xeon) socket, a 5x improve in reminiscence capability, and a 5x improve in reminiscence bandwidth. Briefly, the corporate has excessive expectations for the efficiency of Falcon Shores, which is becoming given the extremely aggressive HPC promote it’s slated for.
And maybe most apparently of all, to get that efficiency Intel isn’t simply tackling issues from the uncooked {hardware} throughput aspect of issues. The Falcon Shores announcement additionally mentions that builders can have entry to a “vastly simplified GPU programming mannequin” for the chip, indicating that Intel isn’t simply slapping some Xe cores into the chip and calling it a day. Simply what this entails stays to be seen, however simplifying GPU programming stays a serious purpose within the GPU computing trade, particularly for heterogeneous processors that mix CPU and GPU processing. Making it simpler to program these excessive throughput chips not solely makes them extra accessible to builders, however lowering/eliminating synchronization and information preparation necessities also can go a great distance in direction of enhancing efficiency.
Like every thing else being introduced as a part of at present’s investor assembly, this announcement is extra of a teaser for Intel. So count on to listen to much more about Falcon Shores over the subsequent couple of years as Intel continues their work to bringing it to market.