Chiplets on mobile processors doesn't sound exactly enticing. Why does Intel feel the need to use 3 different types of chiplets from 2 foundries on a power-sensitive mobile chip?Reply
It's not the same chiplet strategy as AMD is using for Zen, it's more inline with AMD's MI accelerators. Not as cheap as the Zen strategy, but much more power efficient for power sensitive mobile chips.Reply
Do you have proof it is more power efficient? Intel right now is vastly less power efficient than AMD and that is regardless of chiplet or monolithic dies.Reply
First page says .15-.3 pJ/bit for Foveros, AMD has previously stated infinity fabric uses "<2 pJ/bit" which presumably isn't very far under 2. this puts Intel at 5-10x the efficiency for this very specific part of data transmissionReply
first generations of infinity fabric were also less power hungry, untill they noticed what is needed to get things faster and there you go. Lots of bandwidth changes and infinity fabric changes and as well the power usage... lets see what Intel is capable of doing first, it's marketing all over the place...Reply
Intel is behind AMD on power efficiency purely due to process node. Architecturally they are quite competitive. This is proven in how AMD performance scales when underclocked compared to Intel.
Anandtech recently did an investigation into this and came to the conclusion that while the architectures are vastly different they have potentially similar performance once you determine the ideal wattage for the chip, and at the moment Intel is pushing high TDP into extremely inefficient territory to have something competitive with AMD, due to AMD being on a superior node. Basically AMD can deliver X performance at 56w while Intel can deliver X performance at 72w, but Intel can deliver Y performance at 100w while AMD delivers Y performance at 95w.
While impossible to absolutely prove, various factors can help determine this differential has more to do with manufacturing superiority than design superiority.
Intel going for tiles here is a clear attempt to close the gap on this.Reply
I think I'd second-guess anything Anandtech does these days. They had one person with an advanced degree that departed years ago after being bought out by Qualcomm when they owned Killer NICs and have since had trouble publishing articles without obvious typos and "in-a-hurry" oversights.
On the other hand, if that data is supported by a more credible publisher that has decent measuring equipment and can afford to purchase its own test hardware rather than relying solely on free samples - well then we should sit up and take notice. At this point though, an Anandtech exclusive is just a reason to raise the citation needed flag and THEN further analyze the sources for their motives.Reply
They’ve always had typos that slip through, or grammar mistakes. All the way back to Anand’s time. I don’t like it but I know what was intended when I see one. There still isn’t another website with the same focus on the segments they focus on. I miss the deep-dives into mobile chips and phones they used to do by investigating what wasn’t publicly released about things but for most people that’s a niche purpose that other sites cover extensively with reviews and can focus on and is a bit too consumer focused and easily found elsewhere. Not really their target market.Reply
My personal guess, from the moment this was announced, has been that they want every single cm² of silicon going through their Intel 4/EUV capacity going to the compute tile. They're lagging quite a bit behind TSMC and Samsung in terms of EUV capacity, so anything that doesn't stand to benefit much from being designed from the ground up to be made on their own nodes is worth offloading to TSMC.
SoC and IO tiles are really not process-limited currently, and their Arc GPUs are in general going manufactured elsewhere anyway. But their CPU design process has always been fully in house every step of the way, and they don't want to change that (at least not yet). So everything but compute would be "wasted" Intel 4 capacity.Reply
totally agree. At the same time, instead of doing a full Intel 4 Meteor lake chip, shrink it down to compute tile only also reduces the size of the silicon and improves yield. Later next year, Intel will also need EUV capacity for Sierra Forest and Granite Rapids. These chips will be much larger than mobile compute tile and considerably lower yield.... Intel will need every ounce of EUV capacity they have.Reply
Probably to have as much compute on the N4 capacity that they have, their substrate also takes much less power connecting them than current AMD and it allows for the best node for each part being used i.e if Intel's wasn't ideal for the GPU tile as the CPU tile etc Reply
I have the same question. At the same time, I was curious about Intel's EUV capacity. Since Intel is the late comer to EUV and over 50% of EUV machines are at TSMC, does Intel really have the capacity to manufacture full chip Intel 4 Meteor Lake? Not to mention up coming Sierra Forest and later on Granite Rapids will all use EUV capacity. I think the reasonable way is indeed only use EUV at the most critical part of Meteor Lake ---> Compute tile, and out source the rest.Reply
To avoid the issues they have with rollout 14 nm (BDW) and then 10 nm (CNL), I guess, when they held back by yield with respect to particular parts of the chip, specifically, GPU.Reply
A17 Pro just beat all Intel CPUs except the 13900KS in ST Geekbench6. A17 Pro uses less than 3w to achieve this - with typical load significantly below 3w. Meanwhile, 13900KS uses as much as 250w or more.
Intel's Meteor Lake needs to improve by 10x over Raptor Lake just to match what M3 will be able to do. Reply
The 13900ks uses 250 watts on a single core? Got a link for that?
I'll think you find that single core workloads use far, far less. Also remember that benchmarks across ISA's are sketch at best and outright made up at worst. I mean just look how badly games or software can be when ported from one ISA to another, it all really comes down to how well you've made the software to run on each architecture.Reply
No the A17 Performance core is only clocked at 3.6/3.7GHz compared to the x86 designs that are up to clocked 2Ghz+ higher. So this is not some ESPN like Fanatic statement as since the A14/Firestorm core Apple's instruction decoder width is at least 8 decoders wide and backed up by loads of execution ports. And so Apple's P cores are of a very wide order superscalar design since the A14/Firestorm was released!
And the Apple P cores are high IPC at low clocks compared to the x86 designs that have 4/6 instruction decoders so need the higher clocks to make up the IPC deficiency for single thread performance that's calculated as IPC multiplied by average sustained clocks to get that single threaded performance metric.
The lower clocks are where Apple's power savings come from and the longer battery life is obtained. That and the A17 Pro/Earlier A series SOCs have loads of specialized heterogeneous compute for offloading workloads onto instead of using the CPU cores or GPU cores so more power can be saved there for all sorts of specialized workloads. The x86 processors/SOCs are just now getting the same sorts of specialized heterogeneous compute IP blocks but that's relatively immature compare to Apple's SOCs and other ARM Based SOC ecosystems that have been using that specialized heterogeneous compute IP for years now. Reply
Well, it would be interesting to see Intel or AMD make a fixed-width ISA design and how that then stacks up against the stuff of Apple. Really, x86 is at a disadvantage because of the variable-width instructions but still has done a fantastic job. Or, I'd like to see Apple design an x86 CPU and see how that holds up against Zen and the rest.Reply
No logical reason for Apple to go CISC as the x86 Instruction Decoder requires many times the transistors to implement than the transistors required to implement a RISC ISA Instruction Decoder! So it was easy to get 8 Instruction Decoders to fit on the front of the A14/Firestorm processor core(RISC ISA Based). It's easier to go wider if one has a relatively fewer Instructions of a fixed length to implement in a Instruction Decoder design. So that makes it easy to produce a custom very wide order superscalar processor core design that targets high IPC at a lower clock rate and the SOC's CPU cores clocked well inside their Performance/Watt sweet spot. And to still have that A14 match/get close to the x86 cores in single threaded performance and against x86 core designs that are clocked 2GHz+ higher.
The x86 ISA is too Legacy Instructions bloated and it's not going to be easy to refactor that and not require years in the process to do that. The ARM ISA ecosystem is from the ground up RISC there and even though the x86 designers have a RISC like back end to break those CISC down into more RISC like instructions, that hardware engine take more transistors to implement and thus will use more power resources getting that done. The vast majority of ARM ISA instructions translate 1 to 1 into single and some a few Micro-OPs so how hard is that to decode compared to x86 ISA instruction that mostly have multiple micro-ops generated to get all that complex work done. And there's a valid power usage reason that x86 never made any inroads into the wider tablet/smartphone market.
The thing about the ARM/RISC core designs is that they can scale from phones to server/HPC whereas the CISC designs can not scale down as low power as RISC designs! but Intel has done a good job at getting close there but a little too late to matter to the OEMs that really did not want to remain beholden to Intel and x86. And the same can be said now for RISC-V compared to an ARM Holdings that's maybe leaning more towards an x86 like business model where RISC-V represents total end user ISA freedom there, within reason, as the RISC-V ISA is totally open not royalist/encumberments required/enforced. Reply
I agree with most of what you're saying. What I was trying to get at is that there seems to be a belief that Apple has superior engineering ingenuity than Intel and AMD, when really, it is the difference between fixed- and variable-length instruction sets and all that entails. What I'd like to see is all of them on the same playing field and where each then stands, from a CPU point of view. Quite likely, there won't be much of a difference because good design principles are always the same. It's trying to be out of the ordinary that leads to Pentium 4s and Bulldozers.Reply
The thing is, ARM is almost fully ready on the Windows side of the coin. Windows on ARM appears to be working well, x64 emulation is up and running, increasingly more programs are getting ARM compiles, and Microsoft's VS and compilers now have ARM on an equal footing with x64. So, if Intel or AMD decided to make an ARM CPU, people could go over quite easily, similar to the early days of x64.Reply
Edit: royalist/encumberments to royalty/encumberments!
And Firefox's Spell Checker is so bad that The Mozilla Foundation should be stripped of their Tax Exempt status until they fully comply and fix that. Reply
Intel has proposed X86-S ISA, to get rid of all the legacy code and boot directly into 64 bit, (the proposal is available on their website). But I don't know, if this is enough to allow them to build wider decoders to improve the single thread performance.Reply
I took a look at x86-S and it certainly would be welcome, getting rid of unnecessary legacy features. From my understanding, I don't think it would help to build wider decoders. The problem in x86 is that the length of each instruction varies and is not known beforehand. At execution time, length has got to be worked out in predecode, and I imagine this constrains how much can be sent through the decoders, as well as taking up a great deal of power. In the fixed-width ISA, it is trivial to know where each instruction starts and send them off to the decoders in mass. A bit like comparing a linked list with an array.Reply
He may overstate the power, but don't diss his remark by only focusing on that error, as the mobile processor is running at much lower frequencies.Reply
It does seem that the N3B process isn’t yielding great efficiency improvements. Any chance Apple will be paying a visit to Intel Foundries in the near future? A19 or A20 on Intel 20A?Reply
That's weird because the lower power A17 draws up to 14W on app launch, and those numbers are not remotely true for M1. It's impressive compared to high power x86 CPUs, but those numbers just aren't right. Reply
You need to rethink.This is stupid comparion and hilarious conclusion. First of all, Geekbench cannot fully reflect x86 performance, you should compare R23 with m2 max and 13900H, the full load efficiency is actually similar. Second, you exaggerate the efficiency gap by comparing a low powered designed mobile phone soc with desktop chip, intel has low power 15W chips like i7 1335u has great single thread performance as well. At full load, there is efficiency gap but the gap is not big at full load but at light load, apple leads probably 1.5x.Reply
Nice trolling lemur! You landed like an entire page of nerd rage this time. You're a credit to your profession and if I could give you an award for whipping dead website readers into a frenzy (including regulars who have seen you do this for years now) I would. Congrats! 10/10 would enjoy again.Reply
Intel does not need to do anything about its architecture to to match or surpass m3. It just needs to build its cpus on a similar node. Which is not happening anytime soon, thus perpetuating the illusion of efficiency of apple cpus.
Two things more. First it is hilarious to compare the prowess of Intel on designing cpus to that of Apple. Apple has long time "building" machines like a glorified Dell borrowing cpus from IBM or Intel and only recently understood the scale and effort needed to design your silicon by improving on ARM designs.
Secondly, it is misguided to say that if a cpu needs 10 times more wattage on the same node to achieve 2 or 3 times the performance is less efficient. This is not how physics works . If Intel built their cpus on N3 of tsmc they would be 2 or 3 times faster best case scenario. Wattage does not scale linearly with performance. This is the same as saying that a car that has 10 times the power would be 10 times faster. Lololol.
Apple designs good cpus recently , but all the hype about its efficiency is just hype. Even if we assume the design is totally coming from Apple , which it doe not, being a very good modification at best, it does not even build its nodes. By large its efficiency is TSMC efficiency. If it were not for TSMC Apple would be non existent on the performance charts.Reply
TLDR: - Intel 4 < TSMC N6 - To not be late, Intel 3 must arrive within 3 months,which is highly doubtful, since Intel 4 isn't even shipping yet - I assume Intel 3 < TSMC N6, otherwise, why bother enriching the competition? - Parts of the new tech stack looks promising, but Intel refrains from any real performance claims, or any comparison with offerings from AMD or Apple. - Did Intel announce another architecture for desktop computers, probably more similar to that of AMD, e.g. perhaps many performance tiles plus one cache tile?Reply
Maybe. Or maybe TSMC6 is cheaper, and Intel doesn't need the power savings or area savings of I4 over TSMC6 for what the non-compute tiles need to accomplish. It's not exactly uncommon to see the SoC / IO tile on a lower node, doesn't AMD do the same thing?Reply
Intel 4 and 3 are basically the same with the same device density as 3 is enhanced 4. I assume it has slightly higher density value than TSMC 5nm and performance is slightly better. Let's see.Reply
Intel should've either implemented TB5 in Meteor Lake or waited until after Meteor Lake shipped to announce TB5. Because as cool and impressive as meteor lake seems, for some of us, it's already obsolete in that it makes no sense to buy a TB4 laptop/PC and instead wait on TB5 silicon to hit the market.Reply
Why use TB4 or USB4/40Gbs and have to deal with the extra latency and bandwidth robbing overhead compared to PCI-SIG's OCuLink that's just pure PCIe signalling delivered over an external OCuLink Cable. OCuLink and PCIe requires no extra protocol encapsulation and encoding/decoding steps at the PCIe link stage so that's lower latency there compared to USB4/TB4 and later generations that have to have extra encoding/decoding of any PCIe protocol packets to send that out over TB4/USB4. And for external GPUs 4 lanes of PCIe 4.0 connectivity can provide up to 64Gbs of bandwidth over an OCuLink port/cable and OCuLonk ports can be 8 PCIe lanes and wider there.
Once can obtain an M.2/NVMe slot to OCuLink adapter and get an external OCuLink connection of up to 64Gbs as long as the M.2 is 4, PCIe 4.0 lanes wide and no specialized controller chip required on the MB to drive that. And GPD on their Handhelds offers a dedicated OCuLiink port and an external portable eGPU that supports OCuLink or USB4/40Gbs-TB interfacing. TB5 and USB4-V2 will take years to be adopted whereas OCuLink is just PCIe 3.0/4.0 there delivered over an external cable. Reply
Unlike thunderbolt, Occulink doesn't have hotplugging, meaning your device must be connected at cold boot. Not so good for external storage needs.Reply
I'm more focused the on eGPU usage for OCuLink so I'm not stating that TB4/USB4 connectivity does not have its usage model for your use case. But pure PCIe is lowest latency for eGPU usage and can be easily adopted by more OEMs than just GPD for their handhelds as that OCuLink will work with any makers' GPUs as long as one is using an OCuLink capable eGPU adapter or enclosure.
And ETA Prime has extensively tested OCuLink adapters with plenty of Mini Desktop PCs and even the Steam Deck(M.2 slot is only PCIe 3.0 capable). It's the 64Gbs on any PCIe 4.0/x4 connection(M.2/NVMe or other) that's what good for eGPUs via OCuLink relative to the current bandwidth of TB4/USB4 40Gbs. Reply
I’ve seen those videos and the performance advantages for EGPUs. But most of the EGPUs in the market use alpine ridge. A chipset known to reserve bandwidth for DP and have less available for PCIe (22 Gbps). Perhaps there may be one or two based on Titan ridge with slightly more pcie bandwidth. It’s hard to say how barlow ridge will perform in terms of the amount of pcie bandwidth made available to peripherals. But a 64 Gbps pcie connection will not saturate the 80 Gbps link so hopefully we can have most of the available 64 Gbps pcie bandwidth. Another problem with occulink is that there’s no power delivery so you need to have a separate wire for power.
So Barlow ridge TB5 has the potential to be a one cable solution, power upto 240W, pcie up to 64 Gbps, and it will also tunnel DisplayPort. Occulink is cool. But thunderbolt tunnels more capabilities over the wire. Reply
OCuLink is lower latency as was stated in the earlier posts! And TB4/TB# or USB4/USB# will not be able to beat Pure PCIe connectivity for low latency and latency is the bigger factor for gaming workloads. TB tunneling protocol encapsulation of PCIe/Any other Protocol will add latency the result of having to do the extra encoding/encapsulation and decoding/de-encapsulation steps there and back whereas OCuLink is just unadulterated PCIe passed over an external cable.
More Device makers need to be adding OCuLink capability to their systems as that's simple to do and requires no TB#/USB4-V# controller chip to be hung off of MB PCI lanes as the OCuLink port is just passing PCIe signals outside of the device. And TB5/USB4-V2 is more than 64Gbs but that will require more PCIe lanes be attached to the respective TB5/USB4-V2 controller and use more overheard to do that whereas if one has the same numbers of PCIe lanes connected via OCuLink then that's always going to be lower overhead with more available/usable bandwidth and lower latency for OCULink.
Most likely the PCIe lane counts will remain at 4 lanes Max and that will just go from PCIe 4.0 to PCIe 5.0 instead to support TB5 and USB4-V2 bandwidth but whatever PCIe standard utilized OCuLink will always have lower overhead and lower latency than TB/Whatever or USB4/Whatever as with OCuLink that's skipping the extra tunneling protocol steps required.
Plus by extension and with any OCuLink Ports being pure PCIe Protocol Based, that opens up the possibility of OCuLink to TB/USB/Whatever Adapters being utilized for maximum flexibility for other use cases as well. Reply
OCulink has merit for sure, but again, it is clunky. Unlike thunderbolt, it doesn't tunnel displayport or provide power delivery. It also doesn't support hotplugging. That is why it will most likely remain a niche offering. Also you're saying OcCulink is lower latency, but by how much? Where is the test data to prove that ?
And does it really matter? Operating systems can be run directly off of thunderbolt NVME storage, the latency is low enough for a smooth experience. And even if OcCulink is technically faster, a GPU such as a 4080 or 4090 or 7900XTX in a PCIe4x4 or even PCIe5x4 eGPU thunderbolt 5 enclosure will be much faster than the iGPU or even internal graphics. And if the eGPU enclosure is thunderbolt enabled, it can power the laptop or host device and probably act as a dock and provide additional downstream thunderbolt ports and possibly USB as well. Thunderbolt provides flexibility that OcCulink does not. Both standards have merit.
But I have a feeling Thunderbolt 5, if implemented properly in terms of bug-free firmware NVMs from Intel, will gain mass market appeal. The mass market is hungry for the additional bandwidth. AsMedia will probably do extremely well as well with its USB4 and upcoming USB4v2 offerings. Reply
Dont waste your time, Trampoline is an OCUlink shill who will ignore any criticism for his beloved zuckertech. The idea that most people dont want to disassemble a laptop to use a dock is totally alien to him. Reply
LOL, OCuLink's creator PCI-SIG is a not for profit Standards Organization that's responsible for the PCIe standards so it's not like they are any Business Interest with a Fiduciary responsibility to any investors.
OCuLink is just a Port/electrical PCIe extension cabling standard that was in fact originally intended to be used in consumer products but Intel, a member of PCI-SIG along with other industry members, had a vested interest in that Intel/Apple co-developed Thunderbolt IP, because of TB controllers and sales of TB controllers related interests.
And TB4/Later and USB4/Later will never have as low latency owing to the fact that any PCIe signalling will have to be intercepted and encapsulated by the TB/USB/Whatever protocol controller in order to be sent down the TB cabling whereas over the OCuLink ports/cabling that's just the PCIe signalling/packets there and no extra delays there related to any extra tunneling protocol encoding/encapsulation and decoding/de-encapsulating steps required.
So OCuLink represents the maximum flexibility as that's the better lowest latency solution for eGPUs being just pure unadulterated PCIe signaling. And because it's just PCIe that opens up the possibility of all sorts of external adapters that take in PCIe and can convert that to Display Port/HDMI/USB/TB/Whatever the end users need because all Motherboard external I/O, for the most part, is in the from of PCIe and OCulink just brings that PCIe directly out of devices via Ports/External cables.
And to be so dogmitacilly opposed to OCulink is the same as being opposed to PCIe! And does any rational person think that that's logical! OCuLink is External PCie and that's all there is to that and it's the lowest latency method to interface with GPUs via any PCIe Slot or externally via an OCuLink connection(PCIe is PCIe).
Give me a Laptop with at least One OCuLink PCIe X4/4.0 port and with that I can interface to an eGPU at 64Gbs bandwidth/lowest latency possible! And there can and will be adapters that can be plugged into that One OCulink port that can do what any other ports on the laptop can do because those ports are all just connected to some MB PCIe lanes in the first place. Reply
The main advantage of the TB4 is that the form factor is USB-C which can be configured for various other IO. This is highly desirable in a portable form factor like laptops or tablets. Performance is 'good enough' for external GPU usage. OCuLink maybe faster but doesn't have the flexibility like TB4 over the USB-C connector does. OCuLink has its niche but a mainstream consumer IO solution is not one of them.Reply
OCuLink is just externally routed PCIe lanes and really there can be one OCuLink port on every laptop specifically for the best and lowest latency eGPU interfacing and even OCuLink to HDMI/Display Port/whatever adapters that can make the OCuLink port into any other port at the end users discretion. So for eGPUs/Enclosures that have OCuLink ports that's 64Gbs/Lowest latency there and for any Legacy TB4/USB only external eGPU devices just get an OCuLink to TB4/USB4 adapter in the interim and live with the lower bandwidth and higher latency.
GPD already has a line of Handheld Gaming devices that utilize a dedicated OCuLink port and a portable eGPU that supports both OCUlink interfacing and TB4/USB4 interfacing. And I do hope that GPD Branches out into the regular laptop market as GPD's external portable eGPU works with other makers products and even products that have M.2/NVMe capable slots available via an M.2/NVMe to OCuLink adapter! LOL, only Vested Interests would Object to OCuLink in the consumer market space, specifically those Vested Interests with Business Models that do not like any competition. Reply
No one is forcing you to do that and for others that's an option, albeit and inconvenient one. But really the adapters are not meant for Laptops in the first place and even for Mini Desktop PCs is not an easy task there but still more manageable that doing that with a laptop. It would just better if there was more Mini Desktop PC OEMs/Laptops OEMs where those OEMs would adopt an OCuLink PCie 4.0/x4 Port for eGPU usage like GPD has done with their line of handheld gaming devices. And with mass adoption of OCuLink there could also be adapters as well to support all the other standards as OCuLink being PCIe based by extension will support that as well. Reply
Source or market research please ? I have the feeling that many enthusiasts will not be interested. Because of missing TB5. And also because of its ipc improvements (or lack thereof) vs raptor lake.
Meteor lake certainly is impressive. But it seems to be less about raw performance and more about the process improvement. Foveros. Chiplets. Euv. New manufacturing abilities. AI engine. Power efficiency. Newish gpu.
But from a generational uplift perspective, from a raw cpu performance to the thunderbolt io, I t’s not much of an upgrade for enthusiasts. Intel should’ve just launched MTL in Dec and then announced TB5 in January. What was the reason to announce TB5 before the MTL reveal?
I guess we will have to wait on arrow lake mobile (if that’s a thing) or lunar lake for TB5 on laptops. Reply
You need Market Research to tell you TB4 bandwidth is sufficient for majority of users? 40Gb/s can drive easily gigabit interent and multiple monitors. Most jobs do not require more. At the Fortune 500 I manage IT for, we still haven't even switched to thunderbolt as 3.1 docks are more than sufficient.
There's market research on TB4 trends for purchase, that i'm not going to pay for, so we'll just have to settle on "Intel's market research determined that delaying their next gen product line for this 1 feature, potentially causing delays across OEMs 2024 product lines in the process, was not worth it"Reply
While that segment might be outspoken, the percentage of the overall market is tiny and the percentage that cares among that fraction is even smaller. Basement dweller computer nerds and the e-sports people they idolize don't buy the hundreds of thousands of units that a computer manufacturer purchases. Sure, they get a minor head nod from the company to keep them from slobbering and raving about being ignored, but that's done because it's cheap to coddle them with marketing speak and make them believe features are targeted at them so their ego balloons aren't popped and sites like this have a bone or two to throw them once in a while, but ultimately, no one cares what they want as long as they fanboy argue in favor of their preferred brand with other nerds that like the competition.Reply
Exactly. TB5 is exciting and meteor lake is mostly DoA without it. Who would invest thousands into a machine that cant make use of newer functionality? Reply
Was this just written by having an AI interpret the slides? And then OCR failed? "This means that higher Out-of-Service (OoS) work is allocated to P-cores for more demanding and intensive workloads, while lower Quality-of-Service (QoS) workloads are directed to E-cores, primarily to save power"Reply
Thank you for the explanation. The problem is, I caught at least three more mistakes like this, where a wrong assumption is made about what the text on a slide actually means. In which case (knowing that I'm not an expert), how can I be certain that there aren't many more mistakes that I haven't spotted? We do come to Anandtech for in-depth analysis, which requires that trust.Reply
The blunt answer is that we're imperfect (to err is human). We've made mistakes in the past and will continue to do so in the future. But we always own up to those mistakes, and will correct anything if we catch it (or if it gets pointed out).Reply
Wow! Intel have some revolutionary ideas here!! Their chiplet approach will change the industry. Would be what i'd have said if they'd have presented this 6 years ago. My response today is...meh.Reply
Oh no that's bad news as Apple appears to have gone even wider with the A17 P cores than even the A14/Firestorm with decode resources on A17/P core, if the Apple promotional material is correct!
Maybe Chipsandcheese will look at A17's P core design and with some Micro-benchmarks as well. Reply
Yeah that's too bad, it looks like the e-cores got a bigger bump than the p-cores but they didn't advertise it with how strangely they mentioned it Reply
The slide from Intel on its Crestmont E core design(Block Diagram) does not look that much different from Gracemont's block diagram and Redwood Cove(Block Diagram) core design still appears to be a 6 wide Instruction Decoder design and so Similar to Golden Cove but there needs to be more info concerning Micro-Op issue rates and other parts of Redwood Cove's core design. Reply
Intel confirmed to me Redwood Cove would have IP gains over Raptor Cove. When I get back (been sat at a PC a lot the last few days), I'll grab it for you Reply
There's so many changes in MTL, it would make sense to just save a new P core uArch for next gen. Especially when clockspeed/watt is going up a decent amount, so it's not like perf/watt is stagnating. Reply
I think they've been following the old tick-tock system, Sunny and Golden Cove being the tocks, and Willow and Raptor the ticks. So, it's possible that Redwood would bring some proper changes.Reply
Nice! Finally might get a desktop CPU without having to pay for an expensive built in GPU that I don't want. (If you think $25 off for an F model is the same thing you're dillusional)
On an unrelated note, I'm curious which of these tiles represent a minimum viable system. Are the LP E cores on the low voltage island of the SoC die sufficient? Can we get by without the CPU nor GPU dies? That might make a really nice media player as it would have all the display driving and video decoding hardware and a coupld of LP E cores to manage housekeeping and maybe drawing a GUI if necessary.
What about for a simple headless system, can just the SoC die be enough? In either of these cases you'd need the I/O die (maybe even a harvested one where some parts don't work, but are for dies not used.....)Reply
Intel has ruined the Small For factor DIY market that needs Socket Packaged processors and not BGA packaged processors/SOCs. So no Chances to Build an ASRock Desk Mini that's STX MB form factor based and supports Socket Packaged Intel and AMD SOCs/APUs with powerful iGPUs.
And really AMD has intentionally delayed any Ryzen 7000G(Socket Packaged) Desktop APU release in favor of BGA only OEM SKUs on Minisforum and Beelink mini desktop PC systems where there are now Ryzen 7040/BGA Packaged processor based systems allotted 70w cTDPs and so 5 more watts that the Ryzen 5700G(65W) desktop APUs, that was the last generation usable for the ASRock X300 Desk Mini line there.
And the InWin Chopin DIY friendly very Small form factor build that takes a Mini-ITX MB but lacks the room for any dGPU to be slotted in there as the Chopin's form factor is just too small there and AMD's Ryzen 5000G APUs where a popular choice there for DIY friendly small form factor Chopin system builds. And AMD's Desktop Ryzen 7000 series offers RDNA2/2CU integrated graphics but that's not APU class or marketed by AMD as APU class.
I had hoped that Intel would have at least released a 65W Socket Packaged Meteor Lake SKU so folks could possiblely have some ASRock Desk Mini DIY friendly option on a Socket Based STX MB from factor. And I was even more hoping that some Meteor Lake S(65W-80W) Socket Package variant would force AMD's hands there to make them release some Ryzen 7000G Socket Packaged desktop APU for the DIY market! But now sans ant Intel competition in that product segment AMD may just not release any Ryzen 7000G for a good long while and DIY Small Form Factor will go depreciated in favor of BGA Only and OEM Only as well. Reply
More intimidation here and Doxxing is Doxxing! You are using intimidation tactics that should get a moderation res ponce before any legal response is required! Reply
Look at you here and trying to intimidate and adding nothing to any discourse! This is not the kind of posting that should be allowed at Anandtech! Reply
That would be more of DIY friendly Very Small Form Factor Enthusiast/end user there! And with a reasonable expectation that the vibrant DIY Small Form Factor devices(Mini desktop PCs) market continue to be offered Socket Packaged Processors with newer than Ryzen 5000G/Zen-3 and Vega 8CU iGPU based graphics IP, and Ditto for any Intel based options as well.
So it's wrong to expect any Further Ryzen G series Desktop[Socket Packaged] APUs from AMD because that's not good for the OEMs there and their business models that are not so DIY friendly for Processor Upgrades if the Processor comes BGA wedded to the Motherboard! And OEM products that are not so good for eWaste reduction because if the processor goes that can not be easily replaced/upgraded by the end user(DIY sorts of Folks).
There needs to be a Right to Processor Upgrade just as much as a Right to Repair and with Socket packaged processors those rights go hand in hand there along with any environmental eWaste concerns. But we must not trample upon those Business Models as that's just not good for OEM Profits there, consumers be damned!
And InWin Chopin or ASRock Desk Mini, Socket Packaged APUs/SOCs are the best option as that's by definition DIY friendly there.
I'll expect no Complaints from you if the entire PC market goes BGA Packaged Processors only and you'll have to buy the Processor Attached to the Motherboard, take it or leave it! Reply
Seems like a lot of silicon for what's essentially the job of a dirt cheap ARM SoC. And its a questionable fit for a headless system unless its like a stable diffusion/transcoding host.
It *does* seems like an interesting fit for a smart TV chip, maybe with a small GPU die, as they would actually use the NPU for their internal video filtering.Reply
Intel 4 will not be shipping any products to customers until Mid December. This after stating it is in production in December 2022. 12 months from production starts to PCs out is not good. And I better be able to buy meteor lake Notebook on Dec 14th 2023 or this is exactly like old Intel (Launch means we may have sold some parts to someone somewhere). This claiming a node is done when its production ready, when you ship nothing is problematic. FYI Meteor Lake is 2x the cost of Raptor lake in 2024. Intel 4 is not a cost reduction. The product might be great but it is expensiveReply
4 was in production in December 2022? No way! It should be started not long ago.
Usually the first real product silicon would be taped out one year ahead of release date. And that silicon would be very buggy and needs several steppings to have bugs fixed.Reply
So Xe-lpg is still intel uhd graphics (13\14th gen now?) with top EU count of 128 up from 96. Fine.
maybe about 3TF fp32, not quite xbox Series S level
They added RT support which is good i guess but will it ever be used in a gpu that is really PS4 performance?, or maybe there are not gaming applications.
product sounds good, just wait for the numbering schemeReply
"An example of how applications pool together the various tiles include those through WinML, which has been part of Microsoft's operating systems since Windows 10, typically runs workloads with the MLAS library through the CPU, while those going through DirectML are utilized by both the CPU and GPU."
This sentence is really a mess. Editor: please take note. Is "example" the subject of "include"? That would make "includes" the necessary form of the verb. What is the subject of "runs"? I'm guessing WinML. Maybe it should be "WinML...which typically runs" but the long parenthetical expression about Windows 10 support makes it hard to bridge the gap. Maybe parentheses would be more clear instead of commas to keep the meaning on track. I'm still not sure what was meant.Reply
This is what I was hoping to see Intel pull off in the late 14 nm/early 10 nm days when their foundries were having difficulties. Intel should have pivoted int his direction at the first sign of trouble with those as the packaging side of this, while cutting edge back then, could have been pulled off. Better late than never.
However with Meteor Lake around the corner, it is shaping up to a pretty good design. Both the CPU and GPU sides can scale and evolve independently from the central SoC. The GPU portion that was moved onto the SoC makes sense as the codecs and display logic are not going to change over the next few generations. I would quibble about the point made that putting them next to the NPU is more advantageous than next to the GPU cores. There certainly is a benefit for AI upscaling of movies but my presumption is that I'd be lower power/lower latency to have the encoders next to the GPU cache which houses the final render frame for encoding and transmission. The tasks that's benefit here would be gaming streaming or remote access. Both things can be true hence why it is a quibble as it'll matter to individual use cases which one approach is superior.
My initial presumption for the IO die was that it was to house various analog circuits that would then be leveraged by the SoC die. This is a clever means of process optimization as analog circuitry does not scale at the same rate as logic. Similarly this would permit a cheaper die to extend the number of area intensive IO pads.
The last thing missing is the L4 cache die that was hinted at in earlier Linux patches. That'll probably come along with the Lunar Lake generation.Reply
The adamantine cache is supposedly at the base tile that all other tiles are connected to. It wasn't mentioned here, maybe if it does exist on MTL we will see information about it when the CPUs will officially come out.Reply
So now we will have 3 types of cores for the OS to schedule ... I hope Intel is working with OS vendors to properly implement this or it will be a nightmare ... we already saw the 12/13gen issues on Windows 10 and 11 with wrong E to P scheduling ....Reply
Pat Gelsinger does a very good job and I'm speaking concretely because I use series 12 and 13 in the manufacture of Graphics Stations, Ultra PCs, and Standard PCs and they all work PERFECT with maximum benchmark and infinite Tau (only K series)... I'm looking forward to series 14 especially for its benchmark...Reply
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
107 Comments
Back to Article
erinadreno - Tuesday, September 19, 2023 - link
Chiplets on mobile processors doesn't sound exactly enticing. Why does Intel feel the need to use 3 different types of chiplets from 2 foundries on a power-sensitive mobile chip? Replyjazzysoggy - Tuesday, September 19, 2023 - link
It's not the same chiplet strategy as AMD is using for Zen, it's more inline with AMD's MI accelerators. Not as cheap as the Zen strategy, but much more power efficient for power sensitive mobile chips. Replyschujj07 - Tuesday, September 19, 2023 - link
Do you have proof it is more power efficient? Intel right now is vastly less power efficient than AMD and that is regardless of chiplet or monolithic dies. ReplyUnashamed_unoriginal_username_x86 - Tuesday, September 19, 2023 - link
First page says .15-.3 pJ/bit for Foveros, AMD has previously stated infinity fabric uses "<2 pJ/bit" which presumably isn't very far under 2. this puts Intel at 5-10x the efficiency for this very specific part of data transmission Replyduploxxx - Wednesday, September 20, 2023 - link
first generations of infinity fabric were also less power hungry, untill they noticed what is needed to get things faster and there you go. Lots of bandwidth changes and infinity fabric changes and as well the power usage... lets see what Intel is capable of doing first, it's marketing all over the place... ReplySamus - Tuesday, September 19, 2023 - link
Intel is behind AMD on power efficiency purely due to process node. Architecturally they are quite competitive. This is proven in how AMD performance scales when underclocked compared to Intel.Anandtech recently did an investigation into this and came to the conclusion that while the architectures are vastly different they have potentially similar performance once you determine the ideal wattage for the chip, and at the moment Intel is pushing high TDP into extremely inefficient territory to have something competitive with AMD, due to AMD being on a superior node. Basically AMD can deliver X performance at 56w while Intel can deliver X performance at 72w, but Intel can deliver Y performance at 100w while AMD delivers Y performance at 95w.
While impossible to absolutely prove, various factors can help determine this differential has more to do with manufacturing superiority than design superiority.
Intel going for tiles here is a clear attempt to close the gap on this. Reply
PeachNCream - Thursday, September 21, 2023 - link
I think I'd second-guess anything Anandtech does these days. They had one person with an advanced degree that departed years ago after being bought out by Qualcomm when they owned Killer NICs and have since had trouble publishing articles without obvious typos and "in-a-hurry" oversights.On the other hand, if that data is supported by a more credible publisher that has decent measuring equipment and can afford to purchase its own test hardware rather than relying solely on free samples - well then we should sit up and take notice. At this point though, an Anandtech exclusive is just a reason to raise the citation needed flag and THEN further analyze the sources for their motives. Reply
RedGreenBlue - Sunday, October 1, 2023 - link
They’ve always had typos that slip through, or grammar mistakes. All the way back to Anand’s time. I don’t like it but I know what was intended when I see one. There still isn’t another website with the same focus on the segments they focus on. I miss the deep-dives into mobile chips and phones they used to do by investigating what wasn’t publicly released about things but for most people that’s a niche purpose that other sites cover extensively with reviews and can focus on and is a bit too consumer focused and easily found elsewhere. Not really their target market. ReplyComposite - Thursday, September 28, 2023 - link
This is indeed similar to MI250 2.5D fabric. However, MI300X is full 3D fabric. Replyelmagio - Tuesday, September 19, 2023 - link
My personal guess, from the moment this was announced, has been that they want every single cm² of silicon going through their Intel 4/EUV capacity going to the compute tile. They're lagging quite a bit behind TSMC and Samsung in terms of EUV capacity, so anything that doesn't stand to benefit much from being designed from the ground up to be made on their own nodes is worth offloading to TSMC.SoC and IO tiles are really not process-limited currently, and their Arc GPUs are in general going manufactured elsewhere anyway. But their CPU design process has always been fully in house every step of the way, and they don't want to change that (at least not yet). So everything but compute would be "wasted" Intel 4 capacity. Reply
Composite - Thursday, September 28, 2023 - link
totally agree. At the same time, instead of doing a full Intel 4 Meteor lake chip, shrink it down to compute tile only also reduces the size of the silicon and improves yield. Later next year, Intel will also need EUV capacity for Sierra Forest and Granite Rapids. These chips will be much larger than mobile compute tile and considerably lower yield.... Intel will need every ounce of EUV capacity they have. Replytipoo - Tuesday, September 19, 2023 - link
Probably to have as much compute on the N4 capacity that they have, their substrate also takes much less power connecting them than current AMD and it allows for the best node for each part being used i.e if Intel's wasn't ideal for the GPU tile as the CPU tile etc ReplyComposite - Thursday, September 28, 2023 - link
I have the same question. At the same time, I was curious about Intel's EUV capacity. Since Intel is the late comer to EUV and over 50% of EUV machines are at TSMC, does Intel really have the capacity to manufacture full chip Intel 4 Meteor Lake? Not to mention up coming Sierra Forest and later on Granite Rapids will all use EUV capacity. I think the reasonable way is indeed only use EUV at the most critical part of Meteor Lake ---> Compute tile, and out source the rest. ReplyeSyr - Tuesday, September 19, 2023 - link
To avoid the issues they have with rollout 14 nm (BDW) and then 10 nm (CNL), I guess, when they held back by yield with respect to particular parts of the chip, specifically, GPU. Replylemurbutton - Tuesday, September 19, 2023 - link
A17 Pro just beat all Intel CPUs except the 13900KS in ST Geekbench6. A17 Pro uses less than 3w to achieve this - with typical load significantly below 3w. Meanwhile, 13900KS uses as much as 250w or more.Intel's Meteor Lake needs to improve by 10x over Raptor Lake just to match what M3 will be able to do. Reply
Irish_adam - Tuesday, September 19, 2023 - link
The 13900ks uses 250 watts on a single core? Got a link for that?I'll think you find that single core workloads use far, far less. Also remember that benchmarks across ISA's are sketch at best and outright made up at worst. I mean just look how badly games or software can be when ported from one ISA to another, it all really comes down to how well you've made the software to run on each architecture. Reply
Makaveli - Tuesday, September 19, 2023 - link
He is an apple fanboySource: Trust me bro! Reply
FWhitTrampoline - Tuesday, September 19, 2023 - link
No the A17 Performance core is only clocked at 3.6/3.7GHz compared to the x86 designs that are up to clocked 2Ghz+ higher. So this is not some ESPN like Fanatic statement as since the A14/Firestorm core Apple's instruction decoder width is at least 8 decoders wide and backed up by loads of execution ports. And so Apple's P cores are of a very wide order superscalar design since the A14/Firestorm was released!And the Apple P cores are high IPC at low clocks compared to the x86 designs that have 4/6 instruction decoders so need the higher clocks to make up the IPC deficiency for single thread performance that's calculated as IPC multiplied by average sustained clocks to get that single threaded performance metric.
The lower clocks are where Apple's power savings come from and the longer battery life is obtained. That and the A17 Pro/Earlier A series SOCs have loads of specialized heterogeneous compute for offloading workloads onto instead of using the CPU cores or GPU cores so more power can be saved there for all sorts of specialized workloads. The x86 processors/SOCs are just now getting the same sorts of specialized heterogeneous compute IP blocks but that's relatively immature compare to Apple's SOCs and other ARM Based SOC ecosystems that have been using that specialized heterogeneous compute IP for years now. Reply
GeoffreyA - Thursday, September 21, 2023 - link
Well, it would be interesting to see Intel or AMD make a fixed-width ISA design and how that then stacks up against the stuff of Apple. Really, x86 is at a disadvantage because of the variable-width instructions but still has done a fantastic job. Or, I'd like to see Apple design an x86 CPU and see how that holds up against Zen and the rest. ReplyFWhitTrampoline - Thursday, September 21, 2023 - link
No logical reason for Apple to go CISC as the x86 Instruction Decoder requires many times the transistors to implement than the transistors required to implement a RISC ISA Instruction Decoder! So it was easy to get 8 Instruction Decoders to fit on the front of the A14/Firestorm processor core(RISC ISA Based). It's easier to go wider if one has a relatively fewer Instructions of a fixed length to implement in a Instruction Decoder design. So that makes it easy to produce a custom very wide order superscalar processor core design that targets high IPC at a lower clock rate and the SOC's CPU cores clocked well inside their Performance/Watt sweet spot. And to still have that A14 match/get close to the x86 cores in single threaded performance and against x86 core designs that are clocked 2GHz+ higher.The x86 ISA is too Legacy Instructions bloated and it's not going to be easy to refactor that and not require years in the process to do that. The ARM ISA ecosystem is from the ground up RISC there and even though the x86 designers have a RISC like back end to break those CISC down into more RISC like instructions, that hardware engine take more transistors to implement and thus will use more power resources getting that done. The vast majority of ARM ISA instructions translate 1 to 1 into single and some a few Micro-OPs so how hard is that to decode compared to x86 ISA instruction that mostly have multiple micro-ops generated to get all that complex work done. And there's a valid power usage reason that x86 never made any inroads into the wider tablet/smartphone market.
The thing about the ARM/RISC core designs is that they can scale from phones to server/HPC whereas the CISC designs can not scale down as low power as RISC designs! but Intel has done a good job at getting close there but a little too late to matter to the OEMs that really did not want to remain beholden to Intel and x86. And the same can be said now for RISC-V compared to an ARM Holdings that's maybe leaning more towards an x86 like business model where RISC-V represents total end user ISA freedom there, within reason, as the RISC-V ISA is totally open not royalist/encumberments required/enforced. Reply
GeoffreyA - Saturday, September 23, 2023 - link
I agree with most of what you're saying. What I was trying to get at is that there seems to be a belief that Apple has superior engineering ingenuity than Intel and AMD, when really, it is the difference between fixed- and variable-length instruction sets and all that entails. What I'd like to see is all of them on the same playing field and where each then stands, from a CPU point of view. Quite likely, there won't be much of a difference because good design principles are always the same. It's trying to be out of the ordinary that leads to Pentium 4s and Bulldozers. ReplyGeoffreyA - Saturday, September 23, 2023 - link
And yes, I'd like to see RISC-V winning in the end, rather than ARM. ReplyGeoffreyA - Saturday, September 23, 2023 - link
The thing is, ARM is almost fully ready on the Windows side of the coin. Windows on ARM appears to be working well, x64 emulation is up and running, increasingly more programs are getting ARM compiles, and Microsoft's VS and compilers now have ARM on an equal footing with x64. So, if Intel or AMD decided to make an ARM CPU, people could go over quite easily, similar to the early days of x64. ReplyFWhitTrampoline - Thursday, September 21, 2023 - link
Edit: royalist/encumberments to royalty/encumberments!And Firefox's Spell Checker is so bad that The Mozilla Foundation should be stripped of their Tax Exempt status until they fully comply and fix that. Reply
Bluetooth - Saturday, September 23, 2023 - link
Intel has proposed X86-S ISA, to get rid of all the legacy code and boot directly into 64 bit, (the proposal is available on their website). But I don't know, if this is enough to allow them to build wider decoders to improve the single thread performance. ReplyGeoffreyA - Saturday, September 23, 2023 - link
I took a look at x86-S and it certainly would be welcome, getting rid of unnecessary legacy features. From my understanding, I don't think it would help to build wider decoders. The problem in x86 is that the length of each instruction varies and is not known beforehand. At execution time, length has got to be worked out in predecode, and I imagine this constrains how much can be sent through the decoders, as well as taking up a great deal of power. In the fixed-width ISA, it is trivial to know where each instruction starts and send them off to the decoders in mass. A bit like comparing a linked list with an array. ReplyFWhitTrampoline - Tuesday, September 19, 2023 - link
up to clocked 2Ghz+ should read: Clocked up to. ReplyBluetooth - Saturday, September 23, 2023 - link
He may overstate the power, but don't diss his remark by only focusing on that error, as the mobile processor is running at much lower frequencies. Replytipoo - Tuesday, September 19, 2023 - link
It sounds like you carried forward 3W from 2008. The A17 Pro draws more power than ever.https://www.youtube.com/watch?v=TX_RQpMUNx0 Reply
StevoLincolnite - Tuesday, September 19, 2023 - link
He is nothing but a liar. ReplyOrfosaurio - Saturday, September 23, 2023 - link
Please, answer why. I have detected at least some "bias", but why you said that he is a liar? ReplyKPOM - Tuesday, September 19, 2023 - link
It does seem that the N3B process isn’t yielding great efficiency improvements. Any chance Apple will be paying a visit to Intel Foundries in the near future? A19 or A20 on Intel 20A? Replytipoo - Tuesday, September 19, 2023 - link
I'd say there's a real chance as they expect a density crossover in 2025 Replylemurbutton - Tuesday, September 19, 2023 - link
My M1 draws 0.2w - 5w during Geekbench ST. A17 Pro is going to draw considerably less. Replydontlistentome - Wednesday, September 20, 2023 - link
Good to see the RDF is still alive and strong. Guess we are getting closer to Halloween. Replytipoo - Wednesday, September 20, 2023 - link
That's weird because the lower power A17 draws up to 14W on app launch, and those numbers are not remotely true for M1. It's impressive compared to high power x86 CPUs, but those numbers just aren't right. Replylemurbutton - Thursday, September 21, 2023 - link
It's true. Anyone with an M1 can prove it. Run GB5 and open powermetrics in the terminal. It's easy to verify. ReplyOrfosaurio - Saturday, September 23, 2023 - link
Have multiple sources verified that peak power draw? Replybji - Tuesday, September 19, 2023 - link
Off-topic. Just go away please. Replydanielzhang - Thursday, September 21, 2023 - link
You need to rethink.This is stupid comparion and hilarious conclusion.First of all, Geekbench cannot fully reflect x86 performance, you should compare R23 with m2 max and 13900H, the full load efficiency is actually similar.
Second, you exaggerate the efficiency gap by comparing a low powered designed mobile phone soc with desktop chip, intel has low power 15W chips like i7 1335u has great single thread performance as well.
At full load, there is efficiency gap but the gap is not big at full load but at light load, apple leads probably 1.5x. Reply
PeachNCream - Thursday, September 21, 2023 - link
Nice trolling lemur! You landed like an entire page of nerd rage this time. You're a credit to your profession and if I could give you an award for whipping dead website readers into a frenzy (including regulars who have seen you do this for years now) I would. Congrats! 10/10 would enjoy again. ReplyIUU - Thursday, September 28, 2023 - link
Intel does not need to do anything about its architecture to to match or surpass m3. It just needs to build its cpus on a similar node. Which is not happening anytime soon, thus perpetuating the illusion of efficiency of apple cpus.Two things more. First it is hilarious to compare the prowess of Intel on designing cpus to that of Apple. Apple has long time "building" machines like a glorified Dell borrowing cpus from IBM or Intel and only recently understood the scale and effort needed to design your silicon by improving on ARM designs.
Secondly, it is misguided to say that if a cpu needs 10 times more wattage on the same node to achieve 2 or 3 times the performance is less efficient. This is not how physics works . If Intel built their cpus on N3 of tsmc they would be 2 or 3 times faster best case scenario. Wattage does not scale linearly with performance. This is the same as saying that a car that has 10 times the power would be 10 times faster. Lololol.
Apple designs good cpus recently , but all the hype about its efficiency is just hype. Even if we assume the design is totally coming from Apple , which it doe not, being a very good modification at best, it does not even build its nodes. By large its efficiency is TSMC efficiency. If it were not for TSMC Apple would be non existent on the performance charts. Reply
Silma - Tuesday, September 19, 2023 - link
TLDR:- Intel 4 < TSMC N6
- To not be late, Intel 3 must arrive within 3 months,which is highly doubtful, since Intel 4 isn't even shipping yet
- I assume Intel 3 < TSMC N6, otherwise, why bother enriching the competition?
- Parts of the new tech stack looks promising, but Intel refrains from any real performance claims, or any comparison with offerings from AMD or Apple.
- Did Intel announce another architecture for desktop computers, probably more similar to that of AMD, e.g. perhaps many performance tiles plus one cache tile? Reply
Drumsticks - Tuesday, September 19, 2023 - link
Maybe. Or maybe TSMC6 is cheaper, and Intel doesn't need the power savings or area savings of I4 over TSMC6 for what the non-compute tiles need to accomplish. It's not exactly uncommon to see the SoC / IO tile on a lower node, doesn't AMD do the same thing? ReplyRoy2002 - Tuesday, September 19, 2023 - link
Intel 4 and 3 are basically the same with the same device density as 3 is enhanced 4. I assume it has slightly higher density value than TSMC 5nm and performance is slightly better. Let's see. Replykwohlt - Tuesday, September 19, 2023 - link
Intel 4 is not library complete. It can't be used for the SoC tile. Replysutamatamasu - Tuesday, September 19, 2023 - link
I wonder if current processor have an dedicated NPU, then what the heck happen with GNA?It still in there or they're remove it? Reply
Exotica - Tuesday, September 19, 2023 - link
Intel should've either implemented TB5 in Meteor Lake or waited until after Meteor Lake shipped to announce TB5. Because as cool and impressive as meteor lake seems, for some of us, it's already obsolete in that it makes no sense to buy a TB4 laptop/PC and instead wait on TB5 silicon to hit the market. ReplyFWhitTrampoline - Tuesday, September 19, 2023 - link
Why use TB4 or USB4/40Gbs and have to deal with the extra latency and bandwidth robbing overhead compared to PCI-SIG's OCuLink that's just pure PCIe signalling delivered over an external OCuLink Cable. OCuLink and PCIe requires no extra protocol encapsulation and encoding/decoding steps at the PCIe link stage so that's lower latency there compared to USB4/TB4 and later generations that have to have extra encoding/decoding of any PCIe protocol packets to send that out over TB4/USB4. And for external GPUs 4 lanes of PCIe 4.0 connectivity can provide up to 64Gbs of bandwidth over an OCuLink port/cable and OCuLonk ports can be 8 PCIe lanes and wider there.Once can obtain an M.2/NVMe slot to OCuLink adapter and get an external OCuLink connection of up to 64Gbs as long as the M.2 is 4, PCIe 4.0 lanes wide and no specialized controller chip required on the MB to drive that. And GPD on their Handhelds offers a dedicated OCuLiink port and an external portable eGPU that supports OCuLink or USB4/40Gbs-TB interfacing. TB5 and USB4-V2 will take years to be adopted whereas OCuLink is just PCIe 3.0/4.0 there delivered over an external cable. Reply
Exotica - Tuesday, September 19, 2023 - link
Unlike thunderbolt, Occulink doesn't have hotplugging, meaning your device must be connected at cold boot. Not so good for external storage needs. ReplyFWhitTrampoline - Wednesday, September 20, 2023 - link
I'm more focused the on eGPU usage for OCuLink so I'm not stating that TB4/USB4 connectivity does not have its usage model for your use case. But pure PCIe is lowest latency for eGPU usage and can be easily adopted by more OEMs than just GPD for their handhelds as that OCuLink will work with any makers' GPUs as long as one is using an OCuLink capable eGPU adapter or enclosure.And ETA Prime has extensively tested OCuLink adapters with plenty of Mini Desktop PCs and even the Steam Deck(M.2 slot is only PCIe 3.0 capable). It's the 64Gbs on any PCIe 4.0/x4 connection(M.2/NVMe or other) that's what good for eGPUs via OCuLink relative to the current bandwidth of TB4/USB4 40Gbs. Reply
Exotica - Wednesday, September 20, 2023 - link
I’ve seen those videos and the performance advantages for EGPUs. But most of the EGPUs in the market use alpine ridge. A chipset known to reserve bandwidth for DP and have less available for PCIe (22 Gbps). Perhaps there may be one or two based on Titan ridge with slightly more pcie bandwidth. It’s hard to say how barlow ridge will perform in terms of the amount of pcie bandwidth made available to peripherals. But a 64 Gbps pcie connection will not saturate the 80 Gbps link so hopefully we can have most of the available 64 Gbps pcie bandwidth. Another problem with occulink is that there’s no power delivery so you need to have a separate wire for power.So Barlow ridge TB5 has the potential to be a one cable solution, power upto 240W, pcie up to 64 Gbps, and it will also tunnel DisplayPort. Occulink is cool. But thunderbolt tunnels more capabilities over the wire. Reply
FWhitTrampoline - Wednesday, September 20, 2023 - link
OCuLink is lower latency as was stated in the earlier posts! And TB4/TB# or USB4/USB# will not be able to beat Pure PCIe connectivity for low latency and latency is the bigger factor for gaming workloads. TB tunneling protocol encapsulation of PCIe/Any other Protocol will add latency the result of having to do the extra encoding/encapsulation and decoding/de-encapsulation steps there and back whereas OCuLink is just unadulterated PCIe passed over an external cable.More Device makers need to be adding OCuLink capability to their systems as that's simple to do and requires no TB#/USB4-V# controller chip to be hung off of MB PCI lanes as the OCuLink port is just passing PCIe signals outside of the device. And TB5/USB4-V2 is more than 64Gbs but that will require more PCIe lanes be attached to the respective TB5/USB4-V2 controller and use more overheard to do that whereas if one has the same numbers of PCIe lanes connected via OCuLink then that's always going to be lower overhead with more available/usable bandwidth and lower latency for OCULink.
Most likely the PCIe lane counts will remain at 4 lanes Max and that will just go from PCIe 4.0 to PCIe 5.0 instead to support TB5 and USB4-V2 bandwidth but whatever PCIe standard utilized OCuLink will always have lower overhead and lower latency than TB/Whatever or USB4/Whatever as with OCuLink that's skipping the extra tunneling protocol steps required.
Plus by extension and with any OCuLink Ports being pure PCIe Protocol Based, that opens up the possibility of OCuLink to TB/USB/Whatever Adapters being utilized for maximum flexibility for other use cases as well. Reply
Exotica - Wednesday, September 20, 2023 - link
OCulink has merit for sure, but again, it is clunky. Unlike thunderbolt, it doesn't tunnel displayport or provide power delivery. It also doesn't support hotplugging. That is why it will most likely remain a niche offering. Also you're saying OcCulink is lower latency, but by how much? Where is the test data to prove that ?And does it really matter? Operating systems can be run directly off of thunderbolt NVME storage, the latency is low enough for a smooth experience. And even if OcCulink is technically faster, a GPU such as a 4080 or 4090 or 7900XTX in a PCIe4x4 or even PCIe5x4 eGPU thunderbolt 5 enclosure will be much faster than the iGPU or even internal graphics. And if the eGPU enclosure is thunderbolt enabled, it can power the laptop or host device and probably act as a dock and provide additional downstream thunderbolt ports and possibly USB as well. Thunderbolt provides flexibility that OcCulink does not. Both standards have merit.
But I have a feeling Thunderbolt 5, if implemented properly in terms of bug-free firmware NVMs from Intel, will gain mass market appeal. The mass market is hungry for the additional bandwidth. AsMedia will probably do extremely well as well with its USB4 and upcoming USB4v2 offerings. Reply
TheinsanegamerN - Thursday, September 21, 2023 - link
Dont waste your time, Trampoline is an OCUlink shill who will ignore any criticism for his beloved zuckertech. The idea that most people dont want to disassemble a laptop to use a dock is totally alien to him. ReplyFWhitTrampoline - Thursday, September 21, 2023 - link
LOL, OCuLink's creator PCI-SIG is a not for profit Standards Organization that's responsible for the PCIe standards so it's not like they are any Business Interest with a Fiduciary responsibility to any investors.OCuLink is just a Port/electrical PCIe extension cabling standard that was in fact originally intended to be used in consumer products but Intel, a member of PCI-SIG along with other industry members, had a vested interest in that Intel/Apple co-developed Thunderbolt IP, because of TB controllers and sales of TB controllers related interests.
And TB4/Later and USB4/Later will never have as low latency owing to the fact that any PCIe signalling will have to be intercepted and encapsulated by the TB/USB/Whatever protocol controller in order to be sent down the TB cabling whereas over the OCuLink ports/cabling that's just the PCIe signalling/packets there and no extra delays there related to any extra tunneling protocol encoding/encapsulation and decoding/de-encapsulating steps required.
So OCuLink represents the maximum flexibility as that's the better lowest latency solution for eGPUs being just pure unadulterated PCIe signaling. And because it's just PCIe that opens up the possibility of all sorts of external adapters that take in PCIe and can convert that to Display Port/HDMI/USB/TB/Whatever the end users need because all Motherboard external I/O, for the most part, is in the from of PCIe and OCulink just brings that PCIe directly out of devices via Ports/External cables.
And to be so dogmitacilly opposed to OCulink is the same as being opposed to PCIe! And does any rational person think that that's logical! OCuLink is External PCie and that's all there is to that and it's the lowest latency method to interface with GPUs via any PCIe Slot or externally via an OCuLink connection(PCIe is PCIe).
Give me a Laptop with at least One OCuLink PCIe X4/4.0 port and with that I can interface to an eGPU at 64Gbs bandwidth/lowest latency possible! And there can and will be adapters that can be plugged into that One OCulink port that can do what any other ports on the laptop can do because those ports are all just connected to some MB PCIe lanes in the first place. Reply
Kevin G - Wednesday, September 20, 2023 - link
The main advantage of the TB4 is that the form factor is USB-C which can be configured for various other IO. This is highly desirable in a portable form factor like laptops or tablets. Performance is 'good enough' for external GPU usage. OCuLink maybe faster but doesn't have the flexibility like TB4 over the USB-C connector does. OCuLink has its niche but a mainstream consumer IO solution is not one of them. ReplyFWhitTrampoline - Thursday, September 21, 2023 - link
OCuLink is just externally routed PCIe lanes and really there can be one OCuLink port on every laptop specifically for the best and lowest latency eGPU interfacing and even OCuLink to HDMI/Display Port/whatever adapters that can make the OCuLink port into any other port at the end users discretion. So for eGPUs/Enclosures that have OCuLink ports that's 64Gbs/Lowest latency there and for any Legacy TB4/USB only external eGPU devices just get an OCuLink to TB4/USB4 adapter in the interim and live with the lower bandwidth and higher latency.GPD already has a line of Handheld Gaming devices that utilize a dedicated OCuLink port and a portable eGPU that supports both OCUlink interfacing and TB4/USB4 interfacing. And I do hope that GPD Branches out into the regular laptop market as GPD's external portable eGPU works with other makers products and even products that have M.2/NVMe capable slots available via an M.2/NVMe to OCuLink adapter! LOL, only Vested Interests would Object to OCuLink in the consumer market space, specifically those Vested Interests with Business Models that do not like any competition. Reply
TheinsanegamerN - Thursday, September 21, 2023 - link
Because most people dont want to disassemble their laptop to plug in a m.2 adapter, you knucklehead. ReplyFWhitTrampoline - Thursday, September 21, 2023 - link
No one is forcing you to do that and for others that's an option, albeit and inconvenient one. But really the adapters are not meant for Laptops in the first place and even for Mini Desktop PCs is not an easy task there but still more manageable that doing that with a laptop. It would just better if there was more Mini Desktop PC OEMs/Laptops OEMs where those OEMs would adopt an OCuLink PCie 4.0/x4 Port for eGPU usage like GPD has done with their line of handheld gaming devices. And with mass adoption of OCuLink there could also be adapters as well to support all the other standards as OCuLink being PCIe based by extension will support that as well. Replykwohlt - Tuesday, September 19, 2023 - link
The market for people who find TB4 to be insufficient is too small to delay MTL for them ReplyExotica - Wednesday, September 20, 2023 - link
Source or market research please ? I have the feeling that many enthusiasts will not be interested. Because of missing TB5. And also because of its ipc improvements (or lack thereof) vs raptor lake.Meteor lake certainly is impressive. But it seems to be less about raw performance and more about the process improvement. Foveros. Chiplets. Euv. New manufacturing abilities. AI engine. Power efficiency. Newish gpu.
But from a generational uplift perspective, from a raw cpu performance to the thunderbolt io, I t’s not much of an upgrade for enthusiasts. Intel should’ve just launched MTL in Dec and then announced TB5 in January. What was the reason to announce TB5 before the MTL reveal?
I guess we will have to wait on arrow lake mobile (if that’s a thing) or lunar lake for TB5 on laptops. Reply
kwohlt - Wednesday, September 20, 2023 - link
You need Market Research to tell you TB4 bandwidth is sufficient for majority of users? 40Gb/s can drive easily gigabit interent and multiple monitors. Most jobs do not require more. At the Fortune 500 I manage IT for, we still haven't even switched to thunderbolt as 3.1 docks are more than sufficient.There's market research on TB4 trends for purchase, that i'm not going to pay for, so we'll just have to settle on "Intel's market research determined that delaying their next gen product line for this 1 feature, potentially causing delays across OEMs 2024 product lines in the process, was not worth it" Reply
PeachNCream - Thursday, September 21, 2023 - link
"...many enthusiasts..."While that segment might be outspoken, the percentage of the overall market is tiny and the percentage that cares among that fraction is even smaller. Basement dweller computer nerds and the e-sports people they idolize don't buy the hundreds of thousands of units that a computer manufacturer purchases. Sure, they get a minor head nod from the company to keep them from slobbering and raving about being ignored, but that's done because it's cheap to coddle them with marketing speak and make them believe features are targeted at them so their ego balloons aren't popped and sites like this have a bone or two to throw them once in a while, but ultimately, no one cares what they want as long as they fanboy argue in favor of their preferred brand with other nerds that like the competition. Reply
TheinsanegamerN - Thursday, September 21, 2023 - link
Exactly. TB5 is exciting and meteor lake is mostly DoA without it. Who would invest thousands into a machine that cant make use of newer functionality? ReplyKaarlisK - Tuesday, September 19, 2023 - link
Was this just written by having an AI interpret the slides? And then OCR failed?"This means that higher Out-of-Service (OoS) work is allocated to P-cores for more demanding and intensive workloads, while lower Quality-of-Service (QoS) workloads are directed to E-cores, primarily to save power" Reply
Ryan Smith - Tuesday, September 19, 2023 - link
No, it was done by a sleep-deprived human. ReplyKaarlisK - Tuesday, September 19, 2023 - link
Thank you for the explanation.The problem is, I caught at least three more mistakes like this, where a wrong assumption is made about what the text on a slide actually means. In which case (knowing that I'm not an expert), how can I be certain that there aren't many more mistakes that I haven't spotted?
We do come to Anandtech for in-depth analysis, which requires that trust. Reply
Ryan Smith - Tuesday, September 19, 2023 - link
The blunt answer is that we're imperfect (to err is human). We've made mistakes in the past and will continue to do so in the future. But we always own up to those mistakes, and will correct anything if we catch it (or if it gets pointed out). ReplyDannyH246 - Tuesday, September 19, 2023 - link
Wow! Intel have some revolutionary ideas here!! Their chiplet approach will change the industry.Would be what i'd have said if they'd have presented this 6 years ago. My response today is...meh. Reply
tipoo - Tuesday, September 19, 2023 - link
Is anyone at AT planning on deep diving the A17 Pro? ReplyRyan Smith - Tuesday, September 19, 2023 - link
At the moment, no. I do not have a mobile editor to work on such projects. ReplyFWhitTrampoline - Tuesday, September 19, 2023 - link
Oh no that's bad news as Apple appears to have gone even wider with the A17 P cores than even the A14/Firestorm with decode resources on A17/P core, if the Apple promotional material is correct!Maybe Chipsandcheese will look at A17's P core design and with some Micro-benchmarks as well. Reply
tipoo - Tuesday, September 19, 2023 - link
Yeah that's too bad, it looks like the e-cores got a bigger bump than the p-cores but they didn't advertise it with how strangely they mentioned it ReplyFWhitTrampoline - Wednesday, September 20, 2023 - link
The slide from Intel on its Crestmont E core design(Block Diagram) does not look that much different from Gracemont's block diagram and Redwood Cove(Block Diagram) core design still appears to be a 6 wide Instruction Decoder design and so Similar to Golden Cove but there needs to be more info concerning Micro-Op issue rates and other parts of Redwood Cove's core design. ReplyGeoffreyA - Thursday, September 21, 2023 - link
It's hard to see the instruction decoders being increased all that much, because of their power burden in x86. Replyikjadoon - Tuesday, September 19, 2023 - link
>As expected, Meteor Lake brings generational IPC gains through the new Redwood Cove cores.Redwood Cove does not have any IPC gain, I believe. Is there a citation or slide regarding this?
This will be now the third Intel CPU generation with 0% to 1% IPC gains in their P-cores. Reply
Gavin Bonshor - Tuesday, September 19, 2023 - link
Intel confirmed to me Redwood Cove would have IP gains over Raptor Cove. When I get back (been sat at a PC a lot the last few days), I'll grab it for you Replykwohlt - Tuesday, September 19, 2023 - link
There's so many changes in MTL, it would make sense to just save a new P core uArch for next gen. Especially when clockspeed/watt is going up a decent amount, so it's not like perf/watt is stagnating. ReplyGeoffreyA - Thursday, September 21, 2023 - link
I think they've been following the old tick-tock system, Sunny and Golden Cove being the tocks, and Willow and Raptor the ticks. So, it's possible that Redwood would bring some proper changes. Replydwillmore - Tuesday, September 19, 2023 - link
Nice! Finally might get a desktop CPU without having to pay for an expensive built in GPU that I don't want. (If you think $25 off for an F model is the same thing you're dillusional)On an unrelated note, I'm curious which of these tiles represent a minimum viable system. Are the LP E cores on the low voltage island of the SoC die sufficient? Can we get by without the CPU nor GPU dies? That might make a really nice media player as it would have all the display driving and video decoding hardware and a coupld of LP E cores to manage housekeeping and maybe drawing a GUI if necessary.
What about for a simple headless system, can just the SoC die be enough? In either of these cases you'd need the I/O die (maybe even a harvested one where some parts don't work, but are for dies not used.....) Reply
Gavin Bonshor - Tuesday, September 19, 2023 - link
As it stands, there's no plans to bring MTL to desktop. As for next year, that remains to be seen ReplyFWhitTrampoline - Tuesday, September 19, 2023 - link
Intel has ruined the Small For factor DIY market that needs Socket Packaged processors and not BGA packaged processors/SOCs. So no Chances to Build an ASRock Desk Mini that's STX MB form factor based and supports Socket Packaged Intel and AMD SOCs/APUs with powerful iGPUs.And really AMD has intentionally delayed any Ryzen 7000G(Socket Packaged) Desktop APU release in favor of BGA only OEM SKUs on Minisforum and Beelink mini desktop PC systems where there are now Ryzen 7040/BGA Packaged processor based systems allotted 70w cTDPs and so 5 more watts that the Ryzen 5700G(65W) desktop APUs, that was the last generation usable for the ASRock X300 Desk Mini line there.
And the InWin Chopin DIY friendly very Small form factor build that takes a Mini-ITX MB but lacks the room for any dGPU to be slotted in there as the Chopin's form factor is just too small there and AMD's Ryzen 5000G APUs where a popular choice there for DIY friendly small form factor Chopin system builds. And AMD's Desktop Ryzen 7000 series offers RDNA2/2CU integrated graphics but that's not APU class or marketed by AMD as APU class.
I had hoped that Intel would have at least released a 65W Socket Packaged Meteor Lake SKU so folks could possiblely have some ASRock Desk Mini DIY friendly option on a Socket Based STX MB from factor. And I was even more hoping that some Meteor Lake S(65W-80W) Socket Package variant would force AMD's hands there to make them release some Ryzen 7000G Socket Packaged desktop APU for the DIY market! But now sans ant Intel competition in that product segment AMD may just not release any Ryzen 7000G for a good long while and DIY Small Form Factor will go depreciated in favor of BGA Only and OEM Only as well. Reply
kwohlt - Tuesday, September 19, 2023 - link
Oh hey, you're that guy from WCCF ReplyFWhitTrampoline - Wednesday, September 20, 2023 - link
Stop trying to DOXX People here, and I hope the MODS see this Replykwohlt - Wednesday, September 20, 2023 - link
It's not doxxing to point out someone on another forum copy-pastes the same comment all the time. There's no personal info here at all ReplyFWhitTrampoline - Thursday, September 21, 2023 - link
More intimidation here and Doxxing is Doxxing! You are using intimidation tactics that should get a moderation res ponce before any legal response is required! ReplyTheinsanegamerN - Thursday, September 21, 2023 - link
Learn how to spell "DOX', and go back to WCCFtech to get dunked on. ReplyFWhitTrampoline - Thursday, September 21, 2023 - link
Look at you here and trying to intimidate and adding nothing to any discourse! This is not the kind of posting that should be allowed at Anandtech! ReplyTheinsanegamerN - Thursday, September 21, 2023 - link
Oh I know this guy! His obsession with the InWin Chopin is almost chris chan like. ReplyFWhitTrampoline - Thursday, September 21, 2023 - link
That would be more of DIY friendly Very Small Form Factor Enthusiast/end user there! And with a reasonable expectation that the vibrant DIY Small Form Factor devices(Mini desktop PCs) market continue to be offered Socket Packaged Processors with newer than Ryzen 5000G/Zen-3 and Vega 8CU iGPU based graphicsIP, and Ditto for any Intel based options as well.
So it's wrong to expect any Further Ryzen G series Desktop[Socket Packaged] APUs from AMD because that's not good for the OEMs there and their business models that are not so DIY friendly for Processor Upgrades if the Processor comes BGA wedded to the Motherboard! And OEM products that are not so good for eWaste reduction because if the processor goes that can not be easily replaced/upgraded by the end user(DIY sorts of Folks).
There needs to be a Right to Processor Upgrade just as much as a Right to Repair and with Socket packaged processors those rights go hand in hand there along with any environmental eWaste concerns. But we must not trample upon those Business Models as that's just not good for OEM Profits there, consumers be damned!
And InWin Chopin or ASRock Desk Mini, Socket Packaged APUs/SOCs are the best option as that's by definition DIY friendly there.
I'll expect no Complaints from you if the entire PC market goes BGA Packaged Processors only and you'll have to buy the Processor Attached to the Motherboard, take it or leave it! Reply
brucethemoose - Wednesday, September 20, 2023 - link
> That might make a really nice media playerSeems like a lot of silicon for what's essentially the job of a dirt cheap ARM SoC. And its a questionable fit for a headless system unless its like a stable diffusion/transcoding host.
It *does* seems like an interesting fit for a smart TV chip, maybe with a small GPU die, as they would actually use the NPU for their internal video filtering. Reply
emvonline - Tuesday, September 19, 2023 - link
Intel 4 will not be shipping any products to customers until Mid December. This after stating it is in production in December 2022. 12 months from production starts to PCs out is not good. And I better be able to buy meteor lake Notebook on Dec 14th 2023 or this is exactly like old Intel (Launch means we may have sold some parts to someone somewhere). This claiming a node is done when its production ready, when you ship nothing is problematic. FYI Meteor Lake is 2x the cost of Raptor lake in 2024. Intel 4 is not a cost reduction. The product might be great but it is expensive ReplyRoy2002 - Tuesday, September 19, 2023 - link
4 was in production in December 2022? No way! It should be started not long ago.Usually the first real product silicon would be taped out one year ahead of release date. And that silicon would be very buggy and needs several steppings to have bugs fixed. Reply
Roy2002 - Tuesday, September 19, 2023 - link
So December 2022 is the initial project tapein date and silicon debug follows. ReplyChrisGar15 - Tuesday, September 19, 2023 - link
Probably called "manufacturing ready." Replyxol - Wednesday, September 20, 2023 - link
So Xe-lpg is still intel uhd graphics (13\14th gen now?) with top EU count of 128 up from 96. Fine.maybe about 3TF fp32, not quite xbox Series S level
They added RT support which is good i guess but will it ever be used in a gpu that is really PS4 performance?, or maybe there are not gaming applications.
product sounds good, just wait for the numbering scheme Reply
JBCIII - Wednesday, September 20, 2023 - link
"An example of how applications pool together the various tiles include those through WinML, which has been part of Microsoft's operating systems since Windows 10, typically runs workloads with the MLAS library through the CPU, while those going through DirectML are utilized by both the CPU and GPU."This sentence is really a mess. Editor: please take note. Is "example" the subject of "include"? That would make "includes" the necessary form of the verb. What is the subject of "runs"? I'm guessing WinML. Maybe it should be "WinML...which typically runs" but the long parenthetical expression about Windows 10 support makes it hard to bridge the gap. Maybe parentheses would be more clear instead of commas to keep the meaning on track. I'm still not sure what was meant. Reply
GeoffreyA - Thursday, September 21, 2023 - link
WinML is the higher-level abstraction, and DirectML, the lower-level one. ReplyKevin G - Wednesday, September 20, 2023 - link
This is what I was hoping to see Intel pull off in the late 14 nm/early 10 nm days when their foundries were having difficulties. Intel should have pivoted int his direction at the first sign of trouble with those as the packaging side of this, while cutting edge back then, could have been pulled off. Better late than never.However with Meteor Lake around the corner, it is shaping up to a pretty good design. Both the CPU and GPU sides can scale and evolve independently from the central SoC. The GPU portion that was moved onto the SoC makes sense as the codecs and display logic are not going to change over the next few generations. I would quibble about the point made that putting them next to the NPU is more advantageous than next to the GPU cores. There certainly is a benefit for AI upscaling of movies but my presumption is that I'd be lower power/lower latency to have the encoders next to the GPU cache which houses the final render frame for encoding and transmission. The tasks that's benefit here would be gaming streaming or remote access. Both things can be true hence why it is a quibble as it'll matter to individual use cases which one approach is superior.
My initial presumption for the IO die was that it was to house various analog circuits that would then be leveraged by the SoC die. This is a clever means of process optimization as analog circuitry does not scale at the same rate as logic. Similarly this would permit a cheaper die to extend the number of area intensive IO pads.
The last thing missing is the L4 cache die that was hinted at in earlier Linux patches. That'll probably come along with the Lunar Lake generation. Reply
Eliadbu - Wednesday, September 20, 2023 - link
The adamantine cache is supposedly at the base tile that all other tiles are connected to.It wasn't mentioned here, maybe if it does exist on MTL we will see information about it when the CPUs will officially come out. Reply
haplo602 - Thursday, September 21, 2023 - link
So now we will have 3 types of cores for the OS to schedule ... I hope Intel is working with OS vendors to properly implement this or it will be a nightmare ... we already saw the 12/13gen issues on Windows 10 and 11 with wrong E to P scheduling .... ReplyGeoffreyA - Thursday, September 21, 2023 - link
Complexity is good, says Intel. ReplyVink - Friday, September 22, 2023 - link
Pat Gelsinger does a very good job and I'm speaking concretely because I use series 12 and 13 in the manufacture of Graphics Stations, Ultra PCs, and Standard PCs and they all work PERFECT with maximum benchmark and infinite Tau (only K series)... I'm looking forward to series 14 especially for its benchmark... ReplyJayNor - Friday, September 22, 2023 - link
" ...8 x Xe graphics cores with 128 vector engines (12 per Xe core) "16 per xe core Reply
James5mith - Friday, September 29, 2023 - link
"This is because it's the first client processor to be made using chiplets instead of a monolithic design."Don't tell AMD. Reply
dicobalt - Saturday, October 14, 2023 - link
Any indicators of future CAMM sockets integrated directly onto the CPU package? Reply