Unlikely for performance, electricity over copper also moves at very close to the speed of light. It could for power consumption, but that depends on how far you want the memory from the CPU. If it's close then you probably don't gain anything due to conversion losses (the signals still need to be converted back to electricity at either end). If further away, then it may provide a benefit despite conversion.Reply
I'd be nice to see some more exotic memory techniques appear in HBM. For example, separated read/write buses to remove all turn around time latencies. Similarly, create a dedicated command bus. These three buses (read, write, command) can be defined at different widths and clocks in the spec as well as permit independent clocks on the each bus for power savings. These changes would increase the complexity of the memory controller but that is the price of exotic memory technologies like HBM in the first place.Reply
Fiji was 4096 bit wide with four stacks of HBM1. Vega 64 had two stacks of HBM2 for a 2048 bit width. Radeon VII went to four stacks again while the rarely seen mobile Vega had a single stack at 1024 bits wide. This mobile Vega part is what was used inside Intel's Skull Canyon NUC.
The largest number of HBM stacks on a product I've seen so far is eight (8192 bit width) for the MI300 series but that is where each chiplet has its own HBM stack. There are a few large monolithic dies with six stacks (6144 bit width). Technically possible to go to even greater number stacks but the extra interposers and packaging headaches are not yet worth it. With products like nVidia's H100 being able to have a single layer of a stack disabled, the yield factor of packaging may not be that significant of an issue.Reply
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
11 Comments
Back to Article
Threska - Thursday, October 12, 2023 - link
One wonders if optical links would be better in speeding up the links between memory and CPU. ReplyReflex - Thursday, October 12, 2023 - link
Unlikely for performance, electricity over copper also moves at very close to the speed of light. It could for power consumption, but that depends on how far you want the memory from the CPU. If it's close then you probably don't gain anything due to conversion losses (the signals still need to be converted back to electricity at either end). If further away, then it may provide a benefit despite conversion. Replyshing3232 - Friday, October 13, 2023 - link
You still need to convert it into light signal through and there is delay for that. Replysaratoga4 - Friday, October 13, 2023 - link
Optical links are mostly used for longer range. Once you're within centimeters they don't make a lot of sense. ReplyKevin G - Thursday, October 12, 2023 - link
I'd be nice to see some more exotic memory techniques appear in HBM. For example, separated read/write buses to remove all turn around time latencies. Similarly, create a dedicated command bus. These three buses (read, write, command) can be defined at different widths and clocks in the spec as well as permit independent clocks on the each bus for power savings. These changes would increase the complexity of the memory controller but that is the price of exotic memory technologies like HBM in the first place. Replymeacupla - Thursday, October 12, 2023 - link
2048bit bus is crazy.I would be happy if we could have 256bit bus on a low end GPU. Reply
schujj07 - Thursday, October 12, 2023 - link
That is 2048 bit per stack. A GPU with 4 stacks on HBM4 would have an 8192 bit bus. ReplyOxford Guy - Friday, October 13, 2023 - link
Bit of trivia: My recollection is that AMD’s Fiji had 2048-bit HBM1 bus. That went to 1024 bits with Vega (HBM2). ReplyKevin G - Friday, October 13, 2023 - link
Fiji was 4096 bit wide with four stacks of HBM1. Vega 64 had two stacks of HBM2 for a 2048 bit width. Radeon VII went to four stacks again while the rarely seen mobile Vega had a single stack at 1024 bits wide. This mobile Vega part is what was used inside Intel's Skull Canyon NUC.The largest number of HBM stacks on a product I've seen so far is eight (8192 bit width) for the MI300 series but that is where each chiplet has its own HBM stack. There are a few large monolithic dies with six stacks (6144 bit width). Technically possible to go to even greater number stacks but the extra interposers and packaging headaches are not yet worth it. With products like nVidia's H100 being able to have a single layer of a stack disabled, the yield factor of packaging may not be that significant of an issue. Reply
Oxford Guy - Friday, October 13, 2023 - link
'Fiji was 4096 bit wide with four stacks of HBM1. Vega 64 had two stacks of HBM2 for a 2048 bit width.'Don't get old. Reply
Oxford Guy - Friday, October 13, 2023 - link
'the rarely seen mobile Vega had a single stack at 1024 bits wide.'That must be what tricked my memory, as I remembered Vega as being half the width. Reply