Issue link: https://iconnect007.uberflip.com/i/1293772
16 SMT007 MAGAZINE I OCTOBER 2020 If you're familiar with the step- per sizes, there's a maximum stepper size that you can physi- cally build a die to; it's about 800 millimeters square, limited by the reticle. When a die gets that large, the yields go down exponentially, which makes the part very expen- sive. You get only so many parts per wafer because of low yields. To make things worse, that maxi- mum reticle size is not enough for the high level of integration that is needed to bring everything closer together for the performance required by the cutting-edge applications, such as the AI accelerators and high-performance computing needs in the data centers and the high-end networking market. Combining multiple parts together in sepa- rate packages is not a practical option for these high-end products. Johnson: That makes for an interesting con- flict. You need to get all of that functionality onto the silicon, but it's hard to get that much functionality onto the silicon. Horner: Exactly. High-end computing technol- ogy needs reduced latencies. This is becoming very critical because every time you make a hop from one packaged part to the next device to access data, you add latency. A lot of these AI applications require a lot of access to mem- ory. And if you're making so many hops and going back and forth, the turnaround trip costs you a lot of latency, which is not very appeal- ing because it means processing data will be slow. Bringing the memory closer makes more sense. That's why there's a big movement that started with AMD in bringing the DRAM mem- ory much closer. As a result, JEDEC started the high bandwidth memory (HBM) standard a few years ago. Effectively, there are layers and layers of DRAM that are stacked together, as a block of memory that can be attached to a processing unit. It could be a CPU, GPU, or any SoCs that need high levels of memory access with low latency. By putting those devices close to each other—especially in the same package, where you don't have to go out of the package, across PCB traces to another package to get access to data—you minimize a lot of latency and save power. As far as the economic aspect of it, a complex design with mixture of lots of dies in a package con- sists of many layers of complexi- ties that that need to be addressed. If there is a way to communicate information during package floor planning to the SoC or the package designer, or the person who is going to assemble the part on a board, which could be beneficial to fur- ther optimize each part of the design, it could be optimized in parallel. Packages have many substrate layers, similar to the PCB layer con- cept, where having additional layers translates into increased cost. If you know, optimally, where to put the bump connections to the package substrate to minimize the number of substrate layers, or where to place the micro- bumps to make the most optimal die-to-die connection, to minimize the size of the inter- poser, these could translate to overall solution cost saving. There needs to be more communi- cation between the package designer, the PCB designer, and the die designer for a cost-effec- tive and optimal solution. Dan Feinberg: 5G is accelerating that. One of the responses to this issue that you're discuss- ing is the use of chiplets in CPU manufacture. Do you see that? Horner: In many applications, it's not just because the die size is getting larger; it's also about when you have lots of parallel process- ing. At times, there are 40+ cores being pro- cessed in parallel inside a part. Looking at a few of AMD's designs, there are dies with four cores that are combined with four others of its own copy. That made it into four times four, so there are 16 cores within a package. When a large die is partitioned into smaller parts or for aggregat- ing multiple cores, one has to decide on how to partition a die and what interfaces to use. Rita Horner