Discussing the Fabless Advantage with Daniel Nenni (Transcribed from Broken Silicon 63)
The following was transcribed by Sayonara, a member of the Moore's Law Is Dead community. You can download a PDF of the entire transcription if you would like to read the entire conversation, or you can listen to Episode 63 of Broken Silicon on your preferred podcast app.
If I may say so myself, I think this was easily one o f the best episodes, and I highly recommend it be experienced in some form by all who want to learn about the following subjects:
1) The Silicon Industry's transition from a world where almost every silicon company had their own fabs, to the current world where "captive fabs" are nearly a thing of the past.
2) Differences between how Samsung, TSMC, Intel, and Global Foundries conduct business.
3) Various past and upcoming products from Nvidia, AMD, Intel, Xilinx, and many more.
If you still aren't sure if you want to read or listen to this conversation, maybe the abbreviated version below will wet your appetite...enjoy!
MLID = Tom, host of the podcast. DN = Daniel Nenni, Founder of SemiWiki.com
MLID: Welcome to Broken Silicon, a computer gaming and hardware podcast. I'm your host, Tom, and I will let my esteemed guest introduce himself.
DN: Alright. Thanks, Tom. My name is Daniel Nenni, I'm a long-time semiconductor professional. Father of four. I live here right outside of Silicon Valley, I've been here most of my life, and in the last ten years I've started writing and documenting my experiences throughout the semiconductor industry. I started in the early 1980's, which was an interesting time and now we've made a huge transformation - a couple different transformations - so there really is a lot to write about.
MLID: What would you say... just thinking here based on what you've said... the biggest difference between - and maybe there's a few you want to mention - between the semiconductor industry in the 1980's and now... what, 40 years later?
DN: 36 for me. I got married right out of college, so my marriage anniversary and my work anniversary is the same thing, so it's easier to remember. When I first came into the business, computer companies really drove the semiconductor industry. These were mini computers. Before minicomputers we had mainframes, these big things that were in the buildings hiding somewhere. Minicomputers were a little more portable, so when I was in college, we used minicomputers to learn how to program and learn computer engineering.
They were IBM 360s, HP, Data General, Digital Equipment; those were the four big vendors. But those guys made their own chips, so they drove the semiconductor industry. You only had one mainframe for a company whereas minicomputers were spread around, so there were many more silicon chips to be found throughout your building. So that was kind of the birth of the semiconductor industry into the mainstream markets.
MLID: And they all had their own fabs too most of the time, didn't they.
DN: They all did! In fact, my first job was at a fab here in Silicon Valley making chips for a computer company, and it was just the way it was. You had no other choice, and it was viewed as a competitive advantage if you were making your own chips, so they made their own CPUs, they made their own I/O chips, everything, they did their own. Back then a CPU was a board full of chips, right. In fact, sometimes multiple boards. It took multiple chips, multiple boards to make a CPU.
The big change there was when Intel and Motorola came around and said "Hey, we're going to make a general purpose CPU, anybody can use it - and by the way, you can use the same software, you don't have to write your own code". That was a pretty big transformation, that's when the computer companies such as Data General and Digital Equipment and Prime - that really was the beginning of their end. Intel and Motorola really put them out of business.
MLID: That's starting to get into some of the nodes here that I wrote down for us to discuss. Why was it seen as an advantage to own your own fab? Most people could maybe guess the mindset, but literally why do you think most people would have said they had their own fabs back then?
DN: Have you heard the saying "Real Men Have Their Own Fabs"?
MLID: [laughs] I have, yeah. I think I read that in your book too.
DN: That's something that everybody in Silicon Valley knows. It's attributed to the guy from AMD, but it actually wasn't him who said it - the founder of Cypress Semiconductor said it - but it just caught on. Jerry Sanders, the president of AMD... he had a really big ego and he got up and yelled "Real Men Have Fabs", right. There's a lot of ego going on with it, as well the ability to customize your chips. Do you really want to be using the same chips as everybody else? How do you differentiate your product when you're just tagging on with the other guys' silicon?
MLID: When you look back at when everyone owned their own fab, I myself couldn't think come up with a good reason, in hindsight, that you would want one. I think the best engineering argument you could make is "Well, we designed this node, we can design it around our needs," right, and then "That's an advantage that we can wield, that only we have this node for these products," but at the end of the day, I feel that often they'll hold each other back.
Like, "Oh, this node needs to be this way, and so we're going to focus on that," but then that might delay the chip's design, or vice versa. Is that an incorrect assumption about why it can be a hindrance to own a fab?
DN: Well... yeah. As it turns out, we know. But when we had our fabs, you could also guarantee capacity. You had control over your design, over your capacity, you could guarantee 𝑥 amount of chips by building your own fabs.
If you're renting space, what happens if there's other renters that will pay more money? It's supply and demand. The fabless really is a little bit frightening, if you think about it, way back when - and that's why all the banks and investors said "Hey, you gotta have your own fab. Control your own destiny."
And, you, when TSMC came out-
MLID: And that sounds flashy, control your own destiny, but what does that really mean?
DN: Control your own silicon. [laughs] Yeah, now it has different meanings.
One of the companies that rented out their fabs was Texas Instruments. One of the executives at Texas Instruments was Morris Chang - he's the founder of TSMC. That's where he got the idea for the business. He was renting out fab space, and they were very selective who they rented it to. They weren't going to rent it to a company that was going to compete with them, because that doesn't make sense.
Morris Chang said "Hey, there's a business here, we need to open up the manufacturing economies of scale, do it cheaper," and he went back to Taiwan and pitched it. The Taiwanese government and Philips Semiconductor were some of the early investors. Without the Taiwanese government's backing, they never would have made it.
That brings us to where we are today - we're still trying to make faster, cheaper semiconductors, and to make them cheaper, we have to put more on a chip. What's nice is they have these things called teardowns now that they didn't used to have in my day. In my day, we had to tear it down ourselves. If you look at the Apple phone teardowns, they will give you an example of the advancements. The first iPhones had a lot of chips in them. They had one main chip called an SoC.
The second book I wrote is actually on SoCs. It's called Mobile Unleashed. The first half of the book is a look at the technology that ARM provides - there's a company, ARM, I'm sure you've heard of them. So I did a history of ARM, and how they brought their architecture to the industry and, you know, it was a calamity of errors actually.
And then the second half of the book looks at three different companies - Samsung, Qualcomm and Apple. Of how they became semiconductor manufacturers and how they came to dominate the SoC market. Apple's the best story. I grew up with Apple, they made these goofy computers - all the way back when, the Apple II and the Macintosh, and they were more toys - but Apple decided they were going to build their own semiconductors, and they were going to make portable devices. So they made the iPod, the iPad and the iPhone.
MLID: Well, I mean, that's a thing I've started to argue low-key in some of my podcasts - when you see Apple hiring, you know - people say "Well Apple's not Intel or AMD," and it's like well... until they just hire people that used to work at AMD or Intel, right? They can hire those people, guys, and as those teams continue to grow and they design their own chips... Yeah, I dunno. I'm not so sure Apple isn't going to be thought of as a silicon company soon.
DN: The problem we have now, just to bring this up quickly, is: TSMC really is the only foundry out there now.
MLID: Yeah, I see that becoming a huge problem for competitiveness in the next couple years.
DN: It's not good, only having one manufacturing source for leading edge chips. And, you know, I'm a fan of TSMC - they actually have a very good business model - but if they decided to be evil, it could be a real problem.
Google's mission statement in the beginning was "Don't Be Evil". Clearly they threw that out.
MLID: They said "Maybe not anymore..."
DN: "...maybe we can be evil if we make a lot of money." Luckily TSMC is a very moral, very friendly company - they base their business model on customers. You have to realize, TSMC doesn't design semiconductors.
MLID: And that's a decision, right. A decided move that "We're just never going to do that."
DN: Yeah. And if you don't design the semiconductors, it's hard to manufacture them because you don't know what's coming down the road. The ability to partner intimately with key designers of semiconductors like AMD, Apple, some of these other companies - gives TSMC the knowledge and the future ability to create these chips that don't exist today, that they're going to have to invent new technologies and solve new problems.
As it turns out, they have the best of both worlds, because they have access to hundreds of companies and hundreds of different types of designs and they'll see every potential problem sooner than everybody else. Whereas Intel just makes one type of product, really, or two. That limits their focus and their ability to investigate.
The other thing is that it's just pure math. The ecosystem around TSMC - its IP, its EDA, its customers - and these people spend trillions of dollars in R&D doing this, and they share all of this knowledge with TSMC. If you're making your own chips, then you don't have that big of a budget, right.
MLID: Yeah. Looking at TSMC, I would say the one thing that is hopeful - it's something that I've been covering on my channel - is how much of NVIDIA's next graphics card lineup is going to be... For the past few generations, they will make some of their lower end chips on Samsung, and it seems like this time at least most of the gaming lineup, if not the entire beginning of the gaming lineup at least for this year, is made on a Samsung 8nm - which is really just a refined 10nm node.
While there may be some performance loss versus if they were on the latest TSMC 7nm that AMD is going to use, at the same time at least that's money going to hopefully making sure Samsung can keep up with TSMC because really Samsung's a full node behind TSMC at this point. When I look at their 5nm feature set... You can kinda argue, right, that it's around TSMC 7nm EUV? In reality it seems to be even worse than that, despite them calling it 5nm. I'm hoping at least Samsung can at least get some larger partners now that, while they may be at a disadvantage, hopefully the cost makes it so it's worth it and that Samsung can keep up.
'cuz Intel isn't.
DN: Well... you know, I've worked with Samsung and these guys for thirty years, and really it's a cultural business model issue. Samsung is really, really keen on being the first to new technologies. It's part of the culture of the company, and I'm sure the CEO says "Okay, how many patents do you have, how many firsts do you have," etcetera etcetera, but just because you're first to a technology doesn't mean you can actually manufacture it.
DN: Samsung has had yield problems - serious, serious yield problems - throughout their history because they were first to a node, but TSMC is always first to high volume manufacturing, so you really have to separate the two. The first to announce going someplace, and then the first one to ship. The problem TSMC has is that they have some very big customers, and Apple is the game changer. Apple needs a new chip every year for their iProduct - iPod, iPad...
MLID: They will pay for it, yeah.
DN: And they write some very big checks.
Back in the day, we didn't release a semiconductor process until it was done. We would put up these specs and we would say "Hey, Moore's Law. This is the process node. We need a full node transition. This is what we need." We would hold it until we met those specs, and sometimes it would take two years, sometimes it would take three years, sometimes it would take four years. That's the history and tradition of the semiconductor industry. For companies like Intel, they need huge yields. They're making a lot of chips, and if they're not yielding, they're going to lose money.
TSMC started doing these things called half steps. We used to do half-nodes a long time ago. You would take a 65nm and then you'd shrink it and make it tighter because you had manufacturing experience with it and then you could come out with a tighter, cheaper version. We called those half-nodes. It's not something we celebrated, because it wasn't a big deal, and we didn't really have clever names for them. It was just a better version of the previous process.
Apple and TSMC changed that. Apple wants a new chip every year, a new process every year, so TSMC does these half steps. 10nm to 7nm to 6nm - those are all tiny steps. It's using the same fab, the same fab equipment - they're just making improvements. It's a different naming scheme, it's a different methodology.
Intel are still trying to do the Moore's Law thing, where 10nm to 7nm is going to be twice the transistor density. Then they said "Maybe we'll do 1.7 density," where TSMC is doing a fraction of that. If you look at the TSMC numbers, they don't compare half step to half step.
MLID: No. You always have to reference a few nodes back for each one.
DN: For example, they just did this at their conference - they're comparing 7nm to 5nm, and that's a good jump. You get double digit performance increase, power... but they left off 7nm+ and 6nm, because the difference between 6nm and 5nm is single digit. But, you know, this is what the media wanted to hear - Moore's Law is still alive, blah blah blah.
The benefit of doing that, and this what really is important in semiconductors - is the learning curve. When you're taking little steps, you can really accelerate the learning curve and you can find out what the problems are with the equipment, with the designs - there's so many people involved in this recipe, it's just daunting. So taking small steps, as it turns out, now that we know, is just brilliant. TSMC is the only one to get EUV yielding in high volumes.
It's funny, there was a question from the media - we did a virtual event with TSMC, which is the first time, and they had Q&A and stuff like that - and one guy says "How come you stayed with FinFETs at 3nm while Intel and Samsung are going to a new technology called Gate All Around?"
MLID: I think you've already answered your own question. Go on, though [laughs].
DN: Actually, the answer to that question is "Because Apple said so". Apple can't take the risk of not yielding, and we have not seen Gate All Around yield yet. It's in the labs, it looks good, I'm sure TSMC will do it, but they did the same thing with EUV. They brought EUV in at 7nm. So here, you already have 7nm yielding, everything's great. Then you introduce EUV, and you only do a few layers. You do a third EUV and see how that goes. With 6nm, you do half EUV. And then with 5nm, you do full EUV. That cautious approach, as it turns out, is the way to go. It's not an ego brag, "Real Men Have Fabs", "Real Men Do Gordon Moore's Law", you know...
MLID: [laughs] Well, Real Men actually manufacture products, I would say.
DN: Well, I would say Real Men actually make money manufacturing products, because some companies... don't. Samsung - the semiconductor business is such a small piece of their company that you don't even see it in their reports. You have no idea if they're making money on their foundry business or not. Even if they are, it's a very small amount of money - and it's all about yield.
MLID: I remember TSMC specifically giving some talks around the... prolonged, shall we say, 28nm era where there was all this hype that there was going to be a 20nm GPU and that just didn't happen, they just made 28nm again like three times, but then once TSMC hit 16nm they said "No, we're really going to have a 12nm node, we're really going to have a 10nm node when we say we are. We've learned how to... the things that were preventing us from 28 to 16 - and really, from 28 to even 20nm - we've learned those lessons".
It sounds like at a macro level the lesson they learned is to stop with this ridiculous everything-has-to-get-ten-times-better mentality and just take half-steps over and over and over. Eventually, like you say - you can make fun of them for not comparing 5nm to 6, but at the end of the day... yep, new node coming out on time, 20, 30% better than before, as usual.
DN: Yeah. You know... for us semiconductor guys, we know the difference between this stuff. When you design a semiconductor you get something called a PDK. It's called a Process Design Kit. You get these PDKs before the process is done, and what it is is it's simulation models and complete descriptions of the process. We don't really read an article and say "Ooh, I want this process because it's going to be better," we do a thorough analysis, so we actually know what all this stuff is.
But as a consumer, do you want a chip called 10++ or do you want a chip called 7nm? Realistically, they're kind of the same, but it's just part of marketing and Intel really has to get with the program and start matching the other foundries.
You brought up 28nm - that was an interesting node. That was actually a tipping point for the semiconductor industry. There were two ways to go on 28nm, and this was the gate-first, gate-last controversy - I don't know if you are familiar with it. It's high-κ metal gate technology, we introduced a new technology at 28nm and either you did the gate first - how you put the gate down on the die - or you do the gate last.
Intel was the first one to high-κ metal gate and they chose one direction, and TSMC - they were a year or two behind Intel, they chose the same direction as Intel - but the other foundries, Samsung, IBM, UMC, Chartered - they chose a different implementation, and that meant you could not take a design from TSMC and manufacture it elsewhere.
What happened was, and this is a big historical note, TSMC - the other 28nm implementations didn't yield, so TSMC was the only one that had 28nm!
MLID: The designs aren't compatible, and the others aren't yielding - you'd be a moron to not just - so everyone flocked to TSMC then.
DN: Well the problem is we're on allocation and this is the problem that everybody fears - you can't get enough chips no matter what, because they just didn't have enough. The other implementations, the other companies were supposed to make up half of the industry because TSMC has maintained a 50% market share, plus or minus. All of a sudden you lose half your capacity, so NVIDIA couldn't get enough chips, because TSMC said "Hey, listen, we need to be fair and share", and, ah... [laughs] other companies weren't so happy with that, they said "Hey, we're on your fab, we need more chips!"
Anyway, that's how it started. After 28nm, came FinFETs. That is a huge difference, because FinFETs are not created equal. That also changed our naming scheme. 28nm - we used to name our processes based on the length of a transistor, so they were two-dimensional. 28nm - that's how long they were. That was a very easy decision on how to name processes. With FinFETs, they're actually two-dimensional [sic], they have fins. The bigger the fin, the smaller the base, so you really can't do the naming scheme. So now we just make things up.
What's funny is that TSMC was a little bit more honorable in the beginning. They looked at Intel's 14nm and they said "Okay, well we're not as dense as that, so we're going to call ours 16nm." So TSMC called it 16. But Samsung said "Hell no, we're going to call ours 14."
MLID: And so did GlobalFoundries.
DN: Yup. And, well... actually, GlobalFoundries used the Samsung process.
MLID: Yeah, they had a co... yeah.
DN: So it was the same... But you know what happened, GlobalFoundries failed. Their 14 didn't work, so they had to license it from Samsung. The point is that TSMC and Samsung had the same process but different names, and TSMC had to spend a lot of their day explaining why theirs was just as good as Samsung's...
MLID: It was actually better, typically.
DN: Yeah, even though it was named 16 and Samsung's was 14. TSMC learned a valuable lesson - names really need to match for customers' peace of mind. You don't want to extend the sales cycle or the marketing cycle explaining the technical details why your process is better even though it's got a higher geometry name. That's something Intel hasn't learned yet, and they're really going to have to get with it if they're going to be a mainstream semiconductor manufacturer.
MLID: I think there was a bit of time where the prominence of Intel... the amount of people I saw online, gamers going "Well, Intel 14nm is real 14nm" and it's like... it's the most real, none of these are real... how we used to name them, anyways.
The funny thing that I think might come up, and we're getting a little bit into rumor territory here - I believe that most of NVIDIA's gaming lineup of Ampere is going to be 8nm. I've been told that the reps for the Quadro side of NVIDIA are saying that they will be manufacturing some of their top cards on 7nm. Obviously the A100 is TSMC 7nm.
Then there's also rumors of NVIDIA trying to get ahold, or trying to effectively utilize also Samsung's new - and I put big quotation marks here, right - 5nm EUV node. But the funny thing is that, again, "5nm" node from Samsung, despite having a similar density to TSMC's 7 EUV, I'm pretty sure it has quite a bit worse power characteristics. It would be funny, in my opinion, if you had a lineup that they put 5nm on the box and then they underperform the high-end product that says 7nm from a different fab. That would just go to show you how much of it is just straight-up marketing now.
DN: The thing is that we write in detail on this. We have a couple of process experts, and they look at the transistor density as published in technical papers. We have a lot of conferences and you have to publish your technical paper and it has to be correct, there's a lot of vetting that goes on. The tradition is to use an SRAM block, and it's a very tight embedded memory block and we say "Okay, you can get this many transistors in an SRAM, so that is your density". And that's fine, but what they don't say is "Oh, well, that SRAM only has 10% yield", so you really can't design that. You have to give much more spacing, you have to do this and do this. It's impossible to tell which process is good for a GPU and you can't compare it to SRAM or a CPU or anything else because it's really a different animal.
So just saying "we're better because we're on 5nm"? Wow. Not even close. You have to wait until the silicon comes out and run benchmarks because you can't actually design to the specifications that they release in these papers and talk about, because that's best case scenario with a tail wind.
DN: And so the thing is that, as I said, Samsung is a little less stringent with their marketing vetting. They always want to be first and the best and it doesn't always end that way.
MLID: Well first isn't the same as the best, right?
DN: Well... best on paper, I guess you call it. But what you have to do is look at the Apple processor year over year, look at the GPUs year over year.
The reason why NVIDIA is, just to give you the background - NVIDIA CEO, Jensen, was very good friends with Morris Chang, and they started out together. NVIDIA was one of their first customers. At 28nm, NVIDIA wanted more die and Morris said "No, you signed a wafer agreement."
We do these things called wafer agreements, they're signed a couple years in advance that says "Hey, we're going to buy this many wafers, if we buy more we'll pay more, if we buy less, we'll make it up to you." I used to do wafer agreements there. They're very complicated.
MLID: And it's per wafer, it's not per chip.
DN: That's correct. And we used to do wafer agreements on good die, but we don't do that anymore because TSMC would say "Hey, you have to design to these design rules to get the optimum die," and NVIDIA was the biggest offender, they said "No, we want to put more stuff." And then they wouldn't yield and they'd say "Hey, TSMC, it's your fault." Well, not really.
MLID: And then you'd get a Fermi that just doesn't work.
DN: Yeah... yeah. I was involved in a couple of these, and I was like "You guys didn't even obey the design rules!" You get these things called waivers, it's "Well, we want to cheat, so here's a waiver". It's like "Well, okay. It's up to you, but we're not guaranteeing you yield." And then they don't yield and then they blame other people.
28nm - that happened and NVIDIA couldn't get enough chips, and then the other 28nms didn't yield so that made it worse, so NVIDIA started going to multiple sources, but they were sole-source at TSMC since the beginning of time. As I said, if everybody is sole-sourced to TSMC - that's not good. What happens if Taiwan...
MLID: Well, yeah, but I don't think everybody is going to be, though. I hear that TSMC is fed up with NVIDIA as much as they used to be friends, that they are just so fed up with -and this is just what I'm told, I can't pretend to speak to speak for everyone who works at any of these companies.
You gave an example, any time something goes wrong with an NVIDIA architecture it's the foundry's fault, and NVIDIA's very demanding without... I'm sure they pay well, but without writing Apple's checks. I hear they are very hard to work with sometimes, and from the sounds of it they're now pushing pretty hard to become big partners and prop up Samsung. I guess we'll see how well that works out.
DN: So here's the problem. With TSMC, they have an inner circle of customers that they rely on for R&D. Ever since the separation of processes - where you can no longer manufacture a chip on everybody's process, and this happened at 28nm and then at FinFET - there's a lot of secrets involved, and in order to be in the inner circle at TSMC you have to sign some very big NDAs. You cannot manufacture at other companies for 𝑥 amount of time. You have TSMC's secrets, so you can't go over to Samsung and manufacture at Samsung because TSMC's secrets are going to be involved in that. It's just something that has evolved over the years and TSMC is really keen on protecting their secrets so that people don't take their chips away and manufacture them elsewhere.
What happened was NVIDIA was in the inner circle, of course, for years and years. And if you remember, I don't know if you noticed, but now they have another inner circle player called AMD.
MLID: Yeah. They're big, big, big old friends with them.
Full conversation continued in the downloadable Transcript!