Intel Core M - Broadwell-Y 14nm Processor Overview
If you payed close attention to the announcements coming out of Computex 2014 you'll recall that Intel announced the Core M processor series. The Intel Core M processors are based on the 14nm Broadwell micro-architecture and will be the most power-efficient processors offered by Intel in this generation. Before the Computex announcement these processors were previously known under the code-named Broadwell-Y. The Intel Core M series have been designed to
fill the gap between the Atom series and full fledged Core series that are a bit too powerful and power hungry to serve in portable 2-in-1s, ultra-thin tablets, ultrabooks and other handheld devices.
As one would expect, the Intel Core M (Broadwell-Y) series will be dual-core processors that feature Hyper-Threading (4 total threads), 2MB of cache per core, Intel HD Graphics (GT2 GPU), support for up to 8GB of low power dual-channel DDR3 memory and all the tweaks and improvements that come along with the move to the 14nm process. Intel hasn't disclosed any specifics on clock speeds, performance numbers or pricing, but they did mention that the Intel Core M has 20% more execution units on the next generation graphics, so that means there are 24 EU's on Intel Core M and there are plenty of rumors floating around online that Intel Core M will have a design power of 4.5 - 11.5 Watts. The move to 14nm means that Intel can provide a greater performance-per-watt at lower TDPs, which unlocks the door for thinner solutions (less than 9mm thick) and even fanless designs. As you can see from the slide above Intel has made a slew of changes to the processor and just not one single thing.
One of the demo machines that we were shown was a tablet called Llama Mountain. We weren't allowed to take photographs for some reason, but Llama Mountain is a proof of concept tablet made by Intel to show off what the Intel Core M processor can do. The tablet was just 7.2mm (0.28-inch) thick, which is thinner than the Apple iPad Air that is 7.5mm in thickness! Intel claimed that the Windows 8.1 tablet would be able to outperform a basic Intel Core processor powered notebook that was 26mm thick from 4-years ago with tremendous savings in size, battery life and weight. Intel mentioned a 4x reduction in TDP, 7x graphics improvement, 2x improvement in IA core and half the battery size and double the battery life. The Llama Mountain reference design weighs just 1.47 pounds yet still can last up to 8 hours thanks to the 32Wh battery the housing hides. This specific tablet design will never come to market, but it shows that some really exciting tablets should be hitting the streets just before the 2014 Holiday shopping season gets into full swing.
Intel's Core M family will offer around a 20 to 40 percent performance improvement, greater than a 2x reduction reduction in TDP with better performance than the current Haswell-Y series. The Intel Core M has a 60 percent lower SOC idle power and that is key for the increased battery life and fanless design wins. One of the other key improvements that we will go into greater detail on the next page is the fact that the Broadwell-Y processor has a 50 percent smaller package size (XY) and is even 30 percent thinner than Haswell-Y. This means that the processor will take up much less space inside devices and that is one of the key enablers to the emergence of 9mm and thinner devices (not including the keyboard if one is needed). As Intel started building Broadwell they had the goal and vision to bring the promise of the full brand in a fanless solution. That was the vision and each engineer on the team was working to bring the power down considerably without giving up on the performance improvements that come with each new generation.
If you have to summarize the Intel Core M processors at the highest level you can summarize it at three points.
- Intel is enabling fanless designs with form factors under 9mm in thickness with a 14nm processor
- Intel did not compromise on performance even with a greater than a 2x TDP reduction
- Intel considerably reduced the power consumption and has increased battery life
Let's take a closer look at the 14nm process that is being used on Broadwell.
Intel 14nm Technology Details - Transistor Fins, Interconnects and More
For months we read stories online about how Intel was having a tough time ramping up production of their true 14nm technology. The good news is that Intel's 14nm technology is qualified and in volume production and has been for some period of time. With Intel releasing Broadwell before the holidays this shouldn't come to a shock to anyone that Intel has dialed down the 14nm process with 2nd generation Tri-gate (FinFET) transistors. When Intel unveiled the 22nm processors, they started using the first generation of the Tri-gate transistors. While the first generation was found to be good, there is always room for improvement. The upcoming 14nm Broadwell processors bring with them the 2nd generation of Tri-gate (FinFET) transistors.
The slide above shows a drawing of a 14nm Broadwell transistor fin next to the 22nm Haswell transistor fin. Intel changed the fin profile in this generation of FinFET transistors after discovering taller and thinner had many added benefits. The fin pitch metric has gone from 60nm to a 42nm pitch, which is .70x scale factor, which is inline with the Moore's Law. This is critical as it allows for more fins per unit length. The key reasons for this was to be able to increase drive current and overall better performance coming out of each fin. That in turn allowed Intel to depopulate fins since each fin is capable of delivering more transistor strength and that also reduced capacitance, which is key for improving power savings. The Transistor gate pitch went from 90mm to 70mm, a .78x scale factor and then interconnect pitch (very important for overall chip scaling) went from 80mm to 52mm a .65x scale factor. Intel believes they have developed a true 14nm technology with good dimensional scaling.
The image above is a cross section perpendicular to the fins (through the gate dielectric) from a electron micrograph. You can see the 14nm fin profiles are narrower and taller. The metal gate is sitting on top of the transistor. To summarize, the taller and thinner fins allow for increased drive current and performance. Intel was also able to reduce the number of fins allowing for improved density and lower capacitance.
When it comes to interconnects the cross section above shows that Intel has gone from an 80nm minimum pitch to a 52nm minimum pitch. This 0.65x scaling is said to be better-than-normal interconnect scaling. SRAM Memory Cells have 0.54x area scaling from 22nm to 14nm, so the 14nm design process with 2nd generation Tri-gate transistors has lead to industry-leading densities that will likely take other companies years to match.
As technology advances, there is no doubt that we are going to see performance gains. That is pretty much par for the course when new processors are released by Intel. The trick is to balance the performance gains while maintaining low power usage. By taking advantage of the newest technologies available, Intel is able to increase the performance and lower the active power. When you look at performance per watt it continues to go up in this technology generation at the rate of 1.6x per generation.
The slide above shows gate pitch times metal pitch and shows a valid metric that should areas scaling over various generations. As you can see on the 14nm node Intel has been keeping on a fairly steady ~0.53x per generation reduction when it comes to logic areas scaling and this helps show that Moore's law is still alive and well. The other companies have had a lead in density, but they released them later in time. Intel believes that will be changing on the 16nm/14nm nodes as other companies will have to pause to develop FinFETS. Intel thinks it will they will soon be the leader when it comes to logic area scaling on both the node level and with respect to calendar time due to the other companies having to stop and figure it out. Intel will be shipping their 2nd generation FinFET before the others ship their first. Intel clearly stated that they have 14nm in volume production and they next think they are working on is 10nm.
Intel has achieved better than expected area scaling due to the use of advanced double patterning techniques. The ratio has been a fairly steady line over the past six generations, but Intel is seeing better than normal scaling with the move to 14nm. This is critical as you want to use less area per transistor. When it comes to labor cost they are found to be rising due to the number of added masking steps. At the end of the day when you look at the cost per transistor it appears that Intel is continuing to scale down and this generation is actually slightly better than normal trend-line.
Although the 14nm technology being used for the Broadwell SoC parts is quite new, the yields are doing well. Intel has stated that the 14nm product yield is in a 'healthy range' with more improvements coming down the pipe. Intel currently has two fabs that are capable of the 14 nm manufacturing process one is located in Oregon while the other is located in Arizona. There is also a third fab in the works in Ireland that will be able to produce the 14 nm wafers, the Ireland fab should be golden some point in 2015.
Broadwell Architecture - 2nd Gen FIVR, 3DL, PCH
We were able to chat with Stephan Jourdan, Intel Fellow and Director of SoC Architecture at Intel about the microarchitecture changes that were done to Broadwell-Y.
The goal with Broadwell-Y was to make a 8-10mm thick device with a 10.1" display that was fanless with a 3-5 Watt SoC. Intel needed to optimize the CPU, GPU, PCH and move on to the 14nm process to create a sub-5 Watt SOC that could fulfill such a role.
One of the key features of Broadwell Y is that the SoC is significantly smaller than Haswell U/Y. The Broadwell dies size is about 0.63x smaller than Haswell, so the die size is going from 130 mm^2 to 82mm^2. The overall package size for Haswell U/Y is 40x24x1.5mm and Broadwell will be 30x16x1.04mm. Broadwell-Y was originally designed to be 1.1mm in thickness, but Intel was able to exceed that goal and get it down to 1.04mm. On Haswell the packaged core was 400um in thickness, but on Broadwell it is just 200um.
The PCH is still made on the 32nm process, which is why the PCB size hasn't changed much between Haswell and Broadwell. Intel has made changes to the PCH and it is a new design though, so consider it new as well.
Intel is introducing 2nd Generation Fully-Integrated Voltage Regulator (FIVR) & 3DL for increased power delivery efficiency and performance with Broadwell-Y. The FIVR design requires external inductors to be placed on the bottom of the package and the height of the inductors was keeping Intel from going as thin as they wanted. Intel engineers and architects came up with something called the 3DL PCB. It is basically an external PCB for the inductors that mounts below the SoC and in a cutout in the board that the SoC is being attached to. This cuts the z-height of the Broadwell-Y SoC by almost half of what it was on Haswell-Y. This design was a major breakthrough when it came to thickness. Intel also improved the low voltage performance of FIVR by adding something called Dual FIVR LVR Mode, which from what we have been told, allows for the bypass of FIVR if needed.
Fueled by the desire to go fanless, the team at Intel also needed to play around with the way Turbo Boost is handled. Intel now has three power limit settings that allow for safe Turbo Boost operations. Haswell had just two power states. The new PL3 state allows for greater turbo boosting than ever before, but only for a very limited period of time. Intel said that the PL3 mode will last just milliseconds and is mainly there for battery protection as you wouldn't want a SoC to try and pull more power than any given battery can safely deliver.
Intel also improved both Intel HD Graphics performance and efficiency. Broadwell-Y uses 24 Execution Units (EUs) which is 20 percent higher than the 20 EUs found in Haswell SoCs. Intel is still happy with this scalable architecture and is continuing to support gamers with quarterly driver releases and support for DirectX 11.2 and OpenGL 4.3. On the GPU Compute side you'll have Open CL 1.2 and 2.0 with shared virtual memory support. Intel has also doubled the video quality engine throughput and improved the quality and performance for Intel Quick Sync Video Technology. The Intel HD Graphics in Broadwell will support 4K and Ultra HD displays on one monitor, but we aren't sure about support for multiple 4K displays at this time. Native support for 4K wasn't supported on the last generation Haswell Y or U processors as they supported displays up to 2560x1600. Intel said that they are able to support just 30 Hz refresh rates at 4K, so this is a baby step in the right direction. We were also told that H.265 video is supported and that 4k 30FPS encoding isn't a problem.
Intel made significant changes to the PCH with regards to power consumption when Haswell was released and everyone was impressed that the were able to accomplish. Many weren't sure how many optimizations could be down to reduce power consumption on the PCH for Broadwall-Y, but it appears there were still more than could be done. Intel was able to add more power gating and was able to further reduce idle PCH power consumption by roughly 25 percent from the prior year. Intel was also able to reduce the active power use down by about 20 percent than the Haswell PCH-LP. Intel has also included hardware, firmware and software updates to introduce or improve the monitoring and reporting of power usage in the PCH. Intel hasn't revealed specifics on all the Broadwell PCH-LP features just yet, but did note that the Audio DSP is being upgraded with increased SRAM and higher MIPS. This was done to improve power, but also helps with post processing and voice commands that can wake up a device with a keyword or phrase.
The slide above shows the only look at Broadwell performance, so be sure to take a closer look at it. Intel claims that the new 'converged-core' design in Broadwell will offer greater than five percent higher Instructions Per Cycle (IPC) when compared to Haswell. It doesn't appear that there are any major instruction set changes for Haswell, but there were obviously a number of performance optimizations to get the mentioned gain. Some of changes include a Larger Out-of-Order scheduler for faster Store-to-Load forwarding, a Larger L2 TLB (1k to 1.5k entries) that has dedicated 1GB Page L2 TLB (16 entries) and more. Intel noted that the performance features were designed at ~2:1 Performance:Power ratio, which is the more aggressive than the 1:1 ratio that was targeted in previous processors.
It looks like Intel has been busy innovating on all fronts and the move to 14nm along with the introduction of Broadwell SOCs should be an exciting time for everyone. We can't wait to see products powered by Broadwell-Y before the holiday!