Original Link: https://www.anandtech.com/show/13735/anandtech-2018-in-review-gpus



2018 has nearly drawn to a close, and as we’re already gearing up for the event that kicks off 2019 for the industry – the mega-show that is CES – we wanted to spend one last moment going over the highs and lows of the tech industry in 2018. So, whether you’ve been out of the loop for a while and are looking to catch up, or just after a quick summary of the year behind and a glimpse at the year to come, you’re in the right place for the AnandTech Year in Review.

GPU Market: Boom, Bust, & Broken

For better or for worse, by far the leading story in the GPU space for 2018 is the latest rise and fall of the cryptocurrency markets. While the GPU space is by no means a stranger to the market-breaking weirdness that cryptocurrency mining has on GPU prices – we’ve been through this a couple of times now – the latest cycle was the biggest boom and the biggest bust yet. And as a result it’s had immense repercussions on the industry that will play out well into 2019.

While the latest cycle was already well underway before 2018 started, it hit its peak right at the start of the year. Indeed, that’s when the Ethereum cyptocurrency itself – the predominant cryptocurrency for GPUs – hit its all-time record closing price of $1,396/token. As a result, demand for GPUs for the first eight months of the year was unprecedented, which was a boon for some, and a problem for many others.


Ethereum: From Boom To Bust (Ethereumprice.org)

For the better part of three quarters AMD and NVIDIA could sell virtually every last GPU they produced, with some eager companies going straight to the source to try to secure boards and GPUs to scale up their mining operations. As a result, video card prices spiked – of the limited number of cards to even make it to the market, they were just as quickly picked up by miners – which benefitted board vendors and the GPU vendors greatly, but put a major kink in gaming. In fact, affordable gaming cards took the brunt of this demand, as their relatively high price/performance ratios meant that for all the reasons they were good for gaming, they were good for mining as well. When there’s money to be made a Radeon RX 580 (MSRP: $229) will still find buyers, even at $370.

However with every boom comes a bust, and the latest runup on the price of Ethereum was no different. After peaking early in the year, the cryptocurrency’s price continued to trend downward over the entire year, eating away at the profitability of GPU mining. This was exacerbated by the introduction of dedicated Ethereum ASICs, which although required their own fab space to produce and components (e.g. RAM) to go with them, alleviated some of the demand for video cards for mining. Ultimately Ethereum dropping below $300/token in August put the latest boom to an end, and in the last month prices have dropped below $100/token, while the profitability of mining even with the most efficient video cards has turned negative at times.


CamelCamelCamel Price History for MSI's Radeon RX 580 Armor 8G

As a result, it’s only been in the last few months that video card prices have come back down to their reasonable, intended prices. Indeed, this year’s Black Friday to Cyber Monday period was especially robust, as RX 580s were going for as little as $169, a far cry from where they were 6 months earlier. The market may not be entirely fixed quite yet, but good video cards are finally affordable again.

With that said, as we’ve since learned from their recent earnings announcements, the break in video card prices isn’t just due to a drop in demand. Both AMD and NVIDIA ramped up their GPU production orders to try to take advantage of this extended boom period, And, judging from the inventory issues they’re now facing, both have been burnt by it. Hungover from the boom in GPU demand and the boost to revenue and profits that came from it, both companies are now in a small slump as they have built up sizable stockpiles of GPUs that came in right when demand really tapered off. As a result, both companies are trying to offload their inventory in a controlled pace, selling it in limited batches to avoid flooding the market and crashing video card prices entirely. The cryptocurrency market is all but impossible to predict, and hardware suppliers in particular are caught in the tough spot: do nothing and let cryptocurrency break the market indefinitely, or try to react and risk having excess inventory when the market needs it the least.

But for the moment, with most of the sanity restored to the video card market, there is a silver lining: cheap cards! Along with frequent sales for new cards, the second-hand market is also increasingly filled with video cards from miners and speculators who are offloading their equipment and inventory, even if it comes at a loss. So for buyers who are willing to take a bit more of a chance, right now is a very good time for snagging a late-generation video card for a good price.

GDDR6 Memory Hits The Scene

GPU pricing and market matters aside, 2018 was also an important year on the technology front. After years of planning, GDDR6, the next generation of memory for GPUs and other high-throughput processors, finally reached mass production and hit the market. This is an especially important development for mid-range cards, as these products have never had access to more exotic memory technologies like GDDR5X and HBM.

GDDR6 follows in the footsteps of both the long-lived GDDR5 and the more recent GDDR5X. First put into video cards in 2008 – a lifetime ago for the GPU industry – GDDR5 has been the backbone of most video cards over the last decade. It’s been taken to far greater frequencies than JEDEC ever originally planned, with the first cards launching with data rates of 3.6Gbps while the most recent cards have shipped at 8 and 9gbps. And while GDDR5X tried to pick up where GDDR5 left off in 2016, the lack of adoption outside of Micron meant that it never had the traction required to replace its ubiquitous predecessor.

GDDR6, on the other hand, has no such problems. The Big 3 memory vendors – SK Hynix, Samsung, and Micron – are all producing the memory. And while prices are still high as an early technology (don’t expect GDDR5 to disappear overnight), GDDR6 is primed to become the new backbone of the GPU memory industry.

Overall, GDDR6 introduces a trio of key improvements that vault the memory technology ahead of GDDR5. On the signaling front, it uses Quad Data Rate (QDR) signaling as opposed to Double Data Rate (DDR) signaling. With twice as many signal pumps per clock as before, GDDR6’s memory bus can reach the necessary data rates at lower clockspeeds, making it more efficient and easier to implement than a higher clocked DDR bus, though not without some signal integrity challenges of its own.

The second big change is to how the memory is organized and how data is prefetched: instead of a chip having a single 32-bit wide channel with an 8n prefetch, GDDR6 divides that into a pair of 16-bit wide channels, each with a 16n prefetch. This effectively doubles the amount of data that is fed to the memory bus on every memory clock cycle, matching the increased bus capacity and allowing for greater data rates out of the memory without increasing the core clock rate of the memory itself.

Finally, GDDR6 once again lowers the memory operating voltage. Whereas GDDR5 typically ran at 1.5v, the standard voltage for GDDR6 is 1.35v. The actual power savings are a bit hard to quantify here since power consumption also depends on the memory controller used, but all-told we’re expecting to see significant increases in bandwidth with minimal power consumption increases with both GPU vendors – and network controller manufacturers and other GDDR users, for that matter.

GDDR6 is currently readily available in speeds up to 14Gbps, and the standard allows for faster speeds as well. Samsung is already talking about doing 18Gbps memory, and if it ends up as long-lived as its predecessor, GDDR6 will undoubtedly go faster than that still. Meanwhile it will be interesting to see where the line between GDDR6 and HBM2 falls in the next year or two; GDDR6’s speed somewhat undermines the advantages of HBM2, but the latter memory technology recently saw its own bandwidth and capacity boost, so it may still be used as a high-end option, especially in space-constrained products such as socketed accelerators.



NVIDIA Turing Turns to Ray Tracing

The tentpole event of the GPU industry is of course the launch of a new architecture and its chips, and 2018 didn’t disappoint. Over the summer NVIDIA launched their Turing GPU architecture, and with it their new GeForce RTX 20 series of video cards.

Turing itself is an interesting beast, as NVIDIA used the new architecture to overhaul parts of their designs and in order places to introduce new features entirely. The core GPU architecture is essentially a Volta derivative with additional features; this is a notable distinction as while Volta has been available in servers as Tesla V100 since the middle of 2017, it never came to the consumer market. From a consumer standpoint then, Turing is the biggest update to NVIDIA’s core GPU architecture since the launch of Maxwell (1) over four and a half years ago.

Though covering the full depths of what Turing’s core architecture entails is best left for Nate Oh’s fantastic Turing Deep Dive, in short the new architecture further optimizes NVIDIA’s performance and workflow by reorganizing the layout of an individual SM, and for the first time (for a consumer part) breaking out the Integer units into their own execution block. The net result is that a single Turing SM is now composed of 4 processing blocks, each containing 16 FP cores and 16 INT cores. The benefit of this change is that it allows integer instructions to be more readily executed alongside floating point instructions, whereas previously the two occupied the same slot. Meanwhile NVIDIA also updated the cache system, introducing an L0 instruction cache to better feed all of their cores.

With all of that said, as it turns out the marquee feature improvement for Turing isn’t even part of the GPU’s core compute architecture, rather it’s new hardware entirely: everything NVIDIA needs to accelerate ray-tracing on a GPU. Long considered the holy grail of graphics due to its accuracy and quality – and long out of reach of GPUs due to its absurd performance requirements – GPUs are finally getting to the point where they’re fast enough to mix in ray tracing with traditional rasterization for improving graphics in a measured manner.


Ray Tracing Diagram (Henrik / CC BY-SA 4.0)

Turing in turn introduces two new hardware units (relative to consumer Pascal) to achieve this. The first is what NVIDIA calls an RT core, which is their hardware block for actually computing the all-important ray intersections. The second hardware unit are tensor cores, which are actually another carryover from Volta. The tensor cores excel at neural network execution, and while they have many purposes – as demonstrated with NVIDIA’s Tesla accelerators – for ray tracing their purpose is to help smooth out the rough output of the ray tracing process itself. By applying a neural network model to the initial, grainy output of the ray tracing unit, NVIDIA is able to save a lot of expensive computational work by firing off far fewer rays than would otherwise be necessary for a clean ray-traced image.


TU104

Truthfully, the results of this whole ray tracing endeavor at a bit mixed right now since we’re still in the very early days of the technology. Microsoft only announced the relevant DXR standard earlier this year, and the first games with ray tracing features are just now shipping, which means developers haven’t had much time to integrate and optimize the technology. The resulting image quality improvement isn’t a night-and-day difference, which make it a bit harder for NVIDIA to quickly sell consumers on the idea. But we’re expecting to see the level of integration and resulting performance to improve over time, as game developers get better acquainted with the technology and what to use it for.

In the meantime, while the GeForce RTX 20 series is a major step up from its predecessor in terms of features, the resulting performance gains at every price segment are much smaller than what we usually see from a new GPU architecture launch. A $500 RTX 2070 is only around 10% faster than what was a $500 GTX 1080, unlike the 50%+ gains of years gone by. There are a few reasons for why the latest cards haven’t significantly moved the needle on price-to-performance ratios, but the biggest factors are that the transistors allocated to Turing’s RT features can’t be used for traditional rasterization – meaning they add nothing to the performance of existing games – and because NVIDIA is carefully controlling GPU prices to deal with the inventory issues mentioned earlier. As long as NVIDIA is sitting on leftover Pascal GPUs to sell, they aren’t going to be in a hurry to sell Turing GPUs at low prices.

As for individual cards, at the moment we’ve seen the launch of three consumer cards – RTX 2070, RTX 2080, and RTX 2080 Ti – along with the more professionally-oriented Titan RTX. With even the cheapest Turing card going for $500 and mobile variants nowhere to be found, I don’t expect that NVIDIA is done rolling out their RTX 20 series quite yet.

The Incredible Shrinking Polaris

While AMD is between GPU architectures for 2018 – Vega was launched last year and Navi will launch in 2019 – AMD hasn’t spent the year entirely idle. The company’s other child, the workhorse architecture that is Polaris, received a somewhat oddly timed die shrink.

This fall AMD started shipping Polaris 30, a version of Polaris 10 that is built on long-time partner GlobalFoundries’ 12nm process. In practice Polaris 30’s die size isn’t any smaller than Polaris 10’s – officially, AMD lists it at the same 232mm2 as Polaris 10 – however AMD has tapped the 12nm process’s general performance improvements to give the Polaris 10 design a late-life performance boost.

Paired with the launch of Polaris 30 is the Radeon RX 590, which is the first (and thus far, only) video card to use the new GPU. By going all-out on performance (and throwing power efficiency into the wind), AMD has been able to muster enough performance to consistently and convincingly pull ahead of the GeForce GTX 1060 6GB, the RX 480/580’s Green competitor for the last two years. To be sure, the resulting performance increase isn’t very big, gaining an average of 12% over the RX 580. But this is enough to keep it consistently ahead of the GeForce GTX 1060 by around 9%. And given the relatively high volume of cards sold in this mainstream market segment, it’s an important win for AMD and should be a good morale boost for the GPU group after the Radeon RX Vega family didn’t quite land where AMD wanted it to.

The catch for now with Polaris 30/RX 590 is pricing, especially in light of the numerous RX 580 sales already going on. AMD launched the card at $279, and that’s where it stays to this day. And while faster than the GTX 1060, it’s also priced so far ahead of the RX 580 (regularly found at $199) that the RX 580 is serving as a spoiler to the RX 590. Which if nothing else helps move RX 580 cards, but doesn’t do the RX 590 any favors.

Intel Goes Xe

Last but not least we have Intel. The blue team is currently in the middle of an extensive process to ramp up and become the third major GPU vendor in the industry, a process that started with the hiring of Raja Koduri from AMD back in late 2017. At the time the company also announced that they would be developing discrete GPUs, and those plans are starting to fall into place.

For Intel, their 2018 was all about laying out their future GPU plans and illustrating to consumers and partners alike how they’re going to get from today’s integrated graphics to a top-to-bottom range of integrated and discrete GPUs. Intel doesn’t have any hardware to show in 2018 – they didn’t even launch a new iGPU this year – and rather the focus for the company is on 2020, when their new GPU family will launch.

Announced at their Architecture Day event earlier this month, Intel’s discrete GPUs will be sold under the Xe brand. Nothing has been published about the architecture itself at this point, but Intel intends for Xe to be the foundation for several generations of graphics going forward. Xe will also be a true top-to-bottom stack, with the company intending to use it for everything from iGPUs up to datacenter accelerators as a replacement for Xeon Phi (itself an offshoot of the Larrabee GPU project).

Ultimately we’re talking about a GPU architecture that’s still more than a year off, so there’s still a lot of time for plans to change and for Intel to plot about how they want to handle their Xe disclosures. But it’s clear that the company is no longer content to sit on the sidelines and let the high-margin GPU accelerator market grow all around them. So it should be interesting to see how Intel fares in jumping into a market that hasn’t seen a viable third-party competitor in over 15 years.

Looking Forward to 2019

Finally, let’s break out the crystal ball for a quick look at some of the things we should see in 2019.

All but given at this point is more new GPUs from NVIDIA. As the current GeForce RTX 20 series product stack stops at $500, they’re going to need to introduce new products to finish refreshing the product lineup. The GeForce GTX 1060 in particular is due for a successor, owing both to its importance in NVIDIA’s product stack as their high-volume mainstream video card, and because of the challenge posed by AMD’s Radeon RX 590. We may see this as soon as CES 2019 – where NVIDIA is once again giving a presentation – but if not there, then I’d expect to see it not too long thereafter.

Meanwhile for AMD, 2019 is going to be the year of Navi. AMD has been playing their cards very close to their chest on this one, and besides the fact that it will be built on a 7nm process and will utilize a next-generation memory technology (presumably GDDR6), little else has been said. By the time AMD does launch Navi, Vega will be coming up on 2 years old and Polaris 3, so it’s possible that we’ll see AMD do a top-to-bottom refresh here in order to bring everything in sync. However it’s also equally possible that they’ll replace either the top (Vega) or bottom (Polaris) end of the market first, as this is more in-line with how AMD has operated over the past half-decade.

The biggest wildcard for a moment then is what, if anything, NVIDIA does this year to take advantage of 7nm production. A replacement for GV100 at the very high end is a likely candidate – server customers can afford the price tag that comes with low margin production – however consumer parts are a bit more nebulous. NVIDIA surprised a lot of people by launching the 12nm Turing parts right when 7nm was entering mass production, getting these parts to market sooner but missing out on the density and power efficiency improvements of 7nm in the process. A 7nm mid-generation refresh is not out of the picture, however NVIDIA hasn’t done a refresh like that in almost a decade. But then again, the current fab situation is unparalleled; as Moore’s Law continues to slow down, the standard 2-year GPU design cycle and the fab upgrade cycle are getting increasingly out of sync. So there are good arguments to be made on both sides, and it should prove interesting to see which route NVIDIA ultimately takes for 2019.

Log in

Don't have an account? Sign up now