Name: The Intel Kaby Lake-X i7 7740X and i5 7640X Review: The New Single-Threaded Champion, OC to 5GHz
Item: The Intel Kaby Lake-X i7 7740X and i5 7640X Review: The New Single-Threaded Champion, OC to 5GHz
Author: Dr. Ian Cutress

Original Link: https://www.anandtech.com/show/11549/the-intel-kaby-lake-x-i7-7740x-and-i5-7640x-review-the-new-single-thread-champion-oc-to-5ghz

The Intel Kaby Lake-X i7 7740X and i5 7640X Review: The New Single-Threaded Champion, OC to 5GHz

VIEW ARTICLE

by Ian Cutress on July 24, 2017 8:30 AM EST

176 Comments

Intel’s direction for the high-end desktop space has taken an interesting turn. After several years of iterative updates, slowly increasing core counts and increasing IPC, we have gotten used to being at least one generation of microarchitecture behind the mainstream consumer processor families. There are many reasons for this, including enterprise requirements for long support platforms as well as enterprise update cycles. For 2017, Intel is steering the ship in a slightly different direction, and launching the latest microarchitecture on the HEDT platform. These CPUs don’t feature the high core counts of the other HEDT parts, but offer a higher point up the voltage/frequency scale to be the fastest single thread processors money can buy. They also overclock quite well.

The High-Level Product

Back at Computex, Intel announced its new high-end desktop (HEDT) platform code named Basin Falls. There are three elements to Basin Falls: the new X299 chipset which brings a number of enhancements from the more mainstream consumer line, a set of Skylake-X processors based on the Skylake-SP microarchitecture (and slightly different from the mainstream Skylake-S processors launched in 2015) from 6 cores to 18 cores, and a pair of Kaby Lake-X processors, reusing the Kaby Lake-S silicon launched earlier this year but with a bit more frequency and power consumption.

We covered all three parts in our launch pieces (linked below), but here is a quick recap.

The Intel Skylake-X Review: Core i9 7900X, i7 7820X and i7 7800X Tested
Intel Launches Basin Falls: The New Platform and X299 Chipset
Intel Launches Skylake-X: Bringing 18-core HCC Silicon to Consumers for $1999
Intel Launches Kaby Lake-X: High Performance
Our CPU Buyer’s Guide (this link self-updates to the latest version)

Pushing Peak Performance

The main reasons an enthusiast invests in the HEDT platform usually revolve around multi-threaded performance, the need to support multiple graphics cards, and potentially 1.21 jiggabytes of DRAM (ed: you mean 128GB). Kaby Lake-X (KBL-X for short) doesn’t have any of these features that we normally prescribe to HEDT: it only has four cores, not ten or higher; it only supports one GPU at full PCIe speed, up to two in SLI or three in Crossfire if you are willing to have a lower link speed; and it only has dual channel memory for a maximum of 64GB. So why is Intel bringing KBL-X to HEDT? The answer lies primarily in peak performance.

Users that have been in the enthusiast space will have realized that the Holy Grail for PC performance is single threaded (ST) performance. If you master single thread throughput, then arguably the rest is easier, such as scaling out to more cores. There are usually three barriers to high ST-perf: instructions per clock (IPC), frequency and power consumption. A high IPC is fundamental to such a design, as any gains will propagate through the platform, but is becoming a tough job. Over the last few generations, Intel has increased IPC by 3-10% each generation, making a 30-45% increase since 2010 and Sandy Bridge depending on the benchmark, but we’re unlikely to see 50-100% jumps per generation any time soon. Any IPC gains are multiplied through the frequency at which the processor runs at, which can be limited by a combination of things: production process (e.g 14nm), voltage characteristics, stability, yield etc. These features tie directly into power consumption, which increases as a square of voltage and with frequency/capacitance. With some designs, +10% frequency might be possible, but at the cost of +100% power, so there also needs to be a balance to have something marketable that people will want to buy.

Kaby Lake-X is binding Intel’s latest x86 microarchitecture with the highest IPC, at the highest frequency they have ever launched a consumer CPU, for a reasonable power window. Users can overclock another 10% fairly easily, for a slight increase in power. Simply put, Kaby Lake-X is the highest single-thread performing x86 processor out-of-the-box that exists. This is a different take on HEDT compared to previous generations of HEDT CPUs.

With the extra single thread performance, it would appear to appeal to traders who need fast systems, gamers where single thread performance is limiting the frame rate (or minimum frame rates), or science/enterprise workloads where serial code is a limiting factor for simulations or professional applications.

The Core i7 and the Core i5

At the heart, the new KBL-X processors are no different to their mainstream platform KBL-S brethren. The silicon is the same, but potentially binned for a better voltage/frequency curve, and then packaged into the HEDT platform rather than the mainstream platform. Unlike the mainstream processor stack though, Intel is only launching two processors. A Core i7-7740X and a Core i5-7640X.

Intel Kaby Lake Processors
	Core i7		Core i5
	Core i7-7740X	Core i7-7700K	Core i5-7640X	Core i5-7600K
Socket	LGA2066	LGA1151	LGA2066	LGA1151
Cores/Threads	4/8	4/8	4/4	4/4
Base Frequency	4.3 GHz	4.2 GHz	4.0 GHz	3.8 GHz
Turbo Frequency	4.5 GHz	4.5 GHz	4.2 GHz	4.2 GHz
TDP	112 W	91 W	112 W	91 W
L2 Cache	256 KB/core
L3 Cache	8 MB		6 MB
DRAM Channels	2
DRAM Support	DDR4-2666	DDR4-2400	DDR4-2666	DDR4-2400
Graphics	None	HD 620	None	HD 620
Price (MSRP)	$350		$250
Price (7/21)	$349	$309	$248	$239
Launched	July 2017	Jan 2017	July 2017	Jan 2017

Both CPUs are quad core, with the Core i7 having Hyperthreading for a total of eight threads. The Core i5 does not have hyperthreading, making it the first HEDT processor in the modern Core era to do so. Both will have identical support to their KBL-S siblings, although the increased base/turbo frequencies have resulted in Intel’s TDP increasing from 95W to 112W. The TDP rating is a guide for appropriate cooling: the KBL-S processors were actually very good on their power consumption at stock frequencies, and as shown later, so are the KBL-X processors. Intel could have kept the 95W TDP rating very easily here.

Buy Intel Core i7-7740X on Amazon.com

So given what was said in the previous section about peak performance, the Core i7 fits the bill. It has the highest frequencies, and supports hyperthreading for increased performance. It begs the question why the Core i5 exists at all. There are a few guesses as to why:

For some enterprise users, hyperthreading is not needed
It is a cheaper entry point into the platform
It overclocks just as well to perform the same as the Core i7

Intel’s official line is about giving customers options. We will not be seeing a Core i3 on HEDT any time soon, though.

Kaby Lake-X does not come with any bundled cooler, but Intel is promoting its own TS13X liquid cooled loop with the processors. This CLC has been in the market for a good number of years now.

Competition and Market

Because these new KBL-X parts are updates to the KBL-S family, Intel has direct competition with itself. The Core i7-7740X is an alternative to the Core i7-7700K on the mainstream platform, being similar in frequency but giving much higher platform costs.

Comparison: Intel Core i7-7740X vs Core i7-7700K
Intel Core i7-7740X	Features	Intel Core i7-7700K
X299	Platform	Z270 etc
LGA2066	Socket	LGA1151
4 / 8	Cores/Threads	4 / 8
4.3 / 4.5 GHz	Base/Turbo	4.2 / 4.5 GHz
16	PCIe 3.0 Lanes	16
256 KB/core	L2 Cache	256 KB/core
8 MB	L3 Cache	8 MB
112W	TDP	95W
$349	Retail Price (7/21)	$309
MSI X299 Raider: $230 ASUS Prime X299-A: $310 GIGABYTE X299-UD4: $249	Cheap OC Motherboard Cost	GIGABYTE Z270-HD3: $114 MSI Z270 SLI PLUS: $136 ASUS TUF Z270-MK2: $120

Similar platform builds for the two put an extra $100-150 cost on the motherboard, potentially limiting the usefulness of the new KBL-X parts on paper, as that extra cost of the build as a whole might not warrant another 100 MHz on the turbo frequency. The proof is in the pudding, which is why we are doing this review. The Core i5-7640X against the Core i5-7600K is in a similar situation.

However, there is also AMD to discuss. With the launch of the Ryzen family of processors, There are now significant comparisons to be made between what Intel offers and what AMD offers. Going at a price-for-price comparison, using MSRP prices (not sale prices or distributor prices), the Core i7-7740X squares against AMD’s Ryzen 7 1700.

Comparison: Intel Core i7-7740X vs AMD Ryzen 7 1700
Intel Core i7-7740X	Features	AMD Ryzen 7 1700
X299	Platform	X370, B350, A320
LGA2066	Socket	AM4
4 / 8	Cores/Threads	8 / 16
4.3 / 4.5 GHz	Base/Turbo	3.0 / 3.7 GHz
16	PCIe 3.0 Lanes	16
256 KB/core	L2 Cache	512 KB/core
8 MB	L3 Cache	16 MB (Victim Cache)
112W	TDP	65W
$349	Retail Price (7/21)	$270
MSI X299 Raider: $230 ASUS Prime X299-A: $310 GIGABYTE X299-UD4: $249	Cheap OC Motherboard Cost	ASUS Prime B350-Plus: $90 MSI B350 Tomahawk: $100 ASRock AB350 Pro4: $100

At this price, both sides have their merits. Intel wins in IPC, which is about 5-15% ahead clock-for-clock, but is also at a higher frequency compounding the difference. In its own attack, AMD wins in cores and threads, offering eight cores and sixteen threads where Intel only offers four cores and eight threads. Intel wins for the IO and chipset, offering 24 PCIe 3.0 lanes for USB 3.1/SATA/Ethernet/storage, while AMD is limited on that front, having 8 PCIe 2.0 from the chipset. Depending on the workload, both sides have merits – AMD will state that it is geared towards more professional multi-threaded workloads, while Intel is geared towards serial code scenarios which still dominate certain tasks. There’s also the platform cost, where the motherboards are $150-$200 cheaper for AMD when looking at cheap overclockable systems.

For the Core i5-7640X, it sits at a similar price to AMD’s Ryzen 5 1600X.

Comparison: Intel Core i5-7640X vs AMD Ryzen 5 1600X
Intel Core i5-7640X	Features	AMD Ryzen 5 1600X
X299	Platform	X370, B350, A320
LGA2066	Socket	AM4
4 / 4	Cores/Threads	6 / 12
4.2 / 4.4 GHz	Base/Turbo	3.6 / 4.0 GHz
16	PCIe 3.0 Lanes	16
256 KB/core	L2 Cache	512 KB/core
8 MB	L3 Cache	16 MB (Victim Cache)
112W	TDP	95W
$248	Retail Price (7/21)	$230
MSI X299 Raider: $230 ASUS Prime X299-A: $310 GIGABYTE X299-UD4: $249	Cheap OC Motherboard Cost	ASUS Prime B350-Plus: $90 MSI B350 Tomahawk: $100 ASRock AB350 Pro4: $100

Buy Intel Core i5-7640X on Amazon.com

For this review, we have also included numbers for other processors. These include AMD’s Ryzen 7 1800X, which at $499 is an alternative due to the money saved by investing on the AM4 platform; we also have the Skylake-X based Core i7-7800X, which has an MSRP not so far away from the i7-7740X, and opens up the platform with more PCIe lanes, two more cores, but at the expense of single thread performance; and finally the perennial Core i7-2600K from the Sandy Bridge era, for the benchmarks which we have data.

Given everything said above, it would appear Intel has an uphill struggle to convince users that Kaby Lake-X is worth the investment. In this review we will test and see what the performance story is.

Please note that our AMD data is before AGESA 1.0.0.6. We are planning an article with an update of the latest AGESA soon.

Pages In This Review

Navigating the X299 Minefield: Kaby Lake-X Support

When building a platform, keeping it limited to one particular type of product makes it simple and easy to understand, at the expense of flexibility. The idea of flexibility is one that Intel and AMD have experimented with in the past, enabling users to stay on the same underlying platform and upgrade in future generations, but with X299, Intel is taking a large step forward in support. This is both a good and bad thing, depending on how different the support for the different generations needs to be. In this context, Skylake-X and Kaby Lake-X are like chalk and cheese, which can present a headache for users new to building systems. It has caused some minor headaches with system builders and motherboard manufacturers already.

To recap, the three elements of the Basin Falls platform launch were the motherboards/X299 chipset, the SKL-X processors and the KBL-X Processors.

X299: What Is It?

The X299 chipset supports the new Skylake-X and Kaby Lake-X processors, and like the Z170 chipset and Z270 chipset counterparts on the mainstream consumer line, are basically big PCIe switches. One of the issues with the older X99 chipset was its limited capabilities, and inability to drive many PCIe devices – this changes with the big switch mentality on X299. For the DMI 3.0 link going into the chipset (basically a PCIe 3.0 x4), the chipset has access to up to 24 PCIe 3.0 lanes for network controllers, RAID controllers, USB 3.1 controllers, Thunderbolt controllers, SATA controllers, 10GbE controllers, audio cards, more PCIe slot support, special controllers, accelerators, and anything else that requires PCIe lanes in either an x4, x2 or x1 link.

The total uplink is limited by the DMI 3.0 link, but there will be very few situations where this is saturated. There are a few limits to what support is available (some ports are restricted in what they can handle), and only three PCIe 3.0 x4 drives can use the in-built PCIe RAID, but this should satiate all but the most hardcore enthusiasts.

Skylake-X CPUs: Coming in Two Stages

The Skylake-X family of processors for Basin Falls comes in two stages, based on the way the processors are developed. Normally HEDT processors are cut down versions of enterprise processors, usually through restricting certain functions, but the enterprise processors are typically derived from three different silicon layouts during manufacturing. Internally Intel call these three layouts the LCC (low core-count), HCC (high core-count) and XCC (extreme core-count), based on the maximum number of cores they support. Nominally Intel does not disclose which silicon layout they use for which processors, though it is usually straight forward to work them out as long as Intel has disclosed what the configurations of the LCC/HCC/XCC dies are. In this case, Intel has officially left everyone guessing, but the point here is that historically Intel only uses the LCC silicon from the enterprise line for its consumer desktop processors.

In previous generations, this meant either a 6, 8 or 10 core processor at the top of the stack for consumers, with lower core count models being provided by disabling cores (this is a complex topic involving the quality of the manufacturing process and determining voltage/frequency in a process called binning. We’ve covered it before, but it’s something all manufacturers have to do to get good yields as making processors isn’t a perfect process). Each year we expected one of two things: the top-end SKU either gets more frequency, less power, or more cores, and as such the march of progress has been predictable. If you had asked us two months ago, we would have fully expected Skylake-X to top out with LCC silicon at 10 or 12 cores, depending on how Intel was planning the manufacturing part.

So the first element of Intel’s launch is the LCC processors, running up to 10 cores. We previously published that the LCC silicon was 12 cores, but we can now show it is 10 – more on that later. The three Skylake-X CPUs launching today are using LCC silicon with 6, 8 or 10 cores as the Core i7-7800X, Core i7-7820X and Core i9-7900X respectively. Intel is further separating these parts by adjusting the level of official supported DRAM frequency, as well as the PCIe lanes.

The second element to the Skylake-X launch is the one that has somewhat surprised most of the industry: the launch will contain four processors based on the HCC silicon. Technically these processors will not be out until Q4 this year (one SKU coming out in August), and the fact that Intel did not have frequency numbers to share when announcing these parts show that they are not finalized, calling into question when they were added to the roadmap (and if they were a direct response to AMD announcing a 16-core part for this summer). We’ve written a detailed analysis on this in our launch coverage, but Intel is set to launch 12, 14, 16 and 18-core consumer level processors later this year, with the top part running a tray price (when you buy 1k CPUs at a time) of $1999, so we expect the retail to be nearer $2099.

It should be noted that due to a number of factors, the Skylake-X cores and the communication pathways therein are built slightly differently to the consumer version of Skylake-S, which is something discussed and analyzed in our Skylake-X review.

Kaby Lake-X: The Outliers

The final element to the Basin Falls launch is Kaby Lake-X. This is also an aspect of the Basin Falls platform that deviates from the previous generations. Intel’s HEDT line has historically been one generation behind the mainstream consumer platform due to enterprise life cycles as well as the added difficulty of producing these larger chips. As a result, the enterprise and HEDT parts have never had the peak processing efficiency (IPC, instructions per clock) of the latest designs and have sat in the wings, waiting. By bringing the Kaby Lake microarchitecture to HEDT in the form of a Core i7 and a Core i5, this changes the scene, albeit slightly.

Rather than bringing a new big core featuring the latest microarchitecture, Intel is repurposing the Kaby Lake-S mainstream consumer silicon, binning it to slightly more stringent requirements for frequency and power, disabling the integrated graphics, and then putting it in a package for the high-end desktop platform. There are still some significant limitations, such as having only 16 PCIe 3.0 lanes and dual channel memory which might exclude it from the traditional designation of being a true HEDT processor; however Intel has stated that these parts fill a request from customers to have the latest microarchitecture on the HEDT platform. They also overclock quite well, which is worth noting.

The Kaby Lake-X parts will consist of a Core i7 and Core i5, both of which are quad core parts, with the i7 supporting hyperthreading.

Problem Number 1: PCIe Layouts

Users can choose an X299 motherboard and a SKL-X processor, or an X299 motherboard with a KBL-X processor. Every X299 motherboard should have to support both, and it is the level of support that each processor needs that makes it a more difficult task than one might imagine. The obvious difference between the two is the number of PCIe lanes, and where they come from.

KBL-X processors have 16 PCIe 3.0 lanes from the processor, coming from a single PCIe root complex, and these can be bifurcated into x8/x8 or x8/x4/x4 depending on what the motherboard manufacturer wants to implement.

SKL-X processors have either 28 PCIe 3.0 lanes or 44 PCIe 3.0 lanes, depending on which model you buy, and these can come from up to three PCIe x16 root complexes (some complexes will be limited to fit the number). Each one can still be bifurcated into x8/x4/x4, but typically one would expect one PCIe root complex to fill the first x16, and then the next x16 comes from the second root complex, which will either be filled by x8 on the 28-lane processors or x16 on the 44-lane processors. The PCIe lanes can also fly off to support other things, such as storage or Ethernet controllers.

The issue here is that motherboard manufacturers have to design for all three different PCIe lane counts. It is very easy to design a motherboard for SKL-X, and then when a user has a KBL-X processor half the features do not work. This is going to be true for a lot of PCIe slots, and in order to manage all this, manufacturers have to equip the motherboards with PCIe switches to make sure everything is routed correctly for both CPUs, and these switches add cost to the platform. If there were two different platforms, there would not be this added per-board cost (but vendors would have to build two boards instead, each one easier to design).

Chipset Diagram of MSI's X299 XPower Gaming AC, their high-end MB

Aside from the PCIe slots, storage is also going to become an issue. With the previous generation of X99, we started to see M.2 based PCIe storage coming from the processor – guaranteeing no limit in uplink bandwidth, especially when RAID was used. For the new X299 platform, because of KBL-X support, most M.2 slots will be derived from the chipset, adding a small amount of latency but also somewhat limited by the chipset-to-CPU uplink limit of PCIe 3.0 x4. Using two fast M.2 drives in RAID via the chipset will be limited by its own connection. Motherboard vendors know this, and they want to have as many features supported as much as possible in all situations, and so finding a motherboard with a CPU-derived M.2 slot is going to be a rare thing indeed.

The solution to some of this is to have specific motherboards for each set of CPUs. Sure, both CPUs will still work in the motherboard, but when using the wrong family for the CPU design, functionality is going to be severely limited. ASUS has already posted some details about its new Extreme motherboard for X299, with the disclaimer ‘not designed for Kaby Lake-X CPUs’ because the features on board are aimed at SKL-X customers only. We might see more of this filtering through.

There’s also the DRAM: KBL-X is a dual channel design, SKL-X is quad-channel. On an eight-DRAM slot X299 motherboard, only four are operational for KBL-X, wasting board space. The primary four slots for KBL-X are different to SKL-X as well, just in case a user needs to move DRAM around. If a user buys an X299 motherboard with only four slots, chances are only two will work with KBL-X.

One argument here is that a user can upgrade from KBL-X to SKL-X later, or to the beefier KBL-X CPUs launched next generation.

Problem Number 2: Power

Skylake and Kaby Lake are different x86 microarchitectures – the KBL core design was meant to be an ‘optimization’ implementation of Skylake, hitting a few loose hanging fruit and using an updated 14nm process to give better power consumption and better voltage/frequency response from the silicon. There isn’t so much drastic change in the cores, but there is in how the power is delivered.

Skylake-X uses an integrated voltage regulator, or IVR. If you recognize the term, then that is because Intel launched its Broadwell based CPUs with a FIVR, or fully-integrated voltage regulator. Skylake-X does not go all-in like Broadwell did, but for some of the voltage inputs to the CPU, the processor takes in a single voltage and splits it internally, rather than relying on the external circuitry of the motherboard to do so. This affords some benefits, such as consistency in voltage delivery and to a certain extent, some efficiency power gains, and it should simplify the motherboard design - unless you also have to design for non-IVR CPUs, like Kaby Lake-X.

Kaby Lake-X is more of a typical power delivery design, with all the required voltages being supplied by the motherboard. That means that the motherboard has to support both types of voltage delivery, and also adjust itself at POST if a different CPU has been placed in. This obviously adds to the boot time to check if it is the same, but it does require some voltages to be moved around, as too high a voltage can kill a processor. We’ve already killed one.

Specifically, the VRIN voltage on Skylake-X needs to be 1.8V input into the processor for the IVR to work. The same setting on Kaby Lake-X needs to be 1.1 volts for VCCIO. If the motherboard originally had a SKL-X processor in it and does not detect when a KBL-X processor is in, then the motherboard will supply 1.8 volts into the KBL-X rail and the chip will soon die.

When we received samples for SKL-X and KBL-X, we were told by our motherboard partners that if we were switching between the two CPUs, we would have to flush the BIOS. This involves removing AC power when switched off, and holding the Clear CMOS button for 30-60 seconds to train the capacitors and essentially reset the BIOS to default, so it could then detect which CPU was in play before applying full voltages.

We did this, and still ended up with a dead Kaby Lake i7-7740X. There is now a lump of sand in my CPU box. The interesting thing is that this CPU did not die instantly: we started SYSMark, which involves several reboots during the initial preparation phase. On about the 4^th reboot, the system stuck with the BIOS code 0d. Nothing I did was able to go beyond this code, and I put in our Kaby i5 and that ran fine. I put in SKL-X and that ran fine. I put the Kaby i5 in and that ran benchmarks fine. It would appear that our initial Kaby i7 did not have much headroom, and we had to get a replacement for some of the benchmarks.

Incidentally, we also had an i9-7900X die on us. That seems to be unrelated to this issue.

So The Solution?

Motherboard manufacturers have told us that there may be chip-specific motherboards out there in the future. But as it stands, users looking at KBL-X would save a lot of money (and headache) staying with Z270, as the motherboards are cheaper and more streamlined for a Kaby Lake design. Users looking at the top Skylake-X CPUs have nothing to worry about – unless the user really wanted PCIe storage from the CPU. In this case the user will have to find the one or two motherboards that support it, or invest in a PCIe to M.2 riser card and enable it this way – as long as the user puts it into a CPU-based PCIe slot.

So why even bother testing KBL-X if it comes across so downbeat on the platform situation? KBL-X still warrants testing, as the highest frequency processor Intel has released on the latest CPU microarchitecture. As mentioned above, overclocking on KBL-X seems very good, and some users will require the peak single thread performance possible. The argument is more the fact that some of these issues can complicate the platform, reducing accessibility to new builders and causing extra work/time/headaches for motherboard manufacturers and system builders. The issues above are not a significant barrier for any user willing to put in some time to ensure what they buy is suited for their workload.

Test Bed and Setup

As per our processor testing policy, we take a premium category motherboard suitable for the socket, and equip the system with a suitable amount of memory running at the manufacturer's maximum supported frequency. This is also typically run at JEDEC subtimings where possible. It is noted that some users are not keen on this policy, stating that sometimes the maximum supported frequency is quite low, or faster memory is available at a similar price, or that the JEDEC speeds can be prohibitive for performance. While these comments make sense, ultimately very few users apply memory profiles (either XMP or other) as they require interaction with the BIOS, and most users will fall back on JEDEC supported speeds - this includes home users as well as industry who might want to shave off a cent or two from the cost or stay within the margins set by the manufacturer. Where possible, we will extend our testing to include faster memory modules either at the same time as the review or a later date.

Test Setup
Processor	Intel Core i7-7740X (4C/8T, 112W, 4.3 GHz) Intel Core i5-7640X (4C/4T, 112W, 4.0 GHz)
Motherboards	ASRock X299 Taichi MSI X299 Gaming Pro Carbon GIGABYTE X299 Gaming 9
Cooling	Thermalright TRUE Copper Silverstone AR10-115XS (for LGA1151)
Power Supply	Corsair AX760i PSU Corsair AX1200i Platinum PSU
Memory	Corsair Vengeance Pro DDR4-2666 2x8 GB
Video Cards	MSI GTX 1080 Gaming 8GB ASUS GTX 1060 Strix Sapphire R9 Fury 4GB Sapphire RX 480 8GB Sapphire RX 460 2GB
Hard Drive	Crucial MX200 1TB
Optical Drive	LG GH22NS50
Case	Open Test Bed
Operating System	Windows 10 Pro 64-bit

Buy Intel Core i7-7740X on Amazon.com

Buy Intel Core i5-7640X on Amazon.com

Many thanks to...

We must thank the following companies for kindly providing hardware for our multiple test beds. Some of this hardware is not in this test bed specifically, but is used in other testing.

Thank you to Sapphire for providing us with several of their AMD GPUs. We met with Sapphire back at Computex 2016 and discussed a platform for our future testing on AMD GPUs with their hardware for several upcoming projects. As a result, they were able to sample us the latest silicon that AMD has to offer. At the top of the list was a pair of Sapphire Nitro R9 Fury 4GB GPUs, based on the first generation of HBM technology and AMD’s Fiji platform. As the first consumer GPU to use HDM, the R9 Fury is a key moment in graphics history, and this Nitro cards come with 3584 SPs running at 1050 MHz on the GPU with 4GB of 4096-bit HBM memory at 1000 MHz.

Further Reading: AnandTech’s Sapphire Nitro R9 Fury Review

Following the Fury, Sapphire also supplied a pair of their latest Nitro RX 480 8GB cards to represent AMD’s current performance silicon on 14nm (as of March 2017). The move to 14nm yielded significant power consumption improvements for AMD, which combined with the latest version of GCN helped bring the target of a VR-ready graphics card as close to $200 as possible. The Sapphire Nitro RX 480 8GB OC graphics card is designed to be a premium member of the RX 480 family, having a full set of 8GB of GDDR5 memory at 6 Gbps with 2304 SPs at 1208/1342 MHz engine clocks.

Further Reading: AnandTech’s AMD RX 480 Review

With the R9 Fury and RX 480 assigned to our gaming tests, Sapphire also passed on a pair of RX 460s to be used as our CPU testing cards. The amount of GPU power available can have a direct effect on CPU performance, especially if the CPU has to spend all its time dealing with the GPU display. The RX 460 is a nice card to have here, as it is powerful yet low on power consumption and does not require any additional power connectors. The Sapphire Nitro RX 460 2GB still follows on from the Nitro philosophy, and in this case is designed to provide power at a low price point. Its 896 SPs run at 1090/1216 MHz frequencies, and it is paired with 2GB of GDDR5 at an effective 7000 MHz.

We must also say thank you to MSI for providing us with their GTX 1080 Gaming X 8GB GPUs. Despite the size of AnandTech, securing high-end graphics cards for CPU gaming tests is rather difficult. MSI stepped up to the plate in good fashion and high spirits with a pair of their high-end graphics. The MSI GTX 1080 Gaming X 8GB graphics card is their premium air cooled product, sitting below the water cooled Seahawk but above the Aero and Armor versions. The card is large with twin Torx fans, a custom PCB design, Zero-Frozr technology, enhanced PWM and a big backplate to assist with cooling. The card uses a GP104-400 silicon die from a 16nm TSMC process, contains 2560 CUDA cores, and can run up to 1847 MHz in OC mode (or 1607-1733 MHz in Silent mode). The memory interface is 8GB of GDDR5X, running at 10010 MHz. For a good amount of time, the GTX 1080 was the card at the king of the hill.

Further Reading: AnandTech’s NVIDIA GTX 1080 Founders Edition Review

Thank you to ASUS for providing us with their GTX 1060 6GB Strix GPU. To complete the high/low cases for both AMD and NVIDIA GPUs, we looked towards the GTX 1060 6GB cards to balance price and performance while giving a hefty crack at >1080p gaming in a single graphics card. ASUS offered a hand here, supplying a Strix variant of the GTX 1060. This card is even longer than our GTX 1080, with three fans and LEDs crammed under the hood. STRIX is now ASUS’ lower cost gaming brand behind ROG, and the Strix 1060 sits at nearly half a 1080, with 1280 CUDA cores but running at 1506 MHz base frequency up to 1746 MHz in OC mode. The 6 GB of GDDR5 runs at a healthy 8008 MHz across a 192-bit memory interface.

Further Reading: AnandTech’s ASUS GTX 1060 6GB STRIX Review

Thank you to Crucial for providing us with MX200 SSDs. Crucial stepped up to the plate as our benchmark list grows larger with newer benchmarks and titles, and the 1TB MX200 units are strong performers. Based on Marvell's 88SS9189 controller and using Micron's 16nm 128Gbit MLC flash, these are 7mm high, 2.5-inch drives rated for 100K random read IOPs and 555/500 MB/s sequential read and write speeds. The 1TB models we are using here support TCG Opal 2.0 and IEEE-1667 (eDrive) encryption and have a 320TB rated endurance with a three-year warranty.

Further Reading: AnandTech's Crucial MX200 (250 GB, 500 GB & 1TB) Review

Thank you to Corsair for providing us with an AX1200i PSU. The AX1200i was the first power supply to offer digital control and management via Corsair's Link system, but under the hood it commands a 1200W rating at 50C with 80 PLUS Platinum certification. This allows for a minimum 89-92% efficiency at 115V and 90-94% at 230V. The AX1200i is completely modular, running the larger 200mm design, with a dual ball bearing 140mm fan to assist high-performance use. The AX1200i is designed to be a workhorse, with up to 8 PCIe connectors for suitable four-way GPU setups. The AX1200i also comes with a Zero RPM mode for the fan, which due to the design allows the fan to be switched off when the power supply is under 30% load.

Further Reading: AnandTech's Corsair AX1500i Power Supply Review

Thank you to G.Skill for providing us with memory. G.Skill has been a long-time supporter of AnandTech over the years, for testing beyond our CPU and motherboard memory reviews. We've reported on their high capacity and high-frequency kits, and every year at Computex G.Skill holds a world overclocking tournament with liquid nitrogen right on the show floor.

Further Reading: AnandTech's Memory Scaling on Haswell Review, with G.Skill DDR3-3000

2017 CPU Benchmarking

For our review, we are implementing our fresh CPU testing benchmark suite, using new scripts developed specifically for this testing. This means that with a fresh OS install, we can configure the OS to be more consistent, install the new benchmarks, maintain version consistency without random updates and start running the tests in under 5 minutes. After that it's a one button press to start an 8-10hr test (with a high-performance core) with nearly 100 relevant data points in the benchmarks given below. The tests cover a wide range of segments, some of which will be familiar but some of the tests are new to benchmarking in general, but still highly relevant for the markets they come from.

Our new CPU tests go through six main areas. We cover the Web (we've got an un-updateable version of Chrome 56), general system tests (opening tricky PDFs, emulation, brain simulation, AI, 2D image to 3D model conversion), rendering (ray tracing, modeling), encoding (compression, AES, h264 and HEVC), office based tests (PCMark and others), and our legacy tests, throwbacks from another generation of bad code but interesting to compare.

Our graphs typically list CPUs with microarchitecture, SKU name, cost and power. The cost will be one of two numbers, either the 1k unit price 'tray price' for when a business customer purchases 1000 CPUs, or the MSRP likely to be found at retail. The problem here is that neither Intel nor AMD are consistent: Intel has a tray price for every CPU, but an MSRP only for parts sold at retail. AMD typically quotes MSRP for CPUs at retail, tray prices for enterprise CPUs, and doesn't say much about OEM only parts. We try to find a balance here, so prices may be $10-$20 from what you might expect.

A side note on OS preparation. As we're using Windows 10, there's a large opportunity for something to come in and disrupt our testing. So our default strategy is multiple: disable the ability to update as much as possible, disable Windows Defender, uninstall OneDrive, disable Cortana as much as possible, implement the high performance mode in the power options, and disable the internal platform clock which can drift away from being accurate if the base frequency drifts (and thus the timing ends up inaccurate).

Web Tests on Chrome 56

Sunspider 1.0.2
Mozilla Kraken 1.1
Google Octane 2.0
WebXPRT15

System Tests

PDF Opening
FCAT
3DPM v2.1
Dolphin v5.0
DigiCortex v1.20
Agisoft PhotoScan v1.0

Rendering Tests

Corona 1.3
Blender 2.78
LuxMark CPU C++
LuxMark CPU OpenCL
POV-Ray 3.7.1b4
Cinebench R15 ST
Cinebench R15 MT

Encoding Tests

7-Zip 9.2
WinRAR 5.40
AES Encoding (TrueCrypt 7.2)
HandBrake v1.0.2 x264 LQ
HandBrake v1.0.2 x264-HQ
HandBrake v1.0.2 HEVC-4K

Office / Professional

PCMark8
Chromium Compile (v56)
SYSmark 2014 SE

Legacy Tests

3DPM v1 ST / MT
x264 HD 3 Pass 1, Pass 2
Cinebench R11.5 ST / MT
Cinebench R10 ST / MT

A side note - a couple of benchmarks (LuxMark) weren't fully 100% giving good data during testing. Need to go back and re-work this part of our testing.

2017 CPU Gaming Tests

For our new set of GPU tests, we wanted to think big. There are a lot of users in the ecosystem that prioritize gaming above all else, especially when it comes to choosing the correct CPU. If there's a chance to save $50 and get a better graphics card for no loss in performance, then this is the route that gamers would prefer to tread. The angle here though is tough - lots of games have different requirements and cause different stresses on a system, with various graphics cards having different reactions to the code flow of a game. Then users also have different resolutions and different perceptions of what feels 'normal'. This all amounts to more degrees of freedom than we could hope to test in a lifetime, only for the data to become irrelevant in a few months when a new game or new GPU comes into the mix. Just for good measure, let us add in DirectX 12 titles that make it easier to use more CPU cores in a game to enhance fidelity.

Our original list of nine games planned in February quickly became six, due to the lack of professional-grade controls on Ubisoft titles. If you want to see For Honor, Steep or Ghost Recon: Wildlands benchmarked on AnandTech, point Ubisoft Annecy or Ubisoft Montreal in my direction. While these games have in-game benchmarks worth using, unfortunately they do not provide enough frame-by-frame detail to the end user, despite using it internally to produce the data the user eventually sees (and it typically ends up obfuscated by another layer as well). I would instead perhaps choose to automate these benchmarks via inputs, however the extremely variable loading time is a strong barrier to this.

So we have the following benchmarks as part of our 4/2 script, automated to the point of a one-button run and out pops the results four hours later, per GPU. Also listed are the resolutions and settings used.

Civilization 6 (1080p Ultra, 4K Ultra)
Ashes of the Singularity: Escalation* (1080p Extreme, 4K Extreme)
Shadow of Mordor (1080p Ultra, 4K Ultra)
Rise of the Tomb Raider #1 - GeoValley (1080p High, 4K Medium)
Rise of the Tomb Raider #2 - Prophets (1080p High, 4K Medium)
Rise of the Tomb Raider #3 - Mountain (1080p High, 4K Medium)
Rocket League (1080p Ultra, 4K Ultra)
Grand Theft Auto V (1080p Very High, 4K High)

For each of the GPUs in our testing, these games (at each resolution/setting combination) are run four times each, with outliers discarded. Average frame rates, 99th percentiles and 'Time Under x FPS' data is sorted, and the raw data is archived.

The four GPUs we've managed to obtain for these tests are:

MSI GTX 1080 Gaming X 8G
ASUS GTX 1060 Strix 6G
Sapphire Nitro R9 Fury 4GB
Sapphire Nitro RX 480 8GB

In our testing script, we save a couple of special things for the GTX 1080 here. The following tests are also added:

Civilization 6 (8K Ultra, 16K Lowest)

This benchmark, with a little coercion, are able to be run beyond the specifications of the monitor being used, allowing for 'future' testing of GPUs at 8K and 16K with some amusing results. We are only running these tests on the GTX 1080, because there's no point watching a slideshow more than once.

Benchmarking Performance: CPU System Tests

Our first set of tests is our general system tests. These set of tests are meant to emulate more about what people usually do on a system, like opening large files or processing small stacks of data. This is a bit different to our office testing, which uses more industry standard benchmarks, and a few of the benchmarks here are relatively new and different.

All of our benchmark results can also be found in our benchmark engine, Bench.

PDF Opening

First up is a self-penned test using a monstrous PDF we once received in advance of attending an event. While the PDF was only a single page, it had so many high-quality layers embedded it was taking north of 15 seconds to open and to gain control on the mid-range notebook I was using at the time. This put it as a great candidate for our 'let's open an obnoxious PDF' test. Here we use Adobe Reader DC, and disable all the update functionality within. The benchmark sets the screen to 1080p, opens the PDF to in fit-to-screen mode, and measures the time from sending the command to open the PDF until it is fully displayed and the user can take control of the software again. The test is repeated ten times, and the average time taken. Results are in milliseconds.

System: PDF Opening with Adobe Reader DC

PDF opening is all about single thread frequency and IPC, giving the win to the new KBL-X chips.

FCAT Processing: link

One of the more interesting workloads that has crossed our desks in recent quarters is FCAT - the tool we use to measure stuttering in gaming due to dropped or runt frames. The FCAT process requires enabling a color-based overlay onto a game, recording the gameplay, and then parsing the video file through the analysis software. The software is mostly single-threaded, however because the video is basically in a raw format, the file size is large and requires moving a lot of data around. For our test, we take a 90-second clip of the Rise of the Tomb Raider benchmark running on a GTX 980 Ti at 1440p, which comes in around 21 GB, and measure the time it takes to process through the visual analysis tool.

System: FCAT Processing ROTR 1440p GTX1080 Data

FCAT similarly favors frequency and IPC. For this sort of workload, the Core i7 is the best chip to get.

3D Movement Algorithm Test v2.1: link

This is the latest version of the self-penned 3DPM benchmark. The goal of 3DPM is to simulate semi-optimized scientific algorithms taken directly from my doctorate thesis. Version 2.1 improves over 2.0 by passing the main particle structs by reference rather than by value, and decreasing the amount of double->float->double recasts the compiler was adding in. It affords a ~25% speed-up over v2.0, which means new data.

System: 3D Particle Movement v2.1

As 3DPM expands into several threads, the new quad-core parts will easily get trounced here by AMD's 8-cores for the same price. The Core i7-7800X puts on a good showing, as per core Intel's chips give a higher score.

DigiCortex v1.20: link

Despite being a couple of years old, the DigiCortex software is a pet project for the visualization of neuron and synapse activity in the brain. The software comes with a variety of benchmark modes, and we take the small benchmark which runs a 32k neuron/1.8B synapse simulation. The results on the output are given as a fraction of whether the system can simulate in real-time, so anything above a value of one is suitable for real-time work. The benchmark offers a 'no firing synapse' mode, which in essence detects DRAM and bus speed, however we take the firing mode which adds CPU work with every firing.

System: DigiCortex 1.20 (32k Neuron, 1.8B Synapse)

DigiCortex likes a bit of everything: cores, threads, IPC, frequency, uncore frequency, and memory frequency. The Core i7 parts roughly double the Core i5s due to the thread count, and also the AMD Ryzen parts skip ahead as well due to having double the threads to the Core i7.

Agisoft Photoscan 1.0: link

Photoscan stays in our benchmark suite from the previous version, however now we are running on Windows 10 so features such as Speed Shift on the latest processors come into play. The concept of Photoscan is translating many 2D images into a 3D model - so the more detailed the images, and the more you have, the better the model. The algorithm has four stages, some single threaded and some multi-threaded, along with some cache/memory dependency in there as well. For some of the more variable threaded workload, features such as Speed Shift and XFR will be able to take advantage of CPU stalls or downtime, giving sizeable speedups on newer microarchitectures.

System: Agisoft Photoscan 1.0 Total Time

Agisoft is like a Formula 1 race circuit: the long fast straights and techical corners make it a nightmare to have the technology to be the best at both, and Photoscan has enough serial code for high single thread performance to take advantage but also massively parallel sections where having 12-18 threads makes a difference. Despite having half the threads, the single core performance of the Core i7-7740X makes it pull ahead of the Ryzen 7 chips, but when comparing the four threads of the Core i5-7640X to the twelve threads of the Ryzen 5 processors, having 12 threads wins.

Benchmarking Performance: CPU Rendering Tests

Rendering tests are a long-time favorite of reviewers and benchmarkers, as the code used by rendering packages is usually highly optimized to squeeze every little bit of performance out. Sometimes rendering programs end up being heavily memory dependent as well - when you have that many threads flying about with a ton of data, having low latency memory can be key to everything. Here we take a few of the usual rendering packages under Windows 10, as well as a few new interesting benchmarks.

All of our benchmark results can also be found in our benchmark engine, Bench.

Corona 1.3: link

Corona is a standalone package designed to assist software like 3ds Max and Maya with photorealism via ray tracing. It's simple - shoot rays, get pixels. OK, it's more complicated than that, but the benchmark renders a fixed scene six times and offers results in terms of time and rays per second. The official benchmark tables list user submitted results in terms of time, however I feel rays per second is a better metric (in general, scores where higher is better seem to be easier to explain anyway). Corona likes to pile on the threads, so the results end up being very staggered based on thread count.

Rendering: Corona Photorealism

More threads win the day, although the Core i7 does knock at the door of the Ryzen 5 (presumably with $110 in hand as well). It is worth noting that the Core i5-7640X and the older Core i7-2600K are on equal terms.

Blender 2.78: link

For a render that has been around for what seems like ages, Blender is still a highly popular tool. We managed to wrap up a standard workload into the February 5 nightly build of Blender and measure the time it takes to render the first frame of the scene. Being one of the bigger open source tools out there, it means both AMD and Intel work actively to help improve the codebase, for better or for worse on their own/each other's microarchitecture.

Rendering: Blender 2.78

Similar to Corona, more threads means a faster time.

LuxMark v3.1: Link

As a synthetic, LuxMark might come across as somewhat arbitrary as a renderer, given that it's mainly used to test GPUs, but it does offer both an OpenCL and a standard C++ mode. In this instance, aside from seeing the comparison in each coding mode for cores and IPC, we also get to see the difference in performance moving from a C++ based code-stack to an OpenCL one with a CPU as the main host.

Rendering: LuxMark CPU C++

Rendering: LuxMark CPU OpenCL

Luxmark is more thread and cache dependent, and so the Core i7 nips at the heels of the AMD parts with double the threads. The Core i5 sits behind the the Ryzen 5 parts though, due to the 1:3 thread difference.

POV-Ray 3.7.1b4: link

Another regular benchmark in most suites, POV-Ray is another ray-tracer but has been around for many years. It just so happens that during the run up to AMD's Ryzen launch, the code base started to get active again with developers making changes to the code and pushing out updates. Our version and benchmarking started just before that was happening, but given time we will see where the POV-Ray code ends up and adjust in due course.

Rendering: POV-Ray 3.7

Mirror Mirror on the wall...

Cinebench R15: link

The latest version of CineBench has also become one of those 'used everywhere' benchmarks, particularly as an indicator of single thread performance. High IPC and high frequency gives performance in ST, whereas having good scaling and many cores is where the MT test wins out.

Rendering: CineBench 15 SingleThreaded

Rendering: CineBench 15 MultiThreaded

CineBench gives us singlethreaded numbers, and it is clear who rules the roost, almost scoring 200. The Core i7-2600K, due to its lack of instruction support, sits in the corner.

Benchmarking Performance: CPU Web Tests

One of the issues when running web-based tests is the nature of modern browsers to automatically install updates. This means any sustained period of benchmarking will invariably fall foul of the 'it's updated beyond the state of comparison' rule, especially when browsers will update if you give them half a second to think about it. Despite this, we were able to find a series of commands to create an un-updatable version of Chrome 56 for our 2017 test suite. While this means we might not be on the bleeding edge of the latest browser, it makes the scores between CPUs comparable.

All of our benchmark results can also be found in our benchmark engine, Bench.

SunSpider 1.0.2: link

The oldest web-based benchmark in this portion of our test is SunSpider. This is a very basic javascript algorithm tool, and ends up being more a measure of IPC and latency than anything else, with most high-performance CPUs scoring around about the same. The basic test is looped 10 times and the average taken. We run the basic test 4 times.

Web: SunSpider on Chrome 56

SunSpider has a single threaded focus, and we see the Kaby Lake-X processors take their spots at the top of the graph.

Mozilla Kraken 1.1: link

Kraken is another Javascript based benchmark, using the same test harness as SunSpider, but focusing on more stringent real-world use cases and libraries, such as audio processing and image filters. Again, the basic test is looped ten times, and we run the basic test four times.

Web: Mozilla Kraken 1.1 on Chrome 56

Mozilla too relies on single threaded IPC and frequency.

Google Octane 2.0: link

Along with Mozilla, as Google is a major browser developer, having peak JS performance is typically a critical asset when comparing against the other OS developers. In the same way that SunSpider is a very early JS benchmark, and Kraken is a bit newer, Octane aims to be more relevant to real workloads, especially in power constrained devices such as smartphones and tablets.

Web: Google Octane 2.0 on Chrome 56

Octane is an interesting benchmark, requiring cores and ST performance, but mostly the latter. It also seems that either Intel's design is optimized for the benchmark or vice versa, given the substantial difference in performance. There's no way for the benchmark to use all of the threads from AMD, nor the 12 threads in the Core i7-7800X which has a lower single thread performance.

WebXPRT 2015: link

While the previous three benchmarks do calculations in the background and represent a score, WebXPRT is designed to be a better interpretation of visual workloads that a professional user might have, such as browser based applications, graphing, image editing, sort/analysis, scientific analysis and financial tools.

Web: WebXPRT 15 on Chrome 56

WebXPRT is a mix of ST and MT, but still based in the web and relies on ST performance a lot. Given the variable loading on the benchmark, Intel's newest features such as Speed Shift help keep it at the top.

Benchmarking Performance: CPU Encoding Tests

One of the interesting elements on modern processors is encoding performance. This includes encryption/decryption, as well as video transcoding from one video format to another. In the encrypt/decrypt scenario, this remains pertinent to on-the-fly encryption of sensitive data - a process by which more modern devices are leaning to for software security. Video transcoding as a tool to adjust the quality, file size and resolution of a video file has boomed in recent years, such as providing the optimum video for devices before consumption, or for game streamers who are wanting to upload the output from their video camera in real-time. As we move into live 3D video, this task will only get more strenuous, and it turns out that the performance of certain algorithms is a function of the input/output of the content.

All of our benchmark results can also be found in our benchmark engine, Bench.

7-Zip 9.2: link

One of the freeware compression tools that offers good scaling performance between processors is 7-Zip. It runs under an open-source licence, is fast, and easy to use tool for power users. We run the benchmark mode via the command line for four loops and take the output score.

Encoding: 7-Zip

7z loves threads. 7z loves it.

WinRAR 5.40: link

For the 2017 test suite, we move to the latest version of WinRAR in our compression test. WinRAR in some quarters is more user-friendly that 7-Zip, hence its inclusion. Rather than use a benchmark mode as we did with 7-Zip, here we take a set of files representative of a generic stack (33 video files in 1.37 GB, 2834 smaller website files in 370 folders in 150 MB) of compressible and incompressible formats. The results shown are the time taken to encode the file. Due to DRAM caching, we run the test 10 times and take the average of the last five runs when the benchmark is in a steady state.

Encoding: WinRAR 5.40

WinRAR is another benchmark like Agisoft, with some parts being serial and others multithreaded. When we compare the Core i7 to the Ryzen 7, the high ST performance helps push the Core i7 to the top despite the 2:1 thread deficit. On the other hand, the Core i5 has a 3:1 thread defecit to the Ryzen 5, and falls beneath it in the results.

AES Encoding

Algorithms using AES coding have spread far and wide as a ubiquitous tool for encryption. Again, this is another CPU limited test, and modern CPUs have special AES pathways to accelerate their performance. We often see scaling in both frequency and cores with this benchmark. We use the latest version of TrueCrypt and run its benchmark mode over 1GB of in-DRAM data. Results shown are the GB/s average of encryption and decryption.

Encoding: AES

AES is an optimized problem for modern processors, so add frequency and cores to get a proportionally better result. Again, the Core i7-2600K and the Core i5-7640X are almost neck-and-neck.

HandBrake v1.0.2 H264 and HEVC: link

As mentioned above, video transcoding (both encode and decode) is a hot topic in performance metrics as more and more content is being created. First consideration is the standard in which the video is encoded, which can be lossless or lossy, trade performance for file-size, trade quality for file-size, or all of the above can increase encoding rates to help accelerate decoding rates. Alongside Google's favorite codec, VP9, there are two others that are taking hold: H264, the older codec, is practically everywhere and is designed to be optimized for 1080p video, and HEVC (or H265) that is aimed to provide the same quality as H264 but at a lower file-size (or better quality for the same size). HEVC is important as 4K is streamed over the air, meaning less bits need to be transferred for the same quality content.

Handbrake is a favored tool for transcoding, and so our test regime takes care of three areas.

Low Quality/Resolution H264: He we transcode a 640x266 H264 rip of a 2 hour film, and change the encoding from Main profile to High profile, using the very-fast preset.

Encoding: Handbrake H264 (LQ)

High Quality/Resolution H264: A similar test, but this time we take a ten-minute double 4K (3840x4320) file running at 60 Hz and transcode from Main to High, using the very-fast preset.

Encoding: Handbrake H264 (HQ)

HEVC Test: Using the same video in HQ, we change the resolution and codec of the original video from 4K60 in H264 into 4K60 HEVC.

Encoding: Handbrake HEVC (4K)

Benchmarking Performance: CPU Office Tests

The office programs we use for benchmarking aren't specific programs per-se, but industry standard tests that hold weight with professionals. The goal of these tests is to use an array of software and techniques that a typical office user might encounter, such as video conferencing, document editing, architectural modeling, and so on and so forth.

All of our benchmark results can also be found in our benchmark engine, Bench.

Chromium Compile (v56)

Our new compilation test uses Windows 10 Pro, VS Community 2015.3 with the Win10 SDK to compile a nightly build of Chromium. We've fixed the test for a build in late March 2017, and we run a fresh full compile in our test. Compilation is the typical example given of a variable threaded workload - some of the compile and linking is linear, whereas other parts are multithreaded.

Office: Chromium Compile (v56)

Our Chrome Compile test is a mix of load, but also loves L3 cache. We've seen before that the L3 victim cache on AMD can be a defecit here, but even then the Core i5 cannot overcome the 3:1 thread deficit to the Ryzen 5 CPUs. The Core i7-7740X hits the nail on the head for threads and single thread performance, although users that play in this space would look straight to the Core i7-7800X, and likely decide that +16.5% better performance is worth the +18.2% extra cost.

PCMark8: link

Despite originally coming out in 2008/2009, Futuremark has maintained PCMark8 to remain relevant in 2017. On the scale of complicated tasks, PCMark focuses more on the low-to-mid range of professional workloads, making it a good indicator for what people consider 'office' work. We run the benchmark from the commandline in 'conventional' mode, meaning C++ over OpenCL, to remove the graphics card from the equation and focus purely on the CPU. PCMark8 offers Home, Work and Creative workloads, with some software tests shared and others unique to each benchmark set.

Office: PCMark8 Creative (non-OpenCL)

Office: PCMark8 Home (non-OpenCL)

Office: PCMark8 Work (non-OpenCL)

SYSmark 2014 SE: link

SYSmark is developed by Bapco, a consortium of industry CPU companies. The goal of SYSmark is to take stripped down versions of popular software, such as Photoshop and Onenote, and measure how long it takes to process certain tasks within that software. The end result is a score for each of the three segments (Office, Media, Data) as well as an overall score. Here a reference system (Core i3-6100, 4GB DDR3, 256GB SSD, Integrated HD 530 graphics) is used to provide a baseline score of 1000 in each test.

A note on context for these numbers. AMD left Bapco in the last two years, due to differences of opinion on how the benchmarking suites were chosen and AMD believed the tests are angled towards Intel processors and had optimizations to show bigger differences than what AMD felt was present. The following benchmarks are provided as data, but the conflict of opinion between the two companies on the validity of the benchmark is provided as context for the following numbers.

Office: SYSMark 2014 SE (Office) Office: SYSMark 2014 SE (Media) Office: SYSMark 2014 SE (Data) Office: SYSMark 2014 SE (Responsiveness)

Office: SYSMark 2014 SE (Overall)

Benchmarking Performance: CPU Legacy Tests

Our legacy tests represent benchmarks that were once at the height of their time. Some of these are industry standard synthetics, and we have data going back over 10 years. All of the data here has been rerun on Windows 10, and we plan to go back several generations of components to see how performance has evolved.

All of our benchmark results can also be found in our benchmark engine, Bench.

3D Particle Movement v1

3DPM is a self-penned benchmark, taking basic 3D movement algorithms used in Brownian Motion simulations and testing them for speed. High floating point performance, MHz and IPC wins in the single thread version, whereas the multithread version has to handle the threads and loves more cores. This is the original version, written in the style of a typical non-computer science student coding up an algorithm for their theoretical problem, and comes without any non-obvious optimizations not already performed by the compiler, such as false sharing.

Legacy: 3DPM v1 Single Threaded

Legacy: 3DPM v1 MultiThreaded

CineBench 11.5 and 10

Cinebench is a widely known benchmarking tool for measuring performance relative to MAXON's animation software Cinema 4D. Cinebench has been optimized over a decade and focuses on purely CPU horsepower, meaning if there is a discrepancy in pure throughput characteristics, Cinebench is likely to show that discrepancy. Arguably other software doesn't make use of all the tools available, so the real world relevance might purely be academic, but given our large database of data for Cinebench it seems difficult to ignore a small five-minute test. We run the modern version 15 in this test, as well as the older 11.5 and 10 due to our back data.

Legacy: CineBench 11.5 Single Threaded

Legacy: CineBench 11.5 MultiThreaded

Legacy: CineBench 10 Single Threaded

Legacy: 3DPM v1 MultiThreaded

x264 HD 3.0

Similarly, the x264 HD 3.0 package we use here is also kept for historic regressional data. The latest version is 5.0.1, and encodes a 1080p video clip into a high-quality x264 file. Version 3.0 only performs the same test on a 720p file, and in most circumstances the software performance hits its limit on high-end processors, but still works well for mainstream and low-end. Also, this version only takes a few minutes, whereas the latest can take over 90 minutes to run.

Legacy: x264 3.0 Pass 1

Legacy: x264 3.0 Pass 2

Civilization 6

First up in our CPU gaming tests is Civilization 6. Originally penned by Sid Meier and his team, the Civ series of turn-based strategy games are a cult classic, and many an excuse for an all-nighter trying to get Gandhi to declare war on you due to an integer overflow. Truth be told I never actually played the first version, but every edition from the second to the sixth, including the fourth as voiced by the late Leonard Nimoy, it a game that is easy to pick up, but hard to master.

Benchmarking Civilization has always been somewhat of an oxymoron – for a turn based strategy game, the frame rate is not necessarily the important thing here and even in the right mood, something as low as 5 frames per second can be enough. With Civilization 6 however, Firaxis went hardcore on visual fidelity, trying to pull you into the game. As a result, Civilization can taxing on graphics and CPUs as we crank up the details, especially in DirectX 12.

Perhaps a more poignant benchmark would be during the late game, when in the older versions of Civilization it could take 20 minutes to cycle around the AI players before the human regained control. The new version of Civilization has an integrated ‘AI Benchmark’, although it is not currently part of our benchmark portfolio yet, due to technical reasons which we are trying to solve. Instead, we run the graphics test, which provides an example of a mid-game setup at our settings.

At both 1920x1080 and 4K resolutions, we run the same settings. Civilization 6 has sliders for MSAA, Performance Impact and Memory Impact. The latter two refer to detail and texture size respectively, and are rated between 0 (lowest) to 5 (extreme). We run our Civ6 benchmark in position four for performance (ultra) and 0 on memory, with MSAA set to 2x.

For reviews where we include 8K and 16K benchmarks (Civ6 allows us to benchmark extreme resolutions on any monitor) on our GTX 1080, we run the 8K tests similar to the 4K tests, but the 16K tests are set to the lowest option for Performance.

For all our results, we show the average frame rate at 1080p first. Mouse over the other graphs underneath to see 99th percentile frame rates and 'Time Under' graphs, as well as results for other resolutions. All of our benchmark results can also be found in our benchmark engine, Bench.

MSI GTX 1080 Gaming 8G Performance

1080p

4K

8K

16K

ASUS GTX 1060 Strix 6GB Performance

1080p

4K

Sapphire R9 Fury 4GB Performance

1080p

4K

Sapphire RX 480 8GB Performance

1080p

4K

Civilization 6 Conclusion

In all our testing scenarios, AMD wins at 1080p with minor margins on the frame rates but considerable gains in the time under analysis. Intel pushes ahead in almost all of the 4K results, except with the time under analysis at 4K using an R9 Fury, perhaps indicating that AMD is offering a steadier range in its frame rate, despite the average being lower.

Ashes of the Singularity: Escalation

Seen as the holy child of DirectX12, Ashes of the Singularity (AoTS, or just Ashes) has been the first title to actively go explore as many of DirectX12s features as it possibly can. Stardock, the developer behind the Nitrous engine which powers the game, has ensured that the real-time strategy title takes advantage of multiple cores and multiple graphics cards, in as many configurations as possible.

As a real-time strategy title, Ashes is all about responsiveness during both wide open shots but also concentrated battles. With DirectX12 at the helm, the ability to implement more draw calls per second allows the engine to work with substantial unit depth and effects that other RTS titles had to rely on combined draw calls to achieve, making some combined unit structures ultimately very rigid.

Stardock clearly understand the importance of an in-game benchmark, ensuring that such a tool was available and capable from day one, especially with all the additional DX12 features used and being able to characterize how they affected the title for the developer was important. The in-game benchmark performs a four minute fixed seed battle environment with a variety of shots, and outputs a vast amount of data to analyze.

For our benchmark, we run a fixed v2.11 version of the game due to some peculiarities of the splash screen added after the merger with the standalone Escalation expansion, and have an automated tool to call the benchmark on the command line. (Prior to v2.11, the benchmark also supported 8K/16K testing, however v2.11 has odd behavior which nukes this.)

At both 1920x1080 and 4K resolutions, we run the same settings. Ashes has dropdown options for MSAA, Light Quality, Object Quality, Shading Samples, Shadow Quality, Textures, and separate options for the terrain. There are several presents, from Very Low to Extreme: we run our benchmarks at Extreme settings, and take the frame-time output for our average, percentile, and time under analysis.