Nvidia Kepler GPU Tested: Smaller, Stronger, Made for Ultrabooks
Nvidia today officially announced its new 600M Series of GPUs, which feature the company's new Kepler architecture. The chip promises 10X the speed of Intel's integrate graphics but with the size and efficiency needed to squeeze into slim Ultrabooks like the Acer Aspire Timeline Ultra M3. We got a sneak peek at the new GPU, and took one for a spin, too. What does Nvidia promise, and does it deliver?
What is it?
Kepler is the code name for Nvidia's new GPU architecture. It uses a new 28 nanometer design process, which is smaller than the 40-nm process used in the previous generation, known as Fermi.
Inside, the GPU has 8 geometry units, 32 ROP units, 4 raster units, and 256-bit GDDR5.
Additionally, the 8 streaming multiprocessors (SMX) have been redesigned for greater power efficiency. Inside each SMX is 192 CUDA cores (for a total of 1,536), 16 texture units (for a total of 128), and Polymorph Engine 2.0.
One way that Nvidia has increased power efficiency is by streamlining the scheduling process. In Fermi, the scheduler included a hardware-based step to ensure that the data being sent through was valid. However, Nvidia realized this was redundant, and was able to remove it.
Polymorph Engine 2.0 improves the GPU's performance on DX11 tessellation, enabling it to deliver double the per-clock performance of the engine in Fermi GPUs.
Not every deal is worth a squeal. Get only the good stuff from us.
The deal scientists at Laptop Mag won't direct you to measly discounts. We ensure you'll only get the laptop and tech sales that are worth shouting about -- delivered directly to your inbox this holiday season.
What can it do?
According to Nvidia, the GT640M GPU is more than twice as efficient as the GT 540M, and ten times faster than integrated graphics.
The Kepler GPUs will have roughly twice the performance per watt as Fermi GPUs--for example, the a 600M GPU has a TDP (Thermal Design Profile) of about 25 watts, where the 500M has a TDP of about 50 watts. That means notebook makers will be able to fit this GPU into systems with thinner profiles, such as the Acer Aspire Ultrabook M3, and still keep all the parts cool. However, don't expect discrete GPUs in Ultrabooks as thin as the MacBook Air - that's a little bit beyond the laws of physics for now.
Another feature that will be available to all GeForce 600M GPUs is support for FXAA, Nvidia's new anti-aliasing technology, which, according to Nvidia, can provide frame rates twice as high as when using 4xMSAA. The Kepler-based 600M GPUs will be able to do one better, supporting TXAA, which combines anti-aliasing along with other technologies to achieve even smoother lines. According to Nvidia, TXAA1 will be visually equal to 8xMSAA, but only use as much resources as 2xMSAA. Games and developers who have committed to offering TXAA support include "MechWarrior Online," "Secret World," "Eve Online," "Borderlands 2," Unreal 4 Engine, BitSquid, Slant Six Games, and Crytek.
Kepler GPUs have a new hardware-based H.264 decoding engine, called NVENC, which is almost four times faster than the CUDA-based controller, but consumes less power.
As with the previous generation, Kepler GPUs will also support DirectX 11, Optimus graphics-switching, PhysX, Verde, CUDA, 3D Vision, and 3DTV Play. Unlike the GTX 680 desktop GPU, the mobile processors won't have Nvidia GPU Boost, which can dynamically overclock the GPU as needed.
How powerful is it?
Pretty powerful. Let's compare the Acer Aspire Timeline Ultra M3, which has a 1.7-GHZ Intel Core i7-2637M, 4GB of RAM, Nvidia GeForce 640M GPU, and a 256GB SSD, with some heavy hitters:
- Alienware M14x: 2.3GHz Intel Core i7-2820QM, 8GB RAM, Nvidia GeForce GT555M, 1.5GB VRAM, 750GB, 7200-rpm hard drive, 1600 x 900p display
- Apple MacBook Pro (15-inch, 2011): 2.2-GHz Intel Core i7-2720QM, 8GB RAM, AMD Radeon 6750M, 1GB VRAM, 750GB, 5,400-rpm hard drive, 1440 x 900 display.
- HP Pavilion dv7t Quad Edition: 2.0GHz Intel Quad Core i7-2630QM, 8GB RAM, AMD Radeon HD 6770M, 1GB VRAM, 120GB solid state drive, and a 540GB, 7,200-rpm hard drive, 1600 x 900 display.
On the graphics benchmark 3DMark06, the GT640M in the Acer M3 performed better than the AMD Radeon 6750M in the MacBook Pro, and came in a hair less than the AMD Radeon HD 6770M GPU in the HP Pavilion dv7t. The Alienware M14x, which has an Nvidia GT555M GPU, finished about 1,400 points higher.
In our "World of Warcraft," where we max out all the settings, the Acer Aspire M3 came out on top with an average of 80 frames per second, beating even the Alienware M14x (77 fps). However, it should be noted that the M3 has the lowest-resolution display of the bunch (1366 x 768), so, all things being equal, we would imagine the M14x coming out slightly ahead. Still, 80 fps is nothing to sneeze at.
Finally, in the more demanding "Far Cry 2" benchmark, the Acer Aspire M3 again came out on top.
We also ran our "World of Warcraft" test on the Acer Aspire M3 using the discrete Nvidia GPU and the integrated Intel HD Graphics 3000 GPU. The results, as you can imagine, were quite disparate.
With the settings on autodetect, and the screen resolution set to 1366 x 768, the integrated GPU notched 35 fps, which is playable, but the Nvidia GPU scored 155 fps, more than four times as high. When we upped the effects to max, the integrated GPU couldn't handle it, but the discrete GPU clocked in at 80 fps.
We also tested the NVENC decoding engine by converting a 5-minute 1080p MPEG-4 video into an iPod touch format, using a beta version of Cyberlink MediaEspresso. Indeed, the Nvidia GPU was fast, but Intel's Quicksync technology was even faster.
Where can I find it?
All of the mobile GPUs announced today will be in the GeForce 600M series, of which there are nine. However, not all use the new Kepler architecture; some use the current Fermi architecture, as you can see in the chart below.
Mainstream: GeForce GT 620M
Performance: GeForce GT 630M, GT 640M, GT 640M LE, GT 650M
Enthusiast: GeForce GTX 660M, GTX 670M, GTX 675M
Kepler Specs
Row 0 - Cell 0 | GT 620M | GT 630M | GT 635M | GT 640M LE | GT 640M | GT 650M |
Process | 28 nm | 28/40 nm | 40 nm | 28 nm | 28 nm | 28 nm |
Architecture | Fermi | Fermi | Fermi | Kepler | Kepler | Kepler |
Cores | Up to 96 | Up to 96 | Up to 144 | Up to 384 | Up to 384 | Up to 384 |
Features | Optimus, PhysX,Verde, CUDA, 3DTV Play | Optimus, PhysX, Verde, CUDA, 3D Vision, 3DTV Play | Optimus, PhysX, Verde, CUDA, 3D Vision, 3DTV Play | Optimus, PhysX, Verde, CUDA, 3D Vision, 3DTV Play | Optimus, PhysX, Verde, CUDA, 3D Vision, 3DTV Play | Optimus, PhysX, Verde, CUDA, 3D Vision, 3DTV Play |
Clock | Up to 625 MHz | Up to 800 MHz | Up to 675 MHz | Up to 500 MHz | Up to 625 MHz | Up to 850 MHz |
Memory Interface | Up to 1GB GDDR3 | Up to 2GB GDDR3 | Up to 2GB GDDR5 | Up to 2GB GDDR3 | Up to 2GB GDDR3 or GDDR5 | Up to 2GB GDDR3 or GDDR5 |
Memory Width | Up to 128-bit | Up to 128-bit | Up to 192-bit | Up to 128-bit | Up to 128-bit | Up to 128-bit |
Bandwidth | Up to 28.8 | Up to 32.0 | Up to 43.2 | Up to 28.8 | Up to 64.0 | Up to 64.0 |
Row 0 - Cell 0 | GTX 660M | GTX 670M | GTX 675M |
Process | 28 nm | 40 nm | 40 nm |
Architecture | Kepler | Fermi | Fermi |
Cores | Up to 384 | Up to 336 | Up to 384 |
Features | Optimus, SLI, PhysX,Verde, CUDA, 3DTV Play | Optimus, SLI, PhysX,Verde, CUDA, 3DTV Play | Optimus, SLI, PhysX,Verde, CUDA, 3DTV Play |
Processor Clock | Up to 835 MHz | Up to 598 MHz | Up to 620 MHz |
Memory Clock | Up to 2000 MHz | Up to 1500 MHz | Up to 1500 MHz |
Memory Interface | Up to 2GB GDDR5 | Up to 3GB GDDR5 | Up to 2GB GDDR5 |
Memory Width | Up to 128-bit | Up to 192-bit | Up to 256-bit |
Bandwidth | Up to 64.0 | Up to 72.0 | Up to 96.0 |