GeForce GTX 980 & 970 & Lunar Landing Conspiracies with Maxwell VXGI

I’ve been waiting so long for maxwell 8xx but it looks like for avoiding confusion they jumped straight to 9xx. Here’s some videos, the “product video” has some mildly interesting news (the part about cornell box was quite fun [4:56]), anyway I found lunar landing more engaging.

GeForce GTX 980 & 970 Product Video


Debunking Lunar Landing Conspiracies with Maxwell and VXGI

and lot of slides about gtx 9xx:

…and everything looks better with images so:




edit: almost forgot about real gems: 9xx power consumption

Graphics Card Power (W): 980: 165 W, 970 145 W :slight_smile:
Minimum System Power Requirement: 500 W :smiley:

Meh. My trusty 780 will do for now. I’ll wait for the process shrink and 8GB before I jump into Maxwell.

I. Need. Benchmarks. :o

http://www.anandtech.com/show/8526/nvidia-geforce-gtx-980-review

Wait, doesn’t tom’s hardware guide always include compute tests on the cards? why didn’t they do it this time?

Sweclockers.com always include GPUPU tests, and with blender yay :stuck_out_tongue:

use google translate to go from swedish to english or what have you.

for me the 980s was a bit dissapointing, the 970 seems to buy bigger bang for the buck, but the non-reference cooling is a no go for me.
So I’m gonna wait for the GM200-A1 chip, which will probably be a new Titan (2) and for sure, it gotta be 8GB. Since this release is 4gb/8gb(later) and the previous was 3gb/6gb and Titan/TitanBlack got 6GB.

it could maybe happen already in november at the SuperComputing conference in New Orleans, otherwise it’s early 2015. Which is to bad I need it now but will wait.

I was kinda worried about cycles performance since there are less cuda cores in the 980 compared to the 780ti but those scores set it for me. I am definetely going to upgrade my 680 to 980 and not to 780ti. 4gb vram is also quite sexy

It’s said that a maxwell cuda core performs 40% extra of an Kepler. So 2048 cores amounts almost to 2880 (kepler cores) which explains the performances in test I’ve seen so far, it keeps up or surpasses 780.

But it was also said that 8gb versions should follow the initial 4gb release, and THAT would be really really nice for Cycles.

all that asides, I think I can hold on to my cash until the next Titan (2) comes along, it was rumored to be 4000 cores and 512bit bus. And seeing the 980 come with “only” 2048 cores compared to current Titan black 2880, I suspect maxwell Titan will come with 2880 cores, but again being 40% stronger than Kepler cores, so 2880 maxwell cores = 4000~ (kepler cores) which fits in line with the rumors about Titan 2.

But the a 970 with 8Gb ram and reference design cooler could be very tempting also! to bad they keep putting non-reference designed coolers on 970s.

Yeah gonna stick with my 2x 4GB 680s for now… I can hope for a Titan 2

What’s with the hate for non-reference coolers? My ASUS Direct CU2 cooler is performing great.

Guys, no, forget the 980, i just saw the benches on Octane and Cycles, its SLOWER than a 780! It equals a 580!

From the Octane Render:
"RESULTS - pathtracing, alphashadows off
970: 3.43M/sec (1430mhz core/8224mhz memory)
Hope to get more results soon.

PS.970 performs almost the same as 670 and same as 580 and look at
this overclocked clocks!"

A 780 gets about 7 M/sec!

Why? Simple:

Less fillrate (144.1GT/s vs 210GT/s 780 ),
Less memory transfer (224GB/s vs 336GB/s)
Less CUDA Kernel Ray OPs (TMUs 128 vs. 240 with the 780)

Also:
“However the maximum number of active thread blocks per
multiprocessor has been doubled over SMX to 32, which should result
in an automatic occupancy improvement for kernels that use small
thread blocks of 64 or fewer threads (assuming available registers
and shared memory are not the occupancy limiter).”

But exactly this is NOT the case with Cycles (or Octane), we have not “kernels that use smallthread blocks”, the Cycles Kernel is huge! This is why it compiles bad on AMD.
While this IS better for “simple” gaming shaders, it’s crap for complex CUDA.

I expected this, and new drivers won’t make miracles. Its a gaming card (like the 680).
Wait for the GM210 chip! Or get the 780s/Ti (while you can, LOL).

And BTW, VXGI can be run on a 780 also. It’s an algorithm, Maxwell agnostic! Even CUDA agnostic! Licensing is another story…

BTW these tests for CUDA are unfair :slight_smile: , I think a whole new compile for Maxwell architecture would make the test 20~40% faster

I remember the same thing being said about the 680 when it came out, LOL…
The truth is that the raw texel transfer + ram bandwidth tell the real story. The moment you need textures from ram or rays per pixel, there is no more bargaining… this was also the case with the 580 vs the 680… no matter how much you optimize, the specs are always the limit. The 580 eated a 680 for lunch.
Sad but true…

Bottom line: NEVER buy a x04 Chip (680,980), always a x10 (580,780,Titan,next 1080?)

@enilmacs :slight_smile: , it is not a driver issue, it is just needs its own kernels to “fit” , most likely a mix of Fermi and Kepler kernels, Fermi kernels fits 980 architecture more, and kepler dynamic fetch works also on 980, a mix of both AND a change in number of threads/block to 128 “instead of 64 in both octane and cycles” will rise the number A LOT , I expect it to reach 11M/s easily without over clock, BTW I’m a CUDA developer and I know what I’m saying :smiley:

edit:
the raw texel transfer WON’T affect render times …
here is a simple example:
GTX 580 got 1.5 TFlops, with full optimizations you can reach around 1.2TFlops from this amount
GTX 680 got 3 TFlops, but guess what!! renders on GTX 580 are faster!!, so you also have lower TFlops , around 1.0 TFlops from GTX 680 “which is 33% instead of 80% like GTX 580”

GTX Titan got 5.1 TFlops, with the 33%, it will be around 1.7 TFlops, like 40% faster than GTX 580

GTX 980 should work in a similar fashion to GTX 580, with around 80% throughput , so a 4.5TFlops would benefit around 3.7 TFlops “which is more than 2x GTX Titan” , but it needs its own kernels to get that much of it :slight_smile:

@MohamedSakr
We will see… deal ? :wink: Still you can’t “re-compile” the raw hardware specs :stuck_out_tongue: so with the simple blender cube, maybe… but the moment you deal with the memory bus or texel rate for textures… i really doubt it.

Updated reply:
One sec, i didn’t talk about Tflops (which ARE optimizable), but raw transfer, and the 580 HAD a 512bit(!) Bus and a hw scheduler, etc, etc… it was better BY the raw specs.
A 256 Bit Bus WILL limit your Rays/sec accessing the textures, no matter what “optimizations” you do so do the TMUs.
Also don’t forget that the 680 was a shrink to 28nm, and still got slower, the 980 is not.

@enilnacs deal :slight_smile: , check the edited post above

if you want a real benchmark, go with IRay :slight_smile: , the NVidia folks wrote it and they know how to “squeeze” the GPU :smiley:

Here, the 980 is faster than a 780 TI in Cycles: http://www.sweclockers.com/recension/19332-nvidia-geforce-gtx-980-och-gtx-970/16#pagehead

Yes, i see it, 18% faster, 19 Secs instead of 22 (780). Fits exactly, same 28nm, higher clock per core (ca. 1300Mhz=about 18%). Its the island scene.
But what i am interested in is like i said, heavy texture access (8K textures) to ram and heavy geometry (no instancing). There i have my doubts… the bottlenecks. I don’t doubt the technology which will be really good with a GM210, but the raw transfer which is obvious… we will see… I will wait for the 1080 ? :wink:

Does that mean we are recommending the 980?