How is it that the R9-390x beats GTX-TitanX?

John_Lancaster · July 14, 2015, 3:38pm

Nope.

the 980ti crushes the 580 seemingly everywhere except for blender for some… odd reason. Perhaps it is because Cycles needs more optimization for the newer cards!? Banish the thought!
You do understand that Gflps alone does not determine performance… If what you’re saying is true, than my 390x should preform about the same in games, yet that’s simply not true. The 980ti crushes it even with the 390x on the new drivers. The maxwell architecture is more efficient.

An analogy for you.
On paper the FX 9590 looks like a monster compared the the 4790k. But in real life the 4790k dominates it.

Despite the memory bandwidth difference, the memory on Maxwell has a better memory controller and the memory itself is clocked quite a bit higher. There are also differences in how the ram chips were placed and connected to the GPU on the board.
Simply put the Maxwell chip is more efficient with the available memory and the memory chips themselves are faster, and not only with the clocked speeds, there are other factors as well that speed up the ram chips. Bandwidth itself does not determine the overall speed in the end though it is one of the larger factors.

CPUs have very simmilar factors, this isn’t anything new or terribly complicated

Here is some info on Maxwell memory, the page predates the 980ti which is even faster with memory, soooooo, yeah.
http://www.hardwareluxx.com/index.php/reviews/hardware/vgacards/32697-xxl-test-nvidia-geforce-gtx-980-and-970-with-maxwell-architecture.html?start=2

Like I said, you don’t quite understand the differences in architecture here.

John_Lancaster · July 14, 2015, 3:42pm

And the AMD cards aren’t for gaming? Wat.

Also, you do realize that the 980ti is only like 30ish++% faster than the 780ti was which is only “x” amount faster than the 680ti which is only x amount faster than the 580/580ti…

I think you’ve mistaken yourself. the 980ti is over twice as powerful than the 580. Go look at some stats.

EDIT
AND THAT’S WITHOUT AN OVERCLOCK which the 900 series does amazingly well^

CASE AND POINT, Cycles DOES need to be optimized for the newer cards. You can not argue that since numbers don’t lie.

doublebishop · July 14, 2015, 3:55pm

The cards you have tested are not workstation gpu cards, they are gaming cards… the workstation cards are firepro’s / quattro line.

Also, you do realize that the 980ti is only like 30ish++% faster than the 780ti was which is only “x” amount faster than the 680ti which is only x amount faster than the 580/580ti…

Not in terms of compute performance…they significantly changed the architecture between the 5xx series and the 6xx series, where they made the 6xx series have lots of cores, but made them less powerful over larger more powerful cores but less of them.

Rendering is a complex beast, which is affected mostly by this sort of architecture change… simple programs will benefit the most out of hte architecture change, which a lot of games take advantage of (physx)

I think you’ve mistaken yourself. the 980ti is over twice as powerful than the 580. Go look at some stats.
http://gpu.userbenchmark.com/Compare/Nvidia-GTX-980-Ti-vs-Nvidia-GTX-580/3439vs3150
EDIT
AND THAT’S WITHOUT AN OVERCLOCK which the 900 series does amazingly well^

CASE AND POINT, Cycles DOES need to be optimized for the newer cards. You can not argue that since numbers don’t lie.

All those benchmarks are doing is for opengl / directx calculations… has nothing to do with compute performance.

Esparadrapo · July 14, 2015, 4:15pm

The one who doesn’t seem to get it is you. If Maxwell isn’t as good as you think it should be it isn’t Cycles’ fault.

http://render.otoy.com/octanebench/results.php?sort_by=avg&filter=&singleGPU=1

GTX 580 = 64
GTX 780 Ti = 103
GTX 980 Ti = 126

GTX 580 = 1 / 86 * 100 = 1.11
GTX 780 Ti = 1 / 52 * 100 = 1.92
GTX 980 Ti = 1 / 47 * 100 = 2.12

In line with Cycles results.

Point is, AMD may be better for this specific task with a cheaper model and more VRAM on top of that and you’re bitter about that.

John_Lancaster · July 14, 2015, 4:26pm

The AMD statement was sarcastic.
Further I have some points.

The gtx 580 is 1.58 TFLPs.
The gtx 980ti is 5.63 TFLPs…
:eek:

The 980ti has much more and way faster memory than the 580.

The gtx 980ti runs at some MUCH higher clock speeds across the board.

The Maxwell architecture is better/more efficient with new and old rendering technologies.

UserbenchmarkGPU runs a series intensive 3D graphics tests which measure rendering speed and overall computational throughput.

>.>

John_Lancaster · July 14, 2015, 4:30pm

Good is and bitter are subjective words I’ve only stated the facts on Maxwell.
Mathematically speaking it does not make sense that the 390x can beat a TitanX by a fair bit since the TitanX does have the better architecture and a fair bit more computational power to boot, as well as much higher clocks once overclocked. Therefore Cycles and or drivers can clearly be optimized further for the newer cards.

Again, let me reiterate something…
"1)
The gtx 580 is 1.58 TFLPs.
The gtx 980ti is 5.63 TFLPs…
:eek: "
(stock)

doublebishop · July 14, 2015, 4:34pm

This means nothing unless the card can actually use the teraflops efficiently.

The 980ti has much more and way faster memory than the 580.

More and faster… true… is it needed… in terms of benchmarks, memory speed doesnt play too much of a part… memory amount, would give us larger scenes and more flexibility, but we can cram a bunch in to a 1.5gb card that lots of people dont realise, if you want examples of our animations i will post a bunch.

The gtx 980ti runs at some MUCH higher clock speeds across the board.

Again, clock speeds don’t mean much unless it is actually efficient at what it does.

The Maxwell architecture is better/more efficient with new and old rendering technologies.

Sure it will run at a bit lower power consumption… but yeh…

UserbenchmarkGPU runs a series intensive 3D graphics tests which measure rendering speed and overall computational throughput.

>.>

Can you please tell me which of the benchmarks is compute performance? 3D graphics tests are totally irrelevant to this discussion.

John_Lancaster · July 14, 2015, 4:43pm

Interesting how you say “unless it can actually use the Teraflops efficiently”
Does it not conclude to say that further optimizing cycles for the Maxwell architecture would solve this?
That’s some simple logic for you right there.

It’s all in the numbers.
Computationally, the 980ti is about 3.56 more powerful. We don’t see that though because as you said it’s not using that effectively.

Only goes to prove my point…

Esparadrapo · July 14, 2015, 4:51pm

If you don’t understand how two different architectures can excel at an specific task and suck at others I just don’t even.

The GTX 680 had way more transistors and cores than the GTX 580 and yet the former trounced it in Cycles or about anything computational. The GTX 680 is a 3000 GFLOPS card to put it in perspective.

John_Lancaster · July 14, 2015, 5:02pm

You’re obviously missing the point as you’ve just reiterated one of my arguments.
Let me sidestep to something you should be able to get.

The 390x is 5.9 Teraflops.
The FuryX is 8.6 Teraflops.
Tell me, which do you think renders cycles faster? All I want you to say is either the 390x or the Furyx.

Esparadrapo · July 14, 2015, 5:22pm

This bait was so poor I almost laughed.

Guiding yourself by the TFLOPS figure among different architectures doesn’t prove anything.

Just checking muziqaz’s posts I noticed that you didn’t even know that Cycles is a single precision task. giggles

John_Lancaster · July 14, 2015, 5:37pm

Your post… it’s so irrelevant, just filler because you don’t want to answer the question.
Since you wont at least guess I’ll tell you.

The 390x on the old driver actually scored better than the FuryX on the 15.7 driver that came out. The 390x is even faster on the new driver.
Therefore we can conclude that OpenCL for Blender needs more development to take better advantage of this Fury’s architecture. Mathematically this should not add up but those are the facts.

The furyX is physically a MUCH better card.

The TitanX which is the superior card is being out preformed by the 390x even on the old driver.
Therefore we can conclude that CUDA needs to be developed to where It can take better advantage of the newer architectures.

You’re not even arguing against this point… I don’t understand what it is you’re trying to say! Spit it out! PLEASE! No more assumptions about Maxwell or filler, no more nonsense. What the hell are you saying man!?

Esparadrapo · July 14, 2015, 5:58pm

It may seem that it would be wasted on you but lets recap:

Performance for a given application may not scale well or at all with overclocking.
Performance for a given application may not scale well or at all with more cores.
Performance for a given application may not scale well or at all with more bandwidth.
Performance for a given application may not scale well or at all increasing a combination of any of the above.
Trying to figure out the performance for a specific application across architectures is pointless.

What you’re saying is that Maxwell is da best for everything just because.

Have crossed your mind that Cycles may be similar to Luxrender and AMD is just that much better for that task? Better yet, why don’t AMD hardware scale in Luxrender as it should based on GFLOPS figures?

It is you who don’t seem to understand how this stuff works.

John_Lancaster · July 14, 2015, 6:22pm

Esparadrapo:

It may seem that it would be wasted on you but lets recap:

Performance for a given application may not scale well or at all with overclocking.

Performance for a given application may not scale well or at all with more cores.

Performance for a given application may not scale well or at all with more bandwidth.

Performance for a given application may not scale well or at all increasing a combination of any of the above.

Trying to figure out the performance for a specific application across architectures is pointless.

What you’re saying is that Maxwell is da best for everything just because.

Have crossed your mind that Cycles may be similar to Luxrender and AMD is just that much better for that task? Better yet, why don’t AMD hardware scale in Luxrender as it should based on GFLOPS figures?

It is you who doesn’t seem to understand how this stuff works.

I don’t see how you can be making the same points I am yet miss the point entirely…

“da best for everything just because” is a subjective term nor is it EVEN CLOSE to what I said. Again, I’ve only provided factual information about Maxwell.
Further, I’d like to say I’m not opposed in the least bit to AMD as I only bought their 390x, I’ve done most of the testing for OpenCL in the Blender community since it was added and have given out plenty of advise that AMD looks to be a really good choice over Nvidia at the moment.

All your bullets, yes, yes, yes, yes and no, you’re being subjective again and ignoring a lot of data.
If you’re not aware, cards and card architectures are compared ALL. THE. TIME. and this is essential to provide customers with a good basis of what should I get for x task, what is better for x task, what is a good value to x task, etc. What the hell do you think a benchmark is, why do you think we run the BMW2.7 benchmark!? HOW CAN YOU NOT UNDERSTAND THIS? Do you even understand how ignorant that statement was?

What I’m basically saying at the moment is that the 390x is preforming remarkably well yet the TitanX which is indisputably a more powerful card does not beat it and we can conclude that from this data and the fact that the newer Nvidia cards aren’t scaling as well as math suggests they really should be that CUDA could use more development to take more advantage of it. You fail to understand that I’m hardly/not even comparing the cards, I’m using data I collected from the cards to back up the factual point that CUDA could in fact be developed to take more advantage of the better hardware in a TitanX.
You don’t even seem to be disputing this fact, LOL!

Further we can back this logic and point with the fact that the FuryX a MUCH more powerful card does not beat the 390x even on the new driver, therefore we can conclude that OpenCL needs more development to take advantage of the better hardware.

Drivers too.

It seems that it is you who does not understand how this stuff works, first you make some assumptions about Maxwell, then you reiterate my points as part of your “argument”, and now you make an absurd and ignorant statement so crazy that it simply says NO to reality.

WHAT ARE YOU THINKING, dude? I’m still no closer to understanding what it is exactly that you’re trying to say here…
Also, sorry for caps and if I seem a bit rude, I do have the best intentions of trying to understand you as well as have you understand me.

Esparadrapo · July 14, 2015, 6:57pm

You keep repeating this “more powerful” mantra and yet you’re unable to understand that some architectures can excel at some tasks and suck at others.

Bitcoin mining is the perfect example for this. AMD was vastly superior to Nvidia a couple of years ago and both were thrown out the loop by custom ASICs with a fraction of any GPU processing power.

Your original point is that CUDA Cycles needs to be optimized for Maxwell because theoretically a 390X can’t best it and I’m saying that a completely different arch may be just better at that specific task. Is this that hard to understand?

SreckoMicic · July 15, 2015, 7:50am

So guys lets do some conclusion here because I am in process of purchasing new gpu.

Question is what are the drawback of OpenCL compare to CUDA, for Cycles? Where can I find some docs about that?

Thank you!

Richard_Marklew · July 15, 2015, 8:05am

Question is what are the drawback of OpenCL compare to CUDA, for Cycles? Where can I find some docs about that?

http://blender.org/manual/render/cycles/features.html

John_Lancaster · July 15, 2015, 9:48am

Now that makes senses as an argument.

Few points to make to wrap this up.

In the metrics that matter for Cycles on the side of hardware the TitanX is the superior card It should be faster but we don’t see that. Nor do we see the type of performance scaling in Cycles that we mathematically should be whereas on earlier card like the 580, the card is pretty well used. Yes there are obvious differences between these cards but at the end of the day the TitanX is more powerful by a fair bit where it counts when you look at the hardware and architecture of the 390x. That’s simply the fact here.

I didn’t mention this before, but I found it suspicious that during the Titanx and 980ti’s power usage never seemed to exceed 77%. I’m not sure if this is normal because when doing stress and performance benchmarks for these cards I saw more around the 95% mark which is similar to that the 390x said for Cycles. This is quite possibly another indicator that the cards are not being use quite to their full potential, but this is an area I’m not terribly sure about so I’ll not draw any finite conclusions on this.

You use bitcoin mining, a great example to explain that the 390x could simply be just better for rendering in Cycles. Like I mentioned, in the metrics that matter when measuring hardware this shouldn’t be the case. Even mathematically it does not make sense for this case. I’m not saying that you’re wrong, just that this does not factor in as much as you would think for Cycles especially when you look at the other data gathered from testing the TitanX and 980ti.

It’s certainly not terribly absurd to say that CUDA perhaps is not quite optimized as well as it could be for utilization of the newer architectures. One example is that when the 980 and Titanx came out they were doing questionable in Iray, but later updates that contained optimization for the Maxwell architecture saw massive performance improvements.

Further, perhaps the reason OpenCL is preforming rather well is that it was just recently added and tested off mainly the hawaii architecture and its predecessor. We’re seeing some stellar performance here, but yet on the FuryX which is indisputably more powerful than a 390x is currently scoring slower times. Obviously we can conclude that OpenCL looks to be in need of further development to take advantage of the FuryX’s hardware.

Now, I don’t think that CUDA is all that much off. What I do think is that we could certainly see some nice optimizations what would bring things more in line to what that data and hardware tell us about these Maxwell cards. Simply put, the TitanX should quite literally be able beat the 390x, even if it’s just by a little bit. The physical hardware and numbers don’t lie, you can’t argue against these hard facts about the hardware. As we all know, software/drivers are the next area to look at after hardware.

I see what you’re saying here though, but according to the data and information it would not seem to be that much of a factor in this case.

Esparadrapo · July 15, 2015, 10:03am

I’m leaving it here, you’re hopeless.

John_Lancaster · July 15, 2015, 10:11am

The only current drawback is that there are some features not currently supported for OpenCL
See Richard’s post.

I would like to suggest that perhaps you look at purchasing two of the somewhat higher end AMD cards for Crossfire and pretty stellar performance that will likely improve in the future. You will, as I mentioned, want a 1000+ watt power supply for this.

-2 290x card will run you around $660, a GREAT value and they will render almost as fast as 390x cards.

-A 295x (2 290x chips on the same water cooled board) Will be around $650ish and preform about the same as the previous but use a little less power, make a little less heat, and be quieter.

-You could get one or two 390x cards which seem to deliver the best single card performance at the moment, two of these would render FAST. Would run around $850. Beware of heat and likely some noise.

A pair of the very soon to be released Fury air cooled chips would likely preform a fair bit better in the future than the 390x and for two you’d spend around the same price as a TitanX.
1 or two FuryX watercooled.
Or wait until later for the dual GPU FuryX.

-Or wait until not quite as much later and pick up two Fury nano cards which are supposed to cost around as much as a 390x and ill preform between a 290x and 390x as I understand, but it will use MUCH less power therefor less heat and noise. It’s also a pretty small card.

You’ve got a lot of good options.