Cycles NVidia MAXWELL Benchmarks

Almost exactly the same result here on exactly the same card in Win8.1.
Mike Pan’s BMW (256x256): 00:59.27

I was a little miffed that the CUDA rendering didn’t work out of the box in Windows (since it did under Linux, though twice as slow). Will the newer sm_50 kernel be supplied with Blender in the future to save others the frustration of hunting this fix down?

EDIT: Thanks to Juicyfruit’s sm_50 linux kernel, I’ve halved my linux BMW benchmark from 2:10, to 1:04(!) with 256x256 tiles, but Windows 8.1 is still slightly faster at 59.XX seconds. Still beats my CPUs best case time of 1:48 (with 8x8 tiles), so now I can default to GPU rendering.

Where do you put it for linux? There is no “lib” folder. I created one in the cycles dir and put it there but that didn’t work.

You’re right that the linux distro-packaged version of blender might not have a cycles lib dir, as mine doesn’t either in Fedora 20. Perhaps it’s a closed-source legal issue?

$ find /usr/share/blender/2.69/ -iname '*cubin*'
$ sudo yum provides \*cycles/lib\*
Loaded plugins: changelog, langpacks, refresh-packagekit
No matches found

However, the version of blender downloaded directly from blender.org does include it, and it’s the one I use most often anyway because it’s usually more up-to-date than the packaged version:

$ find ~/bin/blender-beta/blender-2.70a-linux-glibc211-x86_64 -name 'kernel_sm*'
/home/jaf/bin/blender-beta/blender-2.70a-linux-glibc211-x86_64/2.70/scripts/addons/cycles/lib/kernel_sm_30.cubin
/home/jaf/bin/blender-beta/blender-2.70a-linux-glibc211-x86_64/2.70/scripts/addons/cycles/lib/kernel_sm_21.cubin
/home/jaf/bin/blender-beta/blender-2.70a-linux-glibc211-x86_64/2.70/scripts/addons/cycles/lib/kernel_sm_35.cubin
/home/jaf/bin/blender-beta/blender-2.70a-linux-glibc211-x86_64/2.70/scripts/addons/cycles/lib/kernel_sm_20.cubin
/home/jaf/bin/blender-beta/blender-2.70a-linux-glibc211-x86_64/<b>2.70/scripts/addons/cycles/lib/kernel_sm_50.cubin</b>

Thank you so very much, you saved me! I had tried everything I could possibly think of: different OS’s, different cuda versions, drivers, I compiled it - I don’t know how many times. Now, it works and it was pretty simple.

Hi,

Can someone please upload the 2.69.11 Win64 build again?
Rolf says the newest builds are much slower, especially with two cards, but he removed the old builds from his dropbox.

Guess who has just placed an order for two 750 ti ’s?
:wink:

Here my build, i use primary.
Blender 2.69.11 eb4f2b4

on newer build, the utilization of 2 or more 750ti´s look like this =>

I don’t know if the problem exist just for me or for everyone.

Thanks a lot Rolf!

I’ll test both builds once my cards arrive and post the results here.
Note: for this 2.69.11 build to run, I had to replace the 2.69.11 directory by its subdirectory 2.69.

viele Grüße

mkdir /usr/share/blender/2.70/scripts/addons/cycles/lib <– This worked on mine… don’t know why it didn’t work for you. :frowning:

GTX 750 1GB DDR 5 (not TI) <-- yes the el cheapo card.

Using Juicy/or ROLF sm_50 kernel with blender 2.70a stabel compiled from source @ Gentoo 64Bit.

Mike BMW Pan (default) 1min 22 secs
Mike BMW Pan (tile 256X128) 1:13 secs

Corrnell Blend (default) 2min 41 secs.

Official Windows Blender versions are now created with Cuda 6.0 and come with sm_50 kernel for Maxwell Cards.

http://builder.blender.org/download/

Thanks for the information, Rolf.
Official Windows Blender 2.70-685316b win8 x64 version tested and working here with 750ti.
For the moment, no ‘subsurface scattering and volume scatter/absorption’ supported on gpu render.
For that, your ‘special sm_50 kernel’, thankfully.
:slight_smile:

Someone (nickname Cycliste) on the French Blender Clan Forum made a quite interesting test using 6 (yes six) 750 Ti cards and comparing it to the more expensive Titan:

http://blenderclan.tuxfamily.org/html/modules/newbb/viewtopic.php?topic_id=41732

Even if you don’t read French you might like to have a look at his rig and the test results.

yes but it 2gb vs 6gb VRAM. very good for mining

I test the Cavalier scene in 2.69.11.
With 1 card i get 14:45.41
With 2 cards i get 7:34.43, so i beat a titan with only 2 of them.
The Time of 3:29 with 6x 750ti in 2.69.11 seems slow for me.

Those cards are CUDA 6 capable, what about unified memory ?

I think those are not OC cards. If you tested the scene with the nice OC you mentioned earlier it could be an explanation.

Well, I get 12:04.63 with 2 OC cards, so YMMV
But I had the display plugged to one of them, and I was clicking around.
I also had GPU-Z started (twice) and it showed GPU load oscillating, a bit like on your lower chart in post #69, but I’m using 2.69.11

I tried a little overclocking with my 750Ti.

EDIT : be careful if you want to overclock, driver 337.50 beta limits core clock to +135MHz

Using driver 335.23 the card reaches 1289MHz (boost 1367MHz) GPU and 1525MHz memory without voltage increase.
It gives 50.54s for Mike Pan’s BMW with 256x256 tiles (instead of 57-58s with Blender 2.70a at stock clock)

I got 15:46.65 for the Cavalier.

@Rolf: how much do you overclock to reach 14:45.41 with one card ?

are the downloads from Graphicall.org compiled for MAXWELL ?

[ATTACH=CONFIG]305943[/ATTACH]

It looks very suspicious, Rolf do you think it can have something to do with this 2.71 optimalization, that does …

  • CUDA handling is more asynchronous now. This results in a lower CPU usage while CUDA rendering with multiple GPUs

devs can be found here … https://developer.blender.org/rB1d016758330b