Topic: Performance Profiling

I'm running out of things I can do on my netbook while commuting, as my netbook currently still renders at 2-4fps. This might be a graphics driver issue, but I had a look at what I can do to fix this and I couldn't find anything.

So I've begun profiling Maratis, to see if I can figure out exactly what it is that is causing it to slow down.

I have uploaded a sample frame to Pastebin. I've not profiled the entire engine yet, as you can see, but it's already shown some interesting things. For one thing, MaratisPlayer::graphicLoop is only taking 72ns (1%) and MaratisPlayer::logicLoop is taking 3us (22%). Also strange is that MLevel::getCurrentScene takes a huge 2us (14%)

The current profiles are done using my fork of Maratis but I've made only minor changes to Maratis, which I don't believe affect performance at all.

The level loaded is just the demo cube, one light, and a camera.

I will add more profiles when I've added more functions to the profiler, and if I've figured out some ways to potentially optimise Maratis. If anyone can see something I've missed, feel free to shout smile

Re: Performance Profiling

Hi, that's interesting, a very good idea.

One thing about recording the drawing time is that graphicLoop will only show you how much time sending the data is taking, to have also the rending time of the graphic card you have to include the swap buffer call.

Did you try with a bigger scene, the total time of the frame I believe is 8 ms which is vey short ?
getCurrentScene 14% can seem strange, but 2 us is nothing.

About your netbook, did you try testing another opengl engine or a demo ?

Re: Performance Profiling

I used to play Minecraft on it. True, it wasn't totally smooth, but it was definitely a playable framerate.

Those values are still a very initial set of profile tags. I've been going through MEngine alphabetically and adding a tag to each function. I think I will leave it at MEngine, Common and Player for now, it should give me enough information.

I was wondering if it would if it would count the actual rendering time or not, I assume there's a yield somewhere that will wait for the actual draw? Maybe that would explain the discrepency with timing...

Re: Performance Profiling

Done another run at profiling. I have MEngine and MaratisPlayer fully tagged, as well as MContexts and MRenderers. I'm still missing ~300ms somewhere according to the profiler.

However, I did find out that the swap buffer calls are taking 50+% of the time, so I guess it is the rendering that's taking a while. That's sort of disappointing because it means that it's unlikely I can fix it tongue

It's just occured to me that I'm flushing the profiler once every logic loop. Maybe if I do it at the end of the while loop in main... yeah that gave a totally different log. I'm getting 89% in MGLContext::deleteQuery...being called 4284 times. Weird.

Re: Performance Profiling

::deleteQuery is strange, because it is suposed to be called only in the destructor of MStandardRenderer.
It is used for occlusion query, and it make me think that it could be a component that can slow your computer,
as occlusion query is a recent feature not always supported.

Is Maratis also slow in your machine when you are using the FixedRenderer ? (for example Sponza fps demo)
FixedRenderer is not using shaders nor occlusion query.

Re: Performance Profiling

That's even stranger then... because my netbook doesn't support Standard Renderer... checking through the other things that the profiler is spitting out, I'm getting calls from both StandardRenderer and FixedRenderer... I wonder why that's happening, the project is even asking to use FixedRender

Re: Performance Profiling

I tryed to put a printf on both the renderer and tested with various examples, it never seemed to run at the same time on maratis, but I didn't try on maratisPlayer. It seem hard to happen as there is only one renderer in MEngine.

The only thing that could explain it if it really happen would be the system switching the renderers all the time...
Check how much time MEngine::setRenderer is called, it should be called only one time per project loading.

Re: Performance Profiling

I've just taken a look, and setRenderer is called 3 times. Time to figure out why...

Re: Performance Profiling

I've figured out what was going on. The MaratisPlayer constructor called start() which was called a little later in restart(), both of which called MEngine::setRenderer, using MStandardRenderer. A little later the project is loaded and calls MaratisPlayer::changeRenderer. In my case this uses FixedRenderer.

I've changed the default to FixedRenderer for now, and removed the call to start() from the constructor (because it's being called shortly afterwards) everything seems to run fine, and I've lost all calls to Standard Renderer stuff from the profiler. I still have 219ms unaccounted for. I've got no idea where that's gone... the top time now is MObject3d::updateBehaviors (77us) followed by MObject3d::computeChildsMatrices (70us)

I think I'm going to put a debug render in MProfiler to display an FPS in the top right corner of the screen.

Re: Performance Profiling

I just made it cough out the FPS to the stdout. I didn't realise that Maratis only stopped rendering when it didn't have focus, but keeps updating. I get ~4fps when the window has focus, 60+fps when it doesn't. I guess that proves that it's definitely the rendering. Ahh well. That means I either need to kick my netbook into not falling over constantly, or upgrade...

Re: Performance Profiling

mh, that's bad because you probably have ok directx drivers, they should provide at least a decent opengl 1.2 drivers...

Re: Performance Profiling

Erm, I'm running linux, so I wouldn't know about DX tongue
I've checked glxgears and get a consistent 60fps. glxinfo gives my version as 1.4, Maratis also claims I have 1.4 support. I also checked that I have direct rendering.

Re: Performance Profiling

Ho, you are on linux ok,
on my desktop computer I have a (only 3 years old) Nvidia good 3d card,
but it's not supported on the new linux, because nvidia refuse to update their drivers for the new xfree.
Technically I could have it accelerated by installing an old linux using an old xfree.

If glxgear works good for you, a non-textured cube should work fine in Maratis. If not, the cause could be a particular opengl instruction, even not used but activated : Alpha-blending, alpha-testing, mip-mapping... something like that maybe.

Did you check your mesa driver ?

Re: Performance Profiling

The only cards I have available to test on are an intel 945 chipset with integrated graphics... and I now have had to downgrade my PC from a 4 year old ATI to a much older ATI because my card burnt out sad Before it did, I seem to remember my PC struggling on Linux, and I definitely had 3D support on there working...

At lunch, I can pastebin all of glxinfo and I can take a look at what the issue might be.
Edit: My glxinfo

Re: Performance Profiling

I've kept looking at what might be causing the problems here and I am still getting ~200ms missing time which is really confusing as I've tagged the main loop so if anything, it should all fall under that. At one point, I thought it could be maybe a threading issue, but then again, waiting on a thread to complete execution should _still_ show up...

Out of curiosity, I stuck a return in the top of draw() in main.cpp and... I got unprofiled time 1ms, 1000FPS. Moving the return down the function confirmed that it's definitely the render loop that's causing it. I'm still a bit confused why it's not showing up in the profiler at all though

Re: Performance Profiling

It's hard to profile the 3d card rendering in detail, because we don't know how the card execute the rendering exactly. OpenGL just send orders, and the sending itself is not always the bottleneck (exept when using very big dynamic meshs).

The 200ms missing can also be the system been late because of the slow rendering ? inputs waiting list etc.

Re: Performance Profiling

But surely if it takes time to wait for the GPU at all, that time would show up on the CPU side? I would assume in something like a flush or swap buffer or something. What it looks like to me is that it's losing the time in between the loops of while(true) somehow... but I'm not sure how that could happen.

I am going to see if I can steal someone's time today to take a look at it, it's probably something relatively simple I'm missing out, but I'm just not seeing it. I would really like to tune Maratis to run a bit smoother, but I can't really optimise blind like this tongue

Re: Performance Profiling

Right. So I feel a bit silly. I got the simple scene, with just a cube, running at more than 4fps. It's actually at 60fps now. Woo!

You were absolutely correct, it was a GL feature that my graphics chip didn't support. In particular it was point sprites... so yeah, the thing that I added for particles. It means I've got to be able to fallback on using billboard sprites rather than point sprites if possible. I will see if I can get it to check whether GL supports point sprites natively before enabling them...

The strange thing is, I've tried to load a more complex scene and I get an infinite update... I guess it's probably something to do with the lighting. I think it would be nice to be able to switch between a standard renderer scene and a fixed renderer scene...

Re: Performance Profiling

Hi, good, I guess the GL Rendering context is a bit too brut force as it doesn't check if something is supported before enable it, there is some cleaning to do on this.

Im not sure to get the infinite update thing you said. I taught your laptop only support fixed renderer anyway ?
What fonction exactly runs slow ?

Re: Performance Profiling

Heh, I just realised that the scene was trying to use the standard renderer, so I changed that in the mproj and the scene loads now (well, after about 5 minutes parsing the xml meshs...) and I'm down to... 2fps... and once again, it's not in any profilable part of Maratis, which suggests OpenGL again...

I realise that an Intel GMA 945 netbook might not be the target machine for Maratis, but I think I would like to try and get a reasonable framerate on my netbook.

So continues my trek into profiling and optimising.

First, a couple of things I may be using in the scene that might cause the slowdown. First of all lights. The scene is pretty full with them. I can replace them with pre-generated lightmaps, but I would like for less-useless machines to still use dynamic lights where possible... so maybe having it so you can specify a lightmap, which only gets used in the fixedrenderer, and lights which only get used in the standardrenderer? Next, poly count. I will have to do some profiling where I change the amount of lights in a scene, keeping the polys the same, and then just add more polys, but I think reducing polys is always good anyway. The obvious solution to this would be to add a LoD system, and on less powerful machines (potentially exposable to players via a "detail" slider in a menu) it would use lower polygon models.

Re: Performance Profiling

Also, check some stuff that can make it slow like :
- power of two texture
- mipmap
- mesh export if you are using the 3dsmax exporter, be aware that it is still less optimized than the Blender exporter (vertices are duplicated)

I'm thinking about something that could slow your machine, in MGLContext.cpp look at "sendTextureImage"
your machine probably doesn't support "glGenerateMipmapEXT" :
you'll see a commented section using "gluBuild2DMipmaps" (what I was using before),
I suggest you try removing glGenerateMipmapEXT to see if it is faster, if it is we should use gluBuild2DMipmaps if opengl < 2.0

Re: Performance Profiling

Thanks for these smile I'll try and take a look at what it might be.

Re: Performance Profiling

I finally got around to switching to gluBuild2DMipmaps and I'm still on about 2fps, so I guess it's not that tongue It's a shame that I really can't profile OpenGL at all to see what might be causing the problem. I'll try taking lights and  things out... see if I can figure it out... based on the fact that I can get a standard simple scene running at 60fps, I should be able to figure out which feature I'm using that's causing the problems...

Re: Performance Profiling

I found that after doing some research :
it can trace opengl calls apparently, to be tested ?

Re: Performance Profiling

I am just grabbing gDEBugger to take a look at that tonight. I had a run through Sponza with the profiler, I was getting 6fps so I guess it isn't necessarily the lighting that's causing the slow performance, probably the large amount of polys. I guess this is a reasonable stress test tongue
I will do more investigation into why this might be anyway.

Just a quick question though, where do you think the best place to "pump" the profiler would be? I've got different readings when I put it in different places. For example on Sponza, I get ~300fps if I put it at the end of  MaratisPlayer::LogicLoop, if I put it at the end of the while(true) loop in main() then I get 6fps, and if I put it in draw() then I get 8fps