Perfromance

For topics that do not fit in another sub-forum.

Moderators: Oberlus, Oberlus

Post Reply
Message
Author
yandonman
Creative Contributor
Posts: 699
Joined: Thu Aug 30, 2012 12:32 am

Perfromance

#1 Post by yandonman » Mon Sep 03, 2012 10:40 pm

http://www.youtube.com/watch?v=0jxX2-EY5eI

I apologies for any negative tone in the commentary - it was early and I didn't have any caffeine in me :P
Code released under GPL 2.0. Content released under GPL 2.0 and Creative Commons Attribution-ShareAlike 3.0.

User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 12478
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: Perfromance

#2 Post by Geoff the Medio » Mon Sep 03, 2012 11:04 pm

Your voice is actually quite pleasant. You wouldn't be the worst Let's Player commentary I've heard.

What are your computer stats? Is it a single / dual / quad core? RAM?

If you don't have a great system, try turning off the galaxy gas and starfields rendering in the options menu galaxy map tab. You can also enable FPS limiting and set it to a low number. It doesn't quite work as advertised, but will reduce CPU use, which I suspect is mostly rendering-related.

Turn update times would probably improve if you had fewer AIs and a smaller galaxy, or more CPU cores.

I'd also question whether running fraps has any effect on the performance, though I assume you had the hangs without in in order to inspire the video making.

There are a few issues like you had with the fleets window moving the bottom right and refusing to reposition. I also get an issue when resizing and then moving the main panel on the design screen, in which it won't ever move again, always tries to resize when dragged.

The producible items list scroll reset has been fixed in the latest SVN.

The production auto-boot thing is annoying, but can be worked around in the latest version by disabling all sitrep messages. It only does the boot when a new sitrep is available to show.

Also, as I suggested earlier, turn in the verbose logging and have a look at the freeorion.log file "time:" lines, which should give some indications of what's taking the longest when doing various UI transitions and turn updates.

yandonman
Creative Contributor
Posts: 699
Joined: Thu Aug 30, 2012 12:32 am

Re: Perfromance

#3 Post by yandonman » Tue Sep 04, 2012 4:31 am

I'm running: x86, Quad 2.4 GHz CPU, RAM: 4 GB, NVidia GeForce 8800 FT - Not the beefiest machine these days, but not really a slouch. Sins of Solar Empire, STO, Diablo III all play real nice and a lot more seems to be going on in those games.


Fraps does have some effect, of course, but it's usually not a significant factor, unless there's a lot of anti-aliasing going on. You'll have to take my word for it though :) (The double mouse cursor in the vid when moving the mouse only showed up when Fraps was running, but the double mouse pointer on pushing a button was their prior to Fraps)


Xperf would be the next thing to do for performance profiling - it'll get right to the source (stack trace and all). Though I'll get to the debug logs as well.

Your voice is actually quite pleasant. You wouldn't be the worst Let's Player commentary I've heard.
Hehe - face for radio, as they say ;) I'll getcha some promo vids up soon...
Code released under GPL 2.0. Content released under GPL 2.0 and Creative Commons Attribution-ShareAlike 3.0.

User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 12478
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: Perfromance

#4 Post by Geoff the Medio » Tue Sep 04, 2012 4:52 am

yandonman wrote:I'm running: x86, Quad 2.4 GHz CPU, RAM: 4 GB, NVidia GeForce 8800 FT - Not the beefiest machine these days, but not really a slouch. Sins of Solar Empire, STO, Diablo III all play real nice and a lot more seems to be going on in those games.
It's very strange that you get such drastic hangs then... I'm building and running on 2.4 GHz Core i5 laptop with 8 GB RAM, and there is nowhere near that much slowdown. The video card is, if my googling is right, about 6 years old, and might have poor OpenGL support, which wouldn't affect most commercial games, though that wouldn't explain the delays when opening / closing screens or processing turns. Maybe turn turning off the starfield rendering, galaxy gas rendering, and try turning off optimized system rendering (which might be sub-optimal for you) and the scanline shaders options, all on the galaxy map tab.

Dart00_Tech
Space Kraken
Posts: 143
Joined: Sat Jun 11, 2011 1:27 am
Location: Modesto, CA USA

Re: Perfromance

#5 Post by Dart00_Tech » Fri Sep 07, 2012 8:38 pm

Hey yandonman, Could you post your saved game? I wana try it on my machine and see if I get same or better lag? Also what version/revision were you running?

My specs:
- Windows 7 64-Bit
- AMD Phenom II X64 960T (Quad) (3.00Ghz)
- 8.00GB DDR3 RAM
- ATI Radeon HD 4200 (256MB DDR2)

If i could get a link to Fraps I could record a similar video.

yandonman
Creative Contributor
Posts: 699
Joined: Thu Aug 30, 2012 12:32 am

Re: Perfromance

#6 Post by yandonman » Sat Sep 08, 2012 5:26 am

Attaching save game file... (you'll need 7zip to unzip it)

I am running v0.4.1 (SVN: 5096).

My impression is that it's tied to the number of ships on the board, as in my last game was non-laggy until I started building fleets. This is anecdotal, and so not to be taken as truth, but maybe it points us in the right direction.

Fraps (http://www.fraps.com/) is $37 - and I honestly think it's worth it if you're going to do any game recordings.
Code released under GPL 2.0. Content released under GPL 2.0 and Creative Commons Attribution-ShareAlike 3.0.

yandonman
Creative Contributor
Posts: 699
Joined: Thu Aug 30, 2012 12:32 am

Re: Perfromance

#7 Post by yandonman » Sun Sep 09, 2012 8:06 am

(brain dump for the night)

~15% constant CPU usage by FreeOrion. Haven't dug into the top 3 that were accounting for ~9% constant CPU.
The attachment CPU_Performance.jpg is no longer available

UI Hangs are caused by too much processing on the main (UI) thread, which does appear to be a function of at least the number of ships.

I thought it was going to be Universe.cpp's Line 1111 debug statement, but commenting it out didn't make any perceptible difference.

Code: Select all

for (ObjectMap::const_iterator it = m_objects.const_begin(); it != m_objects.const_end(); ++it) {
       Logger().debugStream() << "... object (" << it->first << "): " << it->second->Name();
      

Of note, "Logger().debugStream() <<" seems to flush to disk. Disk IO should generally be avoided on the UI thread. Shouldn't this be

Code: Select all

Logger.debug(...) 
as this can take advantage of flushing to disk on a background thread?
Attachments
CPU_Performance.jpg
CPU_Performance.jpg (207.73 KiB) Viewed 1531 times
Code released under GPL 2.0. Content released under GPL 2.0 and Creative Commons Attribution-ShareAlike 3.0.

User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 12478
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: Perfromance

#8 Post by Geoff the Medio » Sun Sep 09, 2012 8:53 am

yandonman wrote:~15% constant CPU usage by FreeOrion. Haven't dug into the top 3 that were accounting for ~9% constant CPU.
If you're suggesting that evaluating conditions is consuming 9% CPU constantly, that's unlikely. The conditions are not evaluated each frame; they are only evaluated when something changes that prompts a re-execution of effects or testing of what objects are matched by a condition (such as changing focus, rearranging ships in fleets, switching the shown fleets or system, or manipulating the production queue or list of producible items), and (extensively) during turn processing. They may end up taking an average of about 9% of time when playing through several turns, but it shouldn't be constant.
UI Hangs are caused by too much processing on the main (UI) thread, which does appear to be a function of at least the number of ships.
Yes, this is expected. This could partly be worked around by moving turn data unpacking to a separate thread, though some of the processing, such as after an action that prompts a mid-turn effects update, needs to be done before updating the UI.
I thought it was going to be Universe.cpp's Line 1111 debug statement, but commenting it out didn't make any perceptible difference.
That's in the effect updating code, which isn't (or at least shouldn't be) running every frame, as discussed above.
Of note, "Logger().debugStream() <<" seems to flush to disk. Disk IO should generally be avoided on the UI thread.
Perhaps, but as you note, the non-verbose logging generally doesn't impact the hangs that much. Generally such logging with be O(N) where N = number of objects being processed, while evaluating conditions like ContainedBy or WithinStarlaneJumps might be O(N^2) depending how they're written.
Shouldn't this be

Code: Select all

Logger.debug(...) 
as this can take advantage of flushing to disk on a background thread?
I don't know enough about how log4cpp works to know if there's a difference in immediacy of writing out to the log file between .debug and .debugStream. Why do you think that / where is it documented that there is?

yandonman
Creative Contributor
Posts: 699
Joined: Thu Aug 30, 2012 12:32 am

Re: Perfromance

#9 Post by yandonman » Sun Sep 09, 2012 6:48 pm

I caught Universe::GetEffectsAndTargets() going off on a merry ride with a target_objects filled with more than a million objects - on the UI thread, which seems to be causing a hang.
GuiHang090912_maybe.jpg

What is the intent of GetEffectsAndTargets()?
Code released under GPL 2.0. Content released under GPL 2.0 and Creative Commons Attribution-ShareAlike 3.0.

User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 12478
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: Perfromance

#10 Post by Geoff the Medio » Sun Sep 09, 2012 6:55 pm

yandonman wrote:What is the intent of GetEffectsAndTargets()?
It determines what effects will act on what target objects, before the effects are actually executed on those objects. Effects make planets have resource outputs, stealth, target population, supply ranges, and other properties, and make ships have fuel, shields, and speeds, and other properties, and make monsters move, and other such things. On the client, effects are re-run when you change something like a focus setting that might alter things like the resource output or target population, in order to determine what the new values are.

How have you generated over 4 million object ids? In a 400 turn game, that would be 10000 new (or rather more) objects on average each turn. In a 500 system galaxy, that'd be 20 objects per turn per system... Which I guess isn't completely unreasonable later in the game, but is still quite a lot...

Oh wait, you're looking in the MSVC debugger. That's notoriously unreliable, and likes to make random large number guesses when it can't find the actual contents of a variable. Have a look in your log files with verbose logging on for lines containing "ObjectMap contains UniverseObjects:" and see what the largest object ID is in the list of objects after that.

yandonman
Creative Contributor
Posts: 699
Joined: Thu Aug 30, 2012 12:32 am

Re: Perfromance

#11 Post by yandonman » Sun Sep 09, 2012 7:16 pm

I don't know enough about how log4cpp works to know if there's a difference in immediacy of writing out to the log file between .debug and .debugStream. Why do you think that / where is it documented that there is?
Digging into log4cpp more, it doesn't appear to matter if you're using debug() or debugStream() - they both eventually call the Category::log() function.

CategoryStream.cpp:

Code: Select all

    CategoryStream::~CategoryStream() { 
        flush();
    }

    void CategoryStream::flush() {
        if (_buffer) {
            getCategory().log(getPriority(), _buffer->str());
            delete _buffer;
            _buffer = NULL;
        }
    }
    
    CategoryStream& CategoryStream::operator<<(const char* t)
    {
       if (getPriority() != Priority::NOTSET) {
          if (!_buffer) {
             if (!(_buffer = new std::ostringstream)) {
                // XXX help help help
             }
          }
          (*_buffer) << t;
       }
       return *this;
    }
Minor performance note, the use of debug() would prevent string concatenation as it evaluates if it's going to log before it concatenating the strings. With debugStream(), it concatenates the strings together first, then decides if it needs to log.


To get threaded file write support, a ThreadedFileAppender class would have to be written. At least I can't find one that exists. Be that as it may, I'm betting this is not the biggest performance hit at the moment...
Code released under GPL 2.0. Content released under GPL 2.0 and Creative Commons Attribution-ShareAlike 3.0.

User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 12478
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: Perfromance

#12 Post by Geoff the Medio » Sun Sep 09, 2012 7:24 pm

yandonman wrote:To get threaded file write support, a ThreadedFileAppender class would have to be written. At least I can't find one that exists.
There are various thread-related things going on already in the log4cpp code... I'd have thought that, if not already automatically enabled, threading wouldn't require a large rewrite to enable.

yandonman
Creative Contributor
Posts: 699
Joined: Thu Aug 30, 2012 12:32 am

Re: Perfromance

#13 Post by yandonman » Mon Sep 10, 2012 12:34 am

NVidia's OpenGL support (nvoglv32.dll [v: 8.17.12.9619] - NVIDIA Compatible OpenGL ICD) appears to be the CPU culprit on my system.
CPU_Performance_NVideaOpenGL.jpg
CPU_Performance_NVideaOpenGL.jpg (148.59 KiB) Viewed 1505 times
(let xperf run for a longer sampling)
Code released under GPL 2.0. Content released under GPL 2.0 and Creative Commons Attribution-ShareAlike 3.0.

User avatar
Geoff the Medio
Programming, Design, Admin
Posts: 12478
Joined: Wed Oct 08, 2003 1:33 am
Location: Munich

Re: Perfromance

#14 Post by Geoff the Medio » Mon Sep 10, 2012 12:37 am

Try disabling / adjusting graphics and video options as suggested earlier.

yandonman
Creative Contributor
Posts: 699
Joined: Thu Aug 30, 2012 12:32 am

Re: Perfromance

#15 Post by yandonman » Mon Sep 10, 2012 3:25 am

I did try those and none of those made a perceptible difference to CPU. Apparently, NVIDIA just likes chewing on a CPU core. :/ I guess I'm going to put that to the side for a bit at concentrate on other perf issues.
Code released under GPL 2.0. Content released under GPL 2.0 and Creative Commons Attribution-ShareAlike 3.0.

Post Reply