multithreaded DLL

Sephi

Deity
Joined
Jan 25, 2009
Messages
3,321
Haven't been able to keep up with the modding community so maybe someone can give me information about current status

Are there mods that already use multithreading to reduce turn times?
Are there mods that plan to add this feature?

I currently implement multithreading for the AI in my mod and I'd like to know if others tried it before. Like how they tried it, results, discovered limitations, etc. For example if someone managed to multithread python :D I do not know enough about python so I don't even bother.

So far I multithreaded worker AI as a test (still not very optimized). List shows time needed.

1 thread used: 3.5s
2 thread used: 2.5s
3 threads used: 2.3s

4+ and more threads are slower. I have CPU with only two cores and hyperthreading, so 3 threads optimum is kinda expected.

might make a modcomp when it is done, but probably there is no demand for it anyway
 
Yes, we can multi-thread the DLL. C2C already does that for Spawns and Properties, but Koshling and AIAndy assure us that much much more can be done.

As for python, the game engine apparently has a global lock on the interpreter, so multi-threading that is NOT possible. :(

edit: Also, does this mean you are coming back to CFC?
 
Haven't been able to keep up with the modding community so maybe someone can give me information about current status

Are there mods that already use multithreading to reduce turn times?
Are there mods that plan to add this feature?
Yes, the C2C DLL has some multithreaded sections and plans to extend that in the future.

I currently implement multithreading for the AI in my mod and I'd like to know if others tried it before. Like how they tried it, results, discovered limitations, etc. For example if someone managed to multithread python :D I do not know enough about python so I don't even bother.
You can't multithread the Python version in Civ4 in a meaningful way as it has a global interpreter lock so any attempt at multithreading within it would actually only multithread the parts that are outside of Python but called by it.

So far I multithreaded worker AI as a test (still not very optimized). List shows time needed.

1 thread used: 3.5s
2 thread used: 2.5s
3 threads used: 2.3s

4+ and more threads are slower. I have CPU with only two cores and hyperthreading, so 3 threads optimum is kinda expected.

might make a modcomp when it is done, but probably there is no demand for it anyway
Well, there definitely is demand in the major mods that have large turn times but given the amount of code changes that accumulate in such a mod with time it is might be difficult to apply multithreaded modcomps.

The difficult part is to do the multithreading in a way that does not destroy the deterministic property needed for multiplayer.
 
very cool, looked at CvPropertySolver.cpp, good stuff there. have you measured speedup if you set NUM_THREADS to different values?

since there is some interest I gonna explain a bit what I have done so you can get an idea if it is useful for you or not

There are a lot of for loops in the game that look like this:

for i = 0 to x
getValue(i)

and then at the end the for loop return the maximum or the mimimum or the return values of getValue. For example worker AI you loop over all builds on a given plot and then return value of bestbuild so worker knows what to build.

I have written some code so that basically you write parallel for instead of for and everything else like managing threads and the like is done by a thread organizing class

here the full example:
old:
Code:
	for(iI = 0; iI < GC.getNumBuildInfos(); iI++)
	{
		iValue = 0;
		if(GET_PLAYER(getOwnerINLINE()).canBuild(pPlot, (BuildTypes)iI, false))
		{
			iValue = AI_PlotBuildValue((BuildTypes)iI, pPlot);
		}

		if(iValue > iBestValue)
		{
			iBestValue = iValue;
			eBestBuild = (BuildTypes)iI;
		}
	}
replaced by new:
Code:
	GC.getGame().getThreadOrganizer().setLoopObject(pPlot);
	GC.getGame().getThreadOrganizer().setLoopTarget(this);
	GC.getGame().getThreadOrganizer().parallel_loop(&CvGameAI::testParallelFor, GC.getNumBuildInfos(), iBestValue, (int&)eBestBuild);
and also new function
Code:
int CvGame::testParallelFor(int i)
{
	ThreadOrganizer &kOrganizer = getThreadOrganizer();
	CvCityAI* pCity = (CvCityAI*)kOrganizer.getLoopTarget();
	CvPlot* pPlot = (CvPlot*)kOrganizer.getLoopObject();

	if(GET_PLAYER(pCity->getOwnerINLINE()).canBuild(pPlot, (BuildTypes)i, false))
		return pCity->AI_PlotBuildValue((BuildTypes)i, pPlot);

	return 0;
}

so basically I only use it on const function that don't run python or Random number generator (workaround for that should be easy though). It's been a while that I profiled my mod, but I expect that I should be able to multithread this way code that uses atleast 50% of the turn time. (it's probably very different for C2C though)

By the way, i have written code that does test run on game start to find out optimum number of worker threads (instead of hardcoding a number). It does a test calculation with 0 to 100 threads and then uses number of threads that needs lowest time. Might want to copy that to C2C unless you have something similar planned/done

@ls612 I have really only been at the FFH and SDK forum in the past and both are kinda dead (atleast compared to the past). But I guess I take a look at C2C every now and then, lots of activity there :goodjob:
 
very cool, looked at CvPropertySolver.cpp, good stuff there. have you measured speedup if you set NUM_THREADS to different values?
Not yet but something I want to do at some time.

so basically I only use it on const function that don't run python or Random number generator (workaround for that should be easy though). It's been a while that I profiled my mod, but I expect that I should be able to multithread this way code that uses atleast 50% of the turn time. (it's probably very different for C2C though)
It will be different but I expect with the number of assets C2C has that there will be an even larger amount of turn time that can profit from multithreading.
Koshling regularly profiles with different savegames to find performance bottlenecks so he will know which part is best multithreaded (although all the caching on different levels he has added will need some extra treatment for multithreading).

By the way, i have written code that does test run on game start to find out optimum number of worker threads (instead of hardcoding a number). It does a test calculation with 0 to 100 threads and then uses number of threads that needs lowest time. Might want to copy that to C2C unless you have something similar planned/done
That would definitely be useful and I agree that making it dynamic is better than hardcoding (I was mainly going the fast way for the first tests with multithreading in the Civ environment).
 
Back
Top Bottom