Mini-engine progress

*Optimisation in progress*

Turn time of T209 on the gigantic test map is down to 1.3s-1.4s using clang-cl or ~1.6s using MSVC.

Switched out the "plot index"-based A* to coordinate-based, which is what a sane person would do if they needed to wrap coordinates.

A lot of time in pathing is spent just doing priority queue pops. It's disappointingly difficult to make a faster priority queue. Currently using a quaternary heap, which is ever so slightly faster than the std heap. I also have an SIMD quaternary vector heap which can sometimes be faster, but not by much. And an SIMD "multi root heap" which is slower.

I also tried Block A* (without LDDB), which was faster in simple pathing tests, but turned out to be slower for Civ4 pathing. Cost function overhead probably.

Think I can finally stop trying to make pathing faster and move onto other things.

I've also set Windows Terminal as the default now, finally, so maybe I could actually use all the Unicode at some point.
 
*Optimisation in progress*

Turn time of T209 on the gigantic test map is down to 1.3s-1.4s using clang-cl or ~1.6s using MSVC.

Switched out the "plot index"-based A* to coordinate-based, which is what a sane person would do if they needed to wrap coordinates.

A lot of time in pathing is spent just doing priority queue pops. It's disappointingly difficult to make a faster priority queue. Currently using a quaternary heap, which is ever so slightly faster than the std heap. I also have an SIMD quaternary vector heap which can sometimes be faster, but not by much. And an SIMD "multi root heap" which is slower.

I also tried Block A* (without LDDB), which was faster in simple pathing tests, but turned out to be slower for Civ4 pathing. Cost function overhead probably.

Think I can finally stop trying to make pathing faster and move onto other things.

I've also set Windows Terminal as the default now, finally, so maybe I could actually use all the Unicode at some poin
Sounds awesome.

My question is, what would be the turntime roundabout in the original engine ?
I guess about 30 secounds ?
 
Wow some incredible advancement in this thread recently. snowern is doing incredible work.
I really wish I could help, but C# is my thing, not C++ or Python.
Some/few mods already have a limit of 50 civs in a game or even 100. For a real simulationist game 32 is hardly enough. Such mods want to have as many civs as possible even if most them are no challenge. But that's realistic: there are big countries and small countries.
I remember a modder even tried 200 civs limit but that failed. He discovered that the final limit was around 120-130.
Am I the only lunatic whose opinion regarding more civs/bigger map is: MORE!!!!!!!!!!!!!!!!!!!! ?
When I play strategy games, I always go max map size/civilizations my computer can take.
 
I think players will be limited only by memory in my engine *(and whether the UI can handle your 128x128 relations matrix...). They will each need a largish chunk of memory for their caches. But performance might be terrible. Each player needs to calculate a cache of pathing plot props and found values, and maintain it. And some cache invalidations are broadcasted to each player: every player needs to maintain a plot danger bitmap that gets invalidated every turn by thousands of barbs. But I suppose you could solve all those problems with a little bit of war. It would be a good game if performance can only get better.
 
Last edited:
I think players will be limited only by memory in my engine *(and whether the UI can handle your 128x128 relations matrix...). They will each need a largish chunk of memory for their caches. But performance might be terrible. Each player needs to calculate a cache of pathing plot props and found values, and maintain it. And some cache invalidations are broadcasted to each player: every player needs to maintain a plot danger bitmap that gets invalidated every turn by thousands of barbs. But I suppose you could solve all those problems with a little bit of war. It would be a good game if performance can only get better.
Music to my ears, I assume the same goes for map size?

Are you a memory salesman? Because I guess I finally got the excuse I needed to buy 64gb in the near future, mwhahahahahah.

I dream of a map that makes the UEM world map look small. I always felt Civilization maps suffered from their scale, they cannot properly portray a good world map at all.
 
Hi,

@snowern,
May I asky you kindly and carefully - do you have any idea when you expect your work be ready for public ? days,weeks, months (years) ?

I guess a lot of people cant`t wait for that moment - me too :-)
 
Last edited:
Music to my ears, I assume the same goes for map size?
1040x640 is enough for everybody! Might be too much for me though. I end up feeling disoriented on Huge maps because everything looks the same... I'm thinking location hotkeys would be nice.

And then you have maze maps. How about a 1040x640 maze. You'll be there for weeks trying to win domination (you'd also have a super marathon game speed so you have enough time to meet everybody).

Now, found value calculation. I've overhauled the SIMD code to be more generic, more sane, and actually tested. And found value calculation has been re-verified for 1040x640 for AVX2, AVX512, MSVC, and clang.

Calculation does 16 or 32 found values at a time. Bottom half of the procedure uses 32-bit ints, so 32x will double-pump AVX512.

Timings in milliseconds, single-threaded, *after caching inputs:
Code:
            AI_foundValue          computeFoundValue_Vectorised
          avx512      avx2  avx512 32x  avx512 16x  avx2 32x  avx2 16x
msvc     1251.99   1245.88      186.62      124.12    155.38    170.21
clang-cl  997.49    966.29        7.69       10.38     15.93     15.05
       x    1.26      1.29       24.25       11.96      9.75     11.31

24x faster than MSVC. Clang is great at punching through my levels of abstraction. What is MSVC even doing. Did I do something wrong. The generated code is filled with calls instead of clang's beautiful wall of AVX.

Multithreading is not so great for the vectorised case. Only about a 2x improvement.

May I asky you kindly and carefully - do you have any idea when you expect your work be ready for public ? days,weeks, months (years) ?
Soon. One day. I really should stop optimising... making sure at least a normal game works would be great.
 
Last edited:
1040x640 is enough for everybody!
Bill Gates said:
640K ought to be enough for anybody.
😅


I end up feeling disoriented on Huge maps because everything looks the same..
Mods need bigger maps not the base game. Mods with more terrain types and features, even natural wonders, and of course better mapscripts are not like that.
 
What would happen if one tries to run your code on a CPU that does not support AVX512?
It would be the AVX2 build. I'm on Alder Lake, so I shouldn't even have AVX512!

Controlled by compiler switch:
C++:
#if __AVX512F__
    inline constexpr bool kEnableAVX512 = true;
#else
    inline constexpr bool kEnableAVX512 = false;
#endif

Even for 128/256-bit vectors, that enables use of mask regs and useful instructions.

And another benefit of clang: It complains if you use AVX512 in an AVX2 build.
 
1040x640 is enough for everybody! Might be too much for me though. I end up feeling disoriented on Huge maps because everything looks the same... I'm thinking location hotkeys would be nice.
Narrator: It wasn't enough for everybody.

And then you have maze maps. How about a 1040x640 maze. You'll be there for weeks trying to win domination (you'd also have a super marathon game speed so you have enough time to meet everybody).
Oh yeah, a Super Marathon game speed would definitively be required.

You could also argue for the need for a movement speed multiplication setting, otherwise wars would last foreveeeer once empires are big enough.
 
You could also argue for the need for a movement speed multiplication setting, otherwise wars would last foreveeeer once empires are big enough.
I wouldn't like that. I prefer Epic and Snail game speeds. Yes, it takes much longer but I like it that way in my mod :)
 
@snowern a question: How are tiles implemented in your engine? Do you think it would be possible to add an alternate implementation? Such as hexes, or SMAC-esque dynamic voxel terrain?
I wouldn't like that. I prefer Epic and Snail game speeds. Yes, it takes much longer but I like it that way in my mod :)
I don't mind slower speeds, I like long games myself.

I always thought 4X games like Civ had an issue with time-space scale myself. Armies are too slow, wars take too long, years go by too fast, especially in the early game. It's not a big thing, but it just bugged me. SMAC is better in that regard.
Also, since I started playing Pdox games, I realized the way they do diplomacy and ownership is just superior. Take Territory = Own Territory is a bad implementation. 4X's diplomacy implementations leans too much towards "Winner Takes All" and "Total War" types of game loops, but that's not how things happened historically, especially in pre-modern times.

(honestly I love 4X games but my big issue with them is that people are still cargo culting Civilization and MOO after decades, when we should be innovating. There are features from games like Call to Power, SMAC and Space Empires IV that weren't ever picked by their sucessors, or barely-reused)
 
@snowern a question: How are tiles implemented in your engine? Do you think it would be possible to add an alternate implementation? Such as hexes, or SMAC-esque dynamic voxel terrain?
For hexes, what you'd do I think is just create a whole new renderer. And reuse some components like city billboards and symbols.

It is something I've wondered about. Take the hexes from modern civ and just force civ 4 onto a hex grid. But, not very easy. Maybe you'd need new map scripts. But at least, a lot of things in the DLL appear to be grid-agnostic.

Could probably do it with graphical civ 4 too. The art assets weren't made for it, but it could work.
 
I always thought 4X games like Civ had an issue with time-space scale myself. Armies are too slow, wars take too long, years go by too fast, especially in the early game. It's not a big thing, but it just bugged me. SMAC is better in that regard.
Yeah, but SMAC's turns represent weeks or months, not decades, rigth?
Also, since I started playing Pdox games, I realized the way they do diplomacy and ownership is just superior. Take Territory = Own Territory is a bad implementation. 4X's diplomacy implementations leans too much towards "Winner Takes All" and "Total War" types of game loops, but that's not how things happened historically, especially in pre-modern times.
How is it in Paradox games? And could those be used in Civ4, if possible at all?

(honestly I love 4X games but my big issue with them is that people are still cargo culting Civilization and MOO after decades, when we should be innovating. There are features from games like Call to Power, SMAC and Space Empires IV that weren't ever picked by their sucessors, or barely-reused)
Care to explain? I'm always looking for ideas for my mod what to add/improve/implement, if I can :)


Maybe you'd need new map scripts.
Maybe Civ5-6-7 map scripts could be used/adapted but AFAIK those are not as good as the ones written for Civ4 by some modders.
 
Maybe Civ5-6-7 map scripts could be used/adapted but AFAIK those are not as good as the ones written for Civ4 by some modders.
There are still only 2 dimensions for hexes, so most things except those hardcoded with squares in mind (rivers and city working range-related) would still work if you changed rules of movement and range calculation between them. In Snowern's words:
a lot of things in the DLL appear to be grid-agnostic.
 
Back
Top Bottom