[Dev] Performance Improvements Thread

Puppeteer · Nov 27, 2021

Flintlock said:
By the way, do you have any plans for the Civ3Map class beyond what it's currently being used for?

No, not anymore. I was using it for TempTiles but decided it was better to pull TempTiles out of the repo than try to keep it, C7, and Civ3Map synced up. I'll just pull the scene and LegacyMap from an older commit into a different utility repo.

I was surprised when you first said this, but no, I wound up deleting or commenting out pretty much everything but the tile indexing, and there is really no need for it to even be its own Node2D anymore.

I'm not sure I touched MapView...yep, every single character of that file was committed by you as of right now, so nothing to merge. All I had to do was feed it a different TileSet and map tile index. And I have my fingerprints all over GameData and GameMap, but I don't think I changed anything substantial. Just removed map generation code which isn't sophisticated enough to work with all the base terrain tiles and not nearly close to being ready for it. (You know about git blame, right? Code view, "Blame" button on upper-right...love it. `git blame <file>` on the command line.)

Flintlock · Nov 28, 2021

You hadn't changed MapView in Development but I had changed it in the ExperimentsWithMapView branch. I expected merging the old version into the newer one would have produced a conflict, or at least been noted somewhere if Git was going to make the merge work by reverting the file. I don't understand why the file was not even mentioned.

Puppeteer said:
You know about git blame, right? Code view, "Blame" button on upper-right...love it. `git blame <file>` on the command line.

Neat, I'll try to keep this in mind. I've heard of Git blame before but I don't think I've ever used it.

Puppeteer · Nov 28, 2021

I just noticed something weird in Civ3Map.cs that it says I added. I have no idea where this came from; what kind of fat fingering led me to this? https://github.com/C7-Game/Prototyp...efbb13bd39e4d296416/C7/Civ3Map/Civ3Map.cs#L51

Middle two lines are useless:

Code:

                if(TileIDLookup[tile.ExtraInfo.BaseTerrainFileID,1] == 0) { LoadTileSet(tile.ExtraInfo.BaseTerrainFileID); }
                var _ = TileIDLookup[tile.ExtraInfo.BaseTerrainFileID,tile.ExtraInfo.BaseTerrainImageID];
                Map[tile.xCoordinate,tile.yCoordinate] = 0;
                Map[tile.xCoordinate,tile.yCoordinate] = TileIDLookup[tile.ExtraInfo.BaseTerrainFileID,tile.ExtraInfo.BaseTerrainImageID];

Just FYI, but those weren't there before my last merge, and they don't seem to be actually doing anything now. Just thought I'd mention it if you're copying/moving that functionality elsewhere. I won't try to patch it since you may be in there now.

Wild guess is that I accidentally clicked on one of those VSCode "code action" suggestion buttons. I did that intentionally today out of curiosity and it did something really really stupid. (Added a copy of a parameter I already defined into a function declaration causing an error.)

Edit: As for merging, git is pretty smart, and it knows the commit history and timestamps of everything, so presuming MapView in Development wasn't changed after you branched off, git knows that and fast-forwards (or squashes) the older commit into the new. Even though Development HEAD was ahead of your branch, that particular file was behind. git is awesome.

Edit 2: I also just noticed that the top of our new map seems to have lighter water at the North pole but not at the South pole. That's a new game, so I don't have a way to double-check it without playing it a while, but I found an old save of another game and it doesn't do that light edge at either pole. It kinda looks like a hack I did with the earlier code when I had no top or bottom tile to reference, but that shouldn't even be possible in this new code as we're not bitmasking tile selection anymore.

Incidentally, that save is seed 1234567, small map, default (middle) map settings.

Quintillus · Nov 28, 2021

Git blame can be really useful. One of my colleagues takes a less pessimistic view than Linus Torvalds, and aliases it to "git credit", and I've taken to doing the same on machines where I use Git primarily via the command line.

Subversion and Mercurial call the same functionality "annotate", which you'll see in some IDEs as well.

That is one of the cool things about working with a new team, there are always opportunities to learn new ways of doing things.

I'm glad to hear the map refactor is merged! I'll be pulling that code and rebasing my city branch on it. Sounds like there was a lot of progress in general, though I'll try to avoid spilling that over into this thread.

Quintillus · Nov 28, 2021

Puppeteer said:
Incidentally, that save is seed 1234567, small map, default (middle) map settings.

I got the map generated locally (70% Continents), but am having trouble figuring out how to run the Program.cs script to generate the JSON map. Probably because I'm not a C# guy. I found some documentation that said Run -> Start without debugging should work, but when I do that, it auto-generates a .vscode/launch.json file at the top level of the repo that contains:

Code:

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": []
}

The output window doesn't show anything, and the only problems are in our existing code base (unused variables). So I'm not sure why it's just generating a JSON file when I'm trying to run it. Doesn't seem to matter if I choose .NET 5+/.netcore, or .NET 4.

I also tried loading the .csproj in Visual Studio 2019 (not VSCode). Trying to run it there, I get slightly more output:

Code:

NuGet package restore failed. Please see Error List window for detailed warnings and errors.
1>------ Rebuild All started: Project: BuildDevSave, Configuration: Debug Any CPU ------
1>C:\Development\Civ Related Projects\Prototype\_Console\BuildDevSave\BuildDevSave.csproj : error NU1105: Unable to find project information for 'C:\Development\Civ Related Projects\Prototype\C7GameData\C7GameData.csproj'. If you are using Visual Studio, this may be because the project is unloaded or not part of the current solution so please run a restore from the command-line. Otherwise, the project file may be invalid or missing targets required for restore.
1>C:\Development\Civ Related Projects\Prototype\_Console\BuildDevSave\BuildDevSave.csproj : error NU1105: Unable to find project information for 'C:\Development\Civ Related Projects\Prototype\QueryCiv3\QueryCiv3.csproj'. If you are using Visual Studio, this may be because the project is unloaded or not part of the current solution so please run a restore from the command-line. Otherwise, the project file may be invalid or missing targets required for restore.
1>Done building project "BuildDevSave.csproj" -- FAILED.
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========

At least it kind of makes sense that it wants things to be in a .sln file. But I'm still suspecting that there must be an obvious way to get it to work in VSCode that I'm missing.

It's also interesting looking at the JSON file, seeing what's in it. I didn't even have to fire up my hex editor to read it! At some point we'll have to make some decisions about how much to put in e.g. unitsOnTile - do we store all the units on a tile there? Do we have a separate top-level units area under gameData -> map? I haven't thought through it, so I don't know the answers. (This is also intended as a "keep in the back of the mind" thought, not a "let's think about this now" thought)

Puppeteer · Nov 28, 2021

For the console stuff I've been using dotnetcore, so cd to /_Console/BuildDevSave , and then `dotnet run`. I have it hard-coded to read a file from my system, though. Easy enough to change in Program.cs , but if you look there you'll see all it does anymore is call ImportCiv3.ImportSav and then C7SaveFormat.Save .

Quintillus · Nov 29, 2021

It looks like all it took was you replying for it to start working. I tried 'dotnet run' shortly after posting, after being reminded of its existence in documentation (I think I used it back in 2019 when I was brushing up on C# for a potential assignment, but had since forgotten it). But it didn't work, and gave an error that I didn't have the right targeting pack. So I downloaded it from the page the error message pointed me to, tried again, same error. Used the Visual Studio Installer to add targeting packs, no dice, despite starting new command prompts and a Visual Studio special command prompt. Decided to give up and watch Rocky instead. Now that I try it again, for some reason, it works. My guess is some part of the install process wasn't 100% finished yet, but I'll give the credit to your post.

I am currently using .NET 4.8 instead of the 5.0 it was using before, since Microsoft's documentation and the VS Installer was pointing me to .NET 4.x targeting packs. I'll try again later with 5.0, but for now I'm just happy it's working. Switching the paths was the easy part, convincing .NET that it was okay to run was the hard part.

Puppeteer · Nov 29, 2021

Yeah, it's really messy jumping between SDKs, but I keep going to dotnetcore/dotnet because I have it at least a little bit figured out. I know `nuget restore` for the Mono side but not really anything else. I like simple command line stuff for a lot of my tasks.

I should really just figure out how to Mono on the console and make sure everything is on ... if not the same SDK then mono 4.72 or whatever Godot Mono 3.4 is using. Right now the non-Godot libraries are on netstandard2.0 for compatibility between the dotnetcore and Mono lines.

But now that I have the file opener in C7, we could manually open the file...oh we don't have a save function yet. Hmm...

Quintillus · Nov 29, 2021

I went ahead and merged the FLC improvements. Haven't done anything else since 5 days ago, beyond rebasing it and removing some console writes this morning when reviewing the changes. But it had fallen 82 commits behind development, so it was time to merge it. It'll give that 40% boost to FLC import performance.

Incidentally, we've added enough stuff in that time that the game scene load is now 2.4 seconds (with FLC improvements), versus 1.2 when I started that branch, and about 0.8 with the improvements. I don't feel like chasing performance right now, but it is interesting to observe. The "Game scene load time" will print out on the console when you load things up, so you can track the performance informally that way.

WildWeazel · Dec 12, 2021

I was just poking around in the IndieCiv forum and came across some relevant discussion of FLC loading at the top of this thread https://forums.civfanatics.com/threads/indieciv-updates.451947/

Caro-Kann · Dec 31, 2021

I've opened a pull request centered around optimizing PCX stuff: loading and Godot converting. https://github.com/C7-Game/Prototype/pull/71

It takes the "Game scene load time" timer from around ~1800ms to around ~750ms, at least on my machine and for the particular Sav I was testing. There's still some ways PCX stuff in particular can be improved further (pointers), but I think it'll be diminishing returns. My guess is that most of the remaining time to be saved will come from Godot-specific tricks rather than C# ones, but that's just a hunch.

Quintillus · Jan 13, 2022

Caro-Kann said:
I've opened a pull request centered around optimizing PCX stuff: loading and Godot converting. https://github.com/C7-Game/Prototype/pull/71

It takes the "Game scene load time" timer from around ~1800ms to around ~750ms, at least on my machine and for the particular Sav I was testing. There's still some ways PCX stuff in particular can be improved further (pointers), but I think it'll be diminishing returns. My guess is that most of the remaining time to be saved will come from Godot-specific tricks rather than C# ones, but that's just a hunch.

I just reviewed the PR, and left some comments. There's one pretty-sure-it's-a-bug around junk bytes being added that probably hasn't been noticeable since all or almost all Civ3 PCX files are of even width. Otherwise, the general theme of my comments is questioning micro-optimizations that IMO decrease readability. Maybe they do make a difference, and if so, great, but in my FLC and in-general experience, micro-optimizations rarely make the sort of difference that SetPixel being eliminated does, so I'd lean towards readability if there's not a measurable difference. E.g. if my readability suggestions take if from 750 ms to 800 ms, they're probably worth it. If they take it from 750 ms to 1750 ms, then it turns out they matter... but given how much of a difference SetPixel made for FLC, I'd be quite surprised if it's not closer to the former than the latter.

The other key comment of note is saving time/memory by sharing palette/temporary storage structures between PCX files. IMO, this is hazardous, as if at any point in the future those are used at the same time by two threads, it will cause a subtle and hard-to-detect bug (and one that may be dependent on the timing of the threads... e.g. the developer who introduces the bug may be completely unaware, and only a small fraction of the users may see it, but those who do will have corrupt graphics).

Quintillus · Jan 13, 2022

Oh, and the other thing that occurred to me but didn't really fit into the PR context is that it would be good to have some testing around image loading, particularly as @Puppeteer reportedly made a Godot testing breakthrough in another thread.

I looked at what I have for Java testing in editor-related PCX world, and found that I have a bunch of test cases where it verifies that reading in a PCX results in the same image as reading an externally-converted-to-PNG file does:

Code:

    @Test
    public void pcx_8_Bit_Palette() throws Exception {
       
        BufferedImage pcxImage = getPCXImage("src/test/resources/testImages/images_to_test/pcx/Knights_1024_768.pcx");
        BufferedImage pngImage = ImageIO.read(new File("src/test/resources/testImages/reference/pcx/Knights_1024_768.png"));
       
        assertTrue(imagesAreEqual(pcxImage, pngImage));
    }
   
    @Test
    public void pcx_8_Bit_Palette_Custom_MSPaint_Tree_Image() throws Exception {
       
        BufferedImage pcxImage = getPCXImage("src/test/resources/testImages/images_to_test/pcx/TREE3112.PCX");
        BufferedImage pngImage = ImageIO.read(new File("src/test/resources/testImages/reference/pcx/TREE3112.png"));
       
        assertTrue(imagesAreEqual(pcxImage, pngImage));
    }

    protected boolean imagesAreEqual(BufferedImage a, BufferedImage b) throws Exception{
        if (a.getHeight() != b.getHeight() || a.getWidth() != b.getWidth()) {
            throw new Exception("Width and height do not match; a is " + a.getHeight() + " by " + a.getWidth() + " and b is " + b.getHeight() + " by " + b.getWidth());
        }
        for (int y = 0; y < a.getHeight(); y++) {
            for (int x = 0; x < a.getWidth(); x++) {
                if (a.getRGB(x, y) != b.getRGB(x, y)) {
                    Color aColor = new Color(a.getRGB(x, y));
                    Color bColor = new Color(b.getRGB(x, y));
                    throw new Exception("Pixel (" + x + ", " + y + ") has different RGB values.  A: " + aColor + "; B: " + bColor);
                }
            }
        }
        return true;
    }

I probably used IrfanView to convert the PCX files to PNG (you'll need the image format extension, but it'll prompt you for it automagically when you open a PCX, IIRC), but any other PCX-supporting editor should work. Obviously the MSPaint_Tree test should say more about what's custom about the palette, but I think the general concept would be a good idea here, especially if we're considering any future changes to PCX processing.

Caro-Kann · Jan 14, 2022

Alright, I think I've followed all the suggestions you made and reverted things back to closer to where they were before. Overall performance loss seems to be about 60-70ms (on the Sav I've been testing with), I think mostly coming from going back to the old loop style for reading in Pcx data. So yeah, that's the tradeoff for better readability/threadability, so if you're fine with that, it's ready to merge.

Quintillus · Feb 7, 2022

I happened to leave C7 up today, and saw it had eaten a couple hours of CPU time when I returned, and was going through about half a core's worth of power to render a scene with admittedly a bunch of units, but nothing moving other than an End Turn alert. So, I took some steps into profiling.

The Godot profiler is highly focused on frame times, and video memory consumption. It does tell you how much standard memory you're using (we're < 100 MB), and can tell you some thing about network traffic. But I haven't found anything useful about CPU usage.

So I decided to see if Rider could tell me anything. They've got some tutorials on it here. Two notes, one you need the DotTrace plugin, and two they changed the UI and to get the profiler button you have to go to View -> Appearance -> Toolbar Classic to get the old view back.

So far, however, no dice there either; it says our Mono version is too old and to use Mono 5.10 or later instead. Now, WildWeazel posted that the bundled one is Mono 6.x, so I'm not sure what's going on there yet. Attaching the profiler to another .NET program that I randomly had running, I can see that it does break down which methods are using lots of CPU, which would be great. But those are probably using the traditional .NET Framework... back to that whole Mono/.NET disambiguation topic.

Writing this mainly so that the next time one of us looks at it, we'll know what's been tried. Took me a bit to figure out why the profiler button wasn't where it was supposed to be.

------------------

Edit: Found some useful information.

It should work with Rider + DotTrace on Mac and Linux (source). However, it doesn't yet on Windows, unless you compile your own Godot (source). I vaguely recall compiling my own Godot on Windows a few months ago, although I can't remember why. But it's probably easier to try everything on Linux, assuming my Rider trial works cross-platform.

Played around with the Godot profiler too, it still shows almost everything as idle time despite 17-20% CPU being in-use. And... now it dropped to 30 FPS instead of 60, but still shows as 99% idle. Although I do see the number of Godot nodes and Draw Calls increasing, reaching 311 and 1647 by turn 250 as the AI spams Warriors. Those are around 91/323 in a fresh game with one city.

Okay, if I move to an uncrowded area of the map, the number of draw calls falls sharply, and the FPS skyrockets. It still shows almost all the time as idle in the Profiler tab but the FPS, Nodes, and Draw Call charts on the Monitor tab suggest that as the map becomes filled with units (and I mean filled, we're talking Sid-level AI levels of lack of free tiles), there's a bottleneck with draw calls, likely to the GPU.

Flintlock · Feb 7, 2022

It's interesting that you're looking at this now. Coincidently, I just made a pass over MapView to improve performance among other things. I submitted a PR a couple of hours ago, the branch is called PolishingMapView. By the way, if you're measuring FPS numbers you should first disable v-sync (it's under Project Settings -> Display -> Window) otherwise you'll be locked at 60, 30, etc. FPS.

At first I too thought we were limited by draw calls but optimizing to reduce that number didn't improve performance much. I used as reference the basic test map, in other words what you get when you load into a new game, zoomed out all the way. Drawing that used to generate 1483 draw calls and ran at 58 FPS on my system. Those draw calls were almost all terrain sprites, naturally since there was little else on the map. So I optimized it by making TerrainLayer assemble a list of terrain sprites to draw then sort by texture before submitting the draws to Godot, enabling Godot to batch them together into fewer draw calls to the GPU. That was very effective at reducing draw calls down to 312 but only modestly improved performance to about 65 FPS for me.

After experimenting some more, I discovered that the major limitation on framerate was the loop over visible tiles in LooseView._Draw using MapView.visibleTiles. One of the reasons that was so bad is that the drawing code redoes the loop for each map layer. I very significantly improved performance by first assembling a list of visible tiles containing only what's needed for drawing, a Tile reference & location, then looping over that for each layer. That improved my framerate to over 90 FPS. I'm not sure why that iteration is so slow, as far as I can tell it's the tileAt method that's mostly responsible. I thought maybe "yield return" was the problem but inlining visibleTiles to remove it only gave a tiny perf improvement.

Something to try in the future would be looping over tiles first, then layers for each tile. I'm hesitant about that, though, since it means objects might not overlap correctly according to their map layer, in other words, it means an object belonging to a lower layer might be drawn overtop of one from a higher layer, if they're on different tiles. Another thing we could do is consolidate some map layers. For example the forest, hills, and marsh layers could be combined into one overlay layer.

Quintillus · Feb 8, 2022

I was only really looking at it after noticing how much CPU it was taking up. I was away for another few hours this evening and found more than 4 hours of CPU usage by an idling C7.

Interesting findings. Looking over it now, and what you've written, I see what you mean that it now sorts the terrain draws to group the same graphics together, which as you say lets it make fewer draw calls? It can essentially say, "draw W at locations X, Y, and Z, and then A at B, C, and D", instead of "Draw W at X, then Draw A at B, then W at Y, then A at C, then W at Z, then A at D"?

At a higher level, the thing that's still interesting me is not why the FPS can be relatively low, but why the CPU usage is so high. Before factoring in your recent changes, I can load up Conquests at full res (1920x1200), with a decent number of units and full animation in a mid-game Rise and Fall of the Roman Empire game, and it uses less than a windowed C7 instance with two units visible, and no terrain improvements. I'd rather have 60 FPS and low CPU/battery usage than super-high FPS, but to the extent that the game is CPU-limited in rendering, they should somewhat go together in theory.

tileAt... answering the question of whether it's responsible is exactly the sort of thing I'd like to use a CPU profiler to examine. In theory each invocation should be cheap, but it does call the bounds-checking isTileAt function too, and maybe it's being called enough in a loop that it adds up to something significant. Looking at how my editor does it, it's similar (it also calls it a bunch to figure out which hills graphic to use, for example), although it dispenses with the bounds checking, and instead returns null if it throws an IndexOutOfBoundsException, and the calling code has to check for null which is inconvenient. I'd be surprised if that's a major factor, but if it is I suppose the null checks are worth it. But I think I'm also the one who wrote it so we wouldn't have to put up with those forever in C7. And it looks like we're calling it less than in my editor. C7 caches tile neighbors; my editor re-calculates them using its tileAt method, which should more than cancel out the lack of bounds checks.

Flintlock said:
Something to try in the future would be looping over tiles first, then layers for each tile. I'm hesitant about that, though, since it means objects might not overlap correctly according to their map layer, in other words, it means an object belonging to a lower layer might be drawn overtop of one from a higher layer, if they're on different tiles. Another thing we could do is consolidate some map layers. For example the forest, hills, and marsh layers could be combined into one overlay layer.

I checked how I did it in my editor, and that is (with exceptions, such as always drawing city labels last) how I do it. First I figure out which tiles are visible, then I iterate over those tiles, top-down, and left-to-right within a row, and for each of those visible tiles, I draw each layer. But you aren't wrong that there are edge cases; for example Hills, with their larger-than-average graphics, can overlap a Mine graphic in a mod that puts Mine graphics towards the southeastern edge of a tile. And now that I look more closely, I can see that Forests that overlap to the northeast/northwest also cover the edges of roads going in that direction. So... it might be a fine art to make sure all possible edge cases are covered, and might impose some limitations on over-size graphics. No one has ever reported that difference in my editor, but the expectation level for 100% graphical fidelity is likely not as strong.

Puppeteer · Feb 8, 2022

A thought: Try leaving a blank/simple Godot "game" running for a few hours and see what its CPU use is by comparison.

My Macbook tends to turn the fan on–an unusual and loud jet engine sound–whenever I leave a Godot game running, even things like the prototype maps from a couple of years ago.

It *does* call 1-4 callbacks about 60 times a second while running. (Process, physics process, draw...maybe just 3.)

Quintillus · Feb 8, 2022

Yeah... right now I'm on my laptop, just displaying a 1024x768 terrain-only (no units) map, and Godot uses 3-4% CPU. Which doesn't sound like much, but this is a hex-core CPU with hyperthreading, so that's about 25-33% of a CPU, not really lightweight. I brought my charger, which is a good thing as I'd probably be out of power by now (after about 2 hours) otherwise. Granted it's not the world's largest battery and has 14% wear, but even if I set C7 to run on the iGPU, it's not easy on it. But at least the cooling is robust on this laptop. If the fan's running, it's at low and I can't hear it above the background noise.

It's also using about 30% of my GPU power, a GTX 1050 Mobile, verified to be using the dGPU, not the iGPU, according to Task Manager on Windows 10. This is with vSync enabled on a 60 Hertz monitor, so I don't see why it should be so intensive. Civ3 ran well on my low-end dedicated GPU that was new in 2002, back when I joined CFC, and ran poorly on an upper-midrange GPU from 1998, so using 30% of a low-end GPU from the mid-2010s already seems suspicious. Although this is on the "Rivers" branch I'm working on, without Flintlock's latest performance improvements.

It's perhaps my biggest remaining concern about Godot. I like programming in it, but we shouldn't already be heavier-weight than Civ4, should we? Although I'm also aware that modern game engine != great performance; some of the fan-spinniest titles I have on Steam are indie titles that don't have particularly impressive graphics, and AFAIK aren't pegging a core on AI like Civ3 will do on large maps.

Flintlock · Feb 8, 2022

Did you try running the Mono profiler through Godot? While poking around in the Project Settings I found the Mono -> Profiler -> Enabled option. It seems to work, when I turned it on it made game run much slower (took 75 seconds to load the map) and it dropped an output.mpld file in my C7 project folder. Too bad the file is useless to me since it requires a Mono tool called "mprof-report" to read but I don't have Mono installed and that tool doesn't appear to be part of the Mono system bundled with Godot.

We definitely want to get a profiler set up if/when we go to do a thorough job of optimizing the game. Another thing we should look at are the compiler settings. Are we losing out on any optimization options when compiling the game through the editor?

Also I'm not sure that caching tile neighbors is faster than regenerating them each time even if tileAt is slow-ish. In general caching is just a tradeoff for memory instead of processing, and memory is relatively slow on modern computers. If those tile neighbor pointers are not in the CPU cache it may easily be faster to regenerate them than read them from main memory.

Quintillus said:
nteresting findings. Looking over it now, and what you've written, I see what you mean that it now sorts the terrain draws to group the same graphics together, which as you say lets it make fewer draw calls? It can essentially say, "draw W at locations X, Y, and Z, and then A at B, C, and D", instead of "Draw W at X, then Draw A at B, then W at Y, then A at C, then W at Z, then A at D"?

Pretty much. More specifically, GPUs are designed to work on batches of data, i.e. arrays of triangles not one at a time. Then they render against some internal setup that was supplied beforehand. The setup includes a set of active shaders and textures, and a lot of knobs and dials for things like MSAA and many more that I don't remember. The important part is that you can draw multiple things at once but only if they use the same set of textures, and Godot will automatically gather DrawTextureRectRegion operations into a single draw call but it can only do that if you submit multiple ones in sequence that all pull from the same texture.

Quintillus said:
So... it might be a fine art to make sure all possible edge cases are covered, and might impose some limitations on over-size graphics.

The annoying thing about this is that it would be easy to solve if we were writing our own renderer but Godot makes it difficult. If we were on our own, we could use the depth buffer to make sure each sprite is layered correctly on an individual basis regardless of what order they're drawn in. (The gist of how it works is that the graphics hardware is designed for 3D so we could place the sprites along the third axis perpendicular to the screen. Then the hardware tracks the depth of each pixel relative to the screen and ensures that far ones aren't drawn on top of nearer ones.) Godot uses this feature of course but it's awkwardly tied into the Node system. We could make layers into Node2Ds so that they can each be given a depth (called z-index by Godot) but then we couldn't have a simple LooseLayer.drawObject method b/c we're only allowed to draw onto a Node2D inside its own _Draw method.

[Dev] Performance Improvements Thread

Emperor

Emperor

Emperor

Resident Medieval Monk

Resident Medieval Monk

Emperor

Resident Medieval Monk

Emperor

Resident Medieval Monk

Going Dutch

Chieftain

Resident Medieval Monk

Resident Medieval Monk

Chieftain

Resident Medieval Monk

Emperor

Resident Medieval Monk

Emperor

Resident Medieval Monk

Emperor

Similar threads