DLL development discussions

n47 · Sep 1, 2013

@Koshling, I had a headache, so I was little undirect. Those landmarks are not so static as I thought. They can be removed -- for example on forest removal -- and I would say, theoretically they should be addable in future -- like on forest planting, nuking, terraforming. So I think an ordinary array is too weak for it. I quite dislike FFreeListTrashArray too, so I've been thinking about an alternative. And I think, I can have one which may be sightly better. Please say me, do you think.

* it would not need heap allocation for each new element (now each element is allocated separately)
* growing would take constant time
* there would be a lost of memory for buffering. This would reach maximally size of one block, where we can set this size (15MB?), and be linearly lower for less filled arrays
* adding a new element would take constant time. It would involve copying one additional element. Yes, whole element not only a wrapper. But I believe memcpy can do it quite fast
* removing would take constant time. It would involve copying 1-2 additional elements
* it would be able to shrink. And it would also be magically done in constant time
* it would provide a class for references to elements (encapsulated pointer), which will be able to be dereference without a need to point the storage structure
* it would ensure not reading an overridden element as the old one. Now there is a small chance for this
* it could possibly be sightly bigger then FFreeListTrashArray. Like 4-8 bytes per element. But the references would be smaller then IDInfos and the heap allocation mechanism would not eat our memory

Thinking, when the system memory is nearly full, heap allocation may be space ineffective due to fragmentation. Or time ineffective, if it uses some on-fly defragmentation. This can be a win. However, have not enough experience to be sure.

Oh my. What a technical talk. I think, I will need to cock something now, to bring some balance to my brain. :crazyeye:

Koshling · Sep 1, 2013

n47 said:
@Koshling, I had a headache, so I was little undirect. Those landmarks are not so static as I thought. They can be removed -- for example on forest removal -- and I would say, theoretically they should be addable in future -- like on forest planting, nuking, terraforming. So I think an ordinary array is too weak for it. I quite dislike FFreeListTrashArray too, so I've been thinking about an alternative. And I think, I can have one which may be sightly better. Please say me, do you think.

* it would not need heap allocation for each new element (now each element is allocated separately)
* growing would take constant time
* there would be a lost of memory for buffering. This would reach maximally size of one block, where we can set this size (15MB?), and be linearly lower for less filled arrays
* adding a new element would take constant time. It would involve copying one additional element. Yes, whole element not only a wrapper. But I believe memcpy can do it quite fast
* removing would take constant time. It would involve copying 1-2 additional elements
* it would be able to shrink. And it would also be magically done in constant time
* it would provide a class for references to elements (encapsulated pointer), which will be able to be dereference without a need to point the storage structure
* it would ensure not reading an overridden element as the old one. Now there is a small chance for this
* it could possibly be sightly bigger then FFreeListTrashArray. Like 4-8 bytes per element. But the references would be smaller then IDInfos and the heap allocation mechanism would not eat our memory

Thinking, when the system memory is nearly full, heap allocation may be space ineffective due to fragmentation. Or time ineffective, if it uses some on-fly defragmentation. This can be a win. However, have not enough experience to be sure.

Oh my. What a technical talk. I think, I will need to cock something now, to bring some balance to my brain.

For landmarks specifically, or to replace all uses of FFreeListTrashArray? Either way, what you say sounds interesting, so all I can advise is try it and see if the net effect is positive.

n47 · Sep 1, 2013

Koshling said:
For landmarks specifically [...] ?

Yes. And besides, I love to wash my friend's car, clean his house and half naked mow his lawn, when I just want to say him hello.

Koshling · Sep 1, 2013

n47 said:
Yes. And besides, I love to wash my friend's car, clean his house and half naked mow his lawn, when I just want to say him hello.

Just checking

n47 · Sep 1, 2013

Yup, it was to beautiful to be true. Forgot, I can't move index structures, which I hold inside the structure. So, the array of indexes would not shrink. Those indexes are something like FFreeListTrashArrayNode so in this aspect, we get a structure as stupid as it was.

Still it will not double its size on grow and will not need to copy all elements at once. But am disappointed, it's so inefficient to have such an array holey like a cheese.

Koshling · Sep 1, 2013

n47 said:
Yup, it was to beautiful to be true. Forgot, I can't move index structures, which I hold inside the structure. So, the array of indexes would not shrink. Those indexes are something like FFreeListTrashArrayNode so in this aspect, we get a structure as stupid as it was. Still it will not double its size on grow and will not need to copy all elements at once. But am disappointed, it's so inefficient to have such an array holey like a cheese.

In practice it tends to be over half full once the game gets beyond a certain stage, so it's not as spare as it might appear to be. Obviously worst case is much sparser than that, but I mean with typical usage patterns during a game.

n47 · Sep 1, 2013

I've been thinking about this as a nice general use structure, not only something dedicates to C2C.

n47 · Sep 1, 2013

Ok, have an idea. But it would involve 3 or in pessimistic case 4 memory load instructions on an object access. Originally it would be 2. I guess, I will first implement the stupid - I mean original way, then extend it and compare results. :rolleyes:

Dancing Hoskuld · Sep 1, 2013

Oh how well I remember this sort of discussion from my code maintenance days, many many years ago.

BTW the first bug I caught was a 1900's bug in COBOL when they converted a manual system to computer and forgot that some people still alive had been born in the 1800's

.

n47 · Sep 1, 2013

I've heard about those bugs many times. They were like stories about princesses and dragons at my university.

The best was, when someone has done a system for an exchange and forgot, that floating-point numbers round. They twigged, when couldn't find few mln of dollars.

It is such a shame, they didn't blow up all exchanges over the world after this. This planet would be a better place.

n47 · Sep 2, 2013

@Koshling, I am wondering is it right, that we are trying to dispose objects, which still have their alive references. When I think of it, it looks like a design issue for me. In languages with garbage collecting there is even no such option. Maybe we shouldn't concern, how to provide references to objects which stays valid after object's deletion, but about a cleaning mechanism, which would remove references to dead objects? It is the general problem about all those data structures I think.

If we don't want more memory access operations per get/dereference then it is now, I would need to make the indexes almost so stupid as management of FFreeListTrashArrayNode is now. And if we would be able to guarantee the references objects to be removed after an object is killed -- don't need to be instantly but in some time -- the data structure would be quite more effective.

Not saying, we would be able to to store objects even without any special structure. Double pointers + some array list -- just to be able to list all object -- would do. But still, if the memory would be nearly full, I think heap will be really poor. Storing millions (?) of objects, each in a separate allocation unit sounds like a horror tetris session for the allocator.

Koshling · Sep 2, 2013

n47 said:
@Koshling, I am wondering is it right, that we are trying to dispose objects, which still have their alive references. When I think of it, it looks like a design issue for me. In languages with garbage collecting there is even no such option. Maybe we shouldn't concern, how to provide references to objects which stays valid after object's deletion, but about a cleaning mechanism, which would remove references to dead objects? It is the general problem about all those data structures I think.

If we don't want more memory access operations per get/dereference then it is now, I would need to make the indexes almost so stupid as management of FFreeListTrashArrayNode is now. And if we would be able to guarantee the references objects to be removed after an object is killed -- don't need to be instantly but in some time -- the data structure would be quite more effective.

Not saying, we would be able to to store objects even without any special structure. Double pointers + some array list -- just to be able to list all object -- would do. But still, if the memory would be nearly full, I think heap will be really poor. Storing millions (?) of objects, each in a separate allocation unit sounds like a horror tetris session for the allocator.

It's an original BTS design decision (in fact base Civ 4), so fairly baked in. It's safe because the references are through ids, which are never re-used and the dereferences are checked for resolving to NULL before use, in circumstances where the underlying object may have been destroyed. That check-on-deref is crucial, which is why the id cannot simply be replaced by a pointer. Millions of objects of the same size is not really a problem for a heap allocator. Millions of objects of slightly different sizes is, but if there are a few highly recurrent sizes (e.g. - sizeof(CvUnit)) it won't fragment badly.

n47 · Sep 2, 2013

fairly baked in

What do you mean? -- Am not sure whet the phrase means.

It's safe because the references are through ids, which are never re-used[/qoute]
I may be wrong, but I think there is a very very ... very small chance, that id will be reused. There is only one byte responsible for id space, so if it hits 255 ids will start to be reused. For example, if the array would be full, and we would perform 257 remove - add operations, there will come to reuse of an id. The single probability of the situation is damn small, but if people would play a game millions of times, I think there can be real chance for it. -- Quite dislike such attitude. Maybe it is not a big deal, as you can always run a save, but if there is a way to do something effectively and one hundred percent sure, why not to do it like this?

Millions of objects of the same size is not really a problem for a heap allocator.

Click to expand...

Maybe you are right. But if so, the algorithm of allocation need to take some time. Allocation units, from what I know, are stored as 2^n blocks. So if we want allocate 129 bytes, we will get 256. If there is some defragmentation involved, it would need to check for some existing block to expand. If there is non, if we don't want to lose space, there will be probably 64 bytes cut from what's left and it will be added to free blocks. But there are still 63 bytes. So if it proceeds it again needs to cut and again ... And if some neighboring block will be disposed, there is a need of aggregation of those chunks.

I remember, there was quite a lot of professionals who advised to avoid using heap as much as is possible in applications which heavily use memory. I can't much from my experience, but looking at the above it sounds reasonable.

Oh, and there is still the memory consumed by the heap. If there are many object, and probably chunks of memory after removals, wouldn't this weight a little?

Koshling · Sep 2, 2013

n47 said:
What do you mean? -- Am not sure whet the phrase means.

It means 'assumed in many places', so hard to change without impacting lots of potential areas. What I meant in this case was that the designers of CiV assumed that access would always be via id, and therefore didn't worry about objects being destroyed while still referenced (as long as it was by id). Consequently it happens a lot.

Code:
It's safe because the references are through ids, which are never re-used[/qoute]
I may be wrong, but I think there is a very very ... very small chance, that id will be reused. There is only one byte responsible for id space, so if it hits 255 ids will start to be reused. For example, if the array would be full, and we would perform 257 remove - add operations, there will come to reuse of an id. The single probability of the situation is damn small, but if people would play a game millions of times, I think there can be real chance for it. -- Quite dislike such attitude. Maybe it is not a big deal, as you can always run a save, but if there is a way to do something effectively and one hundred percent sure, why not to do it like this?

Click to expand...

No - they are guaranteed to be unique for any given FFreeTrashArray in any game, for the lifetime of that game. That is why its not just the index into the array that goes to make up the returned id, but rather that in the bottom 12 bits (FLTA_INDEX_MASK) and a re-use counter (the 'major id space' referred to in the comment in this code inside of FFreeListTrashArray<T>::add():

Code:

// Try to choose a free node that is not from the current major id space so as // to minimize the rate we move through the global id space while( iIndex != FFreeList::INVALID_INDEX && m_pArray[iIndex].iLastUsed == m_iCurrentID ) { iLast = iIndex; iIndex = m_pArray[iIndex].iNextFreeIndex; }

Basically the code in add() ensures that:
1) The bits comprising the major id space (set from m_currentId at the time of the add()) are an number that is monotonically increasing over time
2) On reuse of a slot hat has previouly been used the number in the major-id space bits is incremented.
Together these conditions ensure no id usage. Strictly speaking it is possible for the entire id space to wrap, because the line:

Code:

m_iCurrentID += FLTA_MAX_BUCKETS;

isn't error checked for an overflow. In practice we run out of unoccupied array slots and our ability to grow the (limited to a 12-bit index) array far sooner than we run through the global id space, so this case doesn't actually occur (it should probably be checked really, but currently it isn't)

Maybe you are right. But if so, the algorithm of allocation need to take some time. Allocation units, from what I know, are stored as 2^n blocks. So if we want allocate 129 bytes, we will get 256. If there is some defragmentation involved, it would need to check for some existing block to expand. If there is non, if we don't want to lose space, there will be probably 64 bytes cut from what's left and it will be added to free blocks. But there are still 63 bytes. So if it proceeds it again needs to cut and again ... And if some neighboring block will be disposed, there is a need of aggregation of those chunks.

I remember, there was quite a lot of professionals who advised to avoid using heap as much as is possible in applications which heavily use memory. I can't much from my experience, but looking at the above it sounds reasonable.

Oh, and there is still the memory consumed by the heap. If there are many object, and probably chunks of memory after removals, wouldn't this weight a little?

Click to expand...

Nope - the standard Windows runtime heap allocator allocates N+<header overhead> rounded up to the nearest 4 bytes I believe. I think the header overhead is 12 bytes. ANyway, the point is not that a freed block can later be exactly occupied by MULTIPLE smaller blocks (which a 2^N scheme or a Fibonacci heap would allow for), but that another allocation of EXACTLY the same size will fit a previously freed 'hole' without fragmentation. In typical usage most of the allocations are sizeof(<some class>), so a small set fo sizes is used repeatedly and this condition is met. The standard heap allocator also reduces fragmentation through a few other tricks, such as internally maintaining multiple separate heaps for objects of different size ranges. For most new/delete type allocation pattern it works well with low fragmentation (and O(ln N) worst case allocation times, with MOST cases being O(1)). If you severely abuse it with lots of semi-random length, non short-lived allocations (through malloc of variable length structures typically) it can get pretty badly fragmented and perform badly, but Civ doesn't exhibit those allocation patterns.

n47 · Sep 2, 2013

Koshling said:
Nope - the standard Windows runtime heap allocator allocates N+<header overhead> rounded up to the nearest 4 bytes I believe. I think the header overhead is 12 bytes. ANyway, the point is not that a freed block can later be exactly occupied by MULTIPLE smaller blocks (which a 2^N scheme or a Fibonacci heap would allow for), but that another allocation of EXACTLY the same size will fit a previously freed 'hole' without fragmentation. In typical usage most of the allocations are sizeof(<some class>), so a small set fo sizes is used repeatedly and this condition is met. The standard heap allocator also reduces fragmentation through a few other tricks, such as internally maintaining multiple separate heaps for objects of different size ranges. For most new/delete type allocation pattern it works well with low fragmentation (and O(ln N) worst case allocation times, with MOST cases being O(1)). If you severely abuse it with lots of semi-random length, non short-lived allocations (through malloc of variable length structures typically) it can get pretty badly fragmented and perform badly, but Civ doesn't exhibit those allocation patterns.

Thanks.

It's little hard to believe, with such an approach, it can find a block of proper size in O(1). But if there are not too many of blocks' sizes, it in fact can be quite fast. But it can also be quite poor solution, if you allocate many arrays of different sizes.

That is why its not just the index into the array that goes to make up the returned id, but rather that in the bottom 12 bits

Oh sweet lord. :rolleyes:

Why have I thought it is 24? I thought, there are only 8 bits per (major) id space. But why are you saying ids are guaranteed to be unique for any given FFreeTrashArray in any game, for the lifetime of that game and then you show they are not?

Strictly speaking it is possible for the entire id space to wrap, because the line:
m_iCurrentID += FLTA_MAX_BUCKETS;
isn't error checked for an overflow.

Guarantied is 100% not 99.99999999999999999999999999999999999999999999999999999999999999999999999999999999999999.

Take such situation:
- |m_pArray| == 1000 elements and is full
- m_iFreeListHead == -2 (empty)
- m_iCurrentID == 3<<13
- m_pArray[999].iLastUsed == 3<<13

- call removeAt(999)
- m_iFreeListHead == 999
- call add()
- m_iCurrentID == 4<<13

- call removeAt(999)
- m_iFreeListHead == 999
- call add()
- m_iCurrentID == 5<<13

- call removeAt(999)
- m_iFreeListHead == 999
- call add()
- m_iCurrentID == 6<<13

repeat 1048572 times

- call removeAt(999)
- m_iFreeListHead == 999
- call add()
- m_iCurrentID == 3<<13

id reused

(or I again missed something?

)

Koshling · Sep 2, 2013

n47 said:
Thanks. It's little hard to believe, with such an approach, it can find a block of proper size in O(1). But if there are not too many of blocks' sizes, it in fact can be quite fast. But it can also be quite poor solution, if you allocate many arrays of different sizes.

Oh sweet lord. Why have I thought it is 24? I thought, there are only 8 bits per (major) id space. But why are you saying ids are guaranteed to be unique for any given FFreeTrashArray in any game, for the lifetime of that game and then you show they are not?

Guarantied is 100% not 99.99999999999999999999999999999999999999999999999999999999999999999999999999999999999999.

Take such situation:
- |m_pArray| == 1000 elements and is full
- m_iFreeListHead == -2 (empty)
- m_iCurrentID == 3<<13
- m_pArray[999].iLastUsed == 3<<13

- call removeAt(999)
- m_iFreeListHead == 999
- call add()
- m_iCurrentID == 4<<13

- call removeAt(999)
- m_iFreeListHead == 999
- call add()
- m_iCurrentID == 5<<13

- call removeAt(999)
- m_iFreeListHead == 999
- call add()
- m_iCurrentID == 6<<13

repeat 1048572 times

- call removeAt(999)
- m_iFreeListHead == 999
- call add()
- m_iCurrentID == 3<<13

id reused (or I again missed something? )

Yes, but in practice it never happens, because the allocation patterns in c2c are such that the probability is vanishingly small that you'd get through the major Id space before encountering a situation where the array is full.

My use of the word 'guaranteed' was a little loose

. One game in a few million this error could occur, with the result that a reference (almost certainly to a unit, since they are the only ones that really get allocated on this scale) cold point to an incorrect unit. The overflow really shod be trapped, but if it did the chances of it ever going off for you would be remote.

n47 · Sep 3, 2013

I guess, it is much smaller then 1 in few millions. Still, it is. If you would give it to someone who is building software for nuclear power-plant and say ids are guarantied to newer collide :rolleyes:

I think I would rip your head off.

For half a year, I've been doing a course of project management on the best technical university in my country. At least two guys in my group were already great programmers. And what? They prepared a plan of project during which, there was quite probable for at least one person/family to die. It wouldn't be maybe such a deal, if when pointed, they would say, it was just a toy project and they didn't take it seriously. But they said: it is a quite good result, there could be much more. In fact, I was quite sure they even didn't care about consequences of their work. -- No, programmers are totally irresponsible.

A small chance is always a chance and should be always considered.

Someone could read what you have written and use the structure to something else, where the above pattern would be frequent. No, I deeply believe it is a bad practice what you have done.

n47 · Sep 3, 2013

I have done some test. And you were both right and wrong about the heap.

There is averagely 10.5B lost per allocated block. And in two test cases, there were not great loses after memory shuffling - 6MB for allocated 987MB and 66MB for 690MB. But in the third case there was 102MB for 442MB!

I've done the tests according to what you said. I drew 10 random numbers and I've been allocating and removing blocks of sizes equal to them. -- 20 000 000 operations - mostly allocations. -- So it was not the case of many blocks of different lengths.

I'll toy with tests a little bit more. But if the last case will repeat, I guess it will mean, the heap is not so beautiful as the salesman was saying.

-- Or that I have missed something again. :crazyeye:

Btw, I just have a filling, that O(1) for most operations is way to small to optimize the memory usage.

Koshling · Sep 3, 2013

n47 said:
I guess, it is much smaller then 1 in few millions. Still, it is. If you would give it to someone who is building software for nuclear power-plant and say ids are guarantied to newer collide I think I would rip your head off.

For half a year, I've been doing a course of project management on the best technical university in my country. At least two guys in my group were already great programmers. And what? They prepared a plan of project during which, there was quite probable for at least one person/family to die. It wouldn't be maybe such a deal, if when pointed, they would say, it was just a toy project and they didn't take it seriously. But they said: it is a quite good result, there could be much more. In fact, I was quite sure they even didn't care about consequences of their work. -- No, programmers are totally irresponsible.

A small chance is always a chance and should be always considered.

Someone could read what you have written and use the structure to something else, where the above pattern would be frequent. No, I deeply believe it is a bad practice what you have done.

If someone lifts code from a game, whe failure is an annoyance, and tries to reuse it in a safety critical domain they deserve to be thrown in jail and have the key thrown away.

A good-enough approach to software in a game context is simply a way of regulating effort. In other domains it might be totally inappropriate, but here I do not believe that is the case. That is why the cost of development of software that controls nuclear reactors is orders of magnitude higher than the cost to develop a similar size (in functional terms) in an entertainment domain.

n47 · Sep 3, 2013

Koshling said:
If someone lifts code from a game, whe failure is an annoyance, and tries to reuse it in a safety critical domain they deserve to be thrown in jail and have the key thrown away.

A good-enough approach to software in a game context is simply a way of regulating effort. In other domains it might be totally inappropriate, but here I do not believe that is the case. That is why the cost of development of software that controls nuclear reactors is orders of magnitude higher than the cost to develop a similar size (in functional terms) in an entertainment domain.

Yup. Totally agree.

But I'm actually quite sensitive about words "guarantied", "sure", "100%". People who use them incorrectly in small cases, tend to use them incorrectly in big cases also. For the rest I do not have strength to write today...

Back to the heap. -- The big lose in the space repeats something like once per 7 seeds. For 5 there is more or less 60MB lost. And for 1, 5-10. Checked also for 2 000 000 of operations. The losses were smaller, but the proportion of allocated and lost numbers of MB looked the same. In bad cases there was even over 200MB per 400-500 allocated.

For me it doesn't look good. But maybe I'm tired.

DLL development discussions

Prince

Vorlon

Prince

Vorlon

Prince

Vorlon

Prince

Prince

Deity

Prince

Prince

Vorlon

Prince

Vorlon

Prince

Vorlon

Prince

Prince

Vorlon

Prince

Similar threads