Disassembly of loop section that hangs the game for quite a while, maybe even forever.

Skybuck

Prince
Joined
Apr 23, 2005
Messages
301
Civilization 1 version: 474.05 Dos-Box-X version: 0.83.22

Civilization 1 enters this very time costly loop.

It keeps running/jumping from LOOP SECTION 1 to LOOP SECTION 2 and then from LOOP SECTION 2 to LOOP SECTION 1 and it keeps
doing this over and over and over again.

The registers seem to flip flop between certain values. Perhaps the code is trying to search through something. I suspect it might
be the AI that is trying to find some kind of path. As a young kid back in 199x I also witnessed this costly time loop, it could
maybe take 14 hours before it might exit it on a 80486, it might even never exit, not sure.

Thanks to modern day DosBox-X software it is now possible to easily debug it.

There is a SR command, "set register" that might be used to set/overwrite one of the register so that maybe civilization 1 can break
out of this loop, more easy.

I have copy & pasted both loop sections below, basically the debugger will run through it from the top instruction to the bottom instruction
and then jump to the other loop section, back and forth.

This code is currently running on a toshiba laptop 2.6 GHz but still running for a very long.

This assembly code was produced/acquire from dos-box-x by Skybuck Flying on 22 april 2022 !

Enjoy, and if you spot any point where you think a register can be overwritten for safe exit, let us all know ! =D


// LOOP SECTION 1:
1E1E:000016B9 8B4608 mov ax,[bp+08] ss:[F006]=0057
1E1E:000016BC 3946E0 cmp [bp-20],ax ss:[EFDE]=0075
1E1E:000016BF 7503 jne 000016C4 ($+3) (down)
1E1E:000016C1 E98E00 jmp 00001752 ($+8e) (down)
1E1E:000016C4 B80006 mov ax,0600
1E1E:000016C7 F76E06 imul word [bp+06] ss:[F004]=0004
1E1E:000016CA 8BF0 mov si,ax
1E1E:000016CC B80C00 mov ax,000C
1E1E:000016CF F76EE0 imul word [bp-20] ss:[EFDE]=0075
1E1E:000016D2 03F0 add si,ax
1E1E:000016D4 F684D48108 test byte [si-7E2C],08 ds:[FFFF99D4]=C48
1E1E:000016D9 751D jne 000016F8 ($+1d) (down)
1E1E:000016DB B022 mov al,22
1E1E:000016DD F6ACD781 imul byte [si-7E29] ds:[FFFF9F53]=7500
1E1E:000016E1 8BF8 mov di,ax
1E1E:000016E3 B80100 mov ax,0001
1E1E:000016E6 8A8D4811 mov cl,[di+1148] ds:[12BE]=0001
1E1E:000016EA D3E0 shl ax,cl
1E1E:000016EC 0946E2 or [bp-1E],ax ss:[EFE0]=0007
1E1E:000016EF 83BD381100 cmp word [di+1138],0000 ds:[12AE]=0000
1E1E:000016F4 753C jne 00001732 ($+3c) (down)
1E1E:000016F6 EB37 jmp short 0000172F ($+37) (down)


// LOOP SECTION 2:
1E1E:0000172F FF46AC inc word [bp-54] ss:[EFAA]=29A5
1E1E:00001732 B80C00 mov ax,000C
1E1E:00001735 F76EE0 imul word [bp-20] ss:[EFDE]=0075
1E1E:00001738 8BD8 mov bx,ax
1E1E:0000173A B80006 mov ax,0600
1E1E:0000173D F76E06 imul word [bp+06] ss:[F004]=0004
1E1E:00001740 8BF0 mov si,ax
1E1E:00001742 8A80DE81 mov al,[bx+si-7E22] ds:[FFFF9F8A]=B802
1E1E:00001746 98 cbw
1E1E:00001747 8946E0 mov [bp-20],ax ss:[EFDE]=0075
1E1E:0000174A 3DFFFF cmp ax,FFFF
1E1E:0000174D 7403 je 00001752 ($+3) (no jmp)
1E1E:0000174F E967FF jmp 000016B9 ($-99) (up)


// REGISTER OUTPUT FOR LOOP SECTION 1, WHEN INSTRUCTION POINT ON THE BEGIN OF IT:
----Register Overview-----------------------------------------------------------
EAX=00000004 ESI=00001800 DS=3324 ES=625D FS=0000 GS=0000 SS=3324 Real
EBX=000004D4 EDI=00000000 CS=1E1E EIP=000016B9 C1 Z0 S0 O0 A1 P1 D0 I1 T0
ECX=00000000 EBP=0000EFFE NOPG IOPL3 CPL0
EDX=00000000 ESP=0000EF9C 3082409161
ST0=00000.00 ST1=00000.00 ST2=00000.00 ST3=00000.00
ST4=00000.00 ST5=00000.00 ST6=00000.00 ST7=00000.00

// REGISTER OUTPUT FOR LOOP SECTION 1, WHEN INSTRUCTION POINT ON THE END OF IT:
----Register Overview-----------------------------------------------------------
EAX=00000002 ESI=00001830 DS=3324 ES=625D FS=0000 GS=0000 SS=3324 Real
EBX=000004D4 EDI=00000176 CS=1E1E EIP=000016F6 C0 Z1 S0 O0 A0 P1 D0 I1 T0
ECX=00000001 EBP=0000EFFE NOPG IOPL3 CPL0
EDX=00000000 ESP=0000EF9C 3082409181
ST0=00000.00 ST1=00000.00 ST2=00000.00 ST3=00000.00
ST4=00000.00 ST5=00000.00 ST6=00000.00 ST7=00000.00


// REGISTER OUTPUT FOR LOOP SECTION 2, WHEN INSTRUCTION POINT ON THE BEGIN OF IT:
----Register Overview-----------------------------------------------------------
EAX=00000002 ESI=00001830 DS=3324 ES=625D FS=0000 GS=0000 SS=3324 Real
EBX=000004D4 EDI=00000176 CS=1E1E EIP=0000172F C0 Z1 S0 O0 A0 P1 D0 I1 T0
ECX=00000001 EBP=0000EFFE NOPG IOPL3 CPL0
EDX=00000000 ESP=0000EF9C 3082409182
ST0=00000.00 ST1=00000.00 ST2=00000.00 ST3=00000.00
ST4=00000.00 ST5=00000.00 ST6=00000.00 ST7=00000.00

// REGISTER OUTPUT FOR LOOP SECTION 2, WHEN INSTRUCTION POINT ON THE END OF IT:
----Register Overview-----------------------------------------------------------
EAX=00000075 ESI=00001800 DS=3324 ES=625D FS=0000 GS=0000 SS=3324 Real
EBX=00000030 EDI=00000176 CS=1E1E EIP=0000174F C1 Z0 S0 O0 A1 P0 D0 I1 T0
ECX=00000001 EBP=0000EFFE NOPG IOPL3 CPL0
EDX=00000000 ESP=0000EF9C 3082409194
ST0=00000.00 ST1=00000.00 ST2=00000.00 ST3=00000.00
ST4=00000.00 ST5=00000.00 ST6=00000.00 ST7=00000.00


YOU CAN SEE MY GAME AND MY ATTEMPT TO FIX IT ON THIS YOUTUBE STREAM =D:

 
Last edited:
I think all I need to do is maybe set AX to FFFF when the debugger reaches this line of code:

1E1E:0000174A 3DFFFF cmp ax,FFFF

I am gonna give it a good, fingers cross ! ;) =D

F10 can be used to step over an instruction.
F11 can be used to step into an instruction, maybe usefull for calls, but there are none in these sections so it will behave like F10.

Pausing the debugger is best done via the file menu. I tried alt-pause but that did not seem to work, maybe it needs a special function key on this laptop, but that is a bit risky to do.
 
I tried SR ax, FFFF

DEBUG: Set Register failure.

Maybe it needs a special symbol in front of FFFF ? maybe $FFFF ? or 0xFFFF ? Hmmm..
 
OK THE CORRECT COMMAND IS:

SR ax FFFF

without a command yeah.

DEBUG: Set Register success.
 
LOL IT SOLVED IT FOR ONE MOVEMENT OF A SAIL, IT THANK ATTACKED MY TRANSPORT AND THEN IT ENTERED BACK INTO SOME LOOP AGAIN, FUNNY STUFF ! =D
OH horsehockey I SEE LOTS OF NOPS.... I THINK THIS MIGHT HAVE CAUSED IT TO RUN OUTSIDE OF IT'S INSTRUCTIONS THIS IS PROBABLY NOT YET THE CORRECT SOLUTION LOL... I WAS AFRAID OF THIS SHOULD HAVE MAYBE MADE A STATE SAVE TO TRY AGAIN... OH WELL, NEXT TIME ! ;)

IF I COULD MANIPULATE INSTRUCTION POINTER MAYBE I CAN GET IT BACK INTO IT'S INSTRUCTIONS... AX IS STILL FFFF LOL.
 
I tried SR EIP 000016B9 but that does not seem correct.

Maybe I also need to set Code Segment to the front of it... yeah that might be it:

CS to 1E1E ? hmmm interesting.

No CS don't seem to work how can I get the program back at executing:

1E1E:000016B9

Assuming the code is still there... hmmmm... gonna try some other segments I guess... or I study the registers which I acquired at that particular moment ! wow good thing I did that ! ;)

Gonna set em all up yeah ! ;)
 
COOL SETTING THESE REGISTERS GOT THE GAME BACK UNDER CONTROL AND RUNNING, BUT STILL A HANG LOOP, BUT I CAN SEE THE ANIMATIONS RUN HEHE:

SR EAX 00000004
SR ESI 00001800
SR DS 3324
SR ES 625D
SR FS 0000
SR GS 0000
SR SS 3324

SR EBX 000004D4
SR EDI 00000000
SR CS 1E1E
SR EIP 000016B9
SR ECX 00000000
SR EBP 0000EFFE
SR EDX 00000000
SR ESP 0000EF9C

DON'T TRY TO RUN THEM ALL AT ONCE FROM THE COMMAND LINE THAT DON'T WORK.

THERE IS A SECOND WAY TO ENTER IT VIA A GUI, SO I DID IT THAT WAY, BECAUSE TRYING TO ENTER IT ALL AT ONCE THROUGH A COPY & PASTE MIGHT SCREW UP THE COMMAND LINE, NOT SURE, MAYBE IT JUST TAKES A WHILE TO DO A BACKSPACE.
 
MY NEW STRATEGY IS TO DIRECTLY MANIPULATE THE INSTRUCTION POINTER AND TO TRY THE DIFFERENT JUMP POINTS:

1E1E:000016BF 7503 jne 000016C4 ($+3) (down)
1E1E:000016C1 E98E00 jmp 00001752 ($+8e) (down)
1E1E:000016D9 751D jne 000016F8 ($+1d) (down)
1E1E:000016F4 753C jne 00001732 ($+3c) (down)
1E1E:000016F6 EB37 jmp short 0000172F ($+37) (down)
1E1E:0000174D 7403 je 00001752 ($+3) (no jmp)
1E1E:0000174F E967FF jmp 000016B9 ($-99) (up)

So basically it's one of these:

SR EIP 000016C4
SR EIP 00001752
SR EIP 000016F8
SR EIP 00001732
SR EIP 0000172F
SR EIP 00001752
SR EIP 000016B9

My best guess would be one of them that does not jump back into these loop sections lol.
 
ANALYZING THEM:

NO = JUMPS BACK INTO LOOP SECTIONS
YES = SEEMS TO JUMP OUT

SR EIP 000016C4 (NO)
SR EIP 00001752 (YES)
SR EIP 000016F8 (YES)
SR EIP 00001732 (NO)
SR EIP 0000172F (NO)
SR EIP 00001752 (YES)
SR EIP 000016B9 (NO)

This one below seems to jump in between, maybe safest, so I am going to give that one a try when the execution is over/near it.
SR EIP 000016F8 (YES)
 
UNFORTUNATELY SETTING THE INSTRUCTION POINTER DOES NOT IMMEDIATELY MOVE THERE... HMMM... SO SO FAR THIS STRATEGY DON'T WORK... I ALSO TRIED IT WHEN IT WAS ON TOP OF A JMP INSTRUCTION, BUT ALAS... HMMM...
 
NEW STRATEGY IDEA: TRY AND SET THE REGISTER FLAGS, LIKE C0 AND Z0 AND SUCH... MAYBE THERE IS A FLAG FOR JNE AND SUCH THEN ONLY THESE FLAGS NEED TO BE SET WITH MAYBE

SR Z0 0

DONT WORK

BUT

SR FLAGS 0

DID WORK... WOOPS... HEHEHE... MUST COMBINE ALL 8 OF THEM I GUESS... I GO CONSULT INTEL INSTRUCTIONS TO SEE WHICH FLAG MUST BE SET TO TRIGGER THE JNE JUMPS AND SUCH GOOD TIMES.
 
TO SET THE FLAGS, THE FLAGS MUST BE COMBINED LIKE, 128 + 64 + 32 + 16 + 8 + 4 + 2 + 1

EACH POSITION REPRESENTS IT'S CORRESPONDING VALUE, SO SUM THE ONES THAT YOU WANT TO SET TO 1 TOGETHER. NOT SURE IF IT'S FROM LEF TO RIGHT OR RIGHT TO LEFT EXPERIMENT ! ;)
 
UNFORTUNATELY I SCREWED IT UP THE FIRST TIME WHEN IS SET FLAGS TO 111111111 OR SOMETHING AND IT CAUSED SOME KIND OF UNHANDLED INTERRUPT ERROR, WELL AT LEAST I LEARNED SOMETHING TODAY, MAYBE NEXT TIME I MASTER IT ! ;)
 
// LOOP SECTION 1:
1E1E:000016B9 8B4608 mov ax,[bp+08] ss:[F006]=0057
1E1E:000016BC 3946E0 cmp [bp-20],ax ss:[EFDE]=0075

I looked up 8B46083946E0 in CIV.EXE, and find one unique occurence (you can look it up too with any hex editor).

It is in a routine I disassembled in IDA and called "AIprocessUnit".

My guess if that you have a met the famous "corrupt unit stack" problem, whereby units that are located on the same map square get linked together in a stack (a field of the unit data is pointing the ID of the next unit in stack if any), but somehow, at some point, this linked list gets corrupt and you get endless unit lists, thus endless loops.

You can see an 8-year old (!!!) discussion about it here: https://forums.civfanatics.com/threads/civ-freezes-during-computer-move.535663/

I also used to have this problem very often when I was young and played Civ...

In JCivED there is a function to detect unit stack errors and fix them :

upload_2022-4-22_13-29-50.png


You can try JCivEd, or share you savegame for analysis.

Cheers
 
Last edited:
A circular linked list could occur if "next" or "prev" pointers of existing nodes in lists are not cleared to nil as the list changes, especially when nodes are deleted or removed and existing nodes move forward or backward towards the begin or end pointers of the doubly linked list.

Example of a doubly linked list:

A -> B -> C -> NIL
NIL <- A <-B <- C

if A is deleted or removed, then the prev pointer of B should be cleared/set to NIL.

For example:

List.Begin := List.Begin.Next;

Programmer forgets to set the prev of the new begin to nil:

List.Begin.Prev := nil;

Since it was not cleared List.Begin.Prev still points to A.

If A later gets added back to list at end the following will occur:

B -> C -> A
A <-B <- C <- A

So if civilization has this programming bug and if it uses doubly linked lists then when it removes the front node and adds it back to the list it has accidently created a circular list.

Then when it travels backwards through the doubly linked list it travels in a circular way for ever.

The code does seem to have forward/backward capability though not entirely sure ! ;)
 
This could also occur in the opposite/forward/next way (even for a single linked list) if it removes C and forgets to clear the next pointer of B

A -> B -> C

C removed from list but B is still pointing towards C.

if C now somehow gets added back to the front of the list:

C - > A -> B -> C

However if it is a single linked list then it is unlikely that C will be added to the front of the list, it is more likely that it might be added to the back of the list, cause that is the only capability of a single forward linked list.

However I think it might be conceiveable that two lists of only 1 element are merged together where they happened to point towards each other:

List 1:
B -> C (removed)

List 2:
C -> B (removed)

Merge scenerio 1, B/List 1 added to C/List 2:

C->(B->C (removed))

Merge scenerio 2, C/List 2 added to B/List 1:

B->(C->B (removed))

This mostly resembles the described unit stack bug described by you, where two units "meet" on the same square and possibly they are merged together into a list.

They may have met before and added up in the same list, causing their next or prev pointers to point towards each other.

Next time they meet and get added to the same list, the bug occurs.

(For now this is a hypothesis, it is not yet confirmed :))
 
Last edited:
The instructions I posted in the original post do not seem to be related to linked lists or creation or merger of lists, neither do they seem to traverse lists, at least that's how it seems to me but I am not an expert at reading assembly instructions. I see some multiplications, and some byte accessing, almost seems like some kind of pixel/graphics processing or math/calculation/index/coordinate processing, so an AI search path loop seems more likely to me. I have seen AI units go back and forth, back and forth, especially near center of America, it fails to sail around America and can get stuck there, this can also happen to human driven unit movements, the search algorithm fails to find a path around America, so the loop potential seems present, perhaps it fails to terminate some kind of loop/number of tries to try to find a good movement path.

In this particular game there were a lot of ships, transports, cruisers, sails, so this adds credibility to this hypothesis that a unit near center of America is getting stuck. Therefore an interesting question to ask for you is:

The link which you posted to an 8 year old discussion, was that save game also on Earth map ? (I'll try and go check it out :)) Interesting it is not earth, but it looks somewhat similiar, though a smaller map (difficulty is unknown/can't tell) ! ;) I'll see if it enters the same code if it hangs as I posted above.
His version was 475.01, the version I currently use is: 475.05.
His bug does not occur in version 475.05. So perhaps his bug/hang is unrelated to my hang and it might be a different issue. I have other versions of civ 1 as well. Version 475.01 too, called civ1ch in my folder... I will try and see if I can reproduce his hang ! =D It indeed hangs ! Interesting... going to debug it ! ;)
I have posted a disassembly of his hang bug in his thread, to keep it seperated from this one. I can confirm that his hang bug has totally different code/assembly, so it seems most likely that it is not related to this bug, but it could still be the same kind of bug if it's indeed list related, though for now I believe it is not, if I manage to get save game close to hang point then I can post it to, maybe I already posted one a while ago... I think so... hmmm... maybe a dos-box save state might also be of some help, not sure.

So for now I take this hypothesis of linked/unit stack bugs with a grain of salt. I think it's trying to do some kind of computation and it expect some kind of result or index to occur, but that never happens they way the programmer expected it too... somehow the code/calculations fail to produce a desired exiting/jump out value.
 
Last edited:
I took another more detailed look, and I'm upping my confidence that you have a unit stack problem: a unit whose "next unit in stack" is... itself. So endless loop.

Hereunder I show you the match between your assembly and reverse from IDA: you will see that apparently, unit with ID 0x75 has itself as "next unit in stack", since when retrieving the next value, the variable var_neighbourOccupant (my naming) still has the value 0x75, so the loop continues trying to process unit 0x75...

upload_2022-4-23_16-4-20.png


upload_2022-4-23_16-4-28.png
 
OK, Let's suppose you are right :) How to fix this ? I tried setting ax to FFFF but then it seemed the program ended up outside of it's normal instructions, though I could be wrong.

How does your tool solve it ? I would assume setting AX or some pointer to 0/nil but perhaps FFFF is used as a nil indicator.

Perhaps loading AX with FFFF at some of the other branches might work.

By collecting multiple save games right before this bug happens it might become possible to find the location in the computer program where this happens.

It would work by setting a watch on the next pointer and checking if the value becomes the same as the original/entity itself.

Then loading each save games, playing a couple of turns, and then hoping the watchpoint/databreakpoint condition gets triggered.

Data breakpoint:

Unit.Next = Unit

Or perhaps some more complex breakpoint, checking different locations, but mostly the same idea.

Then maybe an executable patch could be created to fix civ1.exe so that this never happens again ! =D That be great/awesome ! =D
 
Last edited:
An alternative idea/fix would be to replace/intercept one of the jmps and lead it to a new piece of code/instructions, perhaps somewhere in the executabe where there are some empty instructions or nops or perhaps at end of file, or perhaps allocate in memory.

Then check for this condition to occur, and if so then fix it, by setting the next pointer to some kind of nil value. (FFFF) ?

Then this fix would simply let the problem occur, but then detect it, fix it, and then return/jump back to those code sections and then the game should continue to run fine =D

However it might be possible that this "circular list" problem happens for even multiple units linked together in some circle, then the checking code would have to be a bit more complex, assume some maximum number of linked units like 65535 or something and if the counter is about to overflow then decide that a circular list is detected, perhaps build a new list/stack where some code checks to see if the unit to be added is already in the stack, so re-build the stack/list in a save way, once done, set the last next pointer to NIL... hehe interesting idea.

This would be quite an expensive algorithm... to constantly check the already existing list, but it is save efficient. A more simple approach would be to try the list in one go, and just set the next pointer to nil.. after adding a unit, maybe that would be enough to fix it.
 
Last edited:
Back
Top Bottom