Retributive morality, prisoner's dilemma and more: A challenge to CFC

Erik Mesoy

Core Tester / Intern
Joined
Mar 25, 2002
Messages
10,959
Location
Oslo, Norway
I thought of a game, based on the Prisoner's Dilemma, that I could put to CFC for discussion and at the same time practice my java programming skills with.

Here's the setup: In any given round, you can choose to play nice (N) or play dirty (D).
If both parties play nice, they both win 3 points.
If one party plays dirty, that party wins 4 points and the nice party none.
If both parties play dirty, both parties win 1 point.

Symbolically:
(N, N) -> (3, 3)
(N, D) -> (0, 4)
(D, N) -> (4, 0)
(D, D) -> (1, 1)

Slap whatever political or ideological labels you like on the above. I suspect a favorite will be (N, D) as "America is losing the war on terror because it's giving too good treatment to terrorists who don't deserve it because of their actions". ;)

Anyway. The challenge to any number of people at CFC is this - tell me, algorithmically, how you would behave over time in such a situation. Please base it on your RL moral judgement (or someone's) so that I can check performance levels of different codes of morality. I will be running this on my computer in Java, so you can't analyze the situation between each round and discuss tactics. However, you are free to discuss tactics with your fellow posters, and you're also free to submit different tactics to me than you actually say in public. (Send to me by PM. Algorithms posted here will not be run.)

Each submitted algorithm will be run against each other algorithm, plus a few of my own, for an unknown, varying number of rounds, to avoid the problem that crops up at the last round, and the round before that, and the round before that again...

Example of a strategy you can submit:
"If the last two things done by the other guy were D, punish him by playing D twice. Otherwise, play N and forgive a single D." (Call this A)

Another one:
"If my opponent plays N, I get 3 points for N and 4 points for D. If my opponent plays D, I get 0 points for N and 1 point for D. Therefore I should always play D." (Call this B)

Note that if A and B play against one another, the result is:
0,4
0,4
1,1
... (repeat 1,1 until end of game)

Your play instructions can have any amount of conditionals. (Assuming that they're sane, of course! Don't make your plays dependent on the phase of the moon, please.) You can include random plays with percentage probabilities. You can refer to the number of rounds played, but not to the (unknown) number of total or remaining rounds. Your instructions can include dependencies on previous plays. If you want to submit multiple algorithms, that's allowed, but please give them names as I will otherwise be naming each algorithm after its submitter.

Finally, this is not meant to be an accurate model! Merely a curiosity.


Fire away! Special request for El_Machinae to submit an algorithm based on his personal way of life. I'm very curious as to how he'll model that.
 
I've gotten two PMs from people who have bumped into this before and claim to know of the same "optimal" strategy.

It has lost. Repeatedly. :p Ten seconds on google probably does not get you the optimal strategy. Make up your own.
 
Ok, Here is my strategy:

Start Nice, next moves are whatever the opponent chose the last round

If the opponent is dirty there is a small chance (5%) that it will chose Nice anyway

Edit: If it's Java, I might be able to look into it. I dabbled in Java some time ago.
 
Ok, Here is my strategy:

Start Nice, next moves are whatever the opponent chose the last round

If the opponent is dirty there is a small chance (5%) that it will chose Nice anyway

Edit: If it's Java, I might be able to look into it. I dabbled in Java some time ago.

This actually weakens the tit-for-tat strategy. If you go N against D and are N against N for every other round in the game, you'll come off worse than the other guy.

I submitted a couple possibilities.
 
This actually weakens the tit-for-tat strategy. If you go N against D and are N against N for every other round in the game, you'll come off worse than the other guy.

I submitted a couple possibilities.

Only slightly, and if the two parties are cought in a Dirty-Dirty loop, it could work to its advantage.

PS: 5 not 50%.
 
Is doing better than the other guy necessarily the goal?
What's the aim?

This is where your personal morality/ethics/value system/whatever comes in. If you're willing to sacrifice yourself for the public good, send in "all N" as your strategy. If you're sending in a ruleset meant to caricature your political opponents, you can send in "do the opposite of what the other guy is doing". For most people, the aim will probably be to score the most points. Since (3,3)*2 > (4,0)+(0,4), you may want to favor cooperation.
 
In the prisoners dillema, yes, yes it is.
It looks as though you are assuming that punkbass2000 prefers less time spent in prison. ;) (Edit: Heh, crosspost!)

Incidentally, the standings so far are

1. Dawkins' Grudger (Gogf)
2. ArneHD
3. The Psychotic (Gogf)

These were done on paper as I haven't gotten the program working yet.
 
Eric, could you post the code? I would like to have a look at it.
 
Assuming an intelligent opponent who can adjust his own play and who seeks to maximise his own score, I think I have a strategy that works in the long run:

1) Play N in the first round.
2) If the opponent plays N, play N in the following round.
3) If the opponent plays D, randomly play either N or D in the following round.

Once the opponent has guessed that this is your rule, the thinking goes like this: any time when both play N, the opponent knows he can score an extra profit of 1 in the following round by playing D; however, this comes at the expected cost of .5 in each subsequent round (assuming he keeps playing D) and the expected cost of 1.5 when he plays his next N.

This will force the opponent to play N all the time, if he values his own absolute score more than his relative advantage over you. It goes to hell, I think, if he decides it's more important to beat you than to score well.

It won't work in Erik's game, though, because nobody can submit an algorithm that will 'learn' that you're playing randomly.
 
Sorry, yes. :blush: I've edited that in. Thanks.
 
Assuming an intelligent opponent who can adjust his own play and who seeks to maximise his own score, I think I have a strategy that works in the long run:

1) Play N in the first round.
2) If the opponent plays N, play N in the following round.
3) If the opponent plays D, randomly play either N or D in the following round.

Once the opponent has guessed that this is your rule, the thinking goes like this: any time when both play N, the opponent knows he can score an extra profit of 1 in the following round by playing D; however, this comes at the expected cost of .5 in each subsequent round (assuming he keeps playing D) and the expected cost of 1.5 when he plays his next N.

This will force the opponent to play N all the time, if he values his own absolute score more than his relative advantage over you. It goes to hell, I think, if he decides it's more important to beat you than to score well.

It won't work in Erik's game, though, because nobody can submit an algorithm that will 'learn' that you're playing randomly.

Isn't that in essence a variation of the" Forgiving Tib for Tab" that I suggested?
 
Back
Top Bottom