Retributive morality, prisoner's dilemma and more: A challenge to CFC

Erik Mesoy · Mar 17, 2007

I thought of a game, based on the Prisoner's Dilemma, that I could put to CFC for discussion and at the same time practice my java programming skills with.

Here's the setup: In any given round, you can choose to play nice (N) or play dirty (D).
If both parties play nice, they both win 3 points.
If one party plays dirty, that party wins 4 points and the nice party none.
If both parties play dirty, both parties win 1 point.

Symbolically:
(N, N) -> (3, 3)
(N, D) -> (0, 4)
(D, N) -> (4, 0)
(D, D) -> (1, 1)

Slap whatever political or ideological labels you like on the above. I suspect a favorite will be (N, D) as "America is losing the war on terror because it's giving too good treatment to terrorists who don't deserve it because of their actions".

Anyway. The challenge to any number of people at CFC is this - tell me, algorithmically, how you would behave over time in such a situation. Please base it on your RL moral judgement (or someone's) so that I can check performance levels of different codes of morality. I will be running this on my computer in Java, so you can't analyze the situation between each round and discuss tactics. However, you are free to discuss tactics with your fellow posters, and you're also free to submit different tactics to me than you actually say in public. (Send to me by PM. Algorithms posted here will not be run.)

Each submitted algorithm will be run against each other algorithm, plus a few of my own, for an unknown, varying number of rounds, to avoid the problem that crops up at the last round, and the round before that, and the round before that again...

Example of a strategy you can submit:
"If the last two things done by the other guy were D, punish him by playing D twice. Otherwise, play N and forgive a single D." (Call this A)

Another one:
"If my opponent plays N, I get 3 points for N and 4 points for D. If my opponent plays D, I get 0 points for N and 1 point for D. Therefore I should always play D." (Call this B)

Note that if A and B play against one another, the result is:
0,4
0,4
1,1
... (repeat 1,1 until end of game)

Your play instructions can have any amount of conditionals. (Assuming that they're sane, of course! Don't make your plays dependent on the phase of the moon, please.) You can include random plays with percentage probabilities. You can refer to the number of rounds played, but not to the (unknown) number of total or remaining rounds. Your instructions can include dependencies on previous plays. If you want to submit multiple algorithms, that's allowed, but please give them names as I will otherwise be naming each algorithm after its submitter.

Finally, this is not meant to be an accurate model! Merely a curiosity.

Fire away! Special request for El_Machinae to submit an algorithm based on his personal way of life. I'm very curious as to how he'll model that.

Erik Mesoy · Mar 17, 2007

I've gotten two PMs from people who have bumped into this before and claim to know of the same "optimal" strategy.

It has lost. Repeatedly.

Ten seconds on google probably does not get you the optimal strategy. Make up your own.

ArneHD · Mar 17, 2007

Ok, Here is my strategy:

Start Nice, next moves are whatever the opponent chose the last round

If the opponent is dirty there is a small chance (5&#37

that it will chose Nice anyway

Edit: If it's Java, I might be able to look into it. I dabbled in Java some time ago.

Gogf · Mar 17, 2007

ArneHD said:
Ok, Here is my strategy:

Start Nice, next moves are whatever the opponent chose the last round

If the opponent is dirty there is a small chance (5%) that it will chose Nice anyway

Edit: If it's Java, I might be able to look into it. I dabbled in Java some time ago.

This actually weakens the tit-for-tat strategy. If you go N against D and are N against N for every other round in the game, you'll come off worse than the other guy.

I submitted a couple possibilities.

ArneHD · Mar 17, 2007

Gogf said:
This actually weakens the tit-for-tat strategy. If you go N against D and are N against N for every other round in the game, you'll come off worse than the other guy.

I submitted a couple possibilities.

Only slightly, and if the two parties are cought in a Dirty-Dirty loop, it could work to its advantage.

PS: 5 not 50%.

punkbass2000 · Mar 17, 2007

Gogf said:
you'll come off worse than the other guy.

Is doing better than the other guy necessarily the goal?

Truronian · Mar 17, 2007

What's the aim?

Gogf · Mar 17, 2007

punkbass2000 said:
Is doing better than the other guy necessarily the goal?

No, but doing worse is something that should be avoided.

Another idea is the "inverse tit-for-tat." Start with a D and then do the opposite of what the other guy does.

ArneHD · Mar 17, 2007

punkbass2000 said:
Is doing better than the other guy necessarily the goal?

In the prisoners dillema, yes, yes it is.

Erik Mesoy · Mar 17, 2007

punkbass2000 said:
Is doing better than the other guy necessarily the goal?

Truronian said:
What's the aim?

This is where your personal morality/ethics/value system/whatever comes in. If you're willing to sacrifice yourself for the public good, send in "all N" as your strategy. If you're sending in a ruleset meant to caricature your political opponents, you can send in "do the opposite of what the other guy is doing". For most people, the aim will probably be to score the most points. Since (3,3)*2 > (4,0)+(0,4), you may want to favor cooperation.

punkbass2000 · Mar 17, 2007

ArneHD said:
In the prisoners dillema, yes, yes it is.

Why do you think that? For me, anyway, the goal would be as little jail time as possible.

Erik Mesoy · Mar 17, 2007

ArneHD said:
In the prisoners dillema, yes, yes it is.

It looks as though you are assuming that punkbass2000 prefers less time spent in prison.

(Edit: Heh, crosspost!)

Incidentally, the standings so far are

1. Dawkins' Grudger (Gogf)
2. ArneHD
3. The Psychotic (Gogf)

These were done on paper as I haven't gotten the program working yet.

punkbass2000 · Mar 17, 2007

Gogf said:
No, but doing worse is something that should be avoided.

Say there's ten rounds. You start N, they start D. They're now "winning" 4-0. Would it be better to both play D from now and end 13-9 or both play N and end up 31-27?

punkbass2000 · Mar 17, 2007

Incidentally, the setup of this particular game doesn't really align well with "Prisoner's Dilemma". It uses a similar gaming method, but the results and process are quite distinct.

ArneHD · Mar 17, 2007

Eric, could you post the code? I would like to have a look at it.

Taliesin · Mar 17, 2007

Assuming an intelligent opponent who can adjust his own play and who seeks to maximise his own score, I think I have a strategy that works in the long run:

1) Play N in the first round.
2) If the opponent plays N, play N in the following round.
3) If the opponent plays D, randomly play either N or D in the following round.

Once the opponent has guessed that this is your rule, the thinking goes like this: any time when both play N, the opponent knows he can score an extra profit of 1 in the following round by playing D; however, this comes at the expected cost of .5 in each subsequent round (assuming he keeps playing D) and the expected cost of 1.5 when he plays his next N.

This will force the opponent to play N all the time, if he values his own absolute score more than his relative advantage over you. It goes to hell, I think, if he decides it's more important to beat you than to score well.

It won't work in Erik's game, though, because nobody can submit an algorithm that will 'learn' that you're playing randomly.

punkbass2000 · Mar 17, 2007

I assume '3)' should read: "If the opponent plays D, randomly play either N or D in the following round."

Taliesin · Mar 17, 2007

Sorry, yes. :blush:

I've edited that in. Thanks.

punkbass2000 · Mar 17, 2007

ArneHD · Mar 17, 2007

Taliesin said:
Assuming an intelligent opponent who can adjust his own play and who seeks to maximise his own score, I think I have a strategy that works in the long run:

1) Play N in the first round.
2) If the opponent plays N, play N in the following round.
3) If the opponent plays D, randomly play either N or D in the following round.

Once the opponent has guessed that this is your rule, the thinking goes like this: any time when both play N, the opponent knows he can score an extra profit of 1 in the following round by playing D; however, this comes at the expected cost of .5 in each subsequent round (assuming he keeps playing D) and the expected cost of 1.5 when he plays his next N.

This will force the opponent to play N all the time, if he values his own absolute score more than his relative advantage over you. It goes to hell, I think, if he decides it's more important to beat you than to score well.

It won't work in Erik's game, though, because nobody can submit an algorithm that will 'learn' that you're playing randomly.

Isn't that in essence a variation of the" Forgiving Tib for Tab" that I suggested?

Retributive morality, prisoner's dilemma and more: A challenge to CFC

Core Tester / Intern

Core Tester / Intern

Just a little bit mad

Indescribable

Just a little bit mad

Des An artiste

Quite unfamiliar

Indescribable

Just a little bit mad

Core Tester / Intern

Des An artiste

Core Tester / Intern

Des An artiste

Des An artiste

Just a little bit mad

Puttin' on the Ritz

Des An artiste

Puttin' on the Ritz

Des An artiste

Just a little bit mad

Similar threads