Image Image Image




Post new topic Reply to topic  [ 16 posts ] 
Author Message
 Post subject: CFRM
PostPosted: Tue Dec 14, 2010 2:17 am 
Offline
Junior member
User avatar

Posts: 38
Favourite Bot: Mine :D
Hi Guyz,

I'm trying to build my CFR algo. for 1 card poker... Will it be possible ?


I have watched the video and I understand the basic concept of the CFR (I think).

What I don't understand is how to take in count the card of the vilain. Do I have to create a 'node' for each card ?

Next question, when I have calculated the accumative regrets what should I do ? How should I use the calculated values ?

For the root node at start, I have 2 actions : bet or check with probability of 50% each. I suppose that I will have to update these values when I will have read all the tree and calculatec the accumulative regrets.Am I wrong ?

What they mean by 'repeat until suffiently small epsilon' ?

Sorry for all this questions but I'm trying to understand the CFR theory.

Thank you in advance for your help.

MrNice


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Tue Dec 14, 2010 9:21 am 
Offline
PokerAI fellow
User avatar

Posts: 1239
Favourite Bot: my bot
@MrNice

1 card poker solution here http://code.google.com/p/alexoholdem/
Have you read michael johansons MSc thesis? viewtopic.php?f=64&t=83&p=11108&hilit=johanson#p11108
Lots of previous discussions on this forum - have you tried a search?


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Wed Dec 15, 2010 6:43 pm 
Offline
Junior member
User avatar

Posts: 38
Favourite Bot: Mine :D
Hi Spears,

I have read the johansons thesis and I will have another look.

I have already search on the forum but i couldn't find the answer.


I will read again the thesis of johansons and come back to the forum.

regards,

MrNice


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Wed Dec 29, 2010 2:47 pm 
Offline
Junior member
User avatar

Posts: 38
Favourite Bot: Mine :D
Hi Guyz,


I have read and read and read the thesis of Johanson, specially the chapter 3... I have finally achieved to create my game tree and calculate the CFR for one card poker (1->13). Following my algo used to calculate CFR :

Code:
//calculate CFR

            CurrentNode->Ev=(CurrentNode->NextNode1->Ev * CurrentNode->ProbabilityAction1)+(CurrentNode->NextNode2->Ev * CurrentNode->ProbabilityAction2);

            float RegretAction1=CurrentNode->NextNode1->Ev - CurrentNode->Ev;
            float RegretAction2=CurrentNode->NextNode2->Ev - CurrentNode->Ev;
            float CounterFactualRegret1;
            float CounterFactualRegret2;
            
            if(CurrentNode->PreviousNode!=NULL)
            {
               if(CurrentNode->PreviousAction==CurrentNode->PreviousNode->NextAction1)
               {
                  CounterFactualRegret1= RegretAction1 * CurrentNode->PreviousNode->ProbabilityAction1;
                  CounterFactualRegret2= RegretAction2 * CurrentNode->PreviousNode->ProbabilityAction1;

               }
               if(CurrentNode->PreviousAction==CurrentNode->PreviousNode->NextAction2)
               {
                  CounterFactualRegret1= RegretAction1 * CurrentNode->PreviousNode->ProbabilityAction2;
                  CounterFactualRegret2= RegretAction2 * CurrentNode->PreviousNode->ProbabilityAction2;
               }
            }
            else
            {
               CounterFactualRegret1= RegretAction1;
               CounterFactualRegret2= RegretAction2;

            }
            
            CurrentNode->CounterFactualRegret1 = CurrentNode->CounterFactualRegret1+CounterFactualRegret1;
            CurrentNode->CounterFactualRegret2 = CurrentNode->CounterFactualRegret2+CounterFactualRegret2;
            
            CounterFactualRegret1=CurrentNode->CounterFactualRegret1;
            CounterFactualRegret2=CurrentNode->CounterFactualRegret2;


            if(CounterFactualRegret1>0 && CounterFactualRegret2>0)
            {
               CurrentNode->ProbabilityAction1=CounterFactualRegret1/(CounterFactualRegret1+CounterFactualRegret2);
               CurrentNode->ProbabilityAction2=CounterFactualRegret2/(CounterFactualRegret1+CounterFactualRegret2);
            }
            
            if(CounterFactualRegret1<0 && CounterFactualRegret2>0)
            {
               CurrentNode->ProbabilityAction1=0;
               CurrentNode->ProbabilityAction2=1;
            }
            
            if(CounterFactualRegret2<0 && CounterFactualRegret1>0)
            {
               CurrentNode->ProbabilityAction1=1;
               CurrentNode->ProbabilityAction2=0;
            }
            

            CurrentNode->Iteration++;
            
            if(CurrentNode->PreviousNode != NULL)
            {
               ActionatNode(CurrentNode->PreviousNode,HeroCard,VilainCard);
            }
            else
            {
               return;
            }
         }


What do you think about the code ? Is it the right way to calculate it ?

On my implementation, I have only created one tree (from my point of view). Do I have to create a tree on the opponent side ? If yes, what will be the utility ?

What I will do now is to create a game tree for each card (1->13) and run the CFR calculation on each tree against each other cards. So at the end I will have 13 trees with the CFR computed against each other cards... Do you think that's the right way to do ? Or is there any other way to do it?

Regarding the first tests, I have run the algo on the following way :

Hero card : Card 10
Vilain cards : Card from 1 to 13

After that, I have printed the probability for my first action (RAISE or CHECK)... and it's always 'RAISE'... I don't know if it's normal... Does anybody has used the CFR on one card poker to compare our results ?

Thank you for your input.

MrNice


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Wed Dec 29, 2010 8:25 pm 
Offline
Junior member
User avatar

Posts: 38
Favourite Bot: Mine :D
Hi Guyz,

In the Johanson thesis on page 39, he talks about “half-strategies".... What does that mean ?

Do we have to create two trees one from hero side and a second one from the vilain side ?

If we do that, for exemple in the hero tree, do we have to evaluate CFR and modify the probability of the vilain to reach the next node ? So do we have to evaluate CFR on each node or only on the owner's tree information sets ?

Have a nice evening.


MrNice


Top
 Profile E-mail  
 
 Post subject: Re: Monte Carlo Sampling for Regret Minimization in Extensive Games
PostPosted: Thu Dec 30, 2010 9:16 pm 
Offline
Junior member
User avatar

Posts: 25
Favourite Bot: Polaris Orange
Hi MrNice,

A strategy is a set of action probabilities for every choice node / information set in the game tree for both players. A half-strategy is just a set of action probabilities for every choice node for one player.

On each iteration of CFR, you update both players' half-strategies. The easiest way to do that is to do it one player at a time. First, update the hero (leaving the villain unchanged), then update the villain (leaving the hero unchanged). When you're updating the regrets and probabilities for one of hero's information sets, you need to know the probability that the villain will take the actions in his tree that gets you to that point.

-- MBJ


Top
 Profile  
 
 Post subject: Re: CFRM
PostPosted: Thu Dec 30, 2010 9:26 pm 
Offline
Junior member
User avatar

Posts: 38
Favourite Bot: Mine :D
Hi MBJ,

Thank you for your help. I will try to build it with two trees.

Have you some remarks on how I have implemented the CFR calculation ?


Regarding the simuation, do I have to simulate each cards agains each cards ?


Does anybody has the results of CFR with one card poker to compare with my results ?

Thank you for your precious help.

MrNice


Top
 Profile E-mail  
 
 Post subject: Re: CFRM
PostPosted: Thu Dec 30, 2010 10:17 pm 
Offline
Junior member
User avatar

Posts: 25
Favourite Bot: Polaris Orange
Hi MrNice,

MrNice wrote:
Regarding the simuation, do I have to simulate each cards agains each cards ?


Do you mean for the card sampling on each iteration? That's how we do it, yes; we've found that "Poker-specific" CFR (from the CFR paper, which is the same as "Chance-Sampled CFR" from the MC-CFR paper) works best for poker games. On each CFR iteration, randomly deal out one hand for each of the players and the board, and then use those cards to determine the outcome of all of the chance nodes on that iteration.

Quote:
Does anybody has the results of CFR with one card poker to compare with my results ?


Instead of one-card poker, you might like to try using your implementation to solve Kuhn poker or Leduc Hold'em. Kuhn poker is well studied and you can do the math by hand to determine the set of equilibrium strategies, so you can verify that your code is working. Leduc Hold'em is a bit more of a real game, in that it has two rounds and a board card. The OpenCFR package solves Leduc Hold'em, too, so that's another way to verify that your code works: http://poker.cs.ualberta.ca/open_cfr.html

-- MBJ


Top
 Profile  
 
 Post subject: Re: CFRM
PostPosted: Fri Dec 31, 2010 1:31 am 
Offline
Senior member
User avatar

Posts: 100
Favourite Bot: LGC
MBJ wrote:
Hi MrNice,

MrNice wrote:
Regarding the simuation, do I have to simulate each cards agains each cards ?


Do you mean for the card sampling on each iteration? That's how we do it, yes; we've found that "Poker-specific" CFR (from the CFR paper, which is the same as "Chance-Sampled CFR" from the MC-CFR paper) works best for poker games. On each CFR iteration, randomly deal out one hand for each of the players and the board, and then use those cards to determine the outcome of all of the chance nodes on that iteration.

Quote:
Does anybody has the results of CFR with one card poker to compare with my results ?


Instead of one-card poker, you might like to try using your implementation to solve Kuhn poker or Leduc Hold'em. Kuhn poker is well studied and you can do the math by hand to determine the set of equilibrium strategies, so you can verify that your code is working. Leduc Hold'em is a bit more of a real game, in that it has two rounds and a board card. The OpenCFR package solves Leduc Hold'em, too, so that's another way to verify that your code works: http://poker.cs.ualberta.ca/open_cfr.html

-- MBJ


Talking of OpenCFR, In game.cpp we have

Code:
                                                                     
void deal_hand(int& hole0, int& hole1, int& board) {                           
  static int deck[6] = {0,0,1,1,2,2};                                           
                                                                               
  swap(deck[0], deck[1 + rand()%5]);                                           
  swap(deck[1], deck[2 + rand()%4]);                                           
  swap(deck[2], deck[3 + rand()%3]);                                           
                                                                               
  hole0 = deck[0];                                                             
  hole1 = deck[1];                                                             
  board = deck[2];                                                             
}           


Isn't this wrong? (first swap should be with deck[0 + rand()%6] and so on.)

Marv


Top
 Profile  
 
 Post subject: Re: CFRM
PostPosted: Fri Dec 31, 2010 10:28 am 
Offline
Junior member
User avatar

Posts: 25
Favourite Bot: Polaris Orange
Hi Marv,

ok|select wrote:
Talking of OpenCFR, In game.cpp we have

Code:
                                                                     
void deal_hand(int& hole0, int& hole1, int& board) {                           
  static int deck[6] = {0,0,1,1,2,2};                                           
                                                                               
  swap(deck[0], deck[1 + rand()%5]);                                           
  swap(deck[1], deck[2 + rand()%4]);                                           
  swap(deck[2], deck[3 + rand()%3]);                                           
                                                                               
  hole0 = deck[0];                                                             
  hole1 = deck[1];                                                             
  board = deck[2];                                                             
}           


Isn't this wrong? (first swap should be with deck[0 + rand()%6] and so on.)

Marv


You're right; that's a bug. The array is static, so it'll be biased towards the first player not having the same hand as on the previous iteration. We'll fix it - thanks.


Top
 Profile  
 
 Post subject: Re: CFRM
PostPosted: Fri Dec 31, 2010 10:54 am 
Offline
PokerAI fellow
User avatar

Posts: 1239
Favourite Bot: my bot
MrNice wrote:
Does anybody has the results of CFR with one card poker to compare with my results ?


http://lmgtfy.com/?q=one+card+poker&l=1


Top
 Profile E-mail  
 
 Post subject: Re: CFRM
PostPosted: Fri Dec 31, 2010 1:53 pm 
Offline
Junior member
User avatar

Posts: 38
Favourite Bot: Mine :D
Hi Spears,


I know this site but as I have read, the values haven't been computed with CFR but with equation resolution ... So I don't know if the results will be the same....

Any way thanks.


MrNice


Top
 Profile E-mail  
 
 Post subject: Re: CFRM
PostPosted: Sun Jan 02, 2011 6:18 am 
Offline
Junior member
User avatar

Posts: 25
Favourite Bot: Polaris Orange
MrNice wrote:
I know this site but as I have read, the values haven't been computed with CFR but with equation resolution ... So I don't know if the results will be the same....


Hi MrNice,

Even if those values had been generated with CFR, your implementation could be correct and still return a different strategy. Even small poker games usually have multiple equilibria, and since it's 2-player and zero sum, if there's more than one then there is an infinite number of them. CFR will find an equilibrium, but makes no guarantees on which one it will find. If you run CFR twice with different starting conditions - a different random number seed for generating the cards, for example - it can produce a different strategy that is also in equilibrium.

-- MBJ


Top
 Profile  
 
 Post subject: Re: CFRM
PostPosted: Tue Jan 04, 2011 9:46 pm 
Offline
Junior member
User avatar

Posts: 38
Favourite Bot: Mine :D
Hi MBJ,

thank you for your support :D greatly appreciated :D

I think I'm close to the solution but doesn't matter the value of the card, the proability of the RAISE at the first node is always 1... I would expect a different probability base on the initial card... No ?

A have another question regarding Ev.

At a terminal node, I evaluate Ev based on the action and on players's cards... For nodes previous to the terminal node, I will evaluate Ev as :

Code:
Ev=(CurrentNode->NextNode1->Ev * CurrentNode->ProbabilityAction1)+(CurrentNode->NextNode2->Ev * CurrentNode->ProbabilityAction2);


So at a upper level, I suppose that I will use the same formula to compute Ev and calculate CFR....


Is it correct ?


Thank you for your help.

MrNice


Top
 Profile E-mail  
 
 Post subject: Re: CFRM
PostPosted: Mon Aug 22, 2011 10:37 am 
Offline
New member
User avatar

Posts: 2
Favourite Bot: VexBot
Another Question about OpenCFR:

in "train.cpp" file there is a function named "update_regret". One of functions parameters is named "chance", but all through the code it's value is equals "1.0" and doesn't change.

What is the purpose of this parameter ?


Top
 Profile E-mail  
 
 Post subject: Re: CFRM
PostPosted: Wed Aug 31, 2011 4:38 pm 
Offline
New member
User avatar

Posts: 3
Favourite Bot: Behemoth
vengo wrote:
in "train.cpp" file there is a function named "update_regret". One of functions parameters is named "chance", but all through the code it's value is equals "1.0" and doesn't change.

What is the purpose of this parameter ?


If you were to use unsampled CFR, this parameter would be required. It's an artifact of the code being more sophisticated at one point.


Top
 Profile E-mail  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 16 posts ] 


Who is online

Users browsing this forum: Google and 8 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: