MultiAgentDecisionProcess
Release 0.2.1
|
BayesianGameForDecPOMDPStage represents a BG for a single stage. More...
#include <BayesianGameForDecPOMDPStage.h>
Public Member Functions | |
BayesianGameForDecPOMDPStage (const PlanningUnitDecPOMDPDiscrete *pu, const QFunctionJAOHInterface *q, const PartialJointPolicyDiscretePure *pastJPol) | |
Constructor that creates and initializes a BG from scratch. | |
BayesianGameForDecPOMDPStage (const PlanningUnitDecPOMDPDiscrete *pu) | |
Constructor that creates an empty BG. | |
BayesianGameForDecPOMDPStage (const BayesianGameForDecPOMDPStage &a) | |
Copy constructor. | |
void | ClearAllImmediateRewards () |
We can also clear this cache. | |
void | ComputeAllImmediateRewards () |
When performing a lot of GetImmediateReward calls we can first compute a cache of immediate rewards, to speed things up. | |
double | ComputeDiscountedImmediateRewardForJPol (JointPolicyDiscretePure *jpolBG) const |
Compute the discounted expected imm reward for jpolBG. | |
double | GetImmediateReward (Index jtI, Index jaI) const |
Returns the (expected) immediate reward for jtI, jaI. | |
const PlanningUnitDecPOMDPDiscrete * | GetPUDecPOMDPDiscrete () const |
BayesianGameForDecPOMDPStage & | operator= (const BayesianGameForDecPOMDPStage &o) |
Copy assignment operator. | |
void | Print () const |
Print this BayesianGameIdenticalPayoff to cout. | |
std::string | SoftPrint () const |
Prints a description of this entire BayesianGameIdenticalPayoff to a string. | |
~BayesianGameForDecPOMDPStage () | |
Destructor. | |
![]() | |
BayesianGameForDecPOMDPStageInterface (const PartialJointPolicyDiscretePure *pastJPol) | |
(default) Constructor | |
BayesianGameForDecPOMDPStageInterface (Index t) | |
const PartialJointPolicyDiscretePure * | GetPastJointPolicy () const |
Index | GetStage () const |
virtual | ~BayesianGameForDecPOMDPStageInterface () |
Destructor. | |
![]() | |
BayesianGameIdenticalPayoff () | |
BayesianGameIdenticalPayoff (size_t nrAgents, const std::vector< size_t > &nrActions, const std::vector< size_t > &nrTypes) | |
double | GetUtility (const Index jtype, const Index ja) const |
Gets the utility for (for all agents) jtype, ja. | |
double | GetUtility (const std::vector< Index > &indTypeIndices, const std::vector< Index > &indActionIndices) const |
Gets the utility for (for all agents) joint type corresponding to the individual type indices (indTypeIndices) and joint action corresponding to individual action indices (indActionIndices). | |
void | PrintUtilForJointType (Index jtype) const |
bool | SetInitialized (bool b) |
Destructor. | |
void | SetUtility (const Index jtype, const Index ja, const double u) |
Sets the utility for (for all agents) jtype, ja to u. | |
void | SetUtility (const std::vector< Index > &indTypeIndices, const std::vector< Index > &indActionIndices, const double u) |
Sets the utility for (for all agents) joint type corresponding to the individual type indices (indTypeIndices) and joint action corresponding to individual action indices (indActionIndices). | |
std::string | SoftPrintUtilForJointType (Index jtype) const |
Prints the utilities for jtype. | |
![]() | |
BayesianGameIdenticalPayoffInterface () | |
(default) Constructor | |
BayesianGameIdenticalPayoffInterface (size_t nrAgents, const std::vector< size_t > &nrActions, const std::vector< size_t > &nrTypes) | |
![]() | |
void | AddProbability (const Index i, const double p) |
Adds p to the probability of joint type i. | |
void | AddProbability (const std::vector< Index > &indIndices, const double p) |
Adds p to the probability of joint type corresponding to the individual type indices (indIndices). | |
virtual bool | AreCachedJointToIndivIndices (const PolicyGlobals::IndexDomainCategory pdc) const |
Check whether certain index conversions are cached. | |
BayesianGameBase () | |
BayesianGameBase (const size_t nrAgents, const std::vector< size_t > &nrActions, const std::vector< size_t > &nrTypes, int verboseness=0) | |
BayesianGameBase (const BayesianGameBase &a) | |
Copy constructor. | |
bool | CacheJointToIndivAOH_Indices () const |
bool | CacheJointToIndivOH_Indices () const |
bool | CacheJointToIndivType_Indices () const |
virtual PolicyGlobals::IndexDomainCategory | GetDefaultIndexDomCat () const |
Return the default IndexDomainCategory for the problem. | |
size_t | GetNrActions (Index agentI) const |
Get the number of invididual actions of a particular agent. | |
const std::vector< size_t > & | GetNrActions () const |
size_t | GetNrAgents () const |
implement the Interface_ProblemToPolicyDiscrete interface: | |
size_t | GetNrJointActions () const |
Get the number of joint actions. | |
LIndex | GetNrJointPolicies () const |
size_t | GetNrJointTypes () const |
LIndex | GetNrPolicies (Index ag) const |
size_t | GetNrPolicyDomainElements (Index agentI, PolicyGlobals::IndexDomainCategory cat, size_t depth=MAXHORIZON) const |
Get the number of elements in the domain of an agent's policy. | |
const std::vector< size_t > & | GetNrTypes () const |
size_t | GetNrTypes (Index agI) const |
virtual double | GetProbability (const Index i) const |
Gets the probability of joint type i. | |
double | GetProbability (const std::vector< Index > &indIndices) |
Gets the probability of joint type corresponding to the individual type indices (indIndices) | |
Index | IndividualToJointActionIndices (const Index *IndArr) const |
Converts individual action indices to a joint action index. | |
Index | IndividualToJointActionIndices (const std::vector< Index > &indices) const |
Converts individual action indices to a joint action index. | |
Index | IndividualToJointTypeIndices (const std::vector< Index > &indices) const |
std::vector< Index > | JointToIndividualActionIndices (Index jaI) const |
std::vector< Index > | JointToIndividualPolicyDomainIndices (Index jdI, PolicyGlobals::IndexDomainCategory cat) const |
implementation of JointToIndividualPolicyDomainIndices | |
const std::vector< Index > & | JointToIndividualPolicyDomainIndicesRef (Index jdI, PolicyGlobals::IndexDomainCategory cat) const |
implementation of JointToIndividualPolicyDomainIndicesRef | |
const std::vector< Index > & | JointToIndividualTypeIndices (Index jTypeI) const |
void | PrintAction (Index agentI, Index actionI) const |
y | |
void | PrintPolicyDomain (Index agentI, Index typeIndex) const |
bool | SanityCheck () const |
virtual void | SanityCheck () |
void | SanityCheckBGBase () |
void | SetProbability (const Index i, const double p) |
Sets the probability of joint type i to p. | |
void | SetProbability (const std::vector< Index > &indIndices, const double p) |
Sets the probability of joint type corresponding to the individual type indices (indIndices) to p. | |
std::string | SoftPrintAction (Index agentI, Index actionI) const |
std::string | SoftPrintPolicyDomainElement (Index agentI, Index typeIndex, PolicyGlobals::IndexDomainCategory cat) const |
~BayesianGameBase () | |
Destructor. | |
![]() | |
LIndex | GetNrJointPolicies (PolicyGlobals::IndexDomainCategory cat, size_t depth=MAXHORIZON) const |
Get the number of joint policies, given the policy's domain. | |
LIndex | GetNrPolicies (Index ag, PolicyGlobals::IndexDomainCategory cat, size_t depth=MAXHORIZON) const |
Get the number of policies for an agent, given the policy's domain. | |
virtual | ~Interface_ProblemToPolicyDiscretePure () |
Destructor. | |
![]() | |
Interface_ProblemToPolicyDiscrete () | |
(default) Constructor | |
virtual | ~Interface_ProblemToPolicyDiscrete () |
Destructor. |
Protected Member Functions | |
BayesianGameForDecPOMDPStage (const PlanningUnitDecPOMDPDiscrete *pu, const QFunctionJAOHInterface *q, Index t, size_t nrAgents, const std::vector< size_t > &nrActions, const std::vector< size_t > &nrTypes) | |
Constructor that only creates a BG of specified dimensions. | |
double | ComputeImmediateReward (Index jtI, Index jaI) const |
Compute the immediate reward for an action and joint type. | |
PartialJointPolicyDiscretePure * | ConstructExtendedPolicy (PartialJointPolicyDiscretePure &jpolPrevTs, JointPolicyDiscretePure &jpolBG, std::vector< size_t > &nrOHts, std::vector< Index > &firstOHtsI) |
Extends a previous policy jpolPrevTs to the next stage. | |
void | Fill_FirstOHtsI (Index ts, std::vector< Index > &firstOHtsI) |
Fills the (empty) vector firstOHtsI, with the indices (for each agent) of the first observation history of time step ts. | |
void | Fill_jaI_Array (Index ts, Index joIs[], const JointPolicyDiscretePure *jpolPrevTs, Index *jaI_arr) |
Fills the array jaI_arr with the joint actions taken for the JOHs as specified by the array of joint observations joIs according to jpolPrevTs. | |
void | Fill_joI_Array (const Index ts, const std::vector< Index > &indTypes, const std::vector< Index > &firstOHtsI, Index *joI_arr) |
Fills the array of joint observation given the individual types and offsets (firstOHtsI). | |
void | Initialize () |
Initialized the BG - called from constructor. | |
void | ProbRewardForjoahI (Index ts, Index jtI, Index *jaI_arr, Index *joI_arr, Index &jaohI, double &PjaohI, double &ExpR_0_prevTS_thisJAOH) |
Calculates the jaohI corresponding to jaI_arr and joI_arr and also returnes the P(jaohI) and the expected obtained reward for previous time steps GIVEN this joint action history. |
Protected Attributes | |
bool | _m_areCachedImmediateRewards |
are the immediate rewards cached? | |
std::vector< std::vector < double > > | _m_immR |
the cache for the immediate rewards: immR[jt][ja] | |
std::vector < JointBeliefInterface * > | _m_JBs |
The joint beliefs induced by the joint types. | |
const PlanningUnitDecPOMDPDiscrete * | _m_pu |
Stores pointer to the PU. | |
const QFunctionJAOHInterface * | _m_qHeuristic |
A pointer to the heuristic used by this Bayesian game —nec.? | |
![]() | |
const PartialJointPolicyDiscretePure * | _m_pJPol |
Stores pointer to the past policy - perhaps not needed? | |
Index | _m_t |
The stage (time step) that this BG represents. |
Additional Inherited Members | |
![]() | |
static BayesianGameIdenticalPayoff | GenerateRandomBG (size_t nrAgents, std::vector< size_t > acs, std::vector< size_t > obs) |
Generates a random BG with identical payoffs. | |
static BayesianGameIdenticalPayoff | Load (std::string filename) |
Loads a BG from file. | |
static void | Save (const BayesianGameIdenticalPayoff &bg, std::string filename) |
BayesianGameForDecPOMDPStage represents a BG for a single stage.
Definition at line 47 of file BayesianGameForDecPOMDPStage.h.
|
protected |
Constructor that only creates a BG of specified dimensions.
This constructor does not initialize the BG. This is useful when there are additional computations that have to be done. E.g., when this is a clusterable BG (BayesianGameWithClusterInfo), the types do not correspond in a straigtforward manner to action-observ. histories and therefore the probs. and payoff function have to be computed at this higher level.
Definition at line 91 of file BayesianGameForDecPOMDPStage.cpp.
BayesianGameForDecPOMDPStage::BayesianGameForDecPOMDPStage | ( | const PlanningUnitDecPOMDPDiscrete * | pu, |
const QFunctionJAOHInterface * | q, | ||
const PartialJointPolicyDiscretePure * | pastJPol | ||
) |
Constructor that creates and initializes a BG from scratch.
This constructor creates and initializes a BG for the next stage given the past policy and q function.
Definition at line 47 of file BayesianGameForDecPOMDPStage.cpp.
References Initialize().
BayesianGameForDecPOMDPStage::BayesianGameForDecPOMDPStage | ( | const PlanningUnitDecPOMDPDiscrete * | pu | ) |
Constructor that creates an empty BG.
Definition at line 75 of file BayesianGameForDecPOMDPStage.cpp.
BayesianGameForDecPOMDPStage::BayesianGameForDecPOMDPStage | ( | const BayesianGameForDecPOMDPStage & | a | ) |
Copy constructor.
Definition at line 117 of file BayesianGameForDecPOMDPStage.cpp.
BayesianGameForDecPOMDPStage::~BayesianGameForDecPOMDPStage | ( | ) |
|
inlinevirtual |
We can also clear this cache.
Implements BayesianGameForDecPOMDPStageInterface.
Definition at line 188 of file BayesianGameForDecPOMDPStage.h.
References _m_areCachedImmediateRewards, and _m_immR.
Referenced by GMAA_MAAstar::ConstructAndValuateNextPolicies().
|
virtual |
When performing a lot of GetImmediateReward calls we can first compute a cache of immediate rewards, to speed things up.
Implements BayesianGameForDecPOMDPStageInterface.
Definition at line 590 of file BayesianGameForDecPOMDPStage.cpp.
References _m_areCachedImmediateRewards, _m_immR, ComputeImmediateReward(), BayesianGameBase::GetNrJointActions(), and BayesianGameBase::GetNrJointTypes().
Referenced by GMAA_MAAstar::ConstructAndValuateNextPolicies().
|
virtual |
Compute the discounted expected imm reward for jpolBG.
Implements BayesianGameForDecPOMDPStageInterface.
Definition at line 622 of file BayesianGameForDecPOMDPStage.cpp.
References _m_pu, BayesianGameForDecPOMDPStageInterface::_m_t, PlanningUnitDecPOMDPDiscrete::GetDiscount(), GetImmediateReward(), JointPolicyDiscretePure::GetJointActionIndex(), BayesianGameBase::GetNrJointTypes(), and BayesianGameBase::GetProbability().
|
protected |
Compute the immediate reward for an action and joint type.
Definition at line 607 of file BayesianGameForDecPOMDPStage.cpp.
References _m_JBs, _m_pu, BeliefInterface::GetIterator(), BeliefIteratorGeneric::GetProbability(), PlanningUnitDecPOMDPDiscrete::GetReward(), BeliefIteratorGeneric::GetStateIndex(), and BeliefIteratorGeneric::Next().
Referenced by ComputeAllImmediateRewards(), and GetImmediateReward().
|
protected |
Extends a previous policy jpolPrevTs to the next stage.
This function extends a previous policy jpolPrevTs for ts-1 with the behavior specified by the policy of the BayesianGame for time step ts (jpolBG). jpolPrevTs - a joint policy for the DecPOMDP up to time step ts-1 (i.e. with depth=ts-2) jpolBG - a joint policy for the BayesianGame for time step ts. nrOHts - a vector that specifies the number of observation histories for eac agents at time step ts. firstOHtsI - a vector that specifies the index of the first time step ts observation history for each agent (this functions as the offset in the conversion BG->DecPOMDP index conversion).
returns a new JointPolicyDiscretePure (so it must be explicitly deleted)
Definition at line 190 of file BayesianGameForDecPOMDPStage.cpp.
References JointPolicyDiscretePure::GetActionIndex(), JointPolicy::GetDepth(), JointPolicyDiscrete::GetIndexDomainCategory(), BayesianGameBase::GetNrAgents(), PolicyGlobals::OHIST_INDEX, JointPolicyDiscretePure::SetAction(), JointPolicy::SetDepth(), and PolicyGlobals::TYPE_INDEX.
|
protected |
Fills the (empty) vector firstOHtsI, with the indices (for each agent) of the first observation history of time step ts.
Definition at line 169 of file BayesianGameForDecPOMDPStage.cpp.
References _m_pu, PlanningUnitMADPDiscrete::GetFirstObservationHistoryIndex(), and BayesianGameBase::GetNrAgents().
Referenced by Initialize().
|
protected |
Fills the array jaI_arr with the joint actions taken for the JOHs as specified by the array of joint observations joIs according to jpolPrevTs.
Definition at line 246 of file BayesianGameForDecPOMDPStage.cpp.
References _m_pu, JointPolicyDiscretePure::GetJointActionIndex(), and PlanningUnitMADPDiscrete::GetSuccessorJOHI().
Referenced by Initialize().
|
protected |
Fills the array of joint observation given the individual types and offsets (firstOHtsI).
Definition at line 220 of file BayesianGameForDecPOMDPStage.cpp.
References _m_pu, BayesianGameBase::GetNrAgents(), PlanningUnitMADPDiscrete::GetObservationHistoryArrays(), and PlanningUnitMADPDiscrete::IndividualToJointObservationIndices().
Referenced by Initialize().
|
inlinevirtual |
Returns the (expected) immediate reward for jtI, jaI.
Implements BayesianGameForDecPOMDPStageInterface.
Definition at line 175 of file BayesianGameForDecPOMDPStage.h.
References _m_areCachedImmediateRewards, _m_immR, and ComputeImmediateReward().
Referenced by ComputeDiscountedImmediateRewardForJPol(), and GMAA_MAAstar::ConstructAndValuateNextPolicies().
|
inline |
Definition at line 200 of file BayesianGameForDecPOMDPStage.h.
References _m_pu.
|
protected |
Initialized the BG - called from constructor.
Given the past policy and q function, the probabilities and utility function are initialized.
Definition at line 472 of file BayesianGameForDecPOMDPStage.cpp.
References _m_JBs, BayesianGameForDecPOMDPStageInterface::_m_pJPol, _m_pu, _m_qHeuristic, BayesianGameForDecPOMDPStageInterface::_m_t, DEBUG_BG4DECPOMDP1, DEBUG_BG4DECPOMDP2, Fill_FirstOHtsI(), Fill_jaI_Array(), Fill_joI_Array(), PlanningUnitMADPDiscrete::GetJAOHProbsRecursively(), PlanningUnitMADPDiscrete::GetJointActionObservationHistoryIndex(), PlanningUnitMADPDiscrete::GetNewJointBeliefFromISD(), BayesianGameBase::GetNrJointActions(), PlanningUnitMADPDiscrete::GetNrJointObservationHistories(), QFunctionJAOHInterface::GetQ(), BayesianGameBase::JointToIndividualTypeIndices(), PrintTools::PrintProgress(), BayesianGameBase::SetProbability(), BayesianGameIdenticalPayoff::SetUtility(), and JointPolicyDiscretePure::SoftPrintBrief().
Referenced by BayesianGameForDecPOMDPStage().
BayesianGameForDecPOMDPStage & BayesianGameForDecPOMDPStage::operator= | ( | const BayesianGameForDecPOMDPStage & | o | ) |
Copy assignment operator.
Definition at line 151 of file BayesianGameForDecPOMDPStage.cpp.
References _m_areCachedImmediateRewards, _m_immR, _m_pu, and _m_qHeuristic.
|
inlinevirtual |
Print this BayesianGameIdenticalPayoff to cout.
Reimplemented from BayesianGameIdenticalPayoff.
Definition at line 207 of file BayesianGameForDecPOMDPStage.h.
References SoftPrint().
Referenced by GMAA_MAAstar::ConstructAndValuateNextPolicies().
|
protected |
Calculates the jaohI corresponding to jaI_arr and joI_arr and also returnes the P(jaohI) and the expected obtained reward for previous time steps GIVEN this joint action history.
input args Index ts, Index jtI, Index* jaI_arr,Index* joI_arr, output args Index& jaohI, double& PjaohI, double& ExpR_0_prevTS_thisJAOH
basically this function is a form of PlanningUnitMADPDiscrete::GetJAOHProbs(Recursively) that also computes the reward.
Definition at line 278 of file BayesianGameForDecPOMDPStage.cpp.
References _m_pu, DEBUG_BG4DECPOMDP4, BeliefInterface::Get(), TreeNode< Tcontained >::GetIndex(), BeliefInterface::GetIterator(), JointActionObservationHistoryTree::GetJointActionObservationHistory(), PlanningUnitMADPDiscrete::GetJointActionObservationHistoryTree(), PlanningUnitMADPDiscrete::GetNewJointBeliefFromISD(), PlanningUnitMADPDiscrete::GetNrStates(), BeliefIteratorGeneric::GetProbability(), PlanningUnitDecPOMDPDiscrete::GetReferred(), PlanningUnitDecPOMDPDiscrete::GetReward(), BeliefIteratorGeneric::GetStateIndex(), JointActionObservationHistoryTree::GetSuccessor(), BeliefIteratorGeneric::Next(), JointActionObservationHistory::Print(), and JointBeliefInterface::Update().
|
virtual |
Prints a description of this entire BayesianGameIdenticalPayoff to a string.
Reimplemented from BayesianGameIdenticalPayoff.
Definition at line 642 of file BayesianGameForDecPOMDPStage.cpp.
References BayesianGameForDecPOMDPStageInterface::_m_pJPol, BayesianGameForDecPOMDPStageInterface::_m_t, and JointPolicyDiscretePure::SoftPrint().
Referenced by Print().
|
protected |
are the immediate rewards cached?
Definition at line 64 of file BayesianGameForDecPOMDPStage.h.
Referenced by ClearAllImmediateRewards(), ComputeAllImmediateRewards(), GetImmediateReward(), and operator=().
|
protected |
the cache for the immediate rewards: immR[jt][ja]
Definition at line 66 of file BayesianGameForDecPOMDPStage.h.
Referenced by ClearAllImmediateRewards(), ComputeAllImmediateRewards(), GetImmediateReward(), and operator=().
|
protected |
The joint beliefs induced by the joint types.
Definition at line 62 of file BayesianGameForDecPOMDPStage.h.
Referenced by ComputeImmediateReward(), Initialize(), and ~BayesianGameForDecPOMDPStage().
|
protected |
Stores pointer to the PU.
Definition at line 58 of file BayesianGameForDecPOMDPStage.h.
Referenced by ComputeDiscountedImmediateRewardForJPol(), ComputeImmediateReward(), Fill_FirstOHtsI(), Fill_jaI_Array(), Fill_joI_Array(), GetPUDecPOMDPDiscrete(), Initialize(), operator=(), and ProbRewardForjoahI().
|
protected |
A pointer to the heuristic used by this Bayesian game —nec.?
Definition at line 60 of file BayesianGameForDecPOMDPStage.h.
Referenced by Initialize(), and operator=().