MultiAgentDecisionProcess  Release 0.2.1
example.dpomdp
Go to the documentation of this file.
1 # example.dpomdp
2 # -----------
3 # Frans Oliehoek - faolieho@science.uva.nl
4 # 2006-08-03
5 #
6 #
7 # This is a Dec-POMDP (.dpomdp) test file, which demonstrates most syntactic
8 # constructs.
9 # Non-demonstrated constructs are mentioned in the comments, therefore this
10 # document also serves as documentation for the .dpomdp file format.
11 #
12 # First we note a few important things:
13 #
14 # 1) the file format is case-sensitive
15 # 2) the first entries in this file are:
16 # agents
17 # discount
18 # values
19 # states
20 # start
21 # actions
22 # observations
23 # These entries should *all* be present exactly *once* and in above *order*.
24 # 3) Other inconsistencies with cassandra's format are mentioned throughout
25 # this file.
26 # 4) The file is 'newline sensitive': new lines should start at the correct
27 # places!
28 # (i.e. no empty lines allowed! if you want space to this:
29 #
30 # )
31 # 5) Comments should start on a new-line.
32 # 6) Identifiers consist of a letter followed by alphanumerics and '-' and '_'
33 # 7) as a general rule of thumb: 1 number (int or real) is placed on the same
34 # line, multiple numbers (a vector) or matrices, start on a new line.
35 # The keywords "uniform" and "identity" represent vectors/matrices and thus
36 # also are placed on a new line.
37 # 8) Unlike cassandra's format, at the end of transitions, observations and
38 # reward specifications, right before the number we also require a colon.
39 # I.e.:
40 # R: open-right open-right : tiger-left : * : * : 20
41 # This is because in
42 # R: open-right open-right : tiger-left : * : * 20
43 # the 20 could also indicate the 20th individual observation of agent 2.
44 #
45 # Allright, here we go!
46 #
47 #The agents.
48 #----------
49 #Either 1) the number of agents:
50 # agents: %d
51 #or 2) a list of agent identifiers, e.g.:
52 # agents: agent1_name name-of-agent2 ... //TODO:this is not implem. yet!
53 #agents: agent1 the_second-agent
54 agents: 2
55 #
56 #the discount factor.
57 #-------------------
58 #As this is dependent on the planning horizon (typically
59 #1.0 for finite horizon problems) this is more a property of the planning
60 #problem than one of the dec-pomdps. Nevertheless we include it to be as much
61 #compatiable with cassandra and umass as possible.
62 # discount: %f
63 discount: 1.0
64 #
65 #values
66 #------
67 #reward or cost reward type...
68 # values: [ reward, cost ]
69 values: reward
70 #
71 #the states declaration
72 #-----------------------
73 #the number of states or a state list
74 # states: [ %d, <list of states> ]
75 states: state-one s2
76 #
77 #The initial state distribution
78 #------------------------------
79 #NOTE: unlike cassandra, this is not optional and the position is BEFORE the
80 #specification of the actions and observations.
81 #There are 4 ways to specify the starting distribution:
82 # * enumerate the probabilities for each state,
83 # * specify a single starting state,
84 # * give a uniform distribution over states, or
85 # * give a uniform distribution over a subset of states.
86 #
87 #In the last case, you can either specify a list of states too be included,
88 #or a list of states to be excluded.
89 #Examples of this are:
90 # start:
91 # 0.3 0.1 0.0 0.2 0.5
92 #or
93 # start:
94 # uniform
95 #or
96 # start: first-state
97 #or
98 # start: 5
99 #or
100 # start include: first-state third state
101 #or
102 # start include: 1 3 indices-mixed-with-names s2 4
103 #or
104 # start exclude: fifth-state seventh-state
105 start exclude: s2
106 #start:
107 #0.3 0.1
108 #
109 #The actions declarations
110 #------------------------
111 #the (number/list of) actions for each of the agents on a separate line
112 # actions:
113 # [ %d, <list of actions> ]
114 # [ %d, <list of actions> ]
115 # ...
116 # [ %d, <list of actions> ]
117 #
118 # e.g. 3 named actions for agent 1 - 2 unnamed actions for agent 2:
119 actions:
120 agent1-a1 a12 a13
121 2
122 #the (number/list of) observations for each of the agents on a separate line
123 # observations:
124 # [ %d, <list of observations> ]
125 # [ %d, <list of observations> ]
126 # ...
127 # [ %d, <list of observations> ]
128 observations:
129 2
130 hear-left hear-right
131 #Joint actions
132 #-------------
133 #Although not explicitly specified, joint actions are an important part of the
134 #model. Their syntax is explained here. There are (will be) 3 ways to denote a
135 #joint action:
136 #
137 # 1) by index
138 # In accordence with C++ conventions, indices start at 0.
139 # Because, earlier in this file, only individual actions are specified, we use
140 # a convention as to how the joint actions (and joint observations) are
141 # enumerated:
142 # jaI first agent <- individual indices -> last agent
143 # 0 0 0 ..... 0 0
144 # 1 0 0 ..... 0 1
145 #|O_n|-1 0 0 ..... 0 |O_n|-1
146 #
147 # (note that this ordering is enforced by the way joint observation objects
148 # are created in the implementation. In particular see
149 # MADPComponentDiscreteObservations::ConstructJointObservations )
150 #
151 # 2) by individual action components
152 #
153 # For now a joint action is specified by its individual action components, these
154 # can be indices (again, starting at 0), names or combinations thereof. E.g.
155 # following the action specification above, a valid joint action would be
156 # "a12 1"
157 # It is also allowed to use the '*' wildcard. E.g., "a12 *" would denote all
158 # joint actions in which the first agent performes action a12.
159 #
160 # 3) the wild-card
161 #It is also possible to use joint action "*" denoting all joint actions.
162 #
163 #
164 #Joint Observations
165 #------------------
166 #Joint observations use the same syntax as joint actions.
167 #
168 #Transition probabilities
169 #------------------------
170 #This explains the syntax of specifying transition probabilities. In the below
171 #<a1 a2...an> denotes a joint action, which can be specified using the syntax as
172 #explained above.
173 #
174 # NOTE: in contrast to Tony's pomdp file format, we also require a colon after
175 # the end-state.
176 #
177 # T: <a1 a2...an> : <start-state> : <end-state> : %f
178 #or
179 # T: <a1 a2...an> : <start-state> :
180 # %f %f ... %f P(s_1'|ja,s) ... P(s_k'|ja,s)
181 #or
182 # T: <a1 a2...an> : this is a |S| x |S| matrix
183 # %f %f ... %f P(s_1'|ja,s_1) ... P(s_k'|ja,s_1)
184 # %f %f ... %f ...
185 # ... ...
186 # %f %f ... %f P(s_1'|ja,s_k) ... P(s_k'|ja,s_k)
187 #or
188 # T: <a1 a2...an>
189 # [ identity, uniform ]
190 #
191 #T: * :
192 #uniform
193 T: * :
194 uniform
195 T: a12 1 : s2 :
196 0.4 0.6
197 #T: 1 2 :
198 #identity
199 T: 1 2 :
200 0.2 0.8
201 0.6 0.5
202 T: * * : * : * : 0.5
203 T: 1 1 : 0 : 1 : 0.33333
204 T: 1 1 : 0 : 0 : 0.66666
205 T: 1 : 0 :
206 0.123 0.876
207 T: 0 0 : 0 :
208 0.11111 0.88888
209 T: 2 :
210 identity
211 T: 3 :
212 0.2222 0.2233
213 0.2244 0.2255
214 #Observation probabilities
215 # O: <a1 a2...an> : <end-state> : <o1 o2 ... om> : %f
216 #or
217 # O: <a1 a2...an> : <end-state> :
218 # %f %f ... %f P(jo_1|ja,s') ... P(jo_x|ja,s')
219 #or
220 # O:<a1 a2...an> : - a |S|x|JO| matrix
221 # %f %f ... %f P(jo_1|ja,s_1') ... P(jo_x|ja,s_1')
222 # %f %f ... %f ...
223 # ... ...
224 # %f %f ... %f P(jo_1|ja,s_k') ... P(jo_x|ja,s_k')
225 #this uses the fact that earlier specified entries can be overwritten - this IS explicitly mentioned by cassandra (but not by umass) and is very convenient...
226 #TODO: the current format does not allow for independent transitions/observations - this might also be convenient...
227 O: * * :
228 uniform
229 O: a12 1 : state-one : * * : 0.002
230 O: a12 1 : state-one : 0 hear-left : 0.7225
231 O: 0 0 : * : * : 0.9
232 O: 0 1 : s2 : 0 hear-right : 0.1275
233 O: 0 1 : 0 : 1 hear-left : 0.1275
234 O: 1 0 : 1 : * hear-right : 0.0225
235 O: 1 1 : 1 : 1 * : 0.7225
236 O: 2 0 : 0 : 0 hear-right : 0.1275
237 O: 2 1 : 0 : 0 0 : 0.1275
238 O: 2 1 : 1 : * : 0.0225
239 O: * 1 :
240 uniform
241 O: a12 1 : s2 :
242 0.1 0.1 0.1 0.6
243 O: 1 2 :
244 0.1 0.8 0.05 0.05
245 0.6 0.3 0.06 0.04
246 #The rewards
247 # R: <a1 a2...an> : <start-state> : <end-state> : <o1 o2 ... om> : %f
248 #or
249 # R: <a1 a2...an> : <start-state> : <end-state> :
250 # %f %f ... %f
251 #or
252 # R: <a1 a2...an> : <start-state> :
253 # %f %f ... %f
254 # %f %f ... %f
255 # ...
256 # %f %f ... %f
257 #
258 #here the matrix has nrStates rows and nrJointObservations columns.
259 #
260 #Typical problems only use R(s,ja) which is specified by:
261 # R: <a1 a2...an> : <start-state> : * : * : %f
262 R: * : 1 : 3 :
263 4.3 34.2 253 12
264 R: * : s2 :
265 4.3 34.2 253 12
266 5.3 2352 2.3 1.2
267 R: a12 0 : * : * : * : -2
268 R: 0 0 : 0 : * : * : -50
269 R: 0 1: 1 : * : * : +20
270 R: 0 2: 1 : * : * : 20
271 R: 1 0: 0 : * : * : -100
272 R: 1 1: 1 : * : * : -10
273 R: 1 2: 0 : * : * : -101
274 R: 2 0: 1 : * : * : -101
275 R: 2 1 : 1 : * : * : 9