3 # Frans Oliehoek - faolieho@science.uva.nl
7 # This is a Dec-POMDP (.dpomdp) test file, which demonstrates most syntactic
9 # Non-demonstrated constructs are mentioned in the comments, therefore this
10 # document also serves as documentation for the .dpomdp file format.
12 # First we note a few important things:
14 # 1) the file format is case-sensitive
15 # 2) the first entries in this file are:
23 # These entries should *all* be present exactly *once* and in above *order*.
24 # 3) Other inconsistencies with cassandra's format are mentioned throughout
26 # 4) The file is 'newline sensitive': new lines should start at the correct
28 # (i.e. no empty lines allowed! if you want space to this:
31 # 5) Comments should start on a new-line.
32 # 6) Identifiers consist of a letter followed by alphanumerics and '-' and '_'
33 # 7) as a general rule of thumb: 1 number (int or real) is placed on the same
34 # line, multiple numbers (a vector) or matrices, start on a new line.
35 # The keywords "uniform" and "identity" represent vectors/matrices and thus
36 # also are placed on a new line.
37 # 8) Unlike cassandra's format, at the end of transitions, observations and
38 # reward specifications, right before the number we also require a colon.
40 # R: open-right open-right : tiger-left : * : * : 20
42 # R: open-right open-right : tiger-left : * : * 20
43 # the 20 could also indicate the 20th individual observation of agent 2.
45 # Allright, here we go!
49 #Either 1) the number of agents:
51 #or 2) a list of agent identifiers, e.g.:
52 # agents: agent1_name name-of-agent2 ... //TODO:this is not implem. yet!
53 #agents: agent1 the_second-agent
58 #As this is dependent on the planning horizon (typically
59 #1.0 for finite horizon problems) this is more a property of the planning
60 #problem than one of the dec-pomdps. Nevertheless we include it to be as much
61 #compatiable with cassandra and umass as possible.
67 #reward or cost reward type...
68 # values: [ reward, cost ]
71 #the states declaration
72 #-----------------------
73 #the number of states or a state list
74 # states: [ %d, <list of states> ]
77 #The initial state distribution
78 #------------------------------
79 #NOTE: unlike cassandra, this is not optional and the position is BEFORE the
80 #specification of the actions and observations.
81 #There are 4 ways to specify the starting distribution:
82 # * enumerate the probabilities for each state,
83 # * specify a single starting state,
84 # * give a uniform distribution over states, or
85 # * give a uniform distribution over a subset of states.
87 #In the last case, you can either specify a list of states too be included,
88 #or a list of states to be excluded.
89 #Examples of this are:
100 # start include: first-state third state
102 # start include: 1 3 indices-mixed-with-names s2 4
104 # start exclude: fifth-state seventh-state
109 #The actions declarations
110 #------------------------
111 #the (number/list of) actions for each of the agents on a separate line
113 # [ %d, <list of actions> ]
114 # [ %d, <list of actions> ]
116 # [ %d, <list of actions> ]
118 # e.g. 3 named actions for agent 1 - 2 unnamed actions for agent 2:
122 #the (number/list of) observations for each of the agents on a separate line
124 # [ %d, <list of observations> ]
125 # [ %d, <list of observations> ]
127 # [ %d, <list of observations> ]
133 #Although not explicitly specified, joint actions are an important part of the
134 #model. Their syntax is explained here. There are (will be) 3 ways to denote a
138 # In accordence with C++ conventions, indices start at 0.
139 # Because, earlier in this file, only individual actions are specified, we use
140 # a convention as to how the joint actions (and joint observations) are
142 # jaI first agent <- individual indices -> last agent
145 #|O_n|-1 0 0 ..... 0 |O_n|-1
147 # (note that this ordering is enforced by the way joint observation objects
148 # are created in the implementation. In particular see
149 # MADPComponentDiscreteObservations::ConstructJointObservations )
151 # 2) by individual action components
153 # For now a joint action is specified by its individual action components, these
154 # can be indices (again, starting at 0), names or combinations thereof. E.g.
155 # following the action specification above, a valid joint action would be
157 # It is also allowed to use the '*' wildcard. E.g., "a12 *" would denote all
158 # joint actions in which the first agent performes action a12.
161 #It is also possible to use joint action "*" denoting all joint actions.
166 #Joint observations use the same syntax as joint actions.
168 #Transition probabilities
169 #------------------------
170 #This explains the syntax of specifying transition probabilities. In the below
171 #<a1 a2...an> denotes a joint action, which can be specified using the syntax as
174 # NOTE: in contrast to Tony's pomdp file format, we also require a colon after
177 # T: <a1 a2...an> : <start-state> : <end-state> : %f
179 # T: <a1 a2...an> : <start-state> :
180 # %f %f ... %f P(s_1'|ja,s) ... P(s_k'|ja,s)
182 # T: <a1 a2...an> : this is a |S| x |S| matrix
183 # %f %f ... %f P(s_1'|ja,s_1) ... P(s_k'|ja,s_1)
186 # %f %f ... %f P(s_1'|ja,s_k) ... P(s_k'|ja,s_k)
189 # [ identity, uniform ]
203 T: 1 1 : 0 : 1 : 0.33333
204 T: 1 1 : 0 : 0 : 0.66666
215 # O: <a1 a2...an> : <end-state> : <o1 o2 ... om> : %f
217 # O: <a1 a2...an> : <end-state> :
218 # %f %f ... %f P(jo_1|ja,s') ... P(jo_x|ja,s')
220 # O:<a1 a2...an> : - a |S|x|JO| matrix
221 # %f %f ... %f P(jo_1|ja,s_1') ... P(jo_x|ja,s_1')
224 # %f %f ... %f P(jo_1|ja,s_k') ... P(jo_x|ja,s_k')
225 #this uses the fact that earlier specified entries can be overwritten - this IS explicitly mentioned by cassandra (but not by umass) and is very convenient...
226 #TODO: the current format does not allow for independent transitions/observations - this might also be convenient...
229 O: a12 1 : state-one : * * : 0.002
230 O: a12 1 : state-one : 0 hear-left : 0.7225
232 O: 0 1 : s2 : 0 hear-right : 0.1275
233 O: 0 1 : 0 : 1 hear-left : 0.1275
234 O: 1 0 : 1 : * hear-right : 0.0225
235 O: 1 1 : 1 : 1 * : 0.7225
236 O: 2 0 : 0 : 0 hear-right : 0.1275
237 O: 2 1 : 0 : 0 0 : 0.1275
238 O: 2 1 : 1 : * : 0.0225
247 # R: <a1 a2...an> : <start-state> : <end-state> : <o1 o2 ... om> : %f
249 # R: <a1 a2...an> : <start-state> : <end-state> :
252 # R: <a1 a2...an> : <start-state> :
258 #here the matrix has nrStates rows and nrJointObservations columns.
260 #Typical problems only use R(s,ja) which is specified by:
261 # R: <a1 a2...an> : <start-state> : * : * : %f
267 R: a12 0 : * : * : * : -2
268 R: 0 0 : 0 : * : * : -50
269 R: 0 1: 1 : * : * : +20
270 R: 0 2: 1 : * : * : 20
271 R: 1 0: 0 : * : * : -100
272 R: 1 1: 1 : * : * : -10
273 R: 1 2: 0 : * : * : -101
274 R: 2 0: 1 : * : * : -101
275 R: 2 1 : 1 : * : * : 9