
\chapter{Promises and games}\label{gameschap}

This is based on \cite{siri2}.

\section{The relationship between games and promises}

The mechanism of rational, voluntary decision-making used in economics
is the theory of games\cite{rasmusen1,myerson1}. In this section, we
would like to show that there is a natural relationship between game
theory and certain constellations of promises which provides a
motivation for keeping those particular promises out of all the
promises an agent might make. We confine our discussion here to
normal or strategic form games.

We alert the reader to a potential confusion of notation here. The
symbol $\pi$ is normally used for a promise. In the context of game
theory it is often used also for the payoff or utility matrix. We
shall therefore use the symbol $U_{ijk\ldots}$ for the payoff or {\em
utility} matrix of an $N$ player game.


We follow Rasmussen in our description of games to avoid repeating
well-known definitions\cite{rasmusen1}. A game is a number of players
$n_i$, where $i=1\ldots N$, a set of actions, payoffs and
information. A player $n_i$'s {\em action set} is the entire set of
actions or pure strategies available to him.  An action combination is
a set of one action per player $s_i$ in the game.  Player $n_i$'s
payoff or utility is denoted $U_i(s_1,s_2,\ldots,s_N)$ and has one of
two interpretations:
\begin{itemize}
\item It is the utility received at the end of a trial by a player, based on what
the player did.
\item It is an estimated or expected utility to be considered as a function of the possible
or expected strategies a player might choose in a future game.
\end{itemize}
It is this second interpretation that most closely relates to
promises, since we shall identify promises with expectations of
actions. However, we should be clear that the latter cannot exist
without some measure of the former.  This duplicity of interpretation
mirrors a similar one in promise theory, in which one can measure the
reliability of a promise with a scalar probability value. The
probability can then be regarded as a history of a player, or as a
prediction of future behaviour.

A {\em strategy} is a rule that tells player $n_i$ which action $s_i$
to choose at each instant in the game, given his or her information
set. An equilibrium $s^* = (s_1,s_2,\ldots,s_N)$ is a strategy
combination consisting of the `best' strategies for each player, with
respect to the expected payoff. Several such equilibria might exist.

There is a class of games in which the possible actions or moves of a
player are the same in each round of a game. Thus each round of an
extended game is indistinguishable from multiple games, except perhaps
in the information available available to the players. Iterated games
are special because both {\em each round} and the entire {\em iterated
game} can be represented as a utility matrix of the same form. Thus we
may discuss both extended interactions and one-shot interactions using
a utility matrix formulation as long as we restrict attention to such
games. We shall do this here.

Promises are static snapshots of policy.
A promise graph such as that in fig. \ref{f1} can be viewed in two ways:
\begin{figure}[ht]
\begin{center}
%\psfig{file=yinyang.eps,width=2cm}
\includegraphics[width=2cm]{figs/yinyang}
\caption{A simple exchange of promises.\label{f1}}
\end{center}
\end{figure}
either as an equilibrium over a complete interaction epoch or as a
single round in an repeated interaction. To make the move from games
with actions to games with promises, we simply replace actions with promises.
Any ordering of promises (who makes which promise first) is analogous to
an ordering of actions. In a game one considers a strategy to be a choice
of actions. Here we shall assume it to be a set of promises made:
\beq
n_i \promise{\{\pi\}_i} n_j
\eeq
where $\{\pi\}$ is potentially a bundle of different promises made from one
player to another. This bundle might also comprise the empty set.

The payoff or utility received by the $i$th player or agent in the system
is the sum of {\em its own} private valuations of promises made and
received:
\beq
U_i(\{\pi\}_1,\{\pi\}_2,\ldots,\{\pi\}_N) = \sum_{j\not=i} v_i\left(
n_j \promise{\{\pi\}_j} n_i,  n_i \promise{\{\pi\}_i} n_j 
\right)
\eeq
Any kind of promise can be evaluated, regardless of whether it is a
service promise or a use-promise\cite{burgessdsom2005} (give or take promise).

\section{Classification of games}

Games are classified in various ways, particularly with regard to
cooperation and conflict. An interaction is {\em cooperative} if
players coordinate their actions in order to improve their final
payoff (utility). Cooperation might be contingent on available
information.  An interaction is a {\em conflict} if the actions or
intentions of one player can adversely affect the payoff (utility) of
another player's actions.
Let us give some examples of these representations of games in terms
of promise exchanges.

\subsection{Cooperative, no conflict}

Consider two agents helping one another to provide a sufficient level
of service for a third party. By helping one another, they are able to
work together to meet 3's requirements and secure contracts with the
third party and hence both profit: they are not in competition with
one another.  This scenario is shown in fig. \ref{e1}.

\begin{figure}[ht]
\begin{center}
%\psfig{file=coop3.eps,width=3cm}
\includegraphics[width=3cm]{figs/coop3}
\caption{Cooperative game, no conflict.\label{e1}}
\end{center}
\end{figure}
Two agents 1 and 2, make coordinated promises to serve node 3. The
third party promises to reward both agents for their service,
independently. 
The values of the remuneration promises $v_{1,2}(pay)$ are not related.

\subsection{Cooperative, with conflict}

Consider now two agents who bargain with one another directly for
services from one another. Both need these services from the other, so the services are valuable
(fig \ref{e2}). By making each other's promises contingent on what
they receive from the other, the value each of them receives
becomes moderated by the other player. If they cooperate with one another, they can
both win a fair amount, but each change can have negative consequences
through the feedback.
\begin{figure}[ht]
\begin{center}
\includegraphics[width=2cm]{figs/coop4}
%\psfig{file=coop4.eps,width=2cm}
\caption{Cooperative game, with conflict.\label{e2}}
\end{center}
\end{figure}
Agent 1 promises $\pi$ if the other promises $\rho$ and vice versa.
The values $v_1(\rho/\pi)$ and $v_2(\pi/\rho)$ are inter-dependent.

\subsection{Non-cooperative, no conflict}

Two agents providing a service in a free market promise the same
service to a third party, without any cooperation (fig. \ref{e3}). The
third party makes its choice and accepts the promised service partly
from one and partly from the other. Each is promised a payment.
Again, the value of the payment promises are quite independent, hence
there is no conflict between agents 1 and 2.
\begin{figure}[ht]
\begin{center}
\includegraphics[width=3cm]{figs/noncoop1}
%\psfig{file=noncoop1.eps,width=3cm}
\caption{Non-cooperative game, no conflict.\label{e3}}
\end{center}
\end{figure}


\subsection{Non-cooperative, with conflict}

This example is the classic and well-known Prisoner's Dilemma
(fig. \ref{e4}).  Two agents promise to talk (or not talk) to their
prison keeper (a third party).  The prison keeper rewards both agents
depending on the combined promises of both of the agents. The agents 1 and 2 are not in
a position to cooperate (being held in different cells), but they affect one another's
outcomes indirectly.
\begin{figure}[ht]
\begin{center}
\includegraphics[width=3cm]{figs/noncoop2}
%\psfig{file=noncoop2.eps,width=3cm}
\caption{Non-cooperative game, with conflict.\label{e4}}
\end{center}
\end{figure}
This is a conflict because the actions of each player can have consequences for both of them,
and it is non-cooperative because the two players cannot communicate.


\subsection{Constant-sum games}

Let us now show that we can represent the special case of constant-sum
(usually zero-sum) games, provided a certain level of cooperative
infrastructure is in place. Such an infrastructure is often assumed in
economic texts, but here we must be more careful.  A constant-sum game
is special because it implies an ability for one agent to control the
value received by another agent. In a world of complete autonomy, that
is forbidden, since each agent makes its own value judgements. Hence,
to understand such a scenario, we must secure certain agreements
between agents to agree on a currency standard. Constant sum games are
not dependent on centralization, but we present two versions of such a
game with and without centralization, both of which however show the
fragility of the concept.

\begin{theorem}[Constant-sum game]
A promise interaction that forms a constant-sum game requires a finite
number (at least $N$) of agents ($N$ players and at least one
adjudicator which may optionally be taken amongst the players), and
all agents must agree on a standard currency.
\end{theorem}

\begin{proof}
Consider $N$ promise interactions of the form:
\beq
n_i &\promise{\pi_i} &A\nonumber\\
A &\promise{r_i / \chi(\pi_1,\pi_2,\ldots,\pi_N)}& n_i.
\eeq
Each agent $n_i$ promises an adjudicator $A$ some service or behaviour $\pi_i$,
and the adjudicator promises, in return, a piece of a common reward $r_i$,
subject to a constraint $\chi(\pi_1,\pi_2,\ldots,\pi_N)$. We would like this
contraint to be that the sum of the values perceived by the agents is constant,
i.e. we would like
\beq
\chi = \sum_{i=1}^N \; v_i(A \promise{r_i / \chi(\pi_1,\pi_2,\ldots,\pi_N)} n_i) = {\rm const}.\label{c1}
\eeq
However, each agent's valuation function is its own private estimate and is unknown to $A$, so the
best it can do is to arrange for
\beq
\chi = \sum_{i=1}^N \; v_A(A \promise{r_i / \chi(\pi_1,\pi_2,\ldots,\pi_N)} n_i) = {\rm const}.\label{c2}
\eeq
Eqns. (\ref{c1}) and  (\ref{c2}) can only agree if each agent $n_i$ makes an agreement to
use the same valuation $v_i = v_A, \forall i$, implying that the agents have a common currency.
In a centralized framework, this implies an agreement of the form
\beq
A &\promise{v_A}& n_i\nonumber\\
n_i &\promise{U(v_A)}& A
\eeq
between each node and the adjudicator.
Hence the payoff matrix for the game is:
\beq
&& U_i\left(v_i(n_i \promise{\pi_i} A,A \promise{r_i/\chi}n_i)\right)\nonumber\\
&=& U_i\left(v_A(n_i \promise{\pi_i} A,A \promise{r_i/\chi}n_i)\right)\nonumber\\
&=& U_A\left(v_A(n_i \promise{\pi_i} A,A \promise{r_i/\chi}n_i)\right).
\eeq
The adjudicator node has the role of book-keeper. There is no
impediment to identifying $A$ as any {\em one} of the $n_i$ agents,
the conditions are binding and sufficient regardless of which agent
plays the role of $A$. Similarly, agents with the role of $A$ can be repeated
any number of times.
\end{proof}
The issue of who elects the adjudicator of the game (game-master) is still open,
and we shall not discuss that further here.

In the proof above, we made use of a centralized construction. This is
not strictly necessary for the currency. It is sufficient that every
agent or player participating in the game should share a common
knowledge of the valuation standard for currency and obeys a constant
sum rule. The knowledge of the currency standard can spread
epidemically through a promise graph to achieve the same result,
provided the agents all abide by the same rules.  This requires a
special source node for the standard, which can be identified as $A$,
so there is still a adjudicator node. The nodes must
subordinate themselves to a set of rules that preserves:
\beq
\sum_{\pi_i,i,j} v_i (n_i \promise{\pi_i} n_j) = {\rm const}.
\eeq
This cannot be done without relinquishing part of their autonomy.
Turning the argument around, it is likely that constant sum games
do not represent a realistic interaction of autonomous agents.

\subsection{Multiple strategies and mixtures}

We have tacitly identified the promise of services with strategies in
the interpretation of a game in normal form. A promise becomes an
alternative in the expectation of rational play. However, there are
subtleties about actions and promises to be addressed. What if, for
instance, two promises are mutually exclusive?  What happens when we
form mixed strategies?  There are two ways we can envisage mapping
games onto collections of promises:
\begin{itemize}
\item Strategies are binary: either we keep a promise or do not keep a promise.

\item Strategies are multiple: distinct alternative promises for future action,
some of which might be mutually exclusive.
\end{itemize}
Consider the game matrix for a two-agent game below, with payoffs..
\begin{center}
\begin{tabular}{c|ccc}
(1,2) & $\pi_1$ & $\pi_2$ & $\pi_3$\\
\hline
$\pi_4$ & (10,5) & (4,6) & (3,2)\\
$\pi_5$ & (0,5) & (7,2) & (5,4)\\
$\pi_6$ & (10,5) & (6,6) & (8,3)\\
\end{tabular}
\end{center}
The normal solution concept for such a game is to search for an
equilibrium configuration of strategies for the players. Generally,
such equilibria required mixed strategies to balance their
books. However, this leads to a difficulty in interpretation for
promises. Some promises might be mutually exclusive, some promises
might break other promises, etc. Thus we cannot simply search for a
numerical equilibrium for this matrix without applying the logic of
promises to the problem also. The formal ambiguity can be dealt with
by making promises conditional on certain prerequisites. However,
there will be many games represented in the literature that assume information
channels that are unavailable to autonomous agents. Readers should therefore
take care to remember the implications of complete autonomy. In particular,
two agents can only be assumed to share common knowledge about $\pi$ if they
agree\cite{siri1}:
\beq
n_1 &\promise{\pi} &n_2\nonumber\\
n_2 &\promise{U(\pi)}& n_1.
\eeq
The issue of mixed strategies is something that must be addressed in
both of the cases above. In a zeroth-level approximation to promise
theory, one generally makes a default assumption that promises are
always kept. This allows one to discuss the logic of promise graphs.
However, as soon as one considers the matter of observation and repeated
trials, and takes seriously the issue of why agents would keep promises,
this default assumption becomes too simplistic to model real situations.
We discuss some of these matters in the next section. For the remainder
of the paper, we restrict out discussions to two-person games (or games that
are two person factorizable).

\section{Repeated bargaining: conditional promises}

A dilemma that is faced by agents is whether to favour a long term
benefit with a certain risk, or short-term guaranteed rewards by
choosing a policy towards neighbours. If an agent only interacts once
in its lifetime with another, it has nothing to lose by betraying the
trust of the other. However, if there is the possibility of a future
reprisal against it, the reckoning is a different one\cite{nash1}.

Agents have the possibility of {\em trading} promises of different
types using a valuation based on an agreed measure of common currency,
as pointed out in ref. \cite{siri1}. This allows them to secure long
term relationships with one another in a way that secures stability of
interaction. Selfish agents will only predictably continue to interact
if they receive some net payoff from the interaction.

However, in a real world pervasive computing scenario,
one must be careful in attributing the failure to comply with promises
to non-cooperative behaviour, as one might in economics.
There are several reasons why an agent might fail to live up to its
committments in a cooperative relationship:
\begin{itemize}
\item Environmental uncertainties beyond the control of the agent intervene (noise).
\item Unwillingness to cooperate emerges (change of policy).
\item The failure of a dependency agreement, which invalidates a
conditional promise (a third party spoils it).
\end{itemize}
This is not an extensive list, but each case is significant and
important to the fragility of developing a regional policy in the
ad hoc assembly of individuals. 

So, given short term certainty versus long term risk, how might agents choose
to behave?  Suppose they surmise an incentive to cooperate:

\begin{enumerate}
 \item Under what conditions would cooperation emerge
 and be stable in a group of selfish-minded individuals?

\item What kind of \emph{behaviour} is it most beneficial for each individual to
 choose (strategy)? i.e. How does one harmonize selfish interest with common
goals?

\item What might be done to promote the emergence
 of cooperation in a group by altering their policy for interaction?
\end{enumerate}


The preferred solution method of such game theoretical problems is the
concept of Nash equilibrium\cite{myerson1}. An equilibrium is the end
result of a `tug of war' between players.  Game theory therefore
embodies a notion of {\em stability} in the result of the contest.
This is an appropriate model for management. The restriction to two
players simply means that, in a graph of interactions between
individual agents, each negotiation is made between individual pairs,
without reference to third parties. This is also what is appropriate
for an autonomous environment.  We recall some basic notions from game
theory.

\bigskip

\begin{definition}[Two person iterated game] A
 \emph{game} consist of two \emph{players} or agents, which interact
 and make choices. For each choice $i$, by player $a$, and counter
 choice $j$ by the other player, they each receive a payoff, which is
 determined by the \emph{payoff matrix} $\Pi_a$ with components
$[\Pi_a]_{ij}$ (see
 fig. \ref{matrix}).  The game can consist of one or several
 \emph{rounds}. A game of several rounds is called an \emph{iterated}
 game.  \end{definition}

\bigskip

\begin{figure}
  \begin{center}
\includegraphics[width=6cm]{figs/dilemma}
%\epsfig{file=dilemma.eps,width=4cm}
  \end{center}
  \caption{The payoff matrix $\Pi_{(1,2)}$ representing rewards for a single move 
in a dilemma game. The components $[\Pi]_{ij}$ represent the four
quadrants for $i,j=1,2$\label{matrix}}
\end{figure}


This matrix form of the game represents the benefits and losses (hence risks)
during \emph{one} encounter between two agents. If they should meet
again, and the previous encounter might have influence on how the
agents act next time they meet, this is modelled by several
\emph{iterations} of the game. Each iteration hence represents one
interaction between two agents, and this interaction could be defined
in several ways.

\bigskip
\begin{definition}[Move and strategy] 
 The choice of action $\vec {M_{a,i}}^{\rm T} \in \{ (1,0), (0,1) \} =
 \{\vec C,\vec D\}$ made by agent $a$ in round $i$ of a game is called
 a {\em move}. These concern keeping or breaking a given promise to the
other player. For historical reasons, this is referred to as `cooperation' (C)
and `defection' (D), see fig. \ref{matrix}.
A complete sequence of choices $\{ \vec M_{a,i} | \forall
 i\}$ for agent $a$ takes over the entire game, constitutes the
 \emph{strategy} of that player. \end{definition}

\bigskip
Moves are in fact micro-strategies in the interpretation of the
dilemma game as a normal-form, strategic game with constant payoffs.
To avoid confusing micro- and macro-strategies, or extensive and normal
form games, we use these terms: move and strategy.

To account for the effects of a receding history, the total payoff of
a player in an interated game is generally a discounted sum of the
payoffs in the individual rounds\cite{axelrod1,axelrod2}. Let
$M_{a,r}$ be the move made by agent $a$ in the $r$th round of the
game, and let $\Pi_{a}$ be the payoff matrix in this round for agent
(player) $a$, given a counter-player $\overline a$, then over $m$
moves,
\beq
\Pi_{{\rm total},a} =\sum_{r=1}^m \delta^m\;( \vec {M_{a,r}}^{\rm T}\Pi_{a}\;\vec M_{\overline a,r}).
\eeq
The factor $\delta \le 1$ is called the `discount paramter'. One
normally imagines oneself fixed at the present and looking out into
(i.e. predicting) the future, based on past experience, so that $m$
recedes into the future. What happens in the distant future (or
distant past) is less important than what is immediately upon us, so
its importance to the payoff is reduced in weight, or {\em
discounted}. This duality between past and future is the same as that
in all probabilistic theories in which evidence from the past is used as
a model for the future.

The dilemma game has proven to be a very powerful tool for describing
cooperation. The goal of this method is typically to find stable
\emph{solutions} to the iterated game, by balancing short and long term
gain against one another. The iterated game is played for a given
number of rounds and the most profitable player, either during one
round, or in the long run (several iterations) wins.  The actual
winner of a game is not really of interest to us in policy based
management. What is important is the average equilibrium outcome, i.e.  which
strategies or choices of promises agents choose to keep.  To be able
to predict the outcome of such an iterated game, one needs:

\begin{itemize}
 \item The strategies of the other players involved in the game
 \item The number of encounters/interactions with the other players
 \item The payoff/reward for the different combinations of encounters
 \item The start state.
\end{itemize}
This translates into finding out what kind of behaviour is most
beneficial for an agent in the long run, dependent on what kind of
environment the agent acts in, how often each agent meets agents with
certain behaviour and so on.

\section{Promises: constructing policy}

It is plausible that there is a connection between the two person
games studied in the theory of cooperation and the likelihood that
promises are kept between individuals in a pervasive computing
environment. We shall now show this.

A promise graph $\Gamma = \langle I,L \rangle$ is a set of nodes $I$
and a set of labelled edges or links between them $L$. Each directed edge
represents a promise from one agent to another. The set of links $L$
falls into several non-overlapping types, representing different kinds
of promise.  Thus each promise begins with a label $\pi_i$, denoting its type,
and with a specification of the promise constraint, which it promises to
uphold. 
\bigskip

\begin{lemma}[Games are bilateral promises]
Let a two person strategic game be represented by a pair of $2\times
2$ matrices $\Pi_a$, and moves $\vec M_i$ where $n_i, i=1,2$ are the agents
and $a,b=1,2$ label the choices to cooperate or defect in the payoff
matrix, e.g.  $[\Pi_i]_{ab}$.  There is a mapping of each
game between the players onto a pair of promises.
\end{lemma}
\begin{proof}
For the players in the game, introduce a promise of type $\pi_a$ and a
return promise of type $\pi_b$, which are traded in the manner of
fig. \ref{pgame}a. Now let $\vec T(\cdot)$ be a `truth' function that
maps any promise into a one of the values $\{\vec C,\vec D\}$,
indicating whether the promise is obeyed or not by the promising
agent. The truth function may be associated with a move in a game, of
a type that maps one to one onto the promise type, and hence may be written
equivalently:
\beq
 \vec T(n_1 \stackrel{\pi_i}{\rightarrow} A_2)&\rightarrow& \vec M_1 \\
 \vec T(n_2 \stackrel{\pi_j}{\rightarrow} A_1) &\rightarrow&\vec M_2\\
{\left[v_a\left(n_1 \stackrel{\pi_i}{\rightarrow} A_2, n_2\stackrel{\pi_j}{\rightarrow} A_1\right)\right]}_{ab}&\rightarrow& {[U_i]}_{ab} 
\eeq
where $v_i$ is the matrix-valued valuation function for agent $i$,
which interprets the value of the mutual relationship to agent $i$;
$\pi_a$ represents the promise body, which is labelled by the
types $a,b$. We must have either $(A_1,A_2)=(n_1,n_2)$ or
$(A_1,A_2) = (n_3,n_3)$ to mediate the game.
The proof is now trivial
noting that the arrow is from promises {\em onto} games, by a policy
determined function $v_i$.
\end{proof}
Note that, although we can identify a game with a pair of
promises, not all promises can be meaningfully
viewed as games, as we see in section \ref{x}. Any reasonable function
$v_i$ can be used to defined the mapping from promise to game, since the
function $v_i$ is to provide a autonomous {\em valuation} of received
promises by each agent $n_i$. The game can then be used as an estimator for
the rational behaviour of the agents as they interact.

A consequence of this relationship between promises and games is that
promises must also specify a policy for stabiliizing the cooperative
relationship over time, i.e. a strategy for the entire iterated game.

\section{Management dilemmas}\label{dilemmas}

A common approach for modelling and analysing cooperation is to use
dilemma games from two-person game theory\cite{axelrod1,axelrod2}.
Here two individuals, without strictly opposing interests, have the
possibility of exploiting each other's willingness to cooperate for
selfish gain. This is a scenario that has shown to fit several
real-life cases\cite{axelrod1,axelrod2}. It is known as a cooperative
dilemma.

In the dilemma game, each player can choose between two different moves,
\emph{cooperate} (keep promise) or \emph{defect} (i.e. do not keep promise).
The interpretation of the dilemma game symbols is thus: $T$ is the
payoff for receiving services from another agent, while refusing to
contribute; $R$ is the reward two agents receive when both are
cooperating; $P$ is the inconvenience both agents experience when they
both refuse to collaborate; and $S$ is the inconvenience of servicing
an agent who refuse to contribute anything back.

There are many possible outcomes, which depend on the
relative sizes of these parameters.  For the game model to qualify as
a Prisoner's Dilemma, in the common usage of the term, certain
requirements need to be fulfilled: (i) $T > R > P >S$, and (ii) $R >
\frac{(T + S)}{2}$.  These requirements ensure that conclusions
associated with the game are true.

A player can win the maximum payoff by `defecting' (failing to act
according to its own promises) when the other player is `cooperating'
(acting according to its promises), but if both players are defecting,
they both receive a lower score than if they both had
cooperated. Constraint (ii) assures that mutual cooperation is more
beneficial than alternating cooperation and defection between the two
players.

What makes this game model interesting, is the controversy that even
though the choice \emph{defect} is \emph{dominant}, both players will
actually end up with a better result cooperating in the long
term. This model is completely analogous to a scenario of autonomous
agents, looking to establish agreements (promises) between one
another, in order to create a larger system voluntarily.  There is a
potential long-term reward associated with providing services for
other peer agents, but avoiding a commitment could thus seem
beneficial in the short run.

We model this situation with multi-move games in which one agent can
follow the history of its interaction with another and \emph{punish}
previous defections from the other player by failing to behave
cooperatively in return.
An example strategy, which has received much attention in the
literature, is \emph{tit for tat}. This has proven to be very
successful in playing cooperative dilemmas with a good average
performance. Tit for tat is a macro-strategy and it is defined as
follows:

\bigskip
\begin{definition}[Tit for tat]\label{t4t}
Let $r$ label a sequence of rounds in a game, and let $\vec M_{1,r} \in
\{\vec C,\vec D\}$ be the move taken by agent 1 in round $r$, etc. The tit for
tat strategy is defined by:
\beq
\vec M_{1,1} = \vec C\\
\left.
\begin{array}{c}
\vec M_{1,r} = \vec M_{2,r-1} \\
\vec M_{2,r} = \vec M_{1,r-1} \\
\end{array}\right\}  &r& > 1.
\eeq
Notice that the agents always start out \emph{cooperating}, or obeying
policy rules, and then do against the opponent what the opponent did
on the \emph{previous} move.
\end{definition}

\bigskip
The success of this strategy, in dilemma games, is characterised by
the fact that it gets high scores on average, even though it does not
necessarily always win (i.e. it does not always maximize the utility
of its interaction). This strategy has several beneficial qualities:
it is simple, it rewards cooperation, it punishes defection and it is
forgiving, hence single defection noise (including accidents) does not
harm a player too much.  The disadvantage is that it does not exploit
recovery from random noise well enough. It tends to end in fruitless
reactive oscillations between agents, once they have started. A
continuous generalization of the game has further instability
problems\cite{burgessC15}.  Some thing is required to dampen out the
oscillations caused by random errors. A small amount of noise to
unleash oscillations throughout a peer network, in which the agents
had dependent promises.

The real dilemma in these games is this: if an agent in a trading
relationship does not retaliate to broken promises, it will allow
itself to be exploited, exhausted and stability will be compromised.
However, if it does retaliate, it participates in the contributing to
a policy instability in the network.

\section{Examples of games and the economics of promises}\label{x}

What is the complete relationship between games and promises? We need
to establish how the strategies interact and affect one another. Some
relationships are purely fortuitous, with no guarantees of receiving a
payoff for their actions. Others interactions are games which are
bound by a trade of mutually beneficial services.  Thus not all
promises are games, but all dilemma or bargaining games involve
promises.  How does the probability of keeping a promise relate to the
experience gained from iterative dilemma games, and the discount
parameter?

Since the cooperative dilemma gives us a natural notion of stable
equilibrium for autonomous agents, let us ask: in what sense is a
promise a cooperative dilemma, and which agents fall outside this
model?

Consider two nodes, node $A$ and node $B$, which each represent an
autonomous node. As described in \cite{siri1}, a promise
$a\stackrel{\pi}{\rightarrow} b$ from $a$ to $b$ is represented in
graph notation by a directed edge (arrow) from node $a$ to node $b$,
and the label on the edge defines the \emph{content} of the promise.

\bigskip
\begin{definition}[Policy dilemma] 
A bi-lateral promise between two nodes, in which cooperation is
interpreted as \emph{keeping} the promise, and act as promise, and
defection is defined as not keeping the promise. In order to qualify
as a bargaining game, both nodes must receive some utility from the
bilateral arrangement. 
\end{definition}
\bigskip

\begin{figure}[ht]
  \begin{center}
%\epsfig{file=pgames.eps,width=4cm}
\includegraphics[width=4cm]{figs/pgames}
  \end{center}
  \caption{Some 2 person games and their corresponding promises.\label{pgame}}
\end{figure}

Consider the images in fig. \ref{pgame}. Here we see a number of
promise graphs, some of which are policy dilemmas and some of which
are not. There are three basic categories
of promise\cite{burgessdsom2005}:
\begin{enumerate}
\item In a service promise of the form $a \stackrel{s}{\rightarrow} b$,
node $b$ is the recipient of something of value (the promised
service).

\item In a use-promise, $a \stackrel{U(s)}{\rightarrow} b$, node $a$
promises node $b$ that is will avail itself of a service $s$ , which
is has been promised by some node. In this case, node $b$ does not
receive anything tangible, except an assurance that $a$ 
will use the service it has been offered. 

\item Similarly, in a cooperative promise  $a \stackrel{C(s)}{\rightarrow} b$, node $a$
promises node $b$ that it will imitate $b$'s behaviour with respect to
service $s$. Again, the only thing $b$ receives is an assurance.
\end{enumerate}

We must now ask: are all these dilemma games? Consider fig. \ref{pgame}.

In case (a) the two nodes exchange services or other similar promises
of some value to one another. We may consider this interaction as
representing a baraining game between the players, if the promises are
judged beneficial to the agents, e.g. BGP peering agreements between
service providers.

In case (b) however, the assurance of usage does not seem enough to
bargain with. Indeed, node $a$ says ``if you do not give me $s$, I
will not use it!''. This does not seem like a strong bargaining
position. One could similarly apply (b) to a cooperative promise: ``If
you do not give me $s$, I will not do as you do with regard to $s$''
(give myself $s$). Again, there are no grounds for bargaining.  This
could be misleading however. Consider the example of refuse recycling
or garbage collection in society. Users promise to deliver their
garbage and the collectors agree to take it. Both parties can benefit
from this because by {\em not} cooperating they would incur a loss.
This warns that caution is required in jumping to conclusions.

In case (c), there is a game, since the two promises are on equal
footing: ``if you don't do as I do, I won't do as you do''. In both
cases, the promises are of equal value and so they have a symmetry of
mutual assurance, and we can define them to be a game. Indeed, case
(c) can easily be shown to be a stable fixed point.

Case (d) contains no game either. Node $c$ gives nothing in return to
node $b$ for its service $s_2$; node $b$ offers nothing in return to
node $a$ for $s_1$. 

The services are promised without any trade taking
place.  In order to establish a game, an agent must be able to exert an
influence against its peers. It must have something to bargain with.
There are two kinds of nodes that do not fit into this contract of
stability, easily located by network analysis.
\bigskip
\begin{definition}[Opportunist sink]
A network sink (a node with out-degree zero) is {\em opportunist}. It
is vulnerable to loss of service, as it offers nothing in return.
It is a non-profit node.
\end{definition}
\begin{definition}[Altruist source]
A network source (a node with in-degree zero) is an {\em altruist}. It is
vulnerable and exploited, as it gives without taking anything in
return. It is a `single point of failure' for the dependent promises.
\end{definition}
\bigskip
Thus network analysis could enable us to repair the lack of economic
incentive to cooperate in case (d) by modifying the graph to allow for
some chain of payment (see fig. \ref{pay}).
\begin{figure}[ht]
  \begin{center}
\includegraphics[width=4cm]{figs/pay}
%\epsfig{file=pay.eps,width=4cm}
  \end{center} \caption{An act of uncertain altruism can be turned
  into an incentivized bargaining game by assuring some sustaining payoff to the
  altruist.\label{pay}}
\end{figure}
Thus longevity of relationships provides a direct motivation for
introducing a trading currency.

\section{Minimum incentive requirements}

We can now propose some minimal requirements for sustainable promises,
based in economic incentives.

\begin{proposition}[Common currency]
The need to bargain or trade with different currencies motivates an
undirected graph of common currency, representing bilateral promises.
\end{proposition}
\bigskip
\begin{proposition}[Principle of sufficient remuneration]
Every agent should receive a `handshake' promise in return for giving a
promise.  The handshake can be valued in any currency used by the agent.
\end{proposition}
\bigskip

This proposition can easily be proven if one posits equilibrium
as a criterion for a well-formed policy. We shall not attempt to
squeeze this into the present work.

\begin{example}[Legally binding agreements]
The literature of games and economics often refers to `binding
agreements' in the sense of agreements that are enforced by some legal
framework\cite{contracts}. In promise theory there is no such outside
legal framework, unless one builds it from promises. Such a framework
is very complex.  A third party (government or law court) must promise
to make laws known.  All agents must promise to use the law. They must
promise to make visible any contracts and practices they make to an
arbitrator, who in turn must promise to inform law-enforcers of any
breakage of laws. These in turn could enforce an isolation
(incarceration) of an agent. To make an agreement binding by force,
one must subordinate oneself to an outside force, or suffer large
consequences.  Such a scenario takes considerable infra-structure,
which shows that commonly held assumptions can often represent very
complex scenarios (which is why criminals stand a chance of succeeding
in breaking laws). Ironically the main reason for contract success is
a sound economic basis, not the threat of punitive action.
\end{example}

The idea of offering different interpretations of currency has been
considered also by Piaget\cite{piaget1}. In his view, societies of
agents bind into working constellations because they value one another
by different criteria.  Consider a transaction performed between two
`agents', represented by $A$ and $B$, where $A$ performs some action
\emph{on behalf} of $B$.  Piaget assumes the following basic
conditions, which differ from our starting point:
\begin{itemize}
 \item A common scale of values for all agents. 
 \item Conservation of values in time. 
\end{itemize}
and remuneration values: i) Renouncement value $r_{A}$, representing
the \emph{investment} made by agent $A$ in order to perform an action
on behalf of $B$.  ii) Satisfaction value $s_{B}$, representing the
satisfaction expressed by $B$ as a result of receiving the action from
$A$.  iii) Acknowledgement value $t_{B}$: representing the
acknowledgement made by $B$ to $A$ as a result of receiving the
action.  iv) Reward value $v_{A}$, representing the associated value
of receiving the acknowledgement from $B$, as experienced by $A$.
Piaget distinguishes between so-called \emph{real} and \emph{virtual}
values, where $r$ and $s$ are the real values, and $t$ and $v$ the
virtual values in the transaction. The virtual values turn into real
values when they are 'cashed in'.

\section{Service Level Agreements as promise networks}

Will service trade be \emph{stable} over time?
Consider a graph of $n$
nodes, each making promises to a subset of the other nodes. Each
promise is represented by an arrow between the node making the
promise, and the node which is the receiver of the promise. In the
result graph, the number of directed edges indicates how many
\emph{relationships} are established in the group, which also 
says something about how \emph{clustered} the group is. Using graph
theoretical terminology, we can compute the \emph{connectivity} and
the \emph{clustering coefficient} of the
group\cite{burgessbook2,schaumgraph,berge1,west1}.

If we, for instance, assume all nodes are trading \emph{tit for tat},
and the environment is without noise, it is possible to determine,
without uncertainty, what choices each node will make in the next
iteration. This means we can determine which nodes will cooperate, or
how many will cooperate. 

If we can only speak of probabilities, the probability of whether each
promise will be kept, is then a number in the interval $[0,1]$, which
in graph theoretical terminology is a \emph{weight} on the
corresponding promise edge. 
For two agents, we can easily predict the stable configurations under
tit-for-tat by computing the spanning eigenvectors of the common
currency promise graph. To do this, we write the strategy as an
iterative matrix operator whose fixed points will be the stable
solutions of the policy:
\beq
\lambda \left(
\begin{array}{c}
M_1\\
M_2
\end{array}
\right)
=
\left(
\begin{array}{cc}
0 & 1\\
1 & 0
\end{array}
\right)
\left(
\begin{array}{c}
M_1\\
M_2
\end{array}
\right)
\eeq
which has stable eigenvectors
\beq
 \left(
\begin{array}{c}
M_1\\
M_2
\end{array}
\right)
= \left(
\begin{array}{c}
1\\
1
\end{array}
\right)
, \left(
\begin{array}{c}
1\\
-1
\end{array}
\right)
\eeq
This shows that the two strategy modes that can persist in the graph
are that both agents agree to agree (either keep their promises or
not) for all time, or they enter into an eternal battle of opposites
(cooperation and defection) for all time. In the absence of an initial
condition (as in def. \ref{t4t}) there is no way to pick out which of
the former is the case, but it seems reasonable to assume that
promises will initially be kept if an agent bothers to make them.
This method can be extended to larger graphs and we shall return to
this matter in later work.

To choose the stable alternative, we must ensure sufficient promises
to fulfill all of our propositions for stable cooperation.

Let us define an {\em agreement} as a bilateral promise between two
agents to affirm and comply with a common statement.  A {\em contract}
is a bilateral collection of promises and commitments between the
agents.  We then have that a contract Service Level Agreement (SLA) is
a mutual promise between agents to comply with documented promises
that go in both directions.  Our results show minimum requirements for
such promises, for successful service management.

\section{Summary}

In this paper we have considered the relationship between promise
graphs and the management of services through voluntary `economic'
cooperation. This viewpoint is particularly applicable for peer to
peer services and free market economic models for electronic commerce.
We have shown that, given the ability of agents to make personal
valuations of promised services, there are natural promise
configurations that form economic games. We propose that {\em
sustainable configurations} are naturally predicted by the outcomes of
these equivalent games, and that imbalanced promises are likely to
lead to disintegration of promise networks, or unsustainable
Service Level Agreements.

The existing body of work from cooperative game theory suggests that a
complete view of a human-computer policy, as an equilibrium of
individual promises, requires three things: identifiability of agents,
a notion of common currency for motivating cooperation, and a
stabilizing principle of remuneration (trading), that invokes
bilateral promises to bind all nodes together. The fragility of
promise networks is easily analyzed by methods of graph and
reliability theory.


Naive cooperative game theory suggests policy strategies for maximizing the
utility of an autonomous agent's policies.
\begin{enumerate}
 \item {\em Initial cooperative behaviour:} When establishing a new promise or relationship, an agent should always act in a cooperative way, i.e. should keep the promise made to another agent.
 \item {\em Punish bad behaviour:} If an agent fails to fulfil/keep a promise, this should be punished by the environment, so the agent lose some privileges.
 \item {\em Reward good behaviour:} Agents that keep their promises should be rewarded.
\end{enumerate}
This viewpoint must deal with defections that happen by accident:
i.e. faults or `acts of God'. Immediate punishment is an efficient,
but not necessarily desirable strategy, unless used as an incentive to
motivate redundancy and fault tolerance of systems. Simulation leading
to empirical conclusion is an avenue for future study.

Pervasive computing scenarios typically imagine an extremely dynamic
environment, where the number of agents involved changes. Also,
they are assumed to be very
\emph{heterogeneous}, which would imply a wide variety of different
promises. We propose that understanding the global impact of this
world requires an economic model.
There are many threads to pick up on. We hope to report on these in
future work.

