State Resolution: Reloaded

Table of Contents

1 Introduction

This is an attempted Haskell Why Haskell? Because I'm strange. implementation of the State Resolution: Reloaded proposal. This implementation is not complete yet.

1.1 Running this document

Yes, this document is intended to be run. It is written in emacs with Org mode in a literate programming style, which interleaves executable code with plain language explanation. If you have emacs and Org mode, you can generate the Haskell file by downloading the reloaded.org file from https://gitlab.com/uhoreg/matrix_state_resolution, opening it in emacs, and doing M-x org-babel-tangle. If you don't have emacs (or don't want to generate the Haskell file), the above repository also contains a pre-generated Haskell file.

Once you have the Haskell file, you can compile it and run the demos in this document. In emacs, you can do C-c C-c in a code block to execute it I had to fiddle with some of my config to get the output to work properly. YMMV. . Note that some demos span multiple code blocks, and you will need to execute them in order.

1.2 Reading this document

Of course, this document is also intended to be read. As already mention, this document was written in a literate programming style which mixes code and text blocks. Hopefully the text is easily understandable. The code blocks are prefixed by a line that indicates what the code block is for. This is the chunk name and will be included in other code blocks when their names are included, surrounded by double angle brackets (e.g. <<foo>> will insert any chunks named foo). Chunks named types define some sort of data type, or a utility function for using that type. Chunks named functions define a function. Chunks named foo_demo are part a demonstration named foo.

arithmetic_demo:

1+1
2

means that the result of the expression 1+1 is 2.

Some of the foo_demo code blocks are immediately followed by a second code block. This second code block is the value of the final expression of the first code block.

1.3 Understanding Haskell

Haskell looks different from what most programmers are used to. For those unfamiliar with Haskell, Appendix A presents a brief primer on Haskell to help understand the code in this document.

In order to make the code easier to understand, I have attempted to avoid overly-idiomatic Haskell, at least in the main parts of the algorithm. Some of the utility functions may have Haskell idioms for brevity, but I have tried to limit this to functions whose implementations would be fairly obvious or that are less important. The hope is that most things should be either intuitively understandable (for those familiar with mathematical notation and with other programming languages), or ignorable.

2 State

For this algorithm, we use only a subset of the Matrix event data, so for clarity, we use a simplified event model. Our event type will only define the fields that are used, and move all the data to the top level. Event contents will be a simplified version, rather than a full JSON object. For example, the content field of message events will just be the message text, the content field of topic changes will be the new topic, and the content field of a membership event will be the membership type (join, leave, etc.). For power levels events, we only use the assignments of power levels to users; we will not worry about power levels required to send events, and just assume that state events require PL 50 to send, and message events require PL 0 to send.

types:

data EventType -- enum of event types that we will use
  = Create
  | PowerLevels
  | JoinRules
  | Membership
  | Topic
  | Message
  deriving (Eq, Generic)
data Event -- the event data structure
  = Event { eventId :: String
          , eventType :: EventType
          , timeStamp :: Int
          , stateKey :: Maybe String
          , sender :: String
          , content :: String
          , prevEvents :: [Event]
          , authEvents :: [Event]
          , powerLevels :: Maybe (Map.HashMap String Int) -- only for PowerLevels events
          }

We define some convenience functions to create events.

types: The exact definitions of these functions can be largely ignored; the main thing is to be aware that they are defined.

newStateEvent
  = Event { eventId = ""
          , eventType = Create
          , timeStamp = 0
          , stateKey = Just ""
          , sender = ""
          , content = ""
          , prevEvents = []
          , authEvents = []
          , powerLevels = Nothing
          }

newMessageEvent
  = Event { eventId = ""
          , eventType = Message
          , timeStamp = 0
          , stateKey = Nothing
          , sender = ""
          , content = ""
          , prevEvents = []
          , authEvents = []
          , powerLevels = Nothing
          }

join user
  = newStateEvent { eventType = Membership
                  , sender = user
                  , stateKey = Just user
                  , content = "join"
                  }

leave user
  = newStateEvent { eventType = Membership
                  , sender = user
                  , stateKey = Just user
                  , content = "leave"
                  }

u1 `invites` u2
  = newStateEvent { eventType = Membership
                  , sender = u1
                  , stateKey = Just u2
                  , content = "invite"
                  }

u1 `kicks` u2
  = newStateEvent { eventType = Membership
                  , sender = u1
                  , stateKey = Just u2
                  , content = "leave"
                  }

u1 `bans` u2
  = newStateEvent { eventType = Membership
                  , sender = u1
                  , stateKey = Just u2
                  , content = "ban"
                  }

user `setsPowerLevels` pl
  = newStateEvent { eventType = PowerLevels
                  , sender = user
                  , powerLevels = Just $ Map.fromList pl
                  }

user `setsTopic` topic
  = newStateEvent { eventType = Topic
                  , sender = user
                  , content = topic
                  }

A state set is a map from (event type,state key) pairs to events.

types:

type StateSet = Map.HashMap (EventType,String) Event

insertEvent :: Event -> StateSet -> StateSet
insertEvent
  e@(Event { eventType = eventType
           , stateKey = Just stateKey })
  stateSet
  = Map.insert (eventType, stateKey) e stateSet

-- construct a state set from a list of events
stateSetFromEventList :: [Event] -> StateSet
stateSetFromEventList
  = Map.fromList
    . map (\e@(Event { eventType = eventType
                     , stateKey = Just stateKey }) -> ((eventType, stateKey), e))

3 Event authorization

Each event has a set of auth events, which are events that justify why it is allowed to be sent. For most events, their auth events will be current create, power levels, and join rules states, and the sender's membership.

3.1 Auth chain

Note that there may be some opportunity for optimization due to the fact that the entire auth chain may not be needed. It may be possible to stop at a certain point, though the hard part is in determining what that point is.The auth chain of an event is its set of auth events, along with their auth events, recursively.

functions:

authChain :: Event -> (Set.HashSet Event)
authChain (Event { authEvents = e })
  -- ((Set.fromList e):(map authChain e)) means to prepend (Set.fromList e) to
  -- the list returned by (map authChain e), yielding a list of sets.
  -- Set.unions calculates the union of all these sets
  = Set.unions ((Set.fromList e):(map authChain e))

Sorry, your browser does not support SVG.

Black arrows represent prev events. Red arrows represent auth events. Since all events have the create event as an auth event, those arrows are omitted.

Let us look at an example where Alice creates a room and invites Bob and Carol, and then Bob joins. We will look at the auth chain for Bob's join event.

First, we start with the room creation event. Since it is the first event, it has no auth events.

auth_chain_demo:

create = newStateEvent { eventId = "create"
                       , eventType = Create
                       , sender = "@alice:example.com"
                       }

Next, Alice joins the room. Since she is the room creator and the only previous event is the create event, she is allowed to join the room.

auth_chain_demo:

alice_join = (join "@alice:example.com")
              { eventId = "alice joins"
              , prevEvents = [create]
              , authEvents = [create]
              }

The power levels are set for the room, giving Alice admin powers. The auth events for this event are Alice's join event (she can only send the event if she is in the room), and the room creation event (the room creation event is an auth event for every event in the room) The first power level event sent to a room is a special case: a room that does not have any existing power levels will accept any power level event. .

auth_chain_demo:

pl = ("@alice:example.com" `setsPowerLevels` [ ("@alice:example.com", 100)
                                             ])
     { eventId = "power level"
     , prevEvents = [alice_join]
     , authEvents = [alice_join, create]
     }

The join rules are then set, and Alice invites Bob and Carol. The auth events for these events are the same: the power level event (indicating that Alice has sufficient power to send the event), Alice's join event, and the room creation event.

auth_chain_demo:

join_rules = newStateEvent { eventId = "join rules"
                           , eventType = JoinRules
                           , sender = "@alice:example.com"
                           , content = "private"
                           , prevEvents = [pl]
                           , authEvents = [pl, alice_join, create]
                           }

invite_bob = ("@alice:example.com" `invites` "@bob:example.com")
             { eventId = "invite bob"
             , prevEvents = [join_rules]
             , authEvents = [pl, alice_join, create]
             }

invite_carol = ("@alice:example.com" `invites` "@carol:example.com")
               { eventId = "invite carol"
               , prevEvents = [invite_bob]
               , authEvents = [pl, alice_join, create]
               }

Finally, Bob joins the room. The auth events for this event are Bob's invitation, the join rules (indicating that Bob can join if he was invited), and the room creation event.

auth_chain_demo:

bob_join = (join "@bob:example.com")
           { eventId = "bob joins"
           , prevEvents = [invite_carol]
           , authEvents = [invite_bob, join_rules, create]
           }

Now we calculate the auth chain for Bob's join event. This will be the auth events for Bob's join event (Bob's invitation, the join rules, and the room creation), the auth events for Bob's invitation (the power levels, Alice's join, and the room creation), the auth events for the power levels (Alice's join and the room creation), the auth events for Alices join (the room creation), and the auth events for the room creation (nothing).

auth_chain_demo: The fromList in the output just means that the result is either a set or a map created from the list that follows. In this case, it is a set. Since this is a set, the list is not in any particular order.

authChain bob_join
fromList ["join rules","invite bob","power level","create","alice joins"]

3.2 Event authorization checks

When an event is sent to a room, it will only be accepted if the sender is authorized to send them according to the room's state set The exact rules are documented in the Authorization of PDUs section of the Matrix spec. . Users who are not in a room cannot send events other than join events; users can only join if the room is public, or if they have been invited. Events can only be sent by users who are in the room, and who have sufficient power level (in this document, we assume PL 50 is required for state events, and PL 0 for messages).

functions:

-- is the sender in the room?
isInRoom :: String -> StateSet -> Bool
isInRoom sender stateSet
  = maybe False (\x -> (content x) == "join") $ Map.lookup (Membership,sender) stateSet

-- get the power levels (if any) from the state set
stateSetPowerLevels :: StateSet -> Maybe (Map.HashMap String Int)
stateSetPowerLevels stateSet
  = Map.lookup (PowerLevels,"") stateSet
  >>= powerLevels

-- determine if the user has sufficient power level
hasPowerLevel :: String -> Maybe (Map.HashMap String Int) -> Int -> Bool
hasPowerLevel _ Nothing _ = True -- if the room has no power levels, then it's a free-for-all
hasPowerLevel sender (Just powerLevels) pl = (Map.lookupDefault 0 sender powerLevels) >= pl

isAuthorized :: Event -> StateSet -> Bool
-- create events are only allowed if they are the first event
isAuthorized
  (Event { eventType = Create
         , stateKey = Just _
         , prevEvents = prevEvents})
  stateSet
  = null prevEvents -- null tests whether it is empty
-- FIXME: join events
-- FIXME: power level events
-- everything else: PL 50 for state events, PL 0 for messages
isAuthorized
  (Event { stateKey = stateKey
         , sender = sender })
  stateSet
  = isInRoom sender stateSet
    && hasPowerLevel sender (stateSetPowerLevels stateSet) (if (stateKey == Nothing) then 0 else 50)

When checking whether an event is authorized in the state resolution algorithm (in particular, in the iterative auth checks algorithm below), the current state set might not be complete, since the (event type, state key) pair may be one that needs to be resolved by the state resolution algorithm. In this case, the missing state events are taken from the auth events of the event under consideration. We can do this by augmenting the state set with the event's auth events when there is a key missing.

functions:

isAuthorized' :: Event -> StateSet -> Bool
isAuthorized' e stateSet = isAuthorized e augmentedStateSet
  where
    -- fold takes a function, initial accumulator value, and a list, and then
    -- iterates over the values in the list, calling the function to update the
    -- accumulator, and returns the final value
    augmentedStateSet = foldl' insertMissingState stateSet $ authEvents e
    insertMissingState stateSet e@(Event { eventType = eventType
                                         , stateKey = Just stateKey })
      = if (eventType, stateKey) `Map.member` stateSet
        then stateSet
        else Map.insert (eventType, stateKey) e stateSet

The iterative auth checks algorithm takes a list of events and and initial state set, and iterates through the events, updating the state set if the event is authorized.

functions:

iterativeAuthChecks :: [Event] -> StateSet -> StateSet
iterativeAuthChecks events stateSet = foldl' addStateIfAuthorized stateSet events
  where
    addStateIfAuthorized stateSet e
      = if isAuthorized' e stateSet
        then insertEvent e stateSet
        else stateSet

3.3 Control events

Control events (called power events in previous versions, not to be confused with power levels events) are events that have the potential to remove the ability of another user to do something. Power levels, join rules, bans and kicks are control events Technically, normal leave events are also control events, but they are omitted in order to avoid breaking delta state resolution, and it is less likely that someone would attempt to reverse their own leave event through a malicious state reset. (https://matrix.to/#/!WoCKDnRBaGUpRPqTbx:jki.re/$1532339909206jhdSP:jki.re) .

functions:

isControlEvent :: Event -> Bool
isControlEvent (Event { eventType = PowerLevels
                      , stateKey = Just _ }) = True
isControlEvent (Event { eventType = JoinRules
                      , stateKey = Just _ }) = True
isControlEvent (Event { eventType = Membership
                      , sender = sender
                      , content = "ban"
                      , stateKey = Just stateKey }) = sender /= stateKey
-- a kick is a leave event where the sender of the event is not the same as the
-- person leaving (the state key)
isControlEvent (Event { eventType = Membership
                      , sender = sender
                      , content = "leave"
                      , stateKey = Just stateKey }) = sender /= stateKey
-- everything else is not a control event
isControlEvent _ = False

4 Algorithm overview

The state resolution algorithm operates in four stages:

  1. The state from the different state sets are split into unconflicted and conflicted state maps. The unconflicted state map is the initial value of the partially-resolved state set.
  2. The conflicted state map is split into control events (and events in their auth chains), and other (non-control) events.
  3. The control events and the events in their auth chains are sorted topologically, and applied to the partially-resolved state set. The ordering used here is called "reverse topological power ordering."
  4. The non-control events are sorted according to their relationship to control events, and applied to the partially-resolved state set. The ordering used here is called "mainline ordering."
  5. The unconflicted state map is re-applied to the partially-resolved state set.

5 State conflicts

The unconflicted state map of some state sets is the state set where each key exists and is the same in every state set. The conflicted state map is the set of all the events that are not in the unconfliced state map. Naturally, the unconflicted state map contains the state events that do not need to be resolved, while the conflicted state map contains the state events that do need to be resolved. For each key in the conflicted state map, the state resolution algorithm will pick one event (or Nothing It will pick Nothing if none of the events are authorized according to the auth checks. ) for that key to be part of the resolved state.

Another way of looking at this is to define the full state map \(F_S(t,k)\) of some state sets \(S=\{S_1, S_2, \ldots\}\) to be a mapping from (event type (\(t\)), state key (\(k\))) pairs to sets of events along with, possibly, the Nothing element. The domain of \(F\) (the set of keys in the mapping) is the union of the domain of each \(S_i\), and \(F_S(t,k)=\cup \{ S_i(t,k) \}\), where \(S_i(t,k)\) is the state event in state set \(S_i\) for the \((t,k)\) pair, or Nothing if \(S_i\) has no such state event. Then the unconflicted state map is the sub-map of \(F_S\) where \(F_S(t,k)\) has one element (modified so that the unconflicted state map returns the single element rather than a set of one element), and the conflicted state map is the union of all \(F_S\) where \(F_S(t,k)\) has more than one element, minus the Nothing element.

To demonstate how this works, consider three state sets \(S_1, S_2\), and \(S_3\). All three state sets reference the same create event. \(S_1\)'s power levels event (\(pl_1\)) will be one in which only @alice:example.com has non-default power levels, and both \(S_2\) and \(S_3\)'s power levels event (\(pl_2\)) will be one in which @alice:example.com and @bob:example.com both have non-default power levels. \(S_2\) will have a topic event, while \(S_1\) and \(S_3\) will both have no topic event. First, we create the events and the state sets:

conflict_demo:

create = newStateEvent { eventId = "create"
                       , eventType = Create
                       , sender = "@alice:example.com"
                       }

pl1 = ("@alice:example.com" `setsPowerLevels` [ ("@alice:example.com", 100)
                                              ])
      { eventId = "power levels1" }

pl2 = ("@alice:example.com" `setsPowerLevels` [ ("@alice:example.com", 100)
                                              , ("@bob:example.com", 50)
                                              ])
      { eventId = "power levels2"}

topic = ("@bob:example.com" `setsTopic` "This is a topic")
        { eventId = "topic"}

stateSet1 = stateSetFromEventList [ create, pl1 ]
stateSet2 = stateSetFromEventList [ create, pl2, topic ]
stateSet3 = stateSetFromEventList [ create, pl2 ]

stateSets = [ stateSet1, stateSet2, stateSet3 ]

Next, we calculate the domain by getting the domains from all the state sets and putting them all in a HashMap.

conflict_demo and conflict_calculation:The following code chunks will both serve as a demo and, through the magic of literate programming, become part of our function implementation.

domain = Set.fromList $ concat $ map Map.keys stateSets
fromList [(Power Levels,""),(Topic,""),(Create,"")]

As the state sets only have create, power level, and topic events, these are the elements (all with state key "") that are in the domain.

Next, we calculate the full state map by iterating through the domain values, and calculating the event sets for each key. This will return a list of key-value pairs.

conflict_demo and conflict_calculation:

fullStateMapList = map (\k -> (k, eventsForKey k)) $ Set.toList domain
  where
    eventsForKey key = Set.fromList $ map (Map.lookup key) stateSets
[((Power Levels,""),fromList [Just "power levels1",Just "power levels2"]),((Topic,""),fromList [Nothing,Just "topic"]),((Create,""),fromList [Just "create"])]

Finally, we partition into two lists based on whether the event set has size 1 or not. Looking at the output above, we see that the Topic set has two elements (a topic event, and Nothing), the Create set has one element, and PowerLevels set has two elements, so we would expect the Create set to go in the unconflicted state map, and the Topic and PowerLevels sets to go in the conflicted state map.

conflict_demo and conflict_calculation:

(unconflictedList, conflictedList) = partition (\(k, events) -> Set.size events == 1) fullStateMapList
([((Create,""),fromList [Just "create"])],[((Power Levels,""),fromList [Just "power levels1",Just "power levels2"]),((Topic,""),fromList [Nothing,Just "topic"])])

So we see that, as expected, the unconflicted state map is just the Create event, and the conflicted state map has the Topic event (along with Nothing since some state sets didn't have any topic set), and the PowerLevel events.

Putting this all together, along with converting the lists into the correct types:

functions:

calculateConflict :: [StateSet] -> (StateSet, (Set.HashSet Event))
calculateConflict stateSets = (unconflictedStateMap, conflictedStateMap)
  where
    <<conflict_calculation>>
    unconflictedStateMap
      -- convert single-element sets to just the element, and convert to map
      = Map.fromList
        $ map (\(k, eventSet)
                -> (k, fromJust $ head $ Set.toList eventSet)) unconflictedList
    conflictedStateMap
      -- collect all the non-Nothing elements, and convert to a set
      = Set.fromList $ catMaybes $ concat $ map (Set.toList . snd) conflictedList

conflict_demo:

calculateConflict stateSets
(fromList [((Create,""),"create")],fromList ["power levels1","topic","power levels2"])

5.1 Auth difference

The auth difference is used:

… to solve the problem where you have e.g. chains of power level events, like Alice gives Bob power (A), then Bob gives Charlie power (B) and then Charlie, say, changes the ban level (C). If you try and resolve two state sets one of which has A and the other has C, C will never pass auth since Charlie doesn't have power in A. If you pull in the auth difference (B) then you get A -> B -> C, which does pass auth. (https://github.com/matrix-org/matrix-doc/pull/1441#issuecomment-408159689)

Given some state sets, the auth difference is calculated by first calculating the full auth chain for each state set (that is, the union of the auth chains for the events in the state set) and taking every event that doesn't appear in every auth chain. In other words, if we take the full auth chains for the state sets, then the auth difference is their union minus their intersection.

functions:

fullAuthChain :: [Event] -> (Set.HashSet Event)
fullAuthChain stateSet = Set.unions $ map authChain stateSet

authDifference :: [StateSet] -> (Set.HashSet Event)
authDifference stateSets = authChainsUnion `Set.difference` authChainsIntersection
  where
    authChainsUnion = Set.unions fullAuthChains
    authChainsIntersection = foldl1' Set.intersection fullAuthChains
    fullAuthChains = map (fullAuthChain . Map.elems) stateSets

5.2 Full conflicted set

The full conflicted set is the union of the the conflicted state map and the auth difference. If we were to write this as a function, using the previously defined functions, it would look like:

functions:

fullConflictedSet :: [StateSet] -> (Set.HashSet Event)
fullConflictedSet stateSets = conflictedSet `Set.union` (authDifference stateSets)
  where
    (_, conflictedSet) = calculateConflict stateSets

However, as this is a very simple function, and uses part of the result of calculateConflict, which will be called elsewhere in the state resolution algorithm, we will not use this function, and will instead just inline the definition where it is needed.

6 Ordering

6.1 Reverse topological power ordering

When resolving state, we apply two different orderings to events. The first ordering that we apply is called reverse topological power ordering, which is a topological ordering of the events based on the auth events DAG. However, topological orderings are not necessarily unique, and all servers must use the same ordering in order to get the same result from the state resolution algorithm. Thus we will choose a specific topological ordering to use: we define an order on the events by comparing events based on their sender's power levels, then by time stamp, and finally by event ID. The topological ordering that we use will be the lexicographically smallest topological ordering out of all the topological orderings for the set of events based on the event order.

More specifically, given two topological orders \(A = a_1, a_2, \ldots, a_n\) and \(B = b_1, b_2, \ldots, b_n\) from earliest event to latest event, if \(i\) is the first index where \(A\) and \(B\) differ, then \(A\) is lexicographically smaller than \(B\) if \(a_i < b_i\) using the following definition: \(a_i < b_i\) if:

  • \(a_i\)'s sender has greater power level than \(b_i\)'s sender when looking at their respective auth events; or (if the power levels are the same)
  • \(a_i\)'s timestamp is less than \(b_i\)'s timestamp; or (if the power levels and timestamps are the same)
  • \(a_i\)'s event ID is lexicographically less than \(b_i\)'s event ID.

functions:

-- find the power level event (if any) in an e's auth events
_findPL :: Event -> Maybe Event
_findPL e = find (\x -> eventType x == PowerLevels) $ authEvents e

orderForRevTopPowOrd :: Event -> Event -> Ordering
orderForRevTopPowOrd e1 e2
  -- the <> operator, when applied to an ordering, returns the first comparison
  -- that is not equality
  = comparing power e1 e2
    <> comparing timeStamp e1 e2
    <> comparing eventId e1 e2
  where
    -- get the power level of the event's sender
    power e = _findPL e >>= powerLevels >>= (Just . (Map.lookupDefault 0 (sender e)))

We will calculate this topological ordering by using Kahn's algorithm Kahn, Arthur B. (1962), "Topological sorting of large networks", Communications of the ACM, 5 (11): 558–562, or if you can't locate a copy of that paper, see Wikipedia. functions: ] revTopPowSort events = fromJust $ kahn orderForRevTopPowOrd graph where graph = makeGraph eventId childrenOf events childrenOf event = map eventId $ prevEvents event FIXME: add example , with the modification that when we take a node from the set of nodes under consideration, we will take the node with smallest priority.

6.2 Mainline ordering

The second ordering that we apply is called mainline ordering with respect to a given power level event \(p\). The power level event that the mainline ordering will be based on is either the power level event that is chosen by an earlier step in the state resolution algorithm, or the power level in the unconflicted state map, depending on whether or not there is a conflict in the power level events. The idea for this ordering is that events that are sent using power levels that are "closer" in the DAG to \(p\) are given priority over events sent using power levels that are "farther".

The mainline of an event \(p\) is the list created by starting with \(p\) and recursively taking the power level events from its auth events. In the proposal, the list is ordered such that \(p\) is the last event. In our implementation, \(p\) will be the first event. So when we compare positions in this list, we will use the negation of the position.

Given another event \(e\), the closest mainline event of \(e\) is the first power level event encountered in the mainline of \(p\) when recursively looking at \(e\)'s power level events. In other words, it is the first event in the mainline of \(p\) that is also in the mainline of \(e\), or equivalently, the first event in the mainline of \(e\) that is also in the mainline of \(p\).

Note that in most cases the mainlines from any two events from the same room will converge at some point (at the first power level event, at the latest) with the exception of events that are sent before the first power level event. (In a most rooms, only two events are sent before the first power level event: the create event and the room creator's initial join event. The create event will never be involved in any state conflict, so this will only be relevant for the room creator's initial join event.) In some other abnormal cases, the mainlines may not converge, such as maliciously crafted events or server bugs. In the case of events that are sent before the first power level event, the mainline ordering implementation should order these events before any others. In the case of abnormal events, we want these events to be ordered earlier so that they will be superceded by other events. So in both cases, events whose mainline does not meet the mainline of \(p\) should be considered to be earlier than other events.

The mainline order first orders events based on their closest mainline events: if \(e_1\)'s closest mainline event is closer to \(p\) than \(e_2\)'s closest mainline event, then \(e_1\) is greater than \(e_2\). If \(e_1\) and \(e_2\) have the same closest mainline event, then they are ordered by timestamp, and then by event ID.

functions:The mainlineOrder function here takes advantage of a Haskell feature called "currying": calling a function in Haskell with fewer arguments than what it expects will result in a function that takes the remaining arguments (this is a bit of a lie: technically, Haskell functions only take a single argument, but we usually pretend that they take multiple arguments). Thus calling mainlineOrder p will result in a function that will compare two events by the mainline order with respect to \(p\), and which can be passed to sortBy to sort a list of events using the mainline order.

-- The mainline list of an event e.  unfoldr generates a list by iteratively
-- applying a function until it returns Nothing.  The function returns a tuple
-- where the first element is the element to add to the list and the second
-- element is the input to the next iteration.
mainline e = unfoldr (maybe Nothing (\event -> Just (event, _findPL event))) (Just e)

mainlineOrder :: Event -> Event -> Event -> Ordering
mainlineOrder p e1 e2
  = comparing mainlineDepth e1 e2
    <> comparing timeStamp e1 e2
    <> comparing eventId e1 e2
  where
    -- creates a map from event to the negation of its position in the mainline
    -- list
    mainlineMap = Map.fromList $ zip (mainline p) [0,-1..]
    -- the "closest mainline event" of e is the first event in (mainline e)
    -- that is also in (mainline p).  So iterate through the mainline e list,
    -- look up each event in mainlineMap, and return the first non-Nothing
    -- result that we find.  If there is no non-Nothing result, then return
    -- Nothing (which will sort below everything else).
    mainlineDepth e = listToMaybe $ mapMaybe (\e -> Map.lookup e mainlineMap) $ mainline e

Sorry, your browser does not support SVG.

Let us look at an example. For simplicity, we will only include power levels events in the auth events, and we will not indicate any prev events.

mainline_demo:

pl1 = ("@alice:example.com" `setsPowerLevels` [])
      { eventId = "pl1"
      , timeStamp = 1 }
pl2 = ("@alice:example.com" `setsPowerLevels` [])
      { eventId = "pl2"
      , authEvents = [pl1]
      , timeStamp = 2 }
pl3 = ("@alice:example.com" `setsPowerLevels` [])
      { eventId = "pl3"
      , authEvents = [pl1]
      , timeStamp = 4 }
pl4 = ("@alice:example.com" `setsPowerLevels` [])
      { eventId = "pl4"
      , authEvents = [pl2]
      , timeStamp = 6 }
pl5 = ("@alice:example.com" `setsPowerLevels` [])
      { eventId = "pl5"
      , authEvents = [pl4]
      , timeStamp = 6 }
pl6 = ("@alice:example.com" `setsPowerLevels` [])
      { eventId = "pl6"
      , authEvents = [pl4]
      , timeStamp = 5 }
pl7 = ("@alice:example.com" `setsPowerLevels` [])
      { eventId = "pl7"
      , authEvents = [pl2]
      , timeStamp = 5 }
pl8 = ("@alice:example.com" `setsPowerLevels` [])
      { eventId = "pl8"
      , authEvents = [pl7]
      , timeStamp = 6 }

We will look at the mainline ordering with respect to pl7, and so the mainline events are pl7, pl2, and pl1 (marked as rectangle nodes).

mainline_demo:

myMainlineOrder = mainlineOrder pl7

Looking at pl3 and pl5, the closest mainline event for pl3 is pl1, and the closest mainline event for pl5 is pl2. Since pl2 is closer to pl7 than pl1 is, pl3 < pl5.

mainline_demo:

myMainlineOrder pl3 pl5
*StateResReloaded Set Map Data.List| *StateResReloaded Set Map Data.List| LT

Looking at pl4 and pl6, they both share the same closest mainline event (pl2). So we look at the timestamps. pl4 has timestamp 6, and pl6 has timestamp 5, so pl4 > pl6.

mainline_demo:

myMainlineOrder pl4 pl6
*StateResReloaded Set Map Data.List| *StateResReloaded Set Map Data.List| GT

Looking at pl4 and pl5, they both share the same same closest mainline event (pl2). Next we look at the timestamps, but they both have timestamp 6. So we finally look at their event IDs, which results in pl4 < pl5.

mainline_demo:

myMainlineOrder pl4 pl5
*StateResReloaded Set Map Data.List| *StateResReloaded Set Map Data.List| LT

If we sort all of the events using the mainline ordering, we get the following:

mainline_demo:

sortBy myMainlineOrder [pl1, pl2, pl3, pl4, pl5, pl6, pl7, pl8]
*StateResReloaded Set Map Data.List| *StateResReloaded Set Map Data.List| ["pl1","pl3","pl2","pl6","pl4","pl5","pl7","pl8"]

We note that this is not a topological ordering, as any topological ordering would have pl4 before pl6.

7 State resolution algorithm

First, we start by calculating the unconflicted state map and the full conflicted set (the expression for calculating the full conflicted set was discussed above).

resolve:

(unconflictedStateMap, conflictedSet) = calculateConflict stateSets
fullConflictedSet = conflictedSet `Set.union` (authDifference stateSets)

The state resolution algorithm will try to resolve the state from the full conflicted set.

First, we focus on trying to resolve the power levels of the room. We do this by considering the control events from the full conflicted set. We will also add in the events that are in the full auth chain of the control events that are also in the full conflicted set.

resolve:

conflictedControlEvents = Set.filter isControlEvent fullConflictedSet
conflictedControlEventsWithAuth
  = conflictedControlEvents
    `Set.union`
    ((fullAuthChain $ Set.toList conflictedControlEvents)
      `Set.intersection`
      fullConflictedSet)

We sort these events using the reverse topological power ordering and, starting from the unconflicted state map, build up a new partially resolved state set by performing the iterative auth checks on these events.

resolve:

sortedControlEvents = revTopPowSort $ Set.toList conflictedControlEventsWithAuth
partialResolvedState = iterativeAuthChecks sortedControlEvents unconflictedStateMap

Next, we consider the remaining events from the full conflicted set.

resolve:

otherConflictedEvents = fullConflictedSet `Set.difference` conflictedControlEventsWithAuth

We sort them in ascending order using the mainline ordering with respect to the previously resolved power levels event, and then, using the iterative auth checks, we again build up a new resolved state set.

resolve:

resolvedPowerLevels = fromJust $ Map.lookup (PowerLevels, "") partialResolvedState
sortedOtherEvents = sortBy (mainlineOrder resolvedPowerLevels) $ Set.toList otherConflictedEvents
nearlyFinalState = iterativeAuthChecks sortedOtherEvents partialResolvedState

Finally, we make sure that the final state has all the events from the original unconflicted state map.

functions:

resolve :: [StateSet] -> StateSet
resolve stateSets
  = let
      <<resolve>>
    -- Map.union will take the value from the first map if a key exists in both
    -- maps.
    in unconflictedStateMap `Map.union` nearlyFinalState

FIXME: add example

Appendix A: A Haskell primer

Defining functions

Haskell heavily uses pattern matching when defining functions. For example, consider the following:

fib :: Int -> Int
fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)

This defines a function called fib.

The first line defines the input and output types. This is mostly optional, as Haskell will deduce the types, but in this document, I have included them in most top-level functions for clarity. This function takes an integer and returns an integer.

The remaining lines are the actual function definitions, and say that when fib is called with arguments that match the provided patterns (on the left hand side of the =), then the return value of the function is given on the right hand side of the =. Patterns that consist of literals match those literals, so the second line means "when the argument to fib is 0, then return 0". Patterns that consist of variable names match anything, and capture the value of the argument into a variable of that name for use in the right-hand side. So the fourth line means "when the argument to fib is anything, return the value of fib (n-1) + fib (n-2) where n is the value of the argument to fib". The three definitions are evaluated in order, so when fib is called, it will first check if the argument is 0, and if not, it will check if the argument is 1, and if not, then it will use the third definition.

The special pattern "_" matches anything without capturing it value.It is the same as matching with a variable, but never using that variable in the expression. This can be seen in the following definition, which defines a function that provides the functionality of an if-then-else statement:

if' True value _ = value
if' False _ value = value

This defines a function called if' that takes three arguments: a boolean and two other arguments (whose types must be the same). The pattern if' True value _ will match any call to if' where the first argument is True, and the second and third arguments are anything. The second argument is captured into the variable value, and the third argument is ignored. The function in this case just returns value, that is, the second argument. Similarly, when if' is called with False as the first argument, then it will just return its third argument, ignoring the second argument.

In the above examples, the function definitions just return a single expression. More complicated functions can be created using let and where clauses, which can define functions or variables to use in the expression. For example:

l2 (x1,y1) (x2,y2) = sqrt ((sq dx) + (sq dy))
  where
    dx = x2 - x1
    dy = y2 - y1
    sq x = x * x

l2' (x1,y1) (x2,y2)
  = let  dx = x2 - x1
         dy = y2 - y1
         sq x = x * x
    in sqrt ((sq dx) + (sq dy))

Both function definitions are equivalent, and define a function that gives the Euclidean (\(L^2\)) distance between two points. In most cases, the where and let forms can be used interchangeably. In this document, the form used will mostly depend on whether the final expression or the intermediate variables are of most interest.

Anonymous functions are written as (\ arguments -> return value). For example, an anonymous function for doubling a number is (\x -> 2*x). The arguments can also use destructuring. For example, an anonymous function for exchanging the elements in a tuple is (\(x,y) -> (y,x)).

Maybe

Haskell has a special data type called Maybe, which is used to signify an optional value. In a dynamically-typed language such as JavaScript or Python, it would be similar to a variable being allowed to have, for example, either a string value or null (in JavaScript) or None (in Python). However, since variables in Haskell can only have one type, the type Maybe T is used, where T is any other data type. If a variable is of type Maybe String, then it can have a value Just s, where s is any string, or it can have the value Nothing.

Special operators

Haskell has an >>= operator which is related to a concept called monads, which many people find confusing. However, in this document, the >>= operator is only used with the Maybe data type, which makes things simpler. Suppose that x is a value of type Maybe T1, and f is a function that takes an argument of type T1 and returns a Maybe T2. If x has value Just s, then x >>= f is equal to f s, and if x is Nothing, then x >>= f is equal to Nothing. In other words, >>= will pass through Nothing, but if the left-hand side has a value, then >>= will unwrap it from the Maybe type and apply the right-hand side. It can be seen as something like an exception handler in that returning a Nothing will cause it to skip over the remaining computations.

In Haskell, the $ operator is just a trick to avoid using parentheses. It is equivalent to inserting an open parenthesis where the $ is, and the matching close parenthesis at the end of the expression. For example, the expression

Set.fromList $ concat $ map Map.keys stateSets

is equivalent to

Set.fromList ( concat ( map Map.keys stateSets ))

The . operator represents function composition. For example, if we write

f = g . h

then evaluating f x would be the same as evaluating g (h x).

Infix functions

Haskell allows any binary function to be used as an infix operator by wrapping the function name in backticks ("`"). In this document, this is mostly used for set and map operations and for creating events where doing so makes things read more like the way you would read it in English. For example, x `Set.member` y is equivalent to Set.member x y and can be read as "x is a member of the set y".

Appendix B: Odds and ends

This appendix contains mostly internal things to make things work, so is largely uninteresting for understanding the state resolution algorithm.

We define how to display event types.

types:

instance Show EventType where
  show Create = "Create"
  show PowerLevels = "Power Levels"
  show JoinRules = "Join Rules"
  show Membership = "Membership"
  show Topic = "Topic"
  show Message = "Message"

We will consider events to be equal if they have the same ID (that is, event IDs are unique). Similarly, when displaying events, we will just display the event ID. Since we do this, we will use meaningful names as the event IDs.

types:

instance Eq Event where
  (==) e1 e2 = (eventId e1) == (eventId e2)

instance Show Event where
  show = show . eventId

Since events and event types are used in HashSet and HashMap, they must be instances of the Hashable typeclass. With event types, we use the implementation that we get for free from deriving from Generic. For events, we just hash the event ID.

types:

instance Hashable EventType

instance Hashable Event where
  hashWithSalt i = hashWithSalt i . eventId

We also create a function that we will use to generate the graph drawings in this document from our event descriptions. The function will just iterate through the prev events and auth events for each event and print out a line for each edge in graphviz's DOT format.

The function will also take a set of events to omit from the auth DAG. This will be used to, for example, omit auth edges going to the create event.

functions:

eventsToDot :: [Event] -> Set.HashSet Event -> IO ()
eventsToDot [] _ = return ()
eventsToDot (x:xs) omit = do
  printEdges x (prevEvents x) Set.empty ""
  printEdges x (authEvents x) omit " [color=red]"
  eventsToDot xs omit
  where
    printEdges _ [] _ _ = return ()
    printEdges e (x:xs) omit suffix
      | x `Set.member` omit = printEdges e xs omit suffix
      | otherwise = do
          putStr $ show e
          putStr " -> "
          putStr $ show x
          putStrLn suffix
          printEdges e xs omit suffix

Finally, we put all the code blocks together in a module, with the necessary imports.

StateResReloaded.hs:

{-# LANGUAGE DeriveGeneric #-}

module StateResReloaded
where

import GHC.Generics (Generic)
import Data.Hashable
import qualified Data.HashMap.Strict as Map
import qualified Data.HashSet as Set
import Data.Maybe (catMaybes, fromJust, fromMaybe, listToMaybe, mapMaybe)
import Data.List (find, foldl', foldl1', partition, sortBy, unfoldr)
import Data.Ord
import Kahn (makeGraph, kahn)

<<types>>

<<functions>>

Author: Hubert Chathi

Created: 2021-06-15 Tue 21:51

Validate