arithmetic_demo
:
1+1
2
means that the result of the expression 1+1
is 2
.
This is an attempted Haskell Why Haskell? Because I'm strange. implementation of the State Resolution: Reloaded proposal. This implementation is not complete yet.
Yes, this document is intended to be run. It is written in emacs with Org mode
in a literate programming style, which interleaves executable code with plain
language explanation. If you have emacs and Org mode, you can generate the
Haskell file by downloading the reloaded.org
file from
https://gitlab.com/uhoreg/matrix_state_resolution, opening it in emacs, and
doing M-x org-babel-tangle
. If you don't have emacs (or don't want to
generate the Haskell file), the above repository also contains a pre-generated
Haskell file.
Once you have the Haskell file, you can compile it and run the demos in this
document. In emacs, you can do C-c C-c
in a code block to execute
it
I had to fiddle with some of my config to get the output to
work properly. YMMV.
. Note that some demos span multiple code blocks, and you
will need to execute them in order.
Of course, this document is also intended to be read. As already mention, this
document was written in a literate programming style which mixes code and text
blocks. Hopefully the text is easily understandable. The code blocks are
prefixed by a line that indicates what the code block is for. This is the
chunk name and will be included in other code blocks when their names are
included, surrounded by double angle brackets (e.g. <<foo>>
will insert any
chunks named foo
). Chunks named types
define some sort of data type, or a
utility function for using that type. Chunks named functions
define a
function. Chunks named foo_demo
are part a demonstration named foo
.
arithmetic_demo
:
1+1
2
means that the result of the expression 1+1
is 2
.
Some of the foo_demo
code blocks are immediately followed by a second code
block. This second code block is the value of the final expression of the
first code block.
Haskell looks different from what most programmers are used to. For those unfamiliar with Haskell, Appendix A presents a brief primer on Haskell to help understand the code in this document.
In order to make the code easier to understand, I have attempted to avoid overly-idiomatic Haskell, at least in the main parts of the algorithm. Some of the utility functions may have Haskell idioms for brevity, but I have tried to limit this to functions whose implementations would be fairly obvious or that are less important. The hope is that most things should be either intuitively understandable (for those familiar with mathematical notation and with other programming languages), or ignorable.
For this algorithm, we use only a subset of the Matrix event data, so for
clarity, we use a simplified event model. Our event type will only define the
fields that are used, and move all the data to the top level. Event contents
will be a simplified version, rather than a full JSON object. For example, the
content
field of message events will just be the message text, the content
field of topic changes will be the new topic, and the content
field of a
membership event will be the membership type (join, leave, etc.). For power
levels events, we only use the assignments of power levels to users; we will
not worry about power levels required to send events, and just assume that
state events require PL 50 to send, and message events require PL 0 to send.
types
:
data EventType -- enum of event types that we will use
= Create
| PowerLevels
| JoinRules
| Membership
| Topic
| Message
deriving (Eq, Generic)
data Event -- the event data structure
= Event { eventId :: String
, eventType :: EventType
, timeStamp :: Int
, stateKey :: Maybe String
, sender :: String
, content :: String
, prevEvents :: [Event]
, authEvents :: [Event]
, powerLevels :: Maybe (Map.HashMap String Int) -- only for PowerLevels events
}
We define some convenience functions to create events.
types
: The exact definitions of these functions can be largely ignored; the
main thing is to be aware that they are defined.
newStateEvent
= Event { eventId = ""
, eventType = Create
, timeStamp = 0
, stateKey = Just ""
, sender = ""
, content = ""
, prevEvents = []
, authEvents = []
, powerLevels = Nothing
}
newMessageEvent
= Event { eventId = ""
, eventType = Message
, timeStamp = 0
, stateKey = Nothing
, sender = ""
, content = ""
, prevEvents = []
, authEvents = []
, powerLevels = Nothing
}
join user
= newStateEvent { eventType = Membership
, sender = user
, stateKey = Just user
, content = "join"
}
leave user
= newStateEvent { eventType = Membership
, sender = user
, stateKey = Just user
, content = "leave"
}
u1 `invites` u2
= newStateEvent { eventType = Membership
, sender = u1
, stateKey = Just u2
, content = "invite"
}
u1 `kicks` u2
= newStateEvent { eventType = Membership
, sender = u1
, stateKey = Just u2
, content = "leave"
}
u1 `bans` u2
= newStateEvent { eventType = Membership
, sender = u1
, stateKey = Just u2
, content = "ban"
}
user `setsPowerLevels` pl
= newStateEvent { eventType = PowerLevels
, sender = user
, powerLevels = Just $ Map.fromList pl
}
user `setsTopic` topic
= newStateEvent { eventType = Topic
, sender = user
, content = topic
}
A state set is a map from (event type,state key) pairs to events.
types
:
type StateSet = Map.HashMap (EventType,String) Event
insertEvent :: Event -> StateSet -> StateSet
insertEvent
e@(Event { eventType = eventType
, stateKey = Just stateKey })
stateSet
= Map.insert (eventType, stateKey) e stateSet
-- construct a state set from a list of events
stateSetFromEventList :: [Event] -> StateSet
stateSetFromEventList
= Map.fromList
. map (\e@(Event { eventType = eventType
, stateKey = Just stateKey }) -> ((eventType, stateKey), e))
Each event has a set of auth events, which are events that justify why it is allowed to be sent. For most events, their auth events will be current create, power levels, and join rules states, and the sender's membership.
Note that there may be some opportunity for optimization due to the fact that the entire auth chain may not be needed. It may be possible to stop at a certain point, though the hard part is in determining what that point is.The auth chain of an event is its set of auth events, along with their auth events, recursively.
functions
:
authChain :: Event -> (Set.HashSet Event)
authChain (Event { authEvents = e })
-- ((Set.fromList e):(map authChain e)) means to prepend (Set.fromList e) to
-- the list returned by (map authChain e), yielding a list of sets.
-- Set.unions calculates the union of all these sets
= Set.unions ((Set.fromList e):(map authChain e))
Black arrows represent prev events. Red arrows represent auth events. Since all events have the create event as an auth event, those arrows are omitted.
Let us look at an example where Alice creates a room and invites Bob and Carol, and then Bob joins. We will look at the auth chain for Bob's join event.
First, we start with the room creation event. Since it is the first event, it has no auth events.
auth_chain_demo
:
create = newStateEvent { eventId = "create"
, eventType = Create
, sender = "@alice:example.com"
}
Next, Alice joins the room. Since she is the room creator and the only previous event is the create event, she is allowed to join the room.
auth_chain_demo
:
alice_join = (join "@alice:example.com")
{ eventId = "alice joins"
, prevEvents = [create]
, authEvents = [create]
}
The power levels are set for the room, giving Alice admin powers. The auth events for this event are Alice's join event (she can only send the event if she is in the room), and the room creation event (the room creation event is an auth event for every event in the room) The first power level event sent to a room is a special case: a room that does not have any existing power levels will accept any power level event. .
auth_chain_demo
:
pl = ("@alice:example.com" `setsPowerLevels` [ ("@alice:example.com", 100)
])
{ eventId = "power level"
, prevEvents = [alice_join]
, authEvents = [alice_join, create]
}
The join rules are then set, and Alice invites Bob and Carol. The auth events for these events are the same: the power level event (indicating that Alice has sufficient power to send the event), Alice's join event, and the room creation event.
auth_chain_demo
:
join_rules = newStateEvent { eventId = "join rules"
, eventType = JoinRules
, sender = "@alice:example.com"
, content = "private"
, prevEvents = [pl]
, authEvents = [pl, alice_join, create]
}
invite_bob = ("@alice:example.com" `invites` "@bob:example.com")
{ eventId = "invite bob"
, prevEvents = [join_rules]
, authEvents = [pl, alice_join, create]
}
invite_carol = ("@alice:example.com" `invites` "@carol:example.com")
{ eventId = "invite carol"
, prevEvents = [invite_bob]
, authEvents = [pl, alice_join, create]
}
Finally, Bob joins the room. The auth events for this event are Bob's invitation, the join rules (indicating that Bob can join if he was invited), and the room creation event.
auth_chain_demo
:
bob_join = (join "@bob:example.com")
{ eventId = "bob joins"
, prevEvents = [invite_carol]
, authEvents = [invite_bob, join_rules, create]
}
Now we calculate the auth chain for Bob's join event. This will be the auth events for Bob's join event (Bob's invitation, the join rules, and the room creation), the auth events for Bob's invitation (the power levels, Alice's join, and the room creation), the auth events for the power levels (Alice's join and the room creation), the auth events for Alices join (the room creation), and the auth events for the room creation (nothing).
auth_chain_demo
: The fromList
in the output just means that the result is
either a set or a map created from the list that follows. In this case, it is
a set. Since this is a set, the list is not in any particular order.
authChain bob_join
fromList ["join rules","invite bob","power level","create","alice joins"]
When an event is sent to a room, it will only be accepted if the sender is authorized to send them according to the room's state set The exact rules are documented in the Authorization of PDUs section of the Matrix spec. . Users who are not in a room cannot send events other than join events; users can only join if the room is public, or if they have been invited. Events can only be sent by users who are in the room, and who have sufficient power level (in this document, we assume PL 50 is required for state events, and PL 0 for messages).
functions
:
-- is the sender in the room?
isInRoom :: String -> StateSet -> Bool
isInRoom sender stateSet
= maybe False (\x -> (content x) == "join") $ Map.lookup (Membership,sender) stateSet
-- get the power levels (if any) from the state set
stateSetPowerLevels :: StateSet -> Maybe (Map.HashMap String Int)
stateSetPowerLevels stateSet
= Map.lookup (PowerLevels,"") stateSet
>>= powerLevels
-- determine if the user has sufficient power level
hasPowerLevel :: String -> Maybe (Map.HashMap String Int) -> Int -> Bool
hasPowerLevel _ Nothing _ = True -- if the room has no power levels, then it's a free-for-all
hasPowerLevel sender (Just powerLevels) pl = (Map.lookupDefault 0 sender powerLevels) >= pl
isAuthorized :: Event -> StateSet -> Bool
-- create events are only allowed if they are the first event
isAuthorized
(Event { eventType = Create
, stateKey = Just _
, prevEvents = prevEvents})
stateSet
= null prevEvents -- null tests whether it is empty
-- FIXME: join events
-- FIXME: power level events
-- everything else: PL 50 for state events, PL 0 for messages
isAuthorized
(Event { stateKey = stateKey
, sender = sender })
stateSet
= isInRoom sender stateSet
&& hasPowerLevel sender (stateSetPowerLevels stateSet) (if (stateKey == Nothing) then 0 else 50)
When checking whether an event is authorized in the state resolution algorithm (in particular, in the iterative auth checks algorithm below), the current state set might not be complete, since the (event type, state key) pair may be one that needs to be resolved by the state resolution algorithm. In this case, the missing state events are taken from the auth events of the event under consideration. We can do this by augmenting the state set with the event's auth events when there is a key missing.
functions
:
isAuthorized' :: Event -> StateSet -> Bool
isAuthorized' e stateSet = isAuthorized e augmentedStateSet
where
-- fold takes a function, initial accumulator value, and a list, and then
-- iterates over the values in the list, calling the function to update the
-- accumulator, and returns the final value
augmentedStateSet = foldl' insertMissingState stateSet $ authEvents e
insertMissingState stateSet e@(Event { eventType = eventType
, stateKey = Just stateKey })
= if (eventType, stateKey) `Map.member` stateSet
then stateSet
else Map.insert (eventType, stateKey) e stateSet
The iterative auth checks algorithm takes a list of events and and initial state set, and iterates through the events, updating the state set if the event is authorized.
functions
:
iterativeAuthChecks :: [Event] -> StateSet -> StateSet
iterativeAuthChecks events stateSet = foldl' addStateIfAuthorized stateSet events
where
addStateIfAuthorized stateSet e
= if isAuthorized' e stateSet
then insertEvent e stateSet
else stateSet
Control events (called power events in previous versions, not to be confused with power levels events) are events that have the potential to remove the ability of another user to do something. Power levels, join rules, bans and kicks are control events Technically, normal leave events are also control events, but they are omitted in order to avoid breaking delta state resolution, and it is less likely that someone would attempt to reverse their own leave event through a malicious state reset. (https://matrix.to/#/!WoCKDnRBaGUpRPqTbx:jki.re/$1532339909206jhdSP:jki.re) .
functions
:
isControlEvent :: Event -> Bool
isControlEvent (Event { eventType = PowerLevels
, stateKey = Just _ }) = True
isControlEvent (Event { eventType = JoinRules
, stateKey = Just _ }) = True
isControlEvent (Event { eventType = Membership
, sender = sender
, content = "ban"
, stateKey = Just stateKey }) = sender /= stateKey
-- a kick is a leave event where the sender of the event is not the same as the
-- person leaving (the state key)
isControlEvent (Event { eventType = Membership
, sender = sender
, content = "leave"
, stateKey = Just stateKey }) = sender /= stateKey
-- everything else is not a control event
isControlEvent _ = False
The state resolution algorithm operates in four stages:
The unconflicted state map of some state sets is the state set where each key
exists and is the same in every state set. The conflicted state map is the
set of all the events that are not in the unconfliced state map. Naturally,
the unconflicted state map contains the state events that do not need to be
resolved, while the conflicted state map contains the state events that do need
to be resolved. For each key in the conflicted state map, the state resolution
algorithm will pick one event (or Nothing
It will pick Nothing
if none of the events are authorized
according to the auth checks.
) for that key to be part of the
resolved state.
Another way of looking at this is to define the full state map \(F_S(t,k)\) of
some state sets \(S=\{S_1, S_2, \ldots\}\) to be a mapping from (event type
(\(t\)), state key (\(k\))) pairs to sets of events along with, possibly, the
Nothing
element. The domain of \(F\) (the set of keys in the mapping) is the
union of the domain of each \(S_i\), and \(F_S(t,k)=\cup \{ S_i(t,k) \}\), where
\(S_i(t,k)\) is the state event in state set \(S_i\) for the \((t,k)\) pair, or
Nothing
if \(S_i\) has no such state event. Then the unconflicted state map is
the sub-map of \(F_S\) where \(F_S(t,k)\) has one element (modified so that the
unconflicted state map returns the single element rather than a set of one
element), and the conflicted state map is the union of all \(F_S\) where
\(F_S(t,k)\) has more than one element, minus the Nothing
element.
To demonstate how this works, consider three state sets \(S_1, S_2\), and \(S_3\).
All three state sets reference the same create event. \(S_1\)'s power levels
event (\(pl_1\)) will be one in which only @alice:example.com
has non-default
power levels, and both \(S_2\) and \(S_3\)'s power levels event (\(pl_2\)) will be
one in which @alice:example.com
and @bob:example.com
both have non-default
power levels. \(S_2\) will have a topic event, while \(S_1\) and \(S_3\) will both
have no topic event. First, we create the events and the state sets:
conflict_demo
:
create = newStateEvent { eventId = "create"
, eventType = Create
, sender = "@alice:example.com"
}
pl1 = ("@alice:example.com" `setsPowerLevels` [ ("@alice:example.com", 100)
])
{ eventId = "power levels1" }
pl2 = ("@alice:example.com" `setsPowerLevels` [ ("@alice:example.com", 100)
, ("@bob:example.com", 50)
])
{ eventId = "power levels2"}
topic = ("@bob:example.com" `setsTopic` "This is a topic")
{ eventId = "topic"}
stateSet1 = stateSetFromEventList [ create, pl1 ]
stateSet2 = stateSetFromEventList [ create, pl2, topic ]
stateSet3 = stateSetFromEventList [ create, pl2 ]
stateSets = [ stateSet1, stateSet2, stateSet3 ]
Next, we calculate the domain by getting the domains from all the state sets
and putting them all in a HashMap
.
conflict_demo
and conflict_calculation
:The following code chunks
will both serve as a demo and, through the magic of literate programming,
become part of our function implementation.
domain = Set.fromList $ concat $ map Map.keys stateSets
fromList [(Power Levels,""),(Topic,""),(Create,"")]
As the state sets only have create, power level, and topic events, these are the elements (all with state key "") that are in the domain.
Next, we calculate the full state map by iterating through the domain values, and calculating the event sets for each key. This will return a list of key-value pairs.
conflict_demo
and conflict_calculation
:
fullStateMapList = map (\k -> (k, eventsForKey k)) $ Set.toList domain
where
eventsForKey key = Set.fromList $ map (Map.lookup key) stateSets
[((Power Levels,""),fromList [Just "power levels1",Just "power levels2"]),((Topic,""),fromList [Nothing,Just "topic"]),((Create,""),fromList [Just "create"])]
Finally, we partition into two lists based on whether the event set has size 1
or not. Looking at the output above, we see that the Topic
set has two
elements (a topic event, and Nothing
), the Create
set has one element, and
PowerLevels
set has two elements, so we would expect the Create
set to go
in the unconflicted state map, and the Topic
and PowerLevels
sets to go in
the conflicted state map.
conflict_demo
and conflict_calculation
:
(unconflictedList, conflictedList) = partition (\(k, events) -> Set.size events == 1) fullStateMapList
([((Create,""),fromList [Just "create"])],[((Power Levels,""),fromList [Just "power levels1",Just "power levels2"]),((Topic,""),fromList [Nothing,Just "topic"])])
So we see that, as expected, the unconflicted state map is just the Create
event, and the conflicted state map has the Topic
event (along with Nothing
since some state sets didn't have any topic set), and the PowerLevel
events.
Putting this all together, along with converting the lists into the correct types:
functions
:
calculateConflict :: [StateSet] -> (StateSet, (Set.HashSet Event))
calculateConflict stateSets = (unconflictedStateMap, conflictedStateMap)
where
<<conflict_calculation>>
unconflictedStateMap
-- convert single-element sets to just the element, and convert to map
= Map.fromList
$ map (\(k, eventSet)
-> (k, fromJust $ head $ Set.toList eventSet)) unconflictedList
conflictedStateMap
-- collect all the non-Nothing elements, and convert to a set
= Set.fromList $ catMaybes $ concat $ map (Set.toList . snd) conflictedList
conflict_demo
:
calculateConflict stateSets
(fromList [((Create,""),"create")],fromList ["power levels1","topic","power levels2"])
The auth difference is used:
… to solve the problem where you have e.g. chains of power level events, like Alice gives Bob power (A), then Bob gives Charlie power (B) and then Charlie, say, changes the ban level (C). If you try and resolve two state sets one of which has A and the other has C, C will never pass auth since Charlie doesn't have power in A. If you pull in the auth difference (B) then you get A -> B -> C, which does pass auth. (https://github.com/matrix-org/matrix-doc/pull/1441#issuecomment-408159689)
Given some state sets, the auth difference is calculated by first calculating the full auth chain for each state set (that is, the union of the auth chains for the events in the state set) and taking every event that doesn't appear in every auth chain. In other words, if we take the full auth chains for the state sets, then the auth difference is their union minus their intersection.
functions
:
fullAuthChain :: [Event] -> (Set.HashSet Event)
fullAuthChain stateSet = Set.unions $ map authChain stateSet
authDifference :: [StateSet] -> (Set.HashSet Event)
authDifference stateSets = authChainsUnion `Set.difference` authChainsIntersection
where
authChainsUnion = Set.unions fullAuthChains
authChainsIntersection = foldl1' Set.intersection fullAuthChains
fullAuthChains = map (fullAuthChain . Map.elems) stateSets
The full conflicted set is the union of the the conflicted state map and the auth difference. If we were to write this as a function, using the previously defined functions, it would look like:
functions
:
fullConflictedSet :: [StateSet] -> (Set.HashSet Event)
fullConflictedSet stateSets = conflictedSet `Set.union` (authDifference stateSets)
where
(_, conflictedSet) = calculateConflict stateSets
However, as this is a very simple function, and uses part of the result of
calculateConflict
, which will be called elsewhere in the state resolution
algorithm, we will not use this function, and will instead just inline the
definition where it is needed.
When resolving state, we apply two different orderings to events. The first ordering that we apply is called reverse topological power ordering, which is a topological ordering of the events based on the auth events DAG. However, topological orderings are not necessarily unique, and all servers must use the same ordering in order to get the same result from the state resolution algorithm. Thus we will choose a specific topological ordering to use: we define an order on the events by comparing events based on their sender's power levels, then by time stamp, and finally by event ID. The topological ordering that we use will be the lexicographically smallest topological ordering out of all the topological orderings for the set of events based on the event order.
More specifically, given two topological orders \(A = a_1, a_2, \ldots, a_n\) and \(B = b_1, b_2, \ldots, b_n\) from earliest event to latest event, if \(i\) is the first index where \(A\) and \(B\) differ, then \(A\) is lexicographically smaller than \(B\) if \(a_i < b_i\) using the following definition: \(a_i < b_i\) if:
functions
:
-- find the power level event (if any) in an e's auth events
_findPL :: Event -> Maybe Event
_findPL e = find (\x -> eventType x == PowerLevels) $ authEvents e
orderForRevTopPowOrd :: Event -> Event -> Ordering
orderForRevTopPowOrd e1 e2
-- the <> operator, when applied to an ordering, returns the first comparison
-- that is not equality
= comparing power e1 e2
<> comparing timeStamp e1 e2
<> comparing eventId e1 e2
where
-- get the power level of the event's sender
power e = _findPL e >>= powerLevels >>= (Just . (Map.lookupDefault 0 (sender e)))
We will calculate this topological ordering by using Kahn's algorithm
Kahn, Arthur B. (1962), "Topological sorting of large networks",
Communications of the ACM, 5 (11): 558–562, or if you can't locate a copy
of that paper, see Wikipedia.
functions
:
]
revTopPowSort events = fromJust $ kahn orderForRevTopPowOrd graph
where
graph = makeGraph eventId childrenOf events
childrenOf event = map eventId $ prevEvents event
FIXME: add example
,
with the modification that when we take a node from the set of nodes under
consideration, we will take the node with smallest priority.
The second ordering that we apply is called mainline ordering with respect to a given power level event \(p\). The power level event that the mainline ordering will be based on is either the power level event that is chosen by an earlier step in the state resolution algorithm, or the power level in the unconflicted state map, depending on whether or not there is a conflict in the power level events. The idea for this ordering is that events that are sent using power levels that are "closer" in the DAG to \(p\) are given priority over events sent using power levels that are "farther".
The mainline of an event \(p\) is the list created by starting with \(p\) and recursively taking the power level events from its auth events. In the proposal, the list is ordered such that \(p\) is the last event. In our implementation, \(p\) will be the first event. So when we compare positions in this list, we will use the negation of the position.
Given another event \(e\), the closest mainline event of \(e\) is the first power level event encountered in the mainline of \(p\) when recursively looking at \(e\)'s power level events. In other words, it is the first event in the mainline of \(p\) that is also in the mainline of \(e\), or equivalently, the first event in the mainline of \(e\) that is also in the mainline of \(p\).
Note that in most cases the mainlines from any two events from the same room will converge at some point (at the first power level event, at the latest) with the exception of events that are sent before the first power level event. (In a most rooms, only two events are sent before the first power level event: the create event and the room creator's initial join event. The create event will never be involved in any state conflict, so this will only be relevant for the room creator's initial join event.) In some other abnormal cases, the mainlines may not converge, such as maliciously crafted events or server bugs. In the case of events that are sent before the first power level event, the mainline ordering implementation should order these events before any others. In the case of abnormal events, we want these events to be ordered earlier so that they will be superceded by other events. So in both cases, events whose mainline does not meet the mainline of \(p\) should be considered to be earlier than other events.
The mainline order first orders events based on their closest mainline events: if \(e_1\)'s closest mainline event is closer to \(p\) than \(e_2\)'s closest mainline event, then \(e_1\) is greater than \(e_2\). If \(e_1\) and \(e_2\) have the same closest mainline event, then they are ordered by timestamp, and then by event ID.
functions
:The mainlineOrder
function here takes advantage of a
Haskell feature called "currying": calling a function in Haskell with fewer
arguments than what it expects will result in a function that takes the
remaining arguments (this is a bit of a lie: technically, Haskell functions
only take a single argument, but we usually pretend that they take multiple
arguments). Thus calling mainlineOrder p
will result in a function that will
compare two events by the mainline order with respect to \(p\), and which can be
passed to sortBy
to sort a list of events using the mainline order.
-- The mainline list of an event e. unfoldr generates a list by iteratively
-- applying a function until it returns Nothing. The function returns a tuple
-- where the first element is the element to add to the list and the second
-- element is the input to the next iteration.
mainline e = unfoldr (maybe Nothing (\event -> Just (event, _findPL event))) (Just e)
mainlineOrder :: Event -> Event -> Event -> Ordering
mainlineOrder p e1 e2
= comparing mainlineDepth e1 e2
<> comparing timeStamp e1 e2
<> comparing eventId e1 e2
where
-- creates a map from event to the negation of its position in the mainline
-- list
mainlineMap = Map.fromList $ zip (mainline p) [0,-1..]
-- the "closest mainline event" of e is the first event in (mainline e)
-- that is also in (mainline p). So iterate through the mainline e list,
-- look up each event in mainlineMap, and return the first non-Nothing
-- result that we find. If there is no non-Nothing result, then return
-- Nothing (which will sort below everything else).
mainlineDepth e = listToMaybe $ mapMaybe (\e -> Map.lookup e mainlineMap) $ mainline e
Let us look at an example. For simplicity, we will only include power levels events in the auth events, and we will not indicate any prev events.
mainline_demo
:
pl1 = ("@alice:example.com" `setsPowerLevels` [])
{ eventId = "pl1"
, timeStamp = 1 }
pl2 = ("@alice:example.com" `setsPowerLevels` [])
{ eventId = "pl2"
, authEvents = [pl1]
, timeStamp = 2 }
pl3 = ("@alice:example.com" `setsPowerLevels` [])
{ eventId = "pl3"
, authEvents = [pl1]
, timeStamp = 4 }
pl4 = ("@alice:example.com" `setsPowerLevels` [])
{ eventId = "pl4"
, authEvents = [pl2]
, timeStamp = 6 }
pl5 = ("@alice:example.com" `setsPowerLevels` [])
{ eventId = "pl5"
, authEvents = [pl4]
, timeStamp = 6 }
pl6 = ("@alice:example.com" `setsPowerLevels` [])
{ eventId = "pl6"
, authEvents = [pl4]
, timeStamp = 5 }
pl7 = ("@alice:example.com" `setsPowerLevels` [])
{ eventId = "pl7"
, authEvents = [pl2]
, timeStamp = 5 }
pl8 = ("@alice:example.com" `setsPowerLevels` [])
{ eventId = "pl8"
, authEvents = [pl7]
, timeStamp = 6 }
We will look at the mainline ordering with respect to pl7
, and so the
mainline events are pl7
, pl2
, and pl1
(marked as rectangle nodes).
mainline_demo
:
myMainlineOrder = mainlineOrder pl7
Looking at pl3
and pl5
, the closest mainline event for pl3 is pl1, and the
closest mainline event for pl5
is pl2
. Since pl2
is closer to pl7
than
pl1
is, pl3
< pl5
.
mainline_demo
:
myMainlineOrder pl3 pl5
*StateResReloaded Set Map Data.List| *StateResReloaded Set Map Data.List| LT
Looking at pl4
and pl6
, they both share the same closest mainline event
(pl2
). So we look at the timestamps. pl4
has timestamp 6, and pl6
has
timestamp 5, so pl4
> pl6
.
mainline_demo
:
myMainlineOrder pl4 pl6
*StateResReloaded Set Map Data.List| *StateResReloaded Set Map Data.List| GT
Looking at pl4
and pl5
, they both share the same same closest mainline
event (pl2
). Next we look at the timestamps, but they both have timestamp 6.
So we finally look at their event IDs, which results in pl4
< pl5
.
mainline_demo
:
myMainlineOrder pl4 pl5
*StateResReloaded Set Map Data.List| *StateResReloaded Set Map Data.List| LT
If we sort all of the events using the mainline ordering, we get the following:
mainline_demo
:
sortBy myMainlineOrder [pl1, pl2, pl3, pl4, pl5, pl6, pl7, pl8]
*StateResReloaded Set Map Data.List| *StateResReloaded Set Map Data.List| ["pl1","pl3","pl2","pl6","pl4","pl5","pl7","pl8"]
We note that this is not a topological ordering, as any topological ordering
would have pl4
before pl6
.
First, we start by calculating the unconflicted state map and the full conflicted set (the expression for calculating the full conflicted set was discussed above).
resolve
:
(unconflictedStateMap, conflictedSet) = calculateConflict stateSets
fullConflictedSet = conflictedSet `Set.union` (authDifference stateSets)
The state resolution algorithm will try to resolve the state from the full conflicted set.
First, we focus on trying to resolve the power levels of the room. We do this by considering the control events from the full conflicted set. We will also add in the events that are in the full auth chain of the control events that are also in the full conflicted set.
resolve
:
conflictedControlEvents = Set.filter isControlEvent fullConflictedSet
conflictedControlEventsWithAuth
= conflictedControlEvents
`Set.union`
((fullAuthChain $ Set.toList conflictedControlEvents)
`Set.intersection`
fullConflictedSet)
We sort these events using the reverse topological power ordering and, starting from the unconflicted state map, build up a new partially resolved state set by performing the iterative auth checks on these events.
resolve
:
sortedControlEvents = revTopPowSort $ Set.toList conflictedControlEventsWithAuth
partialResolvedState = iterativeAuthChecks sortedControlEvents unconflictedStateMap
Next, we consider the remaining events from the full conflicted set.
resolve
:
otherConflictedEvents = fullConflictedSet `Set.difference` conflictedControlEventsWithAuth
We sort them in ascending order using the mainline ordering with respect to the previously resolved power levels event, and then, using the iterative auth checks, we again build up a new resolved state set.
resolve
:
resolvedPowerLevels = fromJust $ Map.lookup (PowerLevels, "") partialResolvedState
sortedOtherEvents = sortBy (mainlineOrder resolvedPowerLevels) $ Set.toList otherConflictedEvents
nearlyFinalState = iterativeAuthChecks sortedOtherEvents partialResolvedState
Finally, we make sure that the final state has all the events from the original unconflicted state map.
functions
:
resolve :: [StateSet] -> StateSet
resolve stateSets
= let
<<resolve>>
-- Map.union will take the value from the first map if a key exists in both
-- maps.
in unconflictedStateMap `Map.union` nearlyFinalState
FIXME: add example
Haskell heavily uses pattern matching when defining functions. For example, consider the following:
fib :: Int -> Int
fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)
This defines a function called fib
.
The first line defines the input and output types. This is mostly optional, as Haskell will deduce the types, but in this document, I have included them in most top-level functions for clarity. This function takes an integer and returns an integer.
The remaining lines are the actual function definitions, and say that when
fib
is called with arguments that match the provided patterns (on the left
hand side of the =
), then the return value of the function is given on the
right hand side of the =
. Patterns that consist of literals match those
literals, so the second line means "when the argument to fib
is 0, then
return 0". Patterns that consist of variable names match anything, and capture
the value of the argument into a variable of that name for use in the
right-hand side. So the fourth line means "when the argument to fib
is
anything, return the value of fib (n-1) + fib (n-2)
where n
is the value of
the argument to fib
". The three definitions are evaluated in order, so when
fib
is called, it will first check if the argument is 0, and if not, it will
check if the argument is 1, and if not, then it will use the third definition.
The special pattern "_
" matches anything without capturing it
value.It is the same as matching with a variable, but never using that
variable in the expression. This can be seen in the following definition,
which defines a function that provides the functionality of an if-then-else
statement:
if' True value _ = value
if' False _ value = value
This defines a function called if'
that takes three arguments: a boolean and
two other arguments (whose types must be the same). The pattern if' True value _
will match any call to if'
where the first argument is True
, and the second
and third arguments are anything. The second argument is captured into the
variable value
, and the third argument is ignored. The function in this case
just returns value
, that is, the second argument. Similarly, when if'
is
called with False
as the first argument, then it will just return its third
argument, ignoring the second argument.
In the above examples, the function definitions just return a single
expression. More complicated functions can be created using let
and where
clauses, which can define functions or variables to use in the expression. For
example:
l2 (x1,y1) (x2,y2) = sqrt ((sq dx) + (sq dy))
where
dx = x2 - x1
dy = y2 - y1
sq x = x * x
l2' (x1,y1) (x2,y2)
= let dx = x2 - x1
dy = y2 - y1
sq x = x * x
in sqrt ((sq dx) + (sq dy))
Both function definitions are equivalent, and define a function that gives the
Euclidean (\(L^2\)) distance between two points. In most cases, the where
and
let
forms can be used interchangeably. In this document, the form used will
mostly depend on whether the final expression or the intermediate variables are
of most interest.
Anonymous functions are written as (\ arguments -> return value)
. For
example, an anonymous function for doubling a number is (\x -> 2*x)
. The
arguments can also use destructuring. For example, an anonymous function for
exchanging the elements in a tuple is (\(x,y) -> (y,x))
.
Haskell has a special data type called Maybe
, which is used to signify an
optional value. In a dynamically-typed language such as JavaScript or Python,
it would be similar to a variable being allowed to have, for example, either a
string value or null
(in JavaScript) or None
(in Python). However, since
variables in Haskell can only have one type, the type Maybe T
is used, where
T
is any other data type. If a variable is of type Maybe String
, then it
can have a value Just s
, where s
is any string, or it can have the value
Nothing
.
Haskell has an >>=
operator which is related to a concept called monads,
which many people find confusing. However, in this document, the >>=
operator is only used with the Maybe
data type, which makes things simpler.
Suppose that x
is a value of type Maybe T1
, and f
is a function that
takes an argument of type T1
and returns a Maybe T2
. If x
has value
Just s
, then x >>= f
is equal to f s
, and if x
is Nothing
, then x
>>= f
is equal to Nothing
. In other words, >>=
will pass through
Nothing
, but if the left-hand side has a value, then >>=
will unwrap it
from the Maybe
type and apply the right-hand side. It can be seen as
something like an exception handler in that returning a Nothing
will cause it
to skip over the remaining computations.
In Haskell, the $
operator is just a trick to avoid using parentheses. It is
equivalent to inserting an open parenthesis where the $
is, and the matching
close parenthesis at the end of the expression. For example, the expression
Set.fromList $ concat $ map Map.keys stateSets
is equivalent to
Set.fromList ( concat ( map Map.keys stateSets ))
The .
operator represents function composition. For example, if we write
f = g . h
then evaluating f x
would be the same as evaluating g (h x)
.
Haskell allows any binary function to be used as an infix operator by wrapping
the function name in backticks ("`"). In this document, this is mostly used
for set and map operations and for creating events where doing so makes things
read more like the way you would read it in English. For example, x
`Set.member` y
is equivalent to Set.member x y
and can be read as "x is a
member of the set y".
This appendix contains mostly internal things to make things work, so is largely uninteresting for understanding the state resolution algorithm.
We define how to display event types.
types
:
instance Show EventType where
show Create = "Create"
show PowerLevels = "Power Levels"
show JoinRules = "Join Rules"
show Membership = "Membership"
show Topic = "Topic"
show Message = "Message"
We will consider events to be equal if they have the same ID (that is, event IDs are unique). Similarly, when displaying events, we will just display the event ID. Since we do this, we will use meaningful names as the event IDs.
types
:
instance Eq Event where
(==) e1 e2 = (eventId e1) == (eventId e2)
instance Show Event where
show = show . eventId
Since events and event types are used in HashSet
and HashMap
, they must be
instances of the Hashable
typeclass. With event types, we use the
implementation that we get for free from deriving from Generic
. For events,
we just hash the event ID.
types
:
instance Hashable EventType
instance Hashable Event where
hashWithSalt i = hashWithSalt i . eventId
We also create a function that we will use to generate the graph drawings in this document from our event descriptions. The function will just iterate through the prev events and auth events for each event and print out a line for each edge in graphviz's DOT format.
The function will also take a set of events to omit from the auth DAG. This will be used to, for example, omit auth edges going to the create event.
functions
:
eventsToDot :: [Event] -> Set.HashSet Event -> IO ()
eventsToDot [] _ = return ()
eventsToDot (x:xs) omit = do
printEdges x (prevEvents x) Set.empty ""
printEdges x (authEvents x) omit " [color=red]"
eventsToDot xs omit
where
printEdges _ [] _ _ = return ()
printEdges e (x:xs) omit suffix
| x `Set.member` omit = printEdges e xs omit suffix
| otherwise = do
putStr $ show e
putStr " -> "
putStr $ show x
putStrLn suffix
printEdges e xs omit suffix
Finally, we put all the code blocks together in a module, with the necessary imports.
StateResReloaded.hs
:
{-# LANGUAGE DeriveGeneric #-}
module StateResReloaded
where
import GHC.Generics (Generic)
import Data.Hashable
import qualified Data.HashMap.Strict as Map
import qualified Data.HashSet as Set
import Data.Maybe (catMaybes, fromJust, fromMaybe, listToMaybe, mapMaybe)
import Data.List (find, foldl', foldl1', partition, sortBy, unfoldr)
import Data.Ord
import Kahn (makeGraph, kahn)
<<types>>
<<functions>>
Created: 2021-06-15 Tue 21:51