Sunday, October 15, 2017

Eventual consistency does not exist

There's a lot about reactive systems and reactive coding nowadays. It's very hot and very trendy and there are bunch of new cool technologies related to this. As we are being led to think reactive platforms are fundamentally built on an Event Driven architecture. There are several new platforms that are evolving from this pattern and much new fun things are being done with it. I think there are a lot of misconceptions about it and also too much of religious beliefs,

Event Driven architecture differs from what one might be used to if you worked with request/response systems. It's mental picture is about orthogonal towards request/response in terms of execution. With a request/response system you lock each resource by calling them and making them hold until you are done. This creates a "synchronization" because all the required systems in that call will be updated at the "same time". This is not entirely true if there are more than one system.

I don't think event-driven can and will work as a multipurpose platform which solves all your problems in a satisfying way. I think most of these advocates of this pattern are selling something where, you as a developer, will have to either stick your head in the sand or quit your job to not to have to clean up the mess that are created by these systems.

Everyone always tries to give you the "silver bullet" to all of your problems. If there would be an universal system to everything, we would be out of a job.

Eventual consistency does not exist in the real world. Here's why.

The short version: A system can't be eventual consistent if there are no-one there with the information to be able to tell if the system is consistent or not. And if there's an observer that can tell that it's consistent, it's immediately consistent because the observer have the correct (consistent) information to be able to decide it. (I'd like to think this of Schrödinger's cat scenario, just the moment after opening the box.)

Something is only eventual consistent if there's something transitioning between state and there's an observer that can deal with that whenever it looks at that particular state immediately knows how to react properly to that consistent state. If the observer, by definition, know that the system is thoroughly consistent, meaning that every event that have happened until this point when the observer is looking at the system, all have occurred in order in relation to each other, in a predetermined (sequential) order the system will be consistent, this means that the system has to be immediate consistent at some time and any other state is not consistent. The mutual arrangement can be relaxed if and only if the producers have no dependency on each other.

The kicker here is that, in the real world, it really doesn't matter if there's an observer to tell if the world is consistent or not, it still going to be consistent, you as an observer just don't know it yet.

I'd like to make a distinction here, an observer that receives information is a consumer, however the second the observer acts on the information, by passing it on or making an "informed" decision, the observer becomes an "interactor". In an computer system observers cannot deal with all possible scenarios, and adapting to current event, it has to be dealt in some imposed order. Also the event's will then thereby be delayed in relation to other events (unless they are artificially synchronized). This is an stark contrast to the real world where none of this is necessary.

In the real world, every potential "observer" can deal with all possible scenarios so there are no "transition" which would imply eventual consistency.

An example of this:

You have two number generators A and B. The assignment is to have A and B produce random numbers for the consumer R. Random per definition implies no mutual arrangement between a number produced by A and B, A and A and B and B. Two numbers generated by A have no mutual arrangement and no dependency. This system could be called "eventual consistent" because there's no state (mutual arrangement) kept in the system, however because there's no state to share, there will be no consistency because consistency is undefined (as an observer I don't know what state the system should have). That means that R, our consumer, can't tell anything about the system (there is no consistent state). To be able to define something to be consistent you need to know the outcome at any given sample, which by definition, you can't know of randomly produced numbers, ergo this system is inconsistent and not eventual consistent because we don't have enough information of what the state of the system needs to have to be consistent.

There's another system here as well, which is unrelated to the system of producing random numbers, and it's the rate of produced numbers. This is, in it self, could be consistent/not consistent, but it's unrelated to the above example. This problem is not eventual consistent, but immediate consistent if we care about the producer/consumer rate.

Note that this example is not the same if there would be only one generator and it falls into a completely different category.

Second example
You have two number generators A and B and the assignment is to have A and B generate a sequence of numbers. A sequence of numbers implies that the sequence, per definition, have some sort of order, in this case the natural order. We can solve this problem in several ways and still achieve the correct (consistent) result, one such way is having A and B both produce numbers and we will have a receiver R. Receiver R here must now somehow keep information (state) about what and when A and B have done their job. Because A and B doesn't keep their state shared R needs to interpret the supposedly state to have correct and in consistent state in relation to everything else that is happening in the system. The only part of this system that is able to say if the system is consistent is R because R have the definition of something that is correct. A and B because they are not sharing state doesn't know what state the current are in except their own.

So if R now able to tell the state of the system or in this case "eventually" state with R = {1,2,3} does R know that there will be a 4 number coming? Well if it expects a value the state R = {1,2,3} is invalid and therefore we don't have an "eventual" state and we need to wait to be able to tell that the state is correct. If R = {1,2,4} we know we are in incorrect state since we are expecting a 3 before 4. And for the extreme if the correct state is all natural numbers, we'll never be in any consistent state (it's not defined) because we'll never reach consistent state. If we define that 3 is the right state because the nature of natural numbers, we are immediately consistent because its 3. I want to point out that there are information here not described by the system, which is the "natural order of numbers". This is an implicit requirement because R cannot ask A or B for 4 because R have no idea if 4 is actually sent and therefore might have a possible incomplete picture of the situation. This could be entirely correct though and not an issue at this time.

The presence of synchronization.

Some people compare "eventual consistency" with the real world, and comparing for example the Sun's light beams which are "eventually consistent" with how you perceive the sun's position i relation to how you see the sun. The comparison is about that the light takes 8 minutes to reach your eye after it left the Sun's surface and this is "eventual consistency".
There are fallacies with this analogy.

The difference between the real life scenario and a computer system is that in real life, time will synchronize ALL events. No events can happen out of order (in relation to time, even relative time) and this leaves the system, in this case the Sun, sunlight and you, to be consistent. So the real world do have a synchronizer and this works at all times, and this makes the reader of the system always read a consistent state no matter how or when it observes the information. Not only this, that computers systems can fail, and they do quite frequently, and the weird part, this is also something that is emphasized by "eventualists" that one should embrace system failure, though there again no guarantees that the system will be consistent. The real world events just don't crash with an exception or suddenly produces, out of thin air, "compensating" events (that really never happened) that undo everything that just happened.

Systems can seemingly have the property of consistency or "eventual consistency" but ironically this is harder to maintain (unless there are specifically a synchronizing factor) in a system where the load is high. And this again just contradicts the claim made by "eventualists".

In a discrete world, where computers are running, the guarantee of synchronization is synthetic and has to be designed and accounted for. There are no guarantees, unless explicit designed, that this is the case, and this is much more acute in a system with several machines and worsens with the amount of them. This is particularly true in an asynchronous environment where the "eventual consistency" idea is somehow the norm, it should be the other way around. They also suffer, because of the discrete properties, that events can be duplicated or have to be undone. This can then resolve in that the observer sees partial results.

More important is that, most often you can find a solution to a specific problem where "eventual consistency" is strongly or "good enough" consistent because you can guarantee the ordering of things happening but I do think that there will be other newer problems (requirements), after you came up with the initial solution, where this breaks apart.

The fallacy here is when comparing a discrete resolution with a continuous one. These components are everywhere in and here's some examples.

* Threads execution is a continuous resolution
* Threads are when sharing data, a discrete resolution
* Queues are discrete resolutions, even when run in a thread, if they leave the thread's "compound" like being run on several computers
* Networks are discrete resolutions
* IO are discrete resolutions
* Thread pools and thread priorities are discrete resolutions in relation to other threads
* Machines in interaction are discrete resolutions

Everywhere there are discrete resolutions, these are the places where "eventual consistency" can and will go wrong since there are no actual order of events. A continuous resolution, like an isolated thread (no outside interaction) is continuous and therefore always consistent.

An simple example where events could be out of order.

Consider a system A which produces a sequence of natural numbers. Because you want to produce and deal with these numbers in a scalable fashion, you have two queues (these are everywhere, like a OS, load balancer or a network card), and for the sake of this example, they are run on two different machines. You have then a event source which records these events to create a "source of truth". Now the arrival of the events, which would be 1,2... will probably arrive in order they are produced if the queues are equally fast. But if any of them will be slower this won't be the case anymore and therefore your event source will be corrupted.

Also if you have a huge amount of traffic, like LinkedIn, there's no way of telling if the data is correct due to the sheer amount of data coming in. You won't be able to look at all of that data and think that, that particular event is out of order in relation to another event. Not it might not matter for LinkedIn, but for other areas like money transactions or trading, its a huge deal and even illegal (since you need to be able to account for all transactions in you system, and if they are in the wrong order and you made decisions on that you have effectively corrupted your reality).

Focus points

The major problem with "eventual consistency" is not about the writing part, which could be mitigated, but the reading part where a system can make a decision on values that are based on two different values.

I want to conclude this with that I recognize that some problems and solutions can be "eventual consistent" but Ill argue for that those are, as in the case with the sun, immediate consistent because the reader doesn't know anything else. However to design a solution to be exclusevly "eventual consistent" is just stupid and a waste of your employee's or customers money.

No comments:

Post a Comment