Elections and Rewards Math

Understand the mathematics behind Aleph Zero's validator elections and rewards.

High Level Idea

The current election and reward mechanism on Aleph Zero was designed with the following goals in mind:

Simplicity: the mechanism should be simple, both for validators and nominators, so that it is easy to figure out what the optimal staking actions are.
Low Entry Barrier for Validators: given enough technical knowledge one should be able to start validating on Aleph Zero. In particular, the entry should not require a huge stake and there should not be any filtering of candidates based on non-transparent criteria.
Fairness: under normal circumstances, the rewards should be always proportional to stake (this is Proof of Stake, after all!) of a given actor (nominator or validator). It should not be possible to abuse this rule in any way.

The goals 1. and 2. are self-explanatory and don't require further discussion, however goal 3. might look enigmatic. This objective is a result of lots of research, and, most of all, experience from how PoS works on other chains (especially the flaws and possible pitfalls).

The main principle of PoS is that participants should be rewarded proportionally to the stake they put at risk. There are many examples of PoS systems that don't satisfy this requirement. Take for instance a system, where there is a fixed number of 100 validators, and each validator (with optimal uptime) receives the same amount of rewards r for its duty. In such case, there is no point for a validator to put more stake than the bare minimum, and, moreover, smart validators would try to force multiple of their nodes in the committee, because in this system running multiple nodes by a single operator might be advantageous for the operator. This in our opinion is an undesired property, because:

In most cases, running multiple nodes by a single operator does not contribute to decentralization. This is because a profit-driven operator would likely run all the nodes on the same infrastructure, with the same cloud provider, and maybe even on the same machine. What really contributes to decentralization is having as many independent operators as possible, running their nodes in different geographical locations, with different providers and using separate infrastructures. At the same time, we wish to emphasize that running multiple nodes by a single operator is by no means a bad thing: we are glad to have such validators but we kindly ask them to diversify the infrastructure on which their nodes are run.
If having more nodes per operator is financially incentivized, then naturally a single operator may occupy multiple spots in the committee that otherwise would be taken by other, independent operators. When the number of spots is limited, this also incites unhealthy competition between operators: each one wants to take for themselves as many spots as possible.

Note that the above disadvantages are consequences of the fact that the Fairness property we described earlier was not satisfied. If each validator was rewarded proportionally to their stake, then the problem would disappear: in order to run multiple nodes, the validator would need to distribute their stakes among them, but the sum of their stakes would still be the same.

On the other hand of the spectrum is a property which says: by running a node with stake X+Y one receives more rewards than by running two nodes, one with stake X and one with stake Y. If that was the case, then the system would favor big players (whales) and would incentivize pooling. That is not a good thing either, since it generates a force towards centralization.

Concluding, fairness is a property which guarantees that neither splitting your stake into multiple piles, nor combining your stake with someone else will give you any edge when it comes to rewards. It's then optimal to run just one node. That's the property that we aim to achieve.

Detailed Description

Having described the above motivations behind our system, we proceed with a detailed explanation on how it works.

Rewards

In Aleph Zero network, as in many PoS systems there are two types of actors: validators and nominators. Validators run aleph-node and keep the chain running. Nominators delegate their stake to validators they trust. Both validators and nominators receive rewards for their contribution.

To start validating, you must stake a minimum bond of 25000 AZERO. After this condition is satisfied you will enter the era-committee (every era lasts 86400 blocks, roughly 24h of time). Now, time for good news: (assuming your node is running OK) this is already enough for you to obtain rewards from validation. In particular you don't need to go through KYC, you don't need to be manually verified, and you don't need to be lucky.

How are the rewards distributed, then? For that we need to introduce some additional concepts and notation. First of all, each validator $v$ from the set of all validators $V$ specifies its commission $c(v)$ being a number in the interval $[0.02, 1]$ (2% being the minimal commission) and specifying what is the percentage of the rewards for the nominated stake the validator keeps for itself. Let us denote by $\mathrm{own(v)}$ the amount the validator staked on itself (recall, the minimum is 25000) and by $\mathrm{nominated}(v)$ the total amount staked on $v$ by its nominators. We also denote $\mathrm{total}(v) = \mathrm{own}(v) + \mathrm{nominated}(v)$ .

Assume for a moment all validators have 100% uptime, in other words, all of them do their job properly. In such a case, accordingly to our Fairness objective, validator $v$ should gain rewards proportional to $\mathrm{total}(v)$ . And indeed this is the case. Let $ER$ be the era reward: the total reward given out for an era (this is roughly $0.9 \cdot 3 \cdot 10^7 \mathrm{AZERO}/365 \approx 73973 \mathrm{AZERO}$ -- this is computed as 90% times the 30M yearly inflation, the remaining 10% goes to the ecosystem fund). Then the $\mathrm{reward_{total}}(v)$ for the validator $v$ is computed as:

\mathrm{reward_{total}}(v) = \mathrm{total}(v) \cdot \frac{ER}{SoS}

Where $SoS$ (sum of stakes) is the sum $\mathrm{total}(v_1) + \mathrm{total}(v_2) + ... + \mathrm{total}(v_n)$ of stakes of all validators in this era-committee $V$ (i.e., all the validators that has bonded 25k AZERO before the era has started).

Note however, that $v$ does not get the whole $\mathrm{reward_{total}}(v)$ for itself. It must pay its nominators! The mechanism here is really simple. It keeps $c(v) \cdot \mathrm{reward_{total}}(v)$ to itself (the commission), and the rest $(1-c(v)) \cdot \mathrm{reward_{total}}(v)$ is distributed to nominators in proportions according to their individual stakes. Note also that among nominators there is $v$ itself (who nominated $\ge 25000$ to itself) and hence it also gets paid for this part of the stake. In particular, if $v$ has only one nominator -- itself -- then naturally the whole reward goes to $v$ which agrees with the math (check!).

To recap, the exact formula for the reward for a validator $v$ is the following:

\mathrm{reward(v)} = \mathrm{reward_{comm}}(v) + \mathrm{reward_{nom}}(v) \\ \mathrm{reward_{comm}}(v) = c(v) \cdot \mathrm{reward_{total}}(v) \\ \mathrm{reward_{nom}} = \frac{\mathrm{own(v)}}{\mathrm{total(v)}} \cdot (1 - c(v)) \cdot \mathrm{reward_{total}}(v)

where:

\mathrm{total(v)} = \mathrm{own}(v) + \mathrm{nominated(v)} \\ ER \approx 73973 \mathrm{AZERO} \\ SoS = \sum_{v \in V} \mathrm{total}(v)

Performance and uptime calculation

While in normal circumstances the rewards are as described above, there are sometimes cases where some validators don't do they job properly. Naturally the system must disincentivize such behavior. There are essentially two types of misbehavior that a validator can "commit":

Bad Performance -- this is typically when a node is run on a poor hardware, or with bad network. Under this we also include cases where the node just shuts down and stops functioning, because of the malfunction of hardware or some such reason.
Malicious Activity -- this is when a node violates the protocol, which in most cases means the operator does not run the software we asked them to run. But can also happen in cases when the software is not run properly. One such important case is when an operator runs a single validator, with the same keys, simultaneously, from two different machines. This causes the validator to create forked blocks and is highly unhealthy for the network and thus considered malicious.

There are different penalties for the two kinds of misbehavior. To explain them, we need to start by introducing the way performance is calculated for validators. First of all, if the era-committee is too large, a validator V will likely not be actively validating (creating blocks and taking part in finalization) all the time. Instead, era is divided into 96 sessions, and each session is run by a smaller set of validators: a session-committee (the size of this committee will grow over time, but note that this size has absolutely no impact on the reward distribution). Currently the session-committee consists of a certain number of foundation validators + a set of community validators that changes from session to session. Every few sessions, each community validator in the era-committee enters the session committee. For every session in which a validator V participates in the session-committee it receives an uptime (between 0% and 100%) which has the following consequences:

If the uptime is >=90% (note that the threshold might be subject to change) then the validator receives 100% rewards. If it is less, then the reward of V for this session is scaled down appropriately.
If the uptime is very low, the validator receives one penalty point. After receiving 48 penalty points, the validator is suspended. This means that it cannot enter the era-committee for at least 10 eras.

Currently the uptime is calculated as the fraction of blocks created in this session by this validator divided by the number it was expected to produce.

If a validator is a member of the finalization committee, an on-chain algorithm calculates its ABFT performance each session as the distance from the current session DAG's head to the validator's DAG head. If this distance is more than 11 units when session ends, then the validator gets one finalization penalty point, and gets no rewards in this session. After receiving more than 24 finalization penalty points, the validator gets suspended for 10 eras,

The above concerns penalties for bad performance only. In case of malicious activity, the node is suspended from the committee as the first step. Subsequently, each such case is individually analyzed by governance and a decision can be made to ban the defendant from the era validators, in case the malicious activity is confirmed beyond doubt. We note that it is virtually impossible to commit such a fault "by coincidence" and malicious activity is always intentional. Perhaps the only exception from that is when an operator runs two copies of a node (with the same keys) simultaneously. Such a node is then harmful to the whole network, and thus if we detect such a validator who does not fix it immediately upon notice, the validator might be banned from the committee.

How this system satisfies our goals

Let us now go back and reexamine the goals we listed above:

Simplicity: from the viewpoint of the validator, the optimal strategy is to run just one node and run it well. From the viewpoint of the nominator it should nominate the validator that they like (meaning they trust them and expect good performance) that has the lowest commission. Importantly, note if two validators have equal commission and good performance, then there is absolutely no difference in the rewards when nominating them. This is another consequence of Fairness.
Low Entry Barrier for Validators: one could indeed argue that 25000 AZERO is a lot. However this is necessary for safety. Moreover, compared to other PoS chains, this is actually very low, taking into account that this guarantees you a seat in the committee and rewards. In particular, as previously mentioned, no KYC or any other kind of filtering happens.
Fairness: this is the main motivation behind technical decisions in this design, and as you can easily verify, fairness is indeed achieved in this system.

PreviousTroubleshooting NextTestnet Validator Airdrop

Last updated 5 months ago

Was this helpful?