Comment on page

Troubleshooting on the Mainnet

Troubleshoot issues when running an Aleph node on the mainnet for validating transactions.
The instructions for the Mainnet are almost identical to the Testnet ones, but currently there is a slight difference in the logs generated by the Validator Network.

Accepting consensus connections on the Mainnet

Another reason your node is not taking part in consensus might be it is not accepting connections from other Validator nodes on port 30343. This might be caused by providing a wrong IP/DNS argument to the Node.
This can be diagnosed by searching Node logs. You can access your logs just like described in the section Logs. Good idea would be to investigate logs from last 24 hours of your node running. During looking at logs, you are interested in searching for status logs of the Validator network. Those are the logs that start with: Validator Network status.
When your node does not take part in the current session of consensus, this status log will report the following message:
Validator Network status: not maintaining any connections;
This is fine and you should not worry about that. In case your Node is going to be in a session, this status log will change and start reporting your connections with other Validators. Start of session can be found in logs by searching for runway initialization. Example of such log is:
2022-11-04 14:36:04 NodeIndex(4) Runway initialized.
where NodeIndex is your node's index in the committee, so might be different from the one in example. This log indicates initialization of AlephBFT and after that your node will start taking part in consensus.
Now after you have found a time at which node is taking part in consensus, a typical healthy Validator Network status log looks like this:
Validator Network status: target - 9 connections; both ways - 9 [5DzE…KSJDaCLA, 5Fuv…jF768eAe, 5D46…6hBv8wem, 5Deg…LwU5mf56, 5Cja…Bx2MZpKy, 5Cz3…24E93sj3, 5DfM…Fn6KeWmz, 5HZy…JJsL5pJ7, 5HJB…s13BRiz6];
In this log, target stands for how many Validators your node is maintaining connections with. both ways stands for how many and a list of all Validators you are connected to.
In case your node is healthy but there are some minor problems with the network, your node can start reporting other logs like:
Validator Network status: target - 9 connections; both ways - 6 [5DzE…KSJDaCLA, 5Fuv…jF768eAe, 5D46…6hBv8wem, 5Deg…LwU5mf56, 5Cja…Bx2MZpKy, 5Cz3…24E93sj3]; incoming only - 1 [5HJB…s13BRiz6]; outgoing only - 1 [5HZy…JJsL5pJ7]; missing - 1 [5DfM…Fn6KeWmz];
This reports other types of connections, or missing connections. As long as you are connected with at least 2/3 of the nodes, you should not worry about that. That said a common problem that might appear if you do not have 30343 port open is that no one can connect to you. Then the status log will look like the following:
Validator Network status: target - 9 connections; WARNING! No incoming peers even though we expected tham, maybe connecting to us is impossible; outgoing only - 9 [5DzE…KSJDaCLA, 5Fuv…jF768eAe, 5D46…6hBv8wem, 5Deg…LwU5mf56, 5Cja…Bx2MZpKy, 5Cz3…24E93sj3, 5DfM…Fn6KeWmz, 5HZy…JJsL5pJ7, 5HJB…s13BRiz6];
This can mean two things:
  • your public validator address is not available for other Validators to connect to,
  • mapping of your ports is incorrect and your node does not listen on port 30343.
In order for your node to function correctly this needs to be fixed. To do that you can make sure that your validator address and port (--ip/--dns argument or VALIDATOR_PUBLIC_ADDRESS environment variable) is not blocked by any firewall or that it is not hidden behind NAT (which probably comes down to getting a public, fixed IP). You can also verify that your node is setup to listen on a correct port (VALIDATOR_PORT environment variable, by default set to 30343).