Attestation Troubleshooting

I'm missing attestations! Help!

Don't panic.

Missing some attestations is actually quite normal and has minimal impact. If your validators stop working and miss attestations, you will incur some small penalties. The penalties are small amounts of ETH that are approximately equal to the rewards the validator would have received if they had not missed their duties. For example, the penalties for missed attestations amount to around ETH or about per validator per day (as of April 2024).

But I'm being slashed!

No, you're not.

Penalties are not the same as "slashing".

Slashing is reserved for more serious offenses. Validators can be slashed for actions such as double signing or other malicious behavior that compromises the security and integrity of the network. Slashing results in a reduction of the validator's stake, typically at the level of 1 ETH.

Missing attestation duties do not result in slashing.

OK, so what should I do?

If your validators are missing attestations, perform the following checks to aid diagnosis.

Tips: Use ctrl-click (command ⌘-click) to open links

There are many links in the following steps pointing to various screens on the AVADO and different sections in this documentation. You may find it useful to use ctrl-click (on Windows) or command ⌘-click (on Macs) to open the links in new tabs. On mobile devices, long-press the link, then choose "Open in new tab" in the pop-up menu. This allows you to easily review the contents of those linked pages without losing your place in the current steps. You can then switch back to the next steps whenever required.

Check the Consensus Client

Open your Consensus Client DApp (Teku, Prysm, Nimbus) and its Management Page (Teku, Prysm, Nimbus).

If you notice that the status in the DApp remains stuck at Waiting for beacon chain to become ready without any progress, or if you come across error messages in the Logs, the Consensus Client might have a problem syncing.

The most common problems are:

The Consensus Client fails to connect to sufficient number of peers. If you are getting a low peer count, there may be a problem with your network setting. See Opening Network Ports to learn how to resolve this (by doing "port forwarding" on the router).
The Consensus Client fails to find the Execution Client. It may be the case that you have not yet installed the Execution Client, in which case once you have set up the Execution Client this error will go away. For example, the following error messages indicate that the Consensus Client is expecting to connect to the Execution Client but fails to do so.

ERROR - Execution Client request failed. Make sure the Execution Client is online and can respond to requests.

level=error msg="Could not connect to execution client endpoint"

No synced execution layer available for deposit syncing

If you have already installed an Execution Client but this error persists, check that you have correctly selected the Execution Engine (Geth or Nethermind) on the Consensus Client's Settings page to match with your installed Execution Client. Remember to click Apply changes to update the setting.

If the error persists, check the Logs of the Execution Client to see what is going on there. Sometimes a Restart of the Execution Client and the Consensus Client is all that is needed to resolve the issue.

Refer to the following sections for more details on the syncing process of the Consensus Clients and for troubleshooting ideas:

Other Errors

The following error points to a problem in MEV-Boost. Refer to Setting up MEV-Boost, especially the section on Configuring the Relays, to confirm that your settings are correct.

The builder is not available... all relays are unavailable.  Block production will fallback to the execution engine.

The following errors suggest a communication problem between the Execution and Consensus Clients. Try restarting both Clients, or a soft reboot of the AVADO (System > Reboot my AVADO).

Waiting for the JWT Token

FATAL - PLEASE CHECK YOUR ETH1 NODE | Encountered a problem retrieving deposit events from eth1 endpoint

The following error is related to Teku's storage memory. Make sure you have -Xmx8g setting in JAVA_OPTS (old settings of -Xmx3g and -Xmx5g will not work).

FATAL - Exiting due to fatal error in RetryingStorageUpdateChannel

In the following cases, Teku fails to start. There is likely an issue with the user settings. Check the values of JAVA_OPTS (default is -Xmx8g), EXTRA_OPTS (default is empty) and "Initial State" (default is: https://beaconstate.ethstaker.cc). Try "Reset defaults". If nothing works, try remove Teku and install a fresh copy to ensure the factory defaults are applied.

ERROR - Validator *** Error while connecting to beacon node event stream

INFO exited: teku (exit status 2; not expected)

INFO gave up: teku entered FATAL state, too many start retries too quickly

In the following cases, the database may be corrupt beyond repair. To recover, you will need to remove Teku, perform System > Disk Cleanup, re-install Teku and start over again.

FATAL error - failed to initialise storage. Teku failing to start

Teku failed to start: org.iq80.leveldb.DBException: Corruption: corrupted compressed block contents

The following error points to a problem in MEV-Boost. Refer to Setting up MEV-Boost, especially the section on Configuring the Relays, to confirm that your settings are correct.

level=fatal msg="proposer settings is empty after unmarshalling from file specified by proposer-settings-file flag"

The following error suggests a problem of the Execution Client. Perhaps it isn't synced yet; in that case, the error should go away once the Execution Client catches up.

level=error msg="Unable to process past deposit contract logs, perhaps your execution client is not fully synced"

In the following case, Prysm fails to start. There is likely an issue with the user settings. Check the values of EXTRA_OPTS, and "initial state". Try "Reset defaults". If nothing works, try remove Prysm (both Beacon Chain and Consensus Client) and install a fresh copy to ensure the factory defaults are applied.

INFO gave up: prysm entered FATAL state, too many start retries too quickly

The following error is also bad. The database may be corrupted beyond repair. Most likely you will need to remove Prysm and reinstall a fresh copy.

error msg="Unable to prune directory" ... error="slot could not be read from blob file 3.ssz: EOF" prefix=filesystem

On start-up, Nimbus attempts to download an "initial state" file hosted by AVADO, at https://snapshots.ava.do/state.ssz. There may be a problem with this file. If this happens, raise a ticket at AVADO Discord to alert the team.

Error error sending request for url (http://localhost:5052/eth/v1/node/health): error trying to connect: tcp connect error: Cannot assign requested address (os error 99) ++ curl --insecure --silent --fail https://snapshots.ava.do/state.ssz --output /data/data-mainnet/initial_state.ssz + echo 'Waiting for initial state download'

Check the Execution Client

Open the Management Page of your Execution Client DApp (Geth, Nethermind). For Nethermind, there is also a Health Checks page.

If your validators are not attesting, very often the culprit is the Execution Client (Geth or Nethermind). The syncing process of the Execution Clients is more lengthy and involves a lot more data, giving rise to more possibilities for errors.

The followings are the most common problems.

The Execution Client is unable to find peers

If you are getting a low peer count, or none at all, there may be a problem with your network setting. See Opening Network Ports to learn how to resolve this (by doing "port forwarding" on the router).

No beacon client detected or beacon chain not operational

If you have not installed and synchronized a Consensus Client (such as Nimbus, Teku, or Prysm), the Execution Client cannot sync. If you encounter the following messages:

Post-merge network, but no beacon client seen. Please launch one to follow the chain!

Beacon client online, but never received consensus updates. Please ensure your beacon client is operational to follow the chain!

No incoming messages from the consensus client that is required for sync.

No incoming messages from Consensus Client. Consensus client is required to sync the node. Please make sure that it's working properly.

Waiting for Forkchoice message from consensus layer to set fresh pivot block [60s]

It is likely that either you have not installed and synced a Consensus Client, or there is an issue with your existing Consensus Client. Check the logs of the Consensus Client to determine the cause of the problem.

Sometimes, there may be a communication problem between the Execution Client and the Consensus Client. Restarting both can often resolve this. Begin by restarting the Consensus Client, and then restart the Execution Client. You can find the Restart button on the respective client's Management Page.

Connection refused, or NaN%

When initially starting the Execution Client, you may see this error on the Home Page:

This warning may suggest a potential issue with the Execution Client, but it could also be a non-issue. To determine the cause, it is advisable to examine the Logs for any problems related to starting the Execution Client, finding peers, or connecting to the Beacon Chain. By addressing any identified issues, the errors and warnings should no longer persist.

If the Execution Client logs appear normal and do not indicate any problems, it is possible that the problems are temporary and will resolve themselves without further intervention.

Blocks Synced: 0

This is Nethermind specific. During the initial phase, you may find that the progress bar on the Home Page does not progress and remains at "0":

There is a known issue with the progress bar on the Home Page not functioning properly for Nethermind. You may notice that the progress remains at 0% for a significant amount of time, and then suddenly jumps to 100% when Nethermind is fully synced. To reliably monitor the progress, it is recommended to use the Logs, as they provide a more accurate depiction of the syncing process.

Sync progress "stuck" at 99%

This is Geth specific. In earlier versions of Geth, some users reported that the syncing process appeared to be "stuck" at 99% for an extended period. This was primarily due to the state regeneration and state heal steps which could take a long time.

However, newer versions of Geth have significantly improved this user experience. With the introduction of the "path state scheme" (--state.scheme=path) option, syncing should no longer get stuck at 99%. This option has become standard staring from Geth v1.14.0 (AVADO Geth package 10.0.64).

Explainer: Geth's "path state scheme"

The "path state scheme" is an internal method used by Geth to organize Ethereum state data in its database. Starting from Geth v1.14.0, the default method is the "path" scheme, which offers several advantages over the previous "hash" scheme:

The "path" scheme can remove unnecessary data on-the-fly, preventing Geth from becoming "bloated".
The database size will not experience runaway growth.
Unlike before, there is no need to manually prune or reinstall Geth to reclaim disk space.
The database is more resilient to data corruption, even in the event of unclean shutdowns.
Syncing no longer gets stuck at 99%.

For new installations, Geth will now default to the path state scheme.

This change does not affect Geth instances with pre-existing databases, as Geth will continue to use the existing database format. If you are currently using a previous version of Geth and have not set the --state.scheme=path option, you may choose to switch to the new path state scheme. To do so, follow these steps:

Remove the old version of Geth.
Perform a disk cleanup on your system.
Reinstall the latest version of the Geth package.
Allow it to resync from scratch, which may take approximately 12 hours on an i7 processor with a 1 Gbps fiber network connection.

Note that switching to the new scheme is optional. You can continue using Geth with its previous "hash state scheme" and switch to the new scheme only when you need to prune or otherwise reinstall/resync Geth.

CPU usage remains high after the node is synced, with attestation misses

Once the sync process is complete, your node will be fully synced. If you have already activated your validators, you should notice that they start attesting.

However, for a few hours, the Execution Client may still appear busy, and the CPU usage may remain high. This may even cause your validators to miss attestations if they are already activated.

This is because Geth and Nethermind are still busy finishing off their respective tasks.

In the case of Geth, it is engaged in generating state snapshots. State snapshots are a feature introduced in Geth to improve the efficiency and speed of syncing Ethereum nodes. Generating state snapshots is a resource intensive task that requires significant computational power and disk space. However, once the state snapshot is generated, it can be stored and shared among multiple nodes, significantly reducing the time and resources needed for syncing.

During this phase, you will observe the following messages in the logs repeatedly:

Generating state snapshot
Aborting state snapshot generation
Resuming state snapshot generation

This behavior is normal. Geth attempts to perform snapshot generation between the arrival of new blocks to minimize interruption. Allow Geth to complete the snapshot generation process.

In the case of Nethermind, it is engaged in downloading the old blockchain data, called "Old Bodies" and "Old Receipts". These data sets are needed to ensure that your node possesses a complete view of the blockchain. Downloading the old chain data is a resource intensive task that requires significant computational power and disk space.

During this phase, you will observe similar messages in the logs:

Old Bodies        7,529,598 / 19,491,929 ( 38.63 %)

and later:

Old Receipts      2,491,766 / 19,491,929 ( 12.78 %)

Notice that the percentage value should steadily increase. It is normal for your validators to potentially miss some attestations during this phase due to the high system load. Operations will stabilize once the download is complete.

Remember to ensure that you continue using a fan to keep your AVADO device cool.

Specific Syncing Guides

Refer to the following sections for more details on the syncing process of the Execution Clients, particularly on what to expect during the syncing process.

Other Errors

If unfortunately you see the following errors in the Logs, these are usually bad. The database may be corrupt beyond repair. To recover, you will need to remove Geth, perform System > Disk Cleanup, re-install and let it re-sync from scratch.

Unexpected trie node in disk
State snapshotter failed to iterate trie
Fatal: Failed to register the Ethereum service
Failed to retrieve genesis from ancient EOF
Fatal: could not open database.
Inserting block failed
Invalid merkle root
Bad block
Corruption on data-block
Error in block freeze operation
NewPayloadV1: inserting block failed error="invalid gas used"
Number of finalized block is missing
Head state missing, repairing
Block state missing, rewinding further

The following errors are likely a result of incorrect user settings. Check the value of EXTRA_OPTS (on Management Page)

INFO gave up: geth entered FATAL state, too many start retries too quickly

WARN exited: geth (exit status 1; not expected)

Default value of EXTRA_OPTS should be:

--http.api eth,net,web3,txpool

or the following if you manually specified to use the "path state scheme":

--http.api eth,net,web3,txpool --state.scheme=path

Note the double minus signs, and note that there are no spaces around the commas and the periods.

I still have a problem!

If you have performed all the necessary checks and your validators continue to fail in delivering attestations, we recommend reaching out to the AVADO community on Discord or raising a support ticket there.

PreviousConnection Troubleshooting NextError: Reboot and Select proper Boot device

Last updated 12 months ago