Velero
Velero alerts provided by Avisi Cloud
Default Alerts
AME Kubernetes comes with a set of default alerts for Velero. This page serves as a reference for when one of these alerts fires within your cluster. Each alert gives a brief description of what it means, a list of possible causes and suggestions on how to resolve the issue.
Velero Alerts
VeleroBackupPartialFailures [warning]
A backup created by velero has one or more errors.
This alert means that Velero was only able to create a partial back-up. Certain resources and/or persistent disks where not included with the back-up. You should investigate using velero backup describe <backup-name>
and velero backup logs <backup-name>
to determine which resources failed.
Possible causes
- A namespace that is included in the back-up does not exist. Adjust your back-up schedule if necessary.
- failed to create the back-up due to memory constraints (e.g. OOM Event for restic)
- timeouts due to restic back-ups taking to long.
VeleroBackupFailures [warning]
Failed to create the back-up entirely
The backup failed entirely and no resources where stored in S3.
Possible causes
- no access to S3 object storage due to incorrect authentication
- no network connection
- service account has incorrect RBAC permissions
VeleroBackupRemovalFailures [warning]
Failed to delete a back-up that should have been removed
A back-up either failed to be removed entirely or Velero was unable to gain a lock to access the S3 restic repository.
This could result in unnecessary storage usage in S3 or orhpaned back-up files.
Possible causes
- no access to S3 object storage due to incorrect authentication, no network permissions, ...
- OOM events for restic in the velero pod (memory constraints)
Last updated on