Auditbot crashloops due OneTimeKey conflicts
Issue
Auditbot crashloops due OneTimeKey conflicts. You see logs such as One time key signed_curve25519:AAAAAAAAAAA already exists
Environment
ESS 24.04 and more
Resolution
#!/bin/bash
SERVER_NAME=<server name>
URL='https://<synapse fqdn>/_matrix/client/v3/keys/claim'
TOKEN=
echo "Stopping the operator"
kubectl scale deploy/element-operator-controller-manager -n operator-onprem --replicas=0
echo "Stopping auditbot"
kubectl delete sts/first-element-deployment-auditbot-pipe -n element-onprem
while true; do
out=$(curl -s $URL -H "authorization: Bearer $TOKEN" --data '{"one_time_keys": {"@auditbot:$SERVER_NAME": { "AUDITBOTPIPE": "signed_curve25519"}}}')
echo $out
# Check if the response contains an empty "one_time_keys" property
isempty=$(echo $out | jq 'if .one_time_keys == {} then true else false end')
if [[ "$isempty" == "true" ]]; then
echo "All claims are exhausted"
break
fi
sleep 1
done
echo "Restarting the operator so that it restarts auditbot"
kubectl scale deploy/element-operator-controller-manager -n operator-onprem --replicas=1
Root Cause
This issue can occur when you have an incident on your filesystem : unreliable writes, full disks, etc.