Skip to main content

Auditbot crashloops due OneTimeKey conflicts

Issue

Auditbot crashloops due OneTimeKey conflicts. You see logs such as One time key signed_curve25519:AAAAAAAAAAA already exists

Environment

ESS 24.04 and more

Resolution


#!/bin/bash
SERVER_NAME=<server name>
URL='https://<synapse fqdn>/_matrix/client/v3/keys/claim'
TOKEN=

echo "Stopping the operator"
kubectl scale deploy/element-operator-controller-manager -n operator-onprem --replicas=0
echo "Stopping auditbot"
kubectl delete sts/first-element-deployment-auditbot-pipe -n element-onprem

while true; do
    out=$(curl -s $URL -H "authorization: Bearer $TOKEN" --data '{"one_time_keys": {"@auditbot:$SERVER_NAME": { "AUDITBOTPIPE": "signed_curve25519"}}}')
    echo $out

    # Check if the response contains an empty "one_time_keys" property
    isempty=$(echo $out | jq 'if .one_time_keys == {} then true else false end')

    if [[ "$isempty" == "true" ]]; then
        echo "All claims are exhausted"
        break
    fi
    sleep 1
done
echo "Restarting the operator so that it restarts auditbot"
kubectl scale deploy/element-operator-controller-manager -n operator-onprem --replicas=1

Root Cause

This issue can occur when you have an incident on your filesystem : unreliable writes, full disks, etc.