Element Server Suite Pro
Documentation supporting ess-helm Pro edition.
- Introduction to ESS Pro
- Setting up ESS Pro Helm Chart
- Handling secrets in ESS Pro
- Maintenance
- Configuring Components
- Configuring Synapse
- Configuring Matrix Authentication Service
- Configuring Authentication
- Configuring Element Web
- Configuring Matrix RTC
- Setting up Advanced Identity Management
- Advanced Identity Management
- Administration
- Authentication Configuration Examples
- Backup and Restore
- Calculate monthly active users
- Configuring Element Desktop
- Guidance on High Availability
- Migrating from Self-Hosted to ESS
- Mobile client provisioning
- Starting and Stopping ESS Services
- Advanced
- Troubleshooting
Introduction to ESS Pro
[WORK IN PROGRESS]
Element Server Suite Pro (ESS Pro) is the commercial Matrix distribution from Element for professional use. It is based off ESS Community and includes additional features and services that are tailored to professional environments with more than 100 users up to massive scale in the millions.
ESS Pro is designed to support enterprise requirements in terms of advanced IAM, compliance, scalability, high availability and multi-tenancy. ESS Pro makes use of Synapse Pro to provide infrastructure cost savings with unmatched stability and user experience under high load. It uses Element’s Secure Border Gateway (SBG) as an application layer firewall to manage federation and to ensure that deployments stay compliant at any time. ESS Pro includes L3 support, Long-term Support (LTS), Advanced Security Advisory and prepares customers for the Cyber Resilience Act (CRA).
This documentation provides all information for Element customers to get started as well as to work with ESS Pro.
Editions
There are three editions of Element Server Suite:
ESS Community
ESS Community is a cutting-edge Matrix distribution including all the latest features of the Matrix server Synapse and other components. It is freely available under the AGPLv3 license and tailored to small-/mid-scale, non-commercial community use cases. It's designed to easily and quickly set up a Matrix deployment. It comprises the basic components needed to get you running and is a great way to get started.
ESS Pro
ESS Pro is the commercial Matrix distribution from Element for professional use (see above) which is described in this documentation.
ESS TI-M
ESS TI-M is a special version of ESS Pro focused on the requirements of TI-Messenger Pro and ePA as specified by the German National Digital Health Agency Gematik. It complies with a specific Matrix version and does not make use of experimental features.
Contents
TOC
Deploying ESS Pro
ESS Pro comes as a Helm chart and can be deployed using any Kubernetes distribution. It requires an existing Kubernetes cluster and can be operated on the public internet as well as in airgapped scenarios.
A full step-by-step deployment guide for ESS Pro using K3s can be found here.
Components
Next you find an overview of the components in ESS Pro including their purpose and additional information. Most of the components get deployed by default but some of them require additional configuration first. Any component can be enabled/disabled as desired.
The following components are included in ESS Pro (bolded items are being deployed by default):
- Synapse Pro
- Matrix Authentication Service (MAS)
- Dex (for LDAP support)
- Element Web
- Element Call / Matrix RTC
- Advanced Identity Management (AIM, formerly Group Sync)
- Secure Border Gateway (SBG)
- Sygnal (Push Gateway)
- PostgreSQL database
- .well-known delegation
Find below more details on each of the components, information about their capabilities and our recommendations for deployment.
Synapse Pro
Purpose
- The Matrix server that provides client-to-server and server-to-server APIs
- Consists of Synapse and additional Pro components that improve performance, scalability and stability
Deployment recommendations
- Enabled and deployed by default. Should only be disabled if there is an external Synapse deployment to be used instead.
- Works out-of-the-box with default configuration. For advanced configuration, see the below guide.
- Deployment and configuration guide
- Documentation
Matrix Authentication Service (MAS)
Purpose
- Authentication server for Matrix using the OpenID Connect / OAuth 2.0 standard
- Provides local user management capabilities
- Allows integration of external IDM systems
Deployment recommendations
- Enabled and deployed by default. Should only be disabled if Matrix legacy authentication is required.
- Works out-of-the-box with default configuration. For advanced configuration, see the below guide.
- Deployment and configuration guide
- Authentication configuration guide (LDAP / OIDC)
- Documentation
Dex (for LDAP support)
Purpose
- Lightweight Identity Provider supporting various protocols
- Only used for LDAP support with MAS
Deployment recommendations
- Disabled by default as enabling it requires configuration
- Automatically enabled if LDAP authentication is configured
- Authentication configuration guide (LDAP / OIDC)
Element Web
Purpose
- The browser-based client from Element
Deployment recommendations
- Enabled and deployed by default. Should only be disabled if a browser-based client is undesired.
- Deployment and configuration guide
- Documentation
Element Call / Matrix RTC
Purpose
- Backend to support Element Call in-app calling
- Includes an SFU (selective forwarding unit)
Deployment recommendations
- Enabled and deployed by default. Should only be disabled if in-app calling functionality is undesired.
- Deployment and configuration guide
- Documentation
Advanced Identity Management (AIM, formerly Group Sync)
Purpose
- Integration and automation between external Identity Management (IDM) systems and the Matrix backend
- Supports LDAP and SCIM
- Features
- Synchronize user attributes (e.g., display name, email address, etc.) with external IDM systems
- User lifecycle management (automated user deprovisioning)
- Mirror organizational structures to Matrix rooms and Spaces
- Automated room memberships based on user attributes in external IDM systems (e.g., group memberships)
- Automated room permission management based on user attributes in external IDM systems
Deployment recommendations
- Disabled by default as enabling it requires configuration
- For organizations with external IDM (LDAP or OIDC IdP), it is highly recommended to configure and enable AIM
- Deployment and configuration guide
- Documentation
Secure Border Gateway (SBG)
Purpose
Deployment recommendations
Sygnal (Push Gateway)
Purpose
Deployment recommendations
PostgreSQL database
Purpose
Deployment recommendations
.well-known delegation
Purpose
Deployment recommendations
Architecture
Setting up ESS Pro Helm Chart
Getting started
This readme is primarily aimed as a simple walkthrough to setup ESS Pro. Users experienced with Helm and Kubernetes can refer directly to the chart README in element's charts.
Resource requirements
The quick setup relies on K3s. It requires at least 2 CPU cores and 2 GB of memory available.
Prerequisites
You first need to choose what your server name is going to be. The server name makes up the latter part of a user's Matrix ID. In the following example Matrix ID, server-name.tld
is the server name, and should point to your ESS Pro installation:
@alice:server-name.tld
It is currently not possible to change your server name without resetting your database and having to recreate the server.
Quick setup
Setting up a basic environment involves only 6 steps:
- Setting up DNS entries
- Setting up K3s (or use another Kubernetes distribution)
- Setting up TLS/certificates
- Installing the stack
- Creating an initial user
- Verifying the setup
The below instructions will guide you through each of the steps.
Preparing the environment
DNS
You need to create DNS entries to set up ESS Pro. All of these DNS entries must point to your server's IP.
- Server name: This DNS entry should point to the installation ingress. It should be the
server-name.tld
you chose above. - Synapse: For example, you could use
matrix.<server-name.tld>
. - Matrix Authentication Service: For example, you could use
account.<server-name.tld>
. - Matrix RTC Backend: For example, you could use
mrtc.<server-name.tld>
. - Element Web: This will be the address of the chat client of your server. For example, you could use
chat.<server-name.tld>
.
Ports
For this simple setup you need to open the following ports :
- TCP 80: This port will be used for the HTTP connections of all services, which will redirect to the HTTPS connection.
- TCP 443: This port will be used for the HTTPS connections of all services.
- TCP 30881: This port will be used for the TCP WebRTC connections of Matrix RTC Backend.
- UDP 30882: This port will be used for the Muxed WebRTC connections of Matrix RTC Backend.
K3s - Kubernetes single node setup
This guide suggests using K3s as the Kubernetes node hosting ESS Pro. Other options are possible. You can use an existing Kubernetes cluster, or use other clusters like microk8s. Any Kubernetes distribution is compatible with Element Pro, so choose one according to your needs. Please raise with your support or account manager if you discover issues or incompatibilities.
The following will install K3s on the node, and configure its Traefik proxy automatically. If you want to configure K3s behind an existing reverse proxy on the same node, please see the dedicated section.
If you have a firewall running on your server, please follow k3s official recommandations.
- Run the following command to install K3s:
curl -sfL https://get.k3s.io | sh -
- Once K3s is set up, copy its kubeconfig to your home directory to get access to it:
mkdir ~/.kube
export KUBECONFIG=~/.kube/config
sudo k3s kubectl config view --raw > "$KUBECONFIG"
chmod 600 "$KUBECONFIG"
chown "$USER:$USER" "$KUBECONFIG"
-
Add
export KUBECONFIG=~/.kube/config
to~/.bashrc
to make it persistent -
Install Helm, the Kubernetes Package Manager. You can use your OS repository or call the following command:
curl -fsSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
- Create your Kubernetes namespace where you will deploy the Element Server Suite Pro:
kubectl create namespace ess
- Create a directory containing your Element Server Suite configuration values:
mkdir ~/ess-config-values
Logging in Element's registry
You can use the following command to log in to the Element's registry. Use your ESS Credentials issued in your EMS Admin Dashboard, under On Premise Subscriptions.
helm registry login registry.element.io
Downloading ESS Pro example values files
You can find the example configuration values files in helm chart archive. To download the example values files, you can use the following command:
helm pull oci://registry.element.io/matrix-stack --untar -d charts
You can find the example configuration values files in the charts/matrix-stack/ci
directory.
Configuring image pull authentication
ESS Pro images are hosted on the private element registry. To use these images, you need to configure your authentication tokens.
Copy the file from charts/matrix-stack/ci/fragments/ess-credentials.yaml
to ess-credentials.yaml
in your ESS configuration values directory. Adjust the values according to your credentials.
Certificates
We present here 3 options to set up certificates in Element Server Suite. To configure Element Server Suite behind an existing reverse proxy already serving TLS, you can jump to the end of this section.
Let's Encrypt
To use Let’s Encrypt with ESS Pro, you should use Cert Manager. This is a Kubernetes component which allows you to get certificates issued by an ACME provider. The installation follows the official manual:
- Add Helm Jetstack repository:
helm repo add jetstack https://charts.jetstack.io --force-update
- Install Cert-Manager:
helm install \
cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--version v1.17.0 \
--set crds.enabled=true
- Configure Cert-Manager to allow ESS Pro to request Let’s Encrypt certificates automatically. Create a “ClusterIssuer” resource in your K3s node to do so:
export USER_EMAIL=<your email>
kubectl apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: $USER_EMAIL
privateKeySecretRef:
name: letsencrypt-prod-private-key
solvers:
- http01:
ingress:
class: traefik
EOF
- In your ESS configuration values directory, copy the file
charts/matrix-stack/ci/fragments/quick-setup-letsencrypt.yaml
totls.yaml
.
Certificate File
Wildcard certificate
If your wildcard certificate covers both the server-name and the hosts of your services, you can use it directly.
- Import your certificate file in your namespace using kubectl:
kubectl create secret tls ess-certificate --namespace ess \
--cert=path/to/cert/file --key=path/to/key/file
- In your ess configuration values directory, copy the file
charts/matrix-stack/ci/fragments/quick-setup-wildcard-cert.yaml
totls.yaml
. Adjust the TLS Secret name accordingly if needed.
Individual certificates
- If you have a distinct certificate for each of your DNS names, you will need to import each certificate in your namespace using kubectl:
kubectl create secret tls ess-chat-certificate --namespace ess \
--cert=path/to/cert/file --key=path/to/key/file
kubectl create secret tls ess-matrix-certificate --namespace ess \
--cert=path/to/cert/file --key=path/to/key/file
kubectl create secret tls ess-auth-certificate --namespace ess \
--cert=path/to/cert/file --key=path/to/key/file
kubectl create secret tls ess-mtrc-certificate --namespace ess \
--cert=path/to/cert/file --key=path/to/key/file
kubectl create secret tls ess-well-known-certificate --namespace ess \
--cert=path/to/cert/file --key=path/to/key/file
- In your ess configuration values directory, copy the file
charts/matrix-stack/ci/fragments/quick-setup-certificates.yaml
totls.yaml
. Adjust the TLS Secret name accordingly if needed.
Using an existing reverse proxy
- If the certificates are handled in an external load balancer, you can disable TLS in ESS. Copy the file
charts/matrix-stack/ci/fragments/quick-setup-external-cert.yaml
totls.yaml
.
Configuring the database
You can either use the database provided with ESS Pro or you use a dedicated PostgreSQL Server. We recommend using a PostgreSQL server installed with your own distribution packages. For a quick set up, feel free to use the internal PostgreSQL database. The chart will configure it automatically for you by default.
Installation
The ESS Pro installation is performed using Helm package manager, which requires configuration of a values file as specified in this documentation.
Setting up the stack
For a quick setup using the default settings, copy the file from charts/matrix-stack/ci/fragments/quick-setup-hostnames.yaml
to hostnames.yaml
in your ESS configuration values directory and edit the hostnames accordingly.
Run the setup using the following helm command. This command supports combining multiple values files depending on your setup. Typically you would pass to the command line a combination of:
- If using Lets Encrypt or Certificate Files :
--values ~/ess-config-values/tls.yaml
- If using your own PostgreSQL server :
--values ~/ess-config-values/postgresql.yaml
Each optional additional values file used needs to be prefixed with --values
To install specific versions, append :version
after /matrix-stack
. This is required to stay on the LTS. Without specifying the version you will install the latest available. See charts.element.io for a list of available versions.
For example oci://registry.element.io/matrix-stack:25.4.1 \
for the April 2025 LTS.
helm upgrade --install --namespace "ess" ess \
oci://registry.element.io/matrix-stack \
--values ~/ess-config-values/ess-credentials.yaml \
--values ~/ess-config-values/hostnames.yaml \
--values ~/ess-config-values/tls.yaml \
--values <optional additional values files to pass> \
--wait
Wait for the helm command to finish up. ESS Pro is now installed!
Creating an initial user
ESS Pro does not allow user registration by default. To create your initial user, use the mas-cli manage register-user
command in the Matrix Authentication Service pod:
kubectl exec --namespace ess -it deploy/ess-matrix-authentication-service -- \
mas-cli manage register-user
This should give you the following output:
Defaulted container "matrix-authentication-service" out of: matrix-authentication-service, render-config (init), db-wait (init), config (init)
✔ Username · alice
User attributes
Username: alice
Matrix ID: @alice:thisservername.tld
No email address provided, user will be prompted to add one
No password or upstream provider mapping provided, user will not be able to log in
Non-interactive equivalent to create this user:
mas-cli manage register-user --yes alice
✔ What do you want to do next? (<Esc> to abort) · Set a password
✔ Password · ********
User attributes
Username: alice
Matrix ID: @alice:thisservername.tld
Password: ********
No email address provided, user will be prompted to add one
Allowing users registration
See the MAS configuration page for details and a configuration example.
Verifying the setup
To verify the setup, you can:
- Log into your Element Web client website and log in with the user you created above.
- Verify that federation works fine using Matrix Federation Tester.
- Login with an Element X mobile client with the user you created above.
- You can use a Kubernetes UI client such has k9s (TUI-Based) or lens (Electron Based) to see your cluster status.
Advanced setup
For advanced setup instructions, please refer to the Advanced setup guide.
Maintenance
For maintenance topics like upgrading, backups and restoring from backups, please refer to the Maintenance guide.
Uninstalling
If you wish to remove ESS Pro from your cluster, you can simply run the following commands to clean up the installation. Please note deleting the ess
namespace will remove everything within it, including any resources you may have manually created within it:
helm uninstall ess --namespace ess
kubectl delete namespace ess
If you want to also uninstall other components installed in this guide, you can do so using the following commands:
# Remove cert-manager from cluster
helm uninstall cert-manager --namespace cert-manager
# Uninstall helm
rm -rf /usr/local/bin/helm $HOME/.cache/helm $HOME/.config/helm $HOME/.local/share/helm
# Uninstall k3s
/usr/local/bin/k3s-uninstall.sh
# (Optional) Remove config
rm -rf ~/ess-config-values ~/.kube
Handling secrets in ESS Pro
Overview
The matrix-stack
Helm chart provides flexible secret configuration options:
-
Automatic generation via the
init-secrets
job. - In-Helm values for simple, inline secret definitions.
- External secrets for integration with existing secret management systems.
A key feature of the chart is the init-secrets job, which automatically generates and stores secrets in a Kubernetes secret named generated-secrets
. This simplifies the setup of sensitive configurations without manual intervention. This is supported only for secrets internal to the system.
The chart also supports custom secret configurations via either inline values or existing Kubernetes secrets.
The init secrets job
The init-secrets job is a Kubernetes job that runs once during the chart deployment as a helm pre-install/pre-upgrade
hook to generate a secret named generated-secrets
. This secret contains the necessary keys and configurations for the ESS Pro components.
Permissions Required:
To ensure the job can create the generated-secrets
secret, the Kubernetes user must have permissions to manage RBAC in the target namespace. The helm chart will create a service account with appropriate RBAC roles.
Use Case: This job is ideal for automated secret generation, especially when deploying the ESS Pro for the first time. It avoids manual configuration of sensitive data and ensures consistency across deployments.
If you do not want to use the job, you will have to set :
initSecrets:
enabled: false
The chart will then require all secrets to be defined in the values.
Configuring secrets using in-Helm values
You can directly define secrets in the values.yaml
file using the secret value
field. This is useful for simple configurations or when secrets are not stored externally.
Example :
synapse:
registrationSharedSecret:
value: "your-secret-value"
This method is not considered the safest are the secrets will be stored in clear text in the values.yaml
file.
Configuring secrets using external secrets
For advanced use cases, you can reference existing Kubernetes secrets to inject values into the ESS Pro components. This is ideal when secrets are managed elsewhere (e.g., in a secret management system).
Example :
synapse:
registrationSharedSecret:
secret: existing-secret
secretKey: key-in-secret
Requirements:
- The
existing-secret
must be a valid Kubernetes secret containing the keykey-in-secret
. - This method allows seamless integration with external secret management tools (e.g., HashiCorp Vault, Azure Key Vault).
Use Case: This approach is preferred when secrets need to be rotated or managed externally, ensuring compliance with security policies and reducing the risk of hardcoded credentials.
Maintenance
Maintenance
Contents
Upgrading
In order to upgrade your deployment, you should:
- Read the release notes of the new version and check if there are any breaking changes. The changelog is available on element matrix-stack chart page on the right panel.
- Adjust your values if necessary.
- Re-run the install command. It will upgrade your installation to the latest version of the chart.
Backup & restore
Backup
You need to backup a couple of things to be able to restore your deployment:
- Stop Synapse and Matrix Authentication Service workloads:
kubectl scale sts -l "app.kubernetes.io/component=matrix-server" -n ess --replicas=0
kubectl scale deploy -l "app.kubernetes.io/component=matrix-authentication" -n ess --replicas=0
- The database. You need to backup your database and restore it on a new deployment.
- If you are using the provided Postgres database, build a dump using the command
kubectl exec --namespace ess -it sts/ess-postgres -- pg_dumpall -U postgres > dump.sql
. Adjust to your own kubernetes namespace and release name if required. - If you are using your own Postgres database, please build your backup according to your database documentation.
- Your values files used to deploy the chart
- The chart will generate some secrets if you do not provide them. To copy them to a local file, you can run the following command:
kubectl get secrets -l "app.kubernetes.io/managed-by=matrix-tools-init-secrets" -n ess -o yaml > secrets.yaml
. Adjust to your own kubernetes namespace if required. - The media files: Synapse stores media in a persistent volume that should be backed up. On a default K3s setup, you can find where synapse media is stored on your node using the command
kubectl get pv -n ess -o yaml | grep synapse-media
. - Run the
helm upgrade --install....
command again to restore your workload's pods.
Restore
- Recreate the namespace and the backed-up secret in step 3:
kubectl create ns ess
kubectl apply -f secrets.yaml
- Redeploy the chart using the values backed-up in step 2.
- Stop Synapse and Matrix Authentication Service workloads:
kubectl scale sts -l "app.kubernetes.io/component=matrix-server" -n ess --replicas=0
kubectl scale deploy -l "app.kubernetes.io/component=matrix-authentication" -n ess --replicas=0
- Restore the postgres dump. If you are using the provided Postgres database, this can be achieved using the following commands:
# Drop newly created databases and roles
kubectl exec -n ess sts/ess-postgres -- psql -U postgres -c 'DROP DATABASE matrixauthenticationservice'
kubectl exec -n ess sts/ess-postgres -- psql -U postgres -c 'DROP DATABASE synapse'
kubectl exec -n ess sts/ess-postgres -- psql -U postgres -c 'DROP ROLE synapse_user'
kubectl exec -n ess sts/ess-postgres -- psql -U postgres -c 'DROP ROLE matrixauthenticationservice_user'
kubectl cp dump.sql ess-postgres-0:/tmp -n ess
kubectl exec -n ess sts/ess-postgres -- bash -c "psql -U postgres -d postgres < /tmp/dump.sql"
Adjust to your own kubernetes namespace and release name if required.
- Restore the synapse media files using
kubectl cp
to copy them in Synapse pod. If you are using K3s, you can find where the new persistent volume has been mounted withkubectl get pv -n ess -o yaml | grep synapse-media
and copy your files in the destination path. - Run the
helm upgrade --install....
command again to restore your workload's pods.
Configuring Components
Configuring Synapse
See how to download example files from the helm chart here.
Configuration
For a quick setup using the default settings, see the minimal fragment example in charts/matrix-stack/ci/fragments/synapse-minimal.yaml
.
Configuring a postgresql database
If you want to use an external postgresql, see the following fragments examples:
-
charts/matrix-stack/ci/fragments/synapse-postgres.yaml
-
charts/matrix-stack/ci/fragments/synapse-postgres-secrets-in-helm.yaml
orcharts/matrix-stack/ci/fragments/synapse-postgres-secrets-externally.yaml
Credentials
Credentials are generated if possible. Alternatively they can either be provided inline
in the values with value
or if you have an existing Secret
in the cluster in the
same namespace you can use secret
andsecretKey
to reference it.
If you dont want the chart to generate the secret, please refer to the following values fragments examples to see the secrets to configure.
Synapse requires registrationSharedSecret
, signingKey
and macaroon
secrets:
-
charts/matrix-stack/ci/fragments/synapse-secrets-in-helm.yaml
-
charts/matrix-stack/ci/fragments/synapse-secrets-externally.yaml
If you are configuring S3 storage, see the following values fragments examples to see the secrets to configure:
-
charts/matrix-stack/ci/fragments/synapse-s3-secrets-in-helm.yaml
-
charts/matrix-stack/ci/fragments/synapse-s3-secrets-externally.yaml
### Additional configuration
Additional Synapse configuration can be provided inline in the values as a string with
synapse:
additional:
## Either reference config to inject by:
1-custom-config:
config: |
admin_contact: "mailto:admin@example.com"
## Either reference an existing `Secret` by:
2-custom-config:
configSecret: custom-synapse-config
configSecretKey: shared.yaml
Workers
The following Synapse workers are disabled by default and can be enabled on a per-worker basis :
-
appservice
-
background
-
client-reader
-
encryption
-
event-creator
-
event-persister
-
federation-sender
-
initial-synchrotron
-
media-repository
-
presence-writer
-
pusher
-
receipts-account
-
sliding-sync
-
sso-login
-
synchrotron
-
typing-persister
-
user-dir
Synapse workers can be configured in the values with:
synapse:
workers:
<worker name>:
enabled: true
Each worker comes with a different options (static replicas, horizontal scaling, resources, etc). These options can be seen under synapse.workers.<name>
section of helm show values
for this chart.
The following Synapse pro workers are enabled by default :
-
federation-reader
They can be disabled in the values with :
synapse:
workers:
<worker name>:
enabled: false
Full details on available configuration options can be found at https://element-hq.github.io/synapse/latest/usage/configuration/config_documentation.html
Disabling Synapse
Synapse is enabled for deployment by default can be disabled with the following values
synapse:
enabled: false
Configuring Matrix Authentication Service
See how to download example files from the helm chart here.
Configuration
For a quick setup using the default settings, see the minimal fragment example in charts/matrix-stack/ci/fragments/matrix-authentication-service-minimal.yaml
.
Using Element Web ingress
If Element Web is deployed, you can use the ingress host to access the Matrix Authentication Service. To do so, you can skip configuring matrixAuthenticationService.ingress
. The chart will automatically expose the Matrix Authentication Service on the same ingress as Element Web, under the path /account
.
Configuring a postgresql database
If you want to use an external postgresql database, merge 2 files to postgresql.yaml
:
-
charts/matrix-stack/ci/fragments/matrix-authentication-service-postgres.yaml
-
charts/matrix-stack/ci/fragments/matrix-authentication-service-postgres-secrets-in-helm.yaml
orcharts/matrix-stack/ci/fragments/matrix-authentication-service-postgres-secrets-externally.yaml
Credentials
Credentials are generated if possible. Alternatively they can either be provided inline
in the values with value
or if you have an existing Secret
in the cluster in the
same namespace you can use secret
andsecretKey
to reference it.
If you dont want the chart to generate the secret, please refer to the following values fragments examples to see the secrets to configure.
Matrix Authentication Service requires encryptionSecret
, synapseSharedSecret
and synapseOIDCClientSecret
secrets:
-
charts/matrix-stack/ci/fragments/matrix-authentication-service-secrets-in-helm.yaml
-
charts/matrix-stack/ci/fragments/matrix-authentication-service-secrets-externally.yaml
If you are using LDAP Authentication, this will also need to configure dex.masClientSecret
.
### Additional configuration
Additional Matrix Authentication Service configuration can be provided inline in the values as a string with
matrixAuthenticationService:
additional:
## Either reference config to inject by:
1-custom-config:
config: |
admin_contact: "mailto:admin@example.com"
## Either reference an existing `Secret` by:
2-custom-config:
configSecret: custom-matrix-authentication-service-config
configSecretKey: shared.yaml
Disabling Matrix Authentication Service
Matrix Authentication Service is enabled for deployment by default can be disabled with the following values
matrixAuthenticationService:
enabled: false
Enable user registration
To allow users registration, you will need to configure MAS with SMTP. To do so, follow the steps in Configuring Matrix Authentication Service to inject additional email configuration.
Here is a sample minimal MAS configuration that allows user registration. You are encouraged to look through the MAS documentation linked above and customise the options to your requirements.
matrixAuthenticationService:
additional:
user-config.yaml:
config: |
email:
from: '"Company Ink" <noreply@example.com>'
reply_to: '"Company Ink" <noreply@example.com>'
transport: smtp
mode: starttls
hostname: "smtp.example.com"
port: 587
username: smtpuser
password: secretsmtppassword
account:
password_registration_enabled: true
password_recovery_enabled: true
account_deactivation_allowed: true
login_with_email_allowed: true
policy:
data:
emails:
allowed_addresses:
suffixes: ["@example.com"]
rate_limiting:
account_recovery:
per_ip:
burst: 3
per_second: 0.0008
per_address:
burst: 3
per_second: 0.0002
login:
per_ip:
burst: 3
per_second: 0.05
per_account:
burst: 1800
per_second: 0.5
registration:
burst: 3
per_second: 0.0008
Enable the MAS Admin API
To enable the MAS Admin API, you need to add some additional MAS configuration. There are two modes to use the Admin API. You can enable either one on its own or both as per your requirements. Note you will need to generate valid ULIDs for the client IDs below using a ULID generator like for example https://ulidgenerator.com/
- Using the Swagger UI provided with MAS. An example is available on the MAS documentation page at https://element-hq.github.io/matrix-authentication-service/api/index.html. However, we encourage you to instead use the one hosted by your MAS instance at
https://your-mas-domain.tld/api/doc/
.ULID_Admin_Client_1
in the below example enables authentication for graphical MAS clients like the Swagger UI. - Manually calling the API using a rest client, for example cURL or Bruno. This is documented in this example in the MAS documentation. This is
ULID_Admin_Client_2
in the below example.
Ensure you protect the Client IDs and Secrets as these grant full access to manage all accounts on your server.
Example configuration:
matrixAuthenticationService:
additional:
user-config.yaml:
config: |
policy:
data:
admin_clients:
- ULID_Admin_Client_1
- ULID_Admin_Client_2
admin_users:
- your-admin-user
clients:
- client_id: ULID_Admin_Client_1
client_auth_method: client_secret_post
client_secret: A-secret
redirect_uris:
- https://account.example.com/api/doc/oauth2-callback
- client_id: ULID_Admin_Client_2
client_auth_method: client_secret_basic
client_secret: Another-secret
Synapse Admin API
Access tokens returned by the above MAS Admin API configuration cannot be used with the Synapse Admin API. Long term, we plan to implement Personal Access Tokens in MAS. However, until that feature has landed, the only way to get an access token for the Synpse Admin API is using mas-cli
.
kubectl exec --container matrix-authentication-service --namespace ess \
--stdin --tty deploy/ess-matrix-authentication-service \
-- mas-cli manage issue-compatibility-token \
--yes-i-want-to-grant-synapse-admin-privileges \
your-username
This will return a response similar to this
2025-05-21T11:11:53.564226Z INFO mas_cli::commands::manage:320 Compatibility
token issued: mct_secret compat_access_token.id=ZI1UZZKCNWFOBFUUOQEYZBSIU8
compat_session.id=9X1BFZGXOYXGG5MDHPODT3ER6Q compat_session.device=MI71UWHZLG
user.id=QZEMHAYQCYXS8AYYQ3QWTRMNJZ user.username=your-username
In this example, mct_secret
is your admin access token.
Ensure you protect the access token as this grants full access to manage your server.
Configuring Authentication
See how to download example files from the helm chart here.
Overview
The chart components authentication is managed by the top-level key authentication
.
The configuration is similar if you use standalone Synapse (legacy authentication) or if you enable Matrix Authentication Service.
You can find configurations examples in charts/matrix-stack/ci/fragments/authentication-secrets-externally.yaml
and charts/matrix-stack/ci/fragments/authentication-secrets-in-helm.yaml
.
Registration and Password Authentication
The charts come with :
- registration disabled by default
- password authentication enabled by default
To change this default behaviour, you will have to configure it through the synapse.additional
or matrixAuthenticationService.additional
key. See Synapse documentation or Matrix Authentication Service documentation for more details.
Configuring OIDC
You can configure a list of OIDC providers to use in the chart. Please refer to the description of the authentication.oidc
key in the values file for details.
Configuring LDAP
You can configure a list of LDAP providers to use in the chart. Please refer to the description of the authentication.ldap
key in the values file for details.
If LDAP is configured, and Advanced Identity Management is enabled, it will use the first LDAP provider configured in the list as the source of its users.
Configuring Element Web
See how to download example files from the helm chart here.
Configuration
For a quick setup using the default settings, see the minimal fragment example in charts/matrix-stack/ci/fragments/element-web-minimal.yaml
.
Additional configuration
Additional Element Web configuration can be provided inline in the values as a json string with
elementWeb:
additional:
user-config.json: |
{
"some": "settings"
}
Disabling Element Web
Matrix Authentication Service is enabled for deployment by default can be disabled with the following values
elementWeb:
enabled: false
Configuring Matrix RTC
See how to download example files from the helm chart here.
Configuration
For a quick setup using the default settings, see the minimal fragment example in charts/matrix-stack/ci/fragments/matrix-rtc-minimal.yaml
.
Credentials
Credentials are generated if possible. Alternatively they can either be provided inline
in the values with value
or if you have an existing Secret
in the cluster in the
same namespace you can use secret
andsecretKey
to reference it.
If you dont want the chart to generate the secret, please refer to the following values fragments examples to see the secrets to configure.
Matrix RTC requires livekitAuth.secret
secret:
-
charts/matrix-stack/ci/fragments/matrix-rtc-secrets-in-helm.yaml
-
charts/matrix-stack/ci/fragments/matrix-rtc-secrets-externally.yaml
SFU Networking
The matrix RTC SFU networking relies on NodePort by default. This means that the node but be reachable from outside of the cluster. Default ports are :
- RTC TCP: 30000/TCP
- RTC Muxed UDP : 30001/UDP
This can be configured using matrixRTC.sfu.exposedServices
.
The default SFU networking relies on STUN to discover its public IP. It will automatically advertise it to the clients. The STUN servers can be configured in LiveKit configuration using the additional
section :
matrixRTC:
sfu:
additional: |
rtc:
stun_servers:
- ip:port
- ip:port
- ...
Accessing from behind a Load Balancer
If you are behind a Load Balancer, you must forward the ports from the Load Balancer to the nodes. The ports must be the same on the Load Balancer and the nodes. In this situation, the SFU cannot discover the Load Balancer public IP using the STUN method. Instead, you must use set the env variable NODE_IP
:
matrixRTC:
sfu:
extraEnv:
- name: NODE_IP
value: 1.2.3.4
additional: |
rtc:
use_external_ip: false
# To workaround https://github.com/livekit/livekit/issues/2088
# Any IP address is acceptable, it doesn't need to be a correct one,
# it just needs to be present to get LiveKit to skip checking all local interfaces
# We assign here a TEST-NET IP which is
# overridden by the NODE_IP env var at runtime
node_ip: 198.51.100.1
### Additional SFU configuration
Additional Matrix RTC SFU configuration can be provided inline in the values as a string with
matrixRTC:
sfu:
additional:
## Either reference config to inject by:
1-custom-config:
config: |
admin_contact: "mailto:admin@example.com"
## Either reference an existing `Secret` by:
2-custom-config:
configSecret: custom-matrix-rtc-config
configSecretKey: shared.yaml
Disabling Matrix RTC
Matrix RTC is enabled for deployment by default can be disabled with the following values
matrixRTC:
enabled: false
Setting up Advanced Identity Management
See how to download example files from the helm chart here.
Configuration
For a quick setup using the default settings, see the minimal fragment from charts/matrix-stack/ci/fragments/advanced-identity-management-minimal.yaml
.
Configuring a postgresql database
If you want to configure advancedIdentityManagement.postgres database manually, see the following fragments :
-
charts/matrix-stack/ci/fragments/advanced-identity-management-test-postgres.yaml
-
charts/matrix-stack/ci/fragments/advanced-identity-management-test-postgres-secrets-in-helm.yaml
orcharts/matrix-stack/ci/fragments/advanced-identity-management-test-postgres-secrets-externally.yaml
Edit the values accordingly.
Configuring with SCIM bridging
To use Advanced Identity Management SCIM bridging, it is required either :
- To configure an Advanced Identity Management Ingress. You can use the example from
charts/matrix-stack/ci/fragments/advanced-identity-management-ingress.yaml
. The SCIM endpoint will be available at the root of Advanced Identity Management hostname. - To use existing Synapse ingress. If Synapse and Advanced Identity Management are deployed in the same chart release, a path
/scim/v2
will be available at the root of Synapse Ingress.
Configuring Advanced Identity Management synchronization
If LDAP is configured under authentication.ldap
, Advanced Identity Management will use the 1st provider of the list as its own LDAP source provider.
If you want to configure the LDAP provider manually, you can configure it using advancedIdentityManagement.additional
property. See Advanced Identtiy Management Overview to see how you can configure it.
Advanced Identity Management
Overview
Advanced Identity Management allows you to represent your organization's structure within Matrix and Element: creating a space for all its members, maintaining their membership in rooms and subspaces, managing power levels and more.
It is composed of two main parts:
-
Bridges connect to an existing data source (LDAP, Azure AD or others) and extract the list of users and groups from it.
Multiple Bridges exists, and more can be added by implementing the
Bridge
interface (seesrc/bridging
).See Bridging for more details.
-
Provisioner takes directory produced by a bridge, maps it to matrix spaces (see Space mapping) and enforces its presence on a Matrix server — enforces meaning that it will both create and modify it as needed, but also act as an Matrix Application Service that will automatically react to changes on the Matrix server and check them against the rules established prior.
Provisioner is ignorant of its data source — it is not aware of the Bridge being used and is merely fed data from it.
See Provisioning for more details.
In addition to that, Advanced Identity Management is also an Application Service. The Provisioner observers the events reported by the AS in case it needs to enforce its rules on entities that it didn't itself create: for example demote a room creator to their expected power level (see LDAP as a source of truth).
Example configuration
provisioner:
# Optional. A list of rooms that'll get automatically created in in managed space.
# The ID is required to enable GPS to track whether they were already created or not
# – you can change it, but it'll cause new rooms to be generated.
default_rooms:
- id: 'general'
properties: { name: 'General discussion' }
# Optional. A list of userid patterns that will not get kicked from rooms
# even if they don't belong to them according to LDAP.
# This is useful for things like the auditbot.
# Patterns listed here will be wrapped in ^ and $ before matching.
allowed_users:
- '@adminbot:.*'
# Optional. Determines whether users will be automatically invited to rooms (default, public and space-joinable)
# when they gain access to them. Defaults to true. Users will still get invited to spaces regardless of this setting.
invite_to_public_rooms: false
# Optional: A list of remote Advanced Identity Management we'll be federating with. Requests from other remote users will be ignored.
federation:
federates_with:
- '@aim_bot:consultancy.test'
# Optional. When enabled, spaces that are no longer configured, and rooms belonging to those spaces will be cleaned up.
# This will likely become enabled by default in the future.
# When disabled (or omitted), GS will log the rooms and spaces it would clean up if allowed to.
gc:
enabled: true
# Optional. If configured, Advanced Identity Management will synchronize user accounts (attributes and account validity)
# found in the data directory to the specified list of targets.
userProvisioner:
# Optional. Configure to enable user deprovisioning. Disabled by default.
deprovisioning:
enabled: false
# Optional. When users get removed from the directory their accounts will only be deactivated,
# but their erasure will be delayed by the specified time period, allowing them to be reactivated in the meantime.
# The format is <amount><unit>, with amount being numeric and unit being one of: [s, m, h, d], for seconds, minutes,
# hours or days respectively (for example: "24h", "31d" etc.).
# The specified period will be translated into seconds, so won't account for things like DST, leap seconds etc.
# Users will be deleted *no sooner* than that, but may be removed a bit later, depending on other Advanced Identity Management operations.
# By default set to 30 days.
soft_delete_period: '30d'
# Configure the spaces you want Advanced Identity Management to manage on your Matrix server
spaces:
# The internal ID of this space. Don't change it after you set it, or it will create a new one and abandon the old one.
- id: main
# The display name of the space, safe to change later.
name: 'My Company'
# The list of groups from your user directory to add as members to this space.
# An empty string is a special name that means "all available users, regardless of group memberships".
# You can set a power level for any of the groups. By default it's 0.
# This space is going to contain all the users in the directory,
# and those that are present in the "managers" groups will be the moderators (in the space and its child rooms).
groups:
- externalId: ''
- externalId: 'cn=managers,ou=employees,dc=element,dc=test'
powerLevel: 50
- id: management
name: 'Management'
groups:
- externalId: 'cn=managers,ou=employees,dc=element,dc=test'
# You can confgure spaces that'll be federated between multiple GS servers.
# Each Advanced Identity Management will only manage its local users.
- id: shared
name: 'Federated space'
groups:
- externalId: 'cn=engineering,ou=employees,dc=element,dc=test'
federatedGroups:
# The external ID of the group on the foreign server.
# This will be enforced by the Advanced Identity Management running on consultancy.test, not this instance configured here.
# The Advanced Identity Management running on consultancy.test needs to have our MXID
# (@gpsbot:element.test by default) configured in its provisioner.federates_with config option.
- externalId: 'ou=element-contractors,dc=consultancy,dc=test'
# The MXID of the remote Advanced Identity Management bot.
agent: '@aim_bot:consultancy.test'
source:
type: 'ldap'
# LDAP will be checked for changes every this many seconds
check_interval_seconds: 60
# The following can be copied straight from Synapse's homeserver.yaml
# if you're already using its LDAP password provider
uri: "ldap://element.test"
# The base ou we specify here will become the root space
base: "ou=employees,dc=element,dc=test"
# Optional. An LDAP filter to use when searching for entries
filter: '(!(ou=Domain Controllers))'
# Make sure the account you use here has enough permissions to perform searches on your `base`
bind_dn: "ELEMENT\\administrator"
bind_password: "donkey.8"
# Needs `uid` to be able to determine Matrix localparts for users
# and `name`s to pick the right names for spaces
attributes:
uid: "sAMAccountName"
name: "name"
# If the LDAP server requires a client certificate, enable this option.
# cert:
# The path to the file
# file: "./my-cert-file.pem"
# OR the PEM-encoded cert itself
# cert: "foobar"
# Passphrase for the cert, if required.
# passphrase: "passphrase"
# For Microsoft Graph:
# source:
# type: 'ms-graph-ad'
#
# # This is the "Tenant ID" from your Azure Active Directory Overview
# tenant_id: 'b9355cb3-feed-dead-beef-9cc325f0335b'
#
# # Register your app in "App registrations". This will be its "Application (client) ID"
# client_id: '5c955b66-18b3-42de-bb5a-13b5a202d4fc'
#
# # Go to "Certificates & secrets", and click on "New client secret".
# # This will be the "Value" of the created secret (not the "Secret ID").
# client_secret: 'yOb7Q~Km~~YMKzpeq73swJj3kOeJpUwXSZamr'
# # For the bridge to be able to operate correctly, navigate to API permissions and unsure
# # it has access to GroupMember.Read.All and User.Read.All
# # Application permissions for Microsoft Graph. Remember to grant the admin consent for those.
#
# # Optional. The url to reach Graph on. Override if your deployment uses a specific graph endpoint.
# base_url: 'https://graph.microsoft.com/'
#
# # Optional. Specific scopes to set for graph to use.
# scopes: ['https://graph.microsoft.com/.default']
# For SCIM:
# type: 'scim'
# # HTTP port that the SCIM server will listen on
# port: 8040
# # Optional URL prefix for all routes
# base_url: '/scim/v2'
# client:
# # Unique ID for the SCIM client.
# # This will be used to keep track of the managed Space and User/Group storage in Matrix.
# id: 'element-ad'
# # You can set up multiple client tokens with different permission levels.
# rbac:
# # Bearer token for the client, as per RFC 6750
# - token: 'foo-bar-baz'
# # What's the token allowed to do: in this case, everything (read+write on all endpoints).
# # The format for these is 'access:scope', access being 'read', 'write' or '*' for both,
# # scope being 'users', 'groups' or '*' for everything.
# roles: ['*:*']
# # You can specify permissions for anyone who presents a valid Matrix access_token for an admin user
# - synapse_user: 'admin'
# # ...and assign more fine-tuned permissions to it
# roles: ['read:*', 'write:groups']
# attributeMapping:
# # The SCIM user attribute that'll be used as the Matrix username for provisioned users
# username: 'externalId'
# # Should SCIM user creation register a Matrix account for the user.
# # Possible values are 'yes', 'no' and 'if-missing'
# # - 'yes' will register Matrix accounts on the server upon a SCIM create user request,
# # and error out if the user with that username already exists.
# # - 'if-missing' will register Matrix accounts unless they exist already.
# # This is useful if some users have their user accounts created independently before the SCIM bridge was set up.
# # - 'no' will not create user accounts, only work with existing ones.
# register_users: 'no'
# # Optional: Should SCIM responses wait for Matrix provisioning to complete.
# # It is recommended to leave it as false. HTTP responses will be sent quicker,
# # and Matrix provisioning may still fail in the background (to be retried later).
# synchronous_provisioning: false
# # Optional: Configure a mailer to send email notifications to newly registered, activated and deactivated users.
# # mailer:
# # # The email address emails will be sent from
# # from: 'element@element.com'
# # # Path to a directory with email templates.
# # # Each template should be a directory containing 'subject.pug', 'text.pug' and 'html.pug',
# # # all using https://pugjs.org/ as a template language.
# # # Advanced Identity Management ships with standard, Element-branded templates in templates/
# # templates_path: './templates'
# # # SMTP transport configuration, as per https://nodemailer.com/smtp/,
# # # except that we default `secure` to `true` and `port` to 465.
# # transport:
# # host: 'smtp.example.com'
# # auth:
# # user: 'mailer'
# # pass: 'mailerpass.8'
# Optional. Configure this to gather usage statistics.
# See telemetry spec at https://gitlab.matrix.org/new-vector/modular/telemetry-schema
# for details on what's being gathered and sent.
telemetry:
# Identifier of this Advanced Identity Management instance
instance_id: 'foo'
# Every this many seconds (and on startup) telemetry will be recorded (and optionally sent)
send_interval: 3600
# Optional: the EMS endpoint to submit telemetry entries to.
# This is optional as it wouldn't work for airgapped environments,
# and by default no telemetry is sent (but it is still gathered).
endpoint: 'https://ems.com/telemetry'
# Optional: how many times should we retry sending telemetry if it fails. Defaults to 3
retry_count: 3
# Optional: how long should we wait between retries. Defaults to 60, in seconds
retry_interval: 60
# Optional
logging:
# Allowed levels are: error, warn, info, http, verbose, debug, silly - case sensitive.
# "info" will typically notify of all "write" actions (affecting the state of the homeserver),
# while "debug" will also be reporting checks performed that didn't result in any changes.
level: "info"
# Optional. Allowed formats are are:
# - pretty: the default. A timestamped, colorized output suitable for humans
# - json: logging a json object containing a `level`, `message`, `timestamp` and optionally a `label`
format: "json"
Bridging
Bridging directories
Bridges' job is to turn the contents of an external data directory into a data structure that can then be then constructed on the Matrix server by the Provisioner. See State representation for the details description of the data structure being produced.
See specific bridges (in the sidebar) to learn more about how GS interprets the contents of specific data sources.
Bridges run continously and trigger provisioning whenever they observer changes in the data source.
LDAP
The LDAP bridge will periodically (according to its configuration) fetch the LDAP tree from the server (filtering out the things it doesn't find interesting).
To enable maximum flexibility it "flattens" the LDAP tree so that the users' (and groups') place in directory tree doesn't matter.
Groups, OrgUnits and Domains (if found), will all be flattened and treated like a container for users. It makes it possible to use their DNs[^note] (fully qualified names) to assign users to spaces and power levels in those spaces.
[^note]: CNs are also allowed here for backwards compatibility reasons, but only for groups. It is however advised to avoid using CNs and use DNs instead, since they are guaranteed to be unique across the LDAP tree. GS' behaviour is undefined when mapping groups with duplicate names.
For example, for the following LDAP tree:
- Company (Domain) (`dc=company`)
- Alfred (User) (`cn=alfred,cn=company`)
- Engineering (OrgUnit) (`ou=engineering,cn=company`)
- Barbara (User) (`cn=barbara,ou=engineering,cn=company`)
- Moderators (Group) (`cn=moderators,ou=engineering,cn=company`)
- Charlie (User) (`cn=charlie,ou=engineering,cn=company`)
Company, Engineering and Moderators will all be treated as if they were an group. We could then use the following space mapping configuration with it:
spaces:
id: root
name: "Company"
groups:
- externalId: `dc=company` # or leave it empty with the same result
subspaces:
- id: engineering
name: Engineering
groups:
- externalId: `ou=engineering,cn=company`
- externalId: `cn=moderators,ou=engineering,cn=company`
powerLevel: 50
Example Configuration
When using the helm chart, the authentication schema is automatically used to configure GroupSync LDAP source. If you want to override some settings, you can always implement the following configuration:
source:
type: 'ldap'
# LDAP will be checked for changes every this many seconds
check_interval_seconds: 60
# The following can be copied straight from Synapse's homeserver.yaml
# if you're already using its LDAP password provider
uri: "ldap://element.test"
# The base ou we specify here will become the root space
base: "ou=employees,dc=element,dc=test"
# Optional. An LDAP filter to use when searching for entries
filter: '(!(ou=Domain Controllers))'
# Make sure the account you use here has enough permissions to perform searches on your `base`
bind_dn: "ELEMENT\\administrator"
bind_password: "donkey.8"
# Needs `uid` to be able to determine Matrix localparts for users
# and `name`s to pick the right names for spaces
attributes:
uid: "sAMAccountName"
name: "name"
# If the LDAP server requires a client certificate, enable this option.
# cert:
# The path to the file
# file: "./my-cert-file.pem"
# OR the PEM-encoded cert itself
# cert: "foobar"
# Passphrase for the cert, if required.
# passphrase: "passphrase"
Microsoft Graph
The MsGraph bridge will periodically (according to its configuration) perform the following API calls:
-
/organization
, to determine the name of the organization -
/users
to get the list of users in the org -
/groups
to get the list of groups in the org -
/groups/<id>/members
to get a list of members for a particular group
In order to perform the queries successfully, Advanced Identity Management's Application needs to have the following permissions granted in Azure:
-
User.Read.All
-
GroupMember.Read.All
It emits a list of users and groups as-is, without performing any transformations on them.
Example Configuration
Using MS-Graph requires the following GroupSync configuration :
source:
type: 'ms-graph-ad'
# This is the "Tenant ID" from your Azure Active Directory Overview
tenant_id: 'b9355cb3-feed-dead-beef-9cc325f0335b'
# Register your app in "App registrations". This will be its "Application (client) ID"
client_id: '5c955b66-18b3-42de-bb5a-13b5a202d4fc'
# Go to "Certificates & secrets", and click on "New client secret".
# This will be the "Value" of the created secret (not the "Secret ID").
client_secret: 'yOb7Q~Km~~YMKzpeq73swJj3kOeJpUwXSZamr'
# For the bridge to be able to operate correctly, navigate to API permissions and unsure
# it has access to GroupMember.Read.All and User.Read.All
# Application permissions for Microsoft Graph. Remember to grant the admin consent for those.
# Optional. The url to reach Graph on. Override if your deployment uses a specific graph endpoint.
base_url: 'https://graph.microsoft.com/'
# Optional. Specific scopes to set for graph to use.
scopes: ['https://graph.microsoft.com/.default']
SCIM
The SCIM bridge maintains an HTTP service that conforms to the SCIM protocol (RFC 7644) and provisions a Matrix server with the SCIM resources sent to it.
Configuration
The following options are available when configuring the SCIM bridge:
-
base_url
(optional) - the URL prefix for each route. For instance, withbase_url
set to/scim/azure
the requests need to be hitting/scim/azure/Users
etc. Useful when running behind a non-rewriting proxy. Set to an empty string by default. -
client
– a structure with the following fields:-
id
- a string specifying the name of the client, e.g. the name of the organization. Must be unique across SCIM bridge instances running on the server. Changing this value after some SCIM resources have been provisioned is equivalent to creating a new user/group database and a new Matrix Space. The value will also be the default name of the organization Space (which can later be changed by Space Moderators). -
token
- access token for the SCIM client, as per RFC 6750 -
attributeMapping
- a structure with the following fields:-
username
- the SCIM user attribute to use when determining the Matrix username. If you're using OIDC, make sure it matches its setup.
-
-
-
synchronous_provisioning
(optional) - a boolean flag. If set to true, SCIM responses won't be sent before the Matrix provisioning finishes, and any Matrix errors may cause SCIM requests to fail and potentially leave the server in an invalid state. Useful for testing. False by default, and it's strongly recommended to leave it that way.
Example configuration
Configuring the SCIM bridge requires to configure the following values. When using ESS Helm Chart, you need to set groupSync.enableSCIM
to expose the SCIM ingress. It will be abailable under GroupSync ingress if it is enabled, or Synapse ingress at /scim/v2
path.
source:
type: 'scim'
client:
# Unique ID for the SCIM client.
# This will be used to keep track of the managed Space and User/Group storage in Matrix.
id: 'element-ad'
# You can set up multiple client tokens with different permission levels.
rbac:
# Bearer token for the client, as per RFC 6750
- token: 'foo-bar-baz'
# What's the token allowed to do: in this case, everything (read+write on all endpoints).
# The format for these is 'access:scope', access being 'read', 'write' or '*' for both,
# scope being 'users', 'groups' or '*' for everything.
roles: ['*:*']
# You can specify permissions for anyone who presents a valid Matrix access_token for an admin user
- synapse_user: 'admin'
# ...and assign more fine-tuned permissions to it
roles: ['read:*', 'write:groups']
attributeMapping:
# The SCIM user attribute that'll be used as the Matrix username for provisioned users
username: 'externalId'
# Should SCIM user creation register a Matrix account for the user.
# Possible values are 'yes', 'no' and 'if-missing'
# - 'yes' will register Matrix accounts on the server upon a SCIM create user request,
# and error out if the user with that username already exists.
# - 'if-missing' will register Matrix accounts unless they exist already.
# This is useful if some users have their user accounts created independently before the SCIM bridge was set up.
# - 'no' will not create user accounts, only work with existing ones.
register_users: 'no'
# Optional: Should SCIM responses wait for Matrix provisioning to complete.
# It is recommended to leave it as false. HTTP responses will be sent quicker,
# and Matrix provisioning may still fail in the background (to be retried later).
synchronous_provisioning: false
# Optional: Configure a mailer to send email notifications to newly registered, activated and deactivated users.
# mailer:
# # The email address emails will be sent from
# from: 'element@element.com'
# # Path to a directory with email templates.
# # Each template should be a directory containing 'subject.pug', 'text.pug' and 'html.pug',
# # all using https://pugjs.org/ as a template language.
# # Group sync ships with standard, Element-branded templates in templates/
# templates_path: './templates'
# # SMTP transport configuration, as per https://nodemailer.com/smtp/,
# # except that we default `secure` to `true` and `port` to 465.
# transport:
# host: 'smtp.example.com'
# auth:
# user: 'mailer'
# pass: 'mailerpass.8'
Space Mapping
This mechanism allows us to configure spaces that Advanced Identity Management will maintain.
Configuration
We define each space giving it a name (which will be displayed in Element), a unique ID (which allows Advanced Identity Management to track the Space even if it gets renamed), and a list of groups whose users will become the members of the Space. Users needs to be a member of any configured group, not all of them.
You can pick any ID you want, but if you change it later Advanced Identity Management will create a brand new space and abandon the old ones, likely confusing the users.
In order to limit space membership to a specific Group, we include its Group ID.
Each group may optionally include a powerLevel setting, allowing specific groups to have elevated permissions in the space.
A special group ID of ''
(an empty string) indicates that all users from the server, regardless of their group membership,
should become the members of the Space.
In addition to regular groups, you may also make a space federated by specifying federatedGroups
and a remote Advanced Identity Management server.
See Federation for more details.
An optional list of subspaces may also be configured, each using the same configuration format and behaviour (recursively).
If a space has subspaces configured, its members list will be composed of the members of the space itself any any of its subspaces, recursively -- so a subspace's member list is always a subset of its parent space's member list. This may change in the future, so it's advised not to rely on this when configuring your spaces.
spaces:
id: root
name: 'Company'
groups:
- externalId: 'element-users'
With powerLevel
option allows us to give users extra permissions. This is equivalent to the group_power_level
setting[^note].
spaces:
id: root
name: 'Company'
groups:
# regular users
- externalId: 'element-users'
# moderators
- externalId: 'element-moderators'
powerLevel: 50
In case of Power Level conflicts, the highest power level will be used. With the following configuration:
spaces:
id: root
name: 'Company'
groups:
- externalId: 'moderators'
powerLevel: 50
- externalId: 'admins'
powerLevel: 100
A user who's a member of both moderators
and admins
will end up with Power Level of 100.
Subspaces can be configured analogically:
spaces:
id: shared
name: "Element Corp"
groups:
- externalId: 'matrix-mods'
powerLevel: 50
- externalId: ''
subspaces:
- id: london
name: "London Office"
groups:
- externalId: 'london-matrix-mods'
powerLevel: 50
- externalId: 'london-employees'
Provisioning
Provisioning
The role of the provisioner is to take the expected state representation produced by Bridges and ensure that the server state matches these expectations. The provisioner will try to do as little as possible to go from the existing to the desired state — in particular, running a Provisioner twice will result in no operations being performed on the second run.
Provisioning will typically be triggered by the bridge, either on its startup or whenever it becomes aware of changes in the data source.
See Usage Scenarios for examples of provisioning actions in response to data source changes.
Example Configuration
provisioner:
# Optional. A list of rooms that'll get automatically created in in managed space.
# The ID is required to enable GPS to track whether they were already created or not
# – you can change it, but it'll cause new rooms to be generated.
default_rooms:
- id: 'general'
properties: { name: 'General discussion' }
# Optional. A list of userid patterns that will not get kicked from rooms
# even if they don't belong to them according to LDAP.
# This is useful for things like the auditbot.
# Patterns listed here will be wrapped in ^ and $ before matching.
allowed_users:
- '@adminbot:.*'
# Optional. Determines whether users will be automatically invited to rooms (default, public and space-joinable)
# when they gain access to them. Defaults to true. Users will still get invited to spaces regardless of this setting.
invite_to_public_rooms: false
# Optional: A list of remote Advanced Identity Management we'll be federating with. Requests from other remote users will be ignored.
federation:
federates_with:
- '@gs_bot:consultancy.test'
State representation
Both users and power level targets are currently only represented as a localpart: Advanced Identity Management is meant to manage a single server, where each organization member has an account on the server being provisioned.
Advanced Identity Management is not involved in the registration of user accounts themselves — this is typically handled by Synapse's authentication provider. Some bridges may take this responsibility upon themselves — for example the SCIM bridge, when new User accounts are being sent to it. Still, even in that case, Provisioner is not responsible for ensuring that the accounts exist before it starts managing them.
User provisioning
Advanced Identity Management can be configured to synchronize user accounts found in the bridged data directory to a specified list of targets.
Currently the only supported target is Synapse, and the synchronization is limited to user attributes for already existing accounts.
If you are using the helm chart, this can be configured through groupSync.syncedUserAttributes
.
Attribute sync
When users in a data directory change, Advanced Identity Management will ensure that the attributes match those in Synapse (and in the future, other user provisioning targets). Advanced Identity Management will only update users if any updates need to be performed, and only update the attributes it needs to.
Supported attributes are:
-
displayName
Refers to a
displayname
attribute in Synapse.In LDAP, displayName is obtained from the value of an attribute configured as
name
in the attribute mapping.In Azure AD and SCIM its value is taken from the
displayName
attribute of a given user. -
emails
Refers to Synapse's
threepids
formedium: "email"
. When updating this attribute, all threepids other than"email"
will be left intact.In LDAP, the value of this attribute is determined by the value of the attribute configured as
mail
in attribute mapping.In Azure AD, the value of this attribute is taken from the
mail
attribute in Azure AD. This is limited to just one email address per user.In SCIM, the value of this attribute is taken from the
emails
attribute for a given user.
Advanced Identity Management can be configured to sync all of them, or a limited set. See the example config for more details.
Federation
Advanced Identity Management supports closed federation — as in, one where all participating servers are known in advance.
Each federated server maintains its own Advanced Identity Management instance, crucially its own Provisioner. Each Provisioner is responsible for managing users belonging to its homeserver, and ignores those that belong to another homeserver and another Provisioner.
The servers are are equal, and any of them may invite other Advanced Identity Management servers to any of its spaces
-- but they do need to be on a preconfigured list of servers (see federates_with
option in the example config).
When a Advanced Identity Management server wishes to federated with another, it should specify which of its spaces should include a remote Advanced Identity Management server, and which of its groups should be invited.
Example
Let's say we have an organization with two servers -- dallas.example.com and berlin.example.com. Both use Advanced Identity Management with their own data directories.
Dallas has 3 users: Alice, Bob and Cyril. Alice is additionally in a group called "dallas-management".
Berlin has 3 users too: Dave, Eve and Francis. Dave is a member of "berlin-management" group.
We'll set up a space federated between the two, so that users from both servers will end up there, which both managers having a power level of 50.
Federation whitelist
Both provisioners need to have to be aware of the other participating server, so for Dallas we need:
provisioner:
federates_with: ['@groupsync:berlin.example.com']
And for Berlin:
provisioner:
federates_with: ['@groupsync:dallas.example.com']
Without having that configured, each Advanced Identity Management will ignore the requests sent by the other one, in order to not accidentally expose information to an untrusted party.
Note that Matrix IDs in the federates_with
section must match the other servers:
both the server_name
s and the sender_localpart
s,
so that Advanced Identity Managements know how to invite to rooms and spaces as co-conspirators.
Federated space mapping
While the space will be replicated on both servers, with both being equally responsible for it, we need to pick a server to create it on. Let's do it on the Dallas server:
- id: shared
name: 'Federated space'
groups:
- externalId: '' # include all our users
- externalId: 'dallas-managers'
powerLevel: 50
federatedGroups:
# The MXID of the remote Advanced Identity Management bot.
agent: '@groupsync:berlin.example.com'
- externalId: '' # include all users known to the Berlin Advanced Identity Management
- externalId: 'berlin-managers'
powerLevel: 50
No additional configuration is needed on the Berlin server.
Once this configuration is applied, the following thing will happen:
-
The Dallas GS will create the
Federated space
, invite its users, (Alice, Bob and Cyril) and make Alice a moderator (PL50). -
The Dallas GS will invite the Berlin GS (
@groupsync:berlin.example.com
) to that Space, and store its expectations for it in a Matrix State Event. -
The Berlin GS will receive an invitation to
Federated space
and recognize it as coming from a federating GS (inviter is on thefederates_with
list). -
The Berlin GS will join a space and read the State Event to figure out which of its users should be members of the
Federated Space
-
The Berlin GS will invite all it users (Dave, Eve and Francis) to
Federated space
and make Dave a moderator (PL50). -
When enforcing memberships rules, both servers will only consider users from its own server: The Dallas GS will never touch Berlin users and vice versa.
Room Cleanup
Room cleanup
After each provisioning cycle, Advanced Identity Management will clean up the rooms and spaces that it no longer needs to manage. Spaces in Matrix are still rooms, but we treat them a little differently during the cleanup, matching their distinct uses.
Internally, room cleanup is refered to as Room GC (Garbage Collection).
This is meant to be a reversible process (in case it was performed accidentally), so we avoid information loss when possible.
Space cleanup
Spaces are cleaned up when they are no longer configured -- once they are removed from Space mapping configuration, Advanced Identity Management will abandon them by kicking every member and then leaving itself
- resulting in an empty room that will eventually get cleanup up entirely by the homeserver.
The kicking of the users is done so that the deconfigured spaces don't show up in their clients anymore. The rooms inside those spaces remain accessible though, so no conversations are being lost.
We don't draw a distinction here between GS- and user-created spaces, because GS doesn't care about user-created spaces at all. It never joins them and it never manages them, so they will never be part of the cleanup process.
Room cleanup
Rooms are cleaned up when they're no longer accessible from any of the spaces that Advanced Identity Management manages. This can happen in a few cases:
- The room belonged to a space that was cleaned up up by Advanced Identity Management
- The room has been removed from a space managed by Advanced Identity Management
- The room is made private (but remains in a managed space)
Notably, none of these apply if a default rooms gets deconfigured in Advanced Identity Management. Those get created in each GS-managed space, but after their creation they're treated like any other space-public[^note] room.
When a room is cleaned up, GS cleans up its room metadata (this is stored in state events) and leaves the room. All the room members remain in the room so that the conversation is preserved and can continue if needed. Room moderators can then tombstone the room if they so desire, or add it to a different space.
If the room was not originally created by Advanced Identity Management, we give PL 100 back to its original creator (having taken it away back when we took control of it). If a room was created by Advanced Identity Management, its power levels are not touched. Advanced Identity Management remains a room admin in case it needs to take control of the room again in the future (e.g. because it gets added to a different managed space).
[^note]: Space-public meaning: with join_rule: restricted
, allowing space members to join.
Configuration
Room and Space cleanup can be configured through :
provisioner:
# Optional. When enabled, spaces that are no longer configured, and rooms belonging to those spaces will be cleaned up.
# This will likely become enabled by default in the future.
# When disabled (or omitted), GS will log the rooms and spaces it would clean up if allowed to.
gc:
enabled: true
Usage Scenarios
Usage Scenarios
With LDAP bridge as an example data source.
Onboarding
- User logs in to Element, either using OpenID or with a user+password with LDAP integration
- User automatically gets invited to spaces matching their LDAP Organizational Unit memberships, which are nested the way they are nested in LDAP
- For each space they’re in, user gets invited to every room that’s configured to be joinable by space members
- Result: user learns their place in the company structure, discovers their peers and becomes aware of all the public conversations happening in the company
For the following sections, we assume a hierarchy of LDAP Organizational Units:
- Employees
- Engineering
- Support
Restructuring
- In the previously described OrgUnit hierarchy, assume a user Evan belonging to Engineering
- Evan is being moved from Engineering to the more generic Employees
- GS kicks evan from the Engineering space and from all the public rooms in it
- Evan is added to the Support OrgUnit
- GS invites Evan to the Support Matrix space and all its space-public rooms
- Result: Moving users around within the company hierarchy is represented by moving them around in the Matrix space.
Offboarding
- In the previously described OrgUnit hierarchy, assume a user Evan belonging to Engineering
- Evan is being removed from LDAP entirely, or moved outside of Employees, which is the root space managed by GS
- Evan’s Matrix account is being deactivated, preventing them from logging in.
- Evan is subject to the user deletion flow.
- Result: GS will automatically erase a user from Matrix if they no longer belong to the LDAP space managed by it.
Permission management
- GS is configured to assign a power level of 50 to every user in groups called moderators or engineering-moderators
- In the previously described OrgUnit hierarchy, we have users Evan and Brenda belonging to Engineering (and therefore also to Employees).
- We create two LDAP Security Groups: moderators in Employees and engineering-moderators in Engineering
- We make Evan a member of moderators, and Brenda a member of engineering-moderators
- GS assigns a power level of 50 to Evan in the Employees Matrix space and all rooms contained within it – except its subspaces (including Engineering) where Evan’s power level is still the default 0
- GS assigns a power level of 50 to Brenda in the Engineering and all its child rooms. Brenda still has a default power level of 0 in Employees
- The spaces managed by GS allow its moderators (PL 50) to create child rooms and spaces
- Result: LDAP Security Groups can be used to manage Matrix power levels in a granular and configurable manner
LDAP as a source of truth
- In a space hierarchy managed by GS, a user acquires a power level higher than the one described in their LDAP security group
- In response to that, GS demotes the user back to the power level that they should have according to LDAP. If the user is currently an Admin, GS will temporarily take over their account and make them demote themselves.
- A user, for any reason, ends up in a room that they shouldn’t be in – for example, a room they were a member of now becomes a part of the company Space.
- In response to that, GS immediately kicks them from the room they weren’t supposed to be in.
- Result: LDAP is the source of truth, and any changes in Matrix that don’t conform to the rules established in LDAP get automatically corrected.
User Deletion
When user are no longer desired on the AIM-managed server, they fall under the user deletion flow described here.
Deletion criteria
They are considered no longer desired if they are present in the root space of the organization, but they're not on the list of users who are supposed to be provisioned. In a GS-managed server this will only happen if a user was first added to the directory (and provisioned), and then removed from it.
Potential issues
If a user who was never part of the directory finds themselves in the company space somehow, they will be subject to the user deletion flow as well. This should never happen, as GS will not ever invite them to the root space by itself. However, a user with Admin (PL 100) permissions could invite them to the root space manually, which would make GS consider them a deletion candidate -- despite them never being part of a the directory and not being managed by GS before that point.
Configuration
User deletion can happen either instantly, or delayed by a configurable "grace period".
In practice, the instant deletion is implemented the exact same way as the delayed deletion, but with grace period set to 0 seconds.
It's controlled by the user_soft_delete_period
parameter in the config, and leaving it unset is the same as setting it to "0s"
.
userProvisioner:
# Optional. Configure to enable user deprovisioning. Disabled by default.
deprovisioning:
enabled: false
# Optional. When users get removed from the directory their accounts will only be deactivated,
# but their erasure will be delayed by the specified time period, allowing them to be reactivated in the meantime.
# The format is <amount><unit>, with amount being numeric and unit being one of: [s, m, h, d], for seconds, minutes,
# hours or days respectively (for example: "24h", "31d" etc.).
# The specified period will be translated into seconds, so won't account for things like DST, leap seconds etc.
# Users will be deleted *no sooner* than that, but may be removed a bit later, depending on other Advanced Identity Management operations.
# By default set to 30 days.
soft_delete_period: '30d'
Actions performed instantly
When a user is found to be undesirable, their Matrix account is deactivated, preventing them from logging in.
They are not being removed from any rooms though, and their permissions and power levels stay the same.
It's equivalent to using a deactivate user
Synapse API (with erase
set to false
).
If a grace period is configured and the user gets added back to the directory before they get deleted completely, their account will get reactivated.
The user then gets put on the "grace period" list.
NOTE: Historically Advanced Identity Management would reset the password for the user to a random value, but modern versions will just lock the account.
Actions performed after the grace period
Once the grace period expires for a user, they get erased from the server. This is done under the hood by the Synapse API
deactivate user, with erase
set to true
.
Administration
Migrating? Automate your deployment? Configuring Backups? Guides for Administrators here!
Authentication Configuration Examples
Authentication Configuration Examples
Authentication configuration examples for LDAP, OpenID on Azure and SAML.
Provided below are some configuration examples covering how you can set up various types of Delegated Authentication.
LDAP on Windows AD
-
Base.
The distinguished name of the root level Org Unit in your LDAP directory.- The distinguished name can be displayed by selecting
View
/Advanced Features
in the Active Directory console and then, right-clicking on the object, selectingProperties
/Attributes Editor
.
- The distinguished name can be displayed by selecting
-
Bind DN.
The distinguished name of the LDAP account with read access. -
Filter.
A LDAP filter to filter out objects under the LDAP Base DN. -
URI.
The URI of your LDAP serverldap://dc.example.com
.-
This is often your Domain Controller, can also pass in
ldaps://
for SSL connectivity. -
The following are the typical ports for Windows AD LDAP servers:
-
ldap://ServerName:389
-
ldaps://ServerName:636
-
-
-
LDAP Bind Password.
The password of the AD account with read access. -
LDAP Attributes.
-
Mail.
mail
-
Name.
cn
-
UID.
sAMAccountName
-
OpenID on Microsoft Azure
Before configuring within the installer, you have to configure Microsoft Azure Active Directory.
Set up Microsoft Azure Active Directory
-
You need to create an
App registration
. -
You have to select
Redirect URI (optional)
and set it to the following, wherematrix
is the subdomain of Synapse andexample.com
is your base domain as configured on the Domains section:https://matrix.example.com/_synapse/client/oidc/callback
For the bridge to be able to operate correctly, navigate to API permissions, add Microsoft Graph APIs, choose Delegated Permissions and add:
-
openid
-
profile
-
email
Remember to grant the admin consent for those.
To setup the installer, you'll need:
-
The
Application (client) ID
-
The
Directory (tenant) ID
-
A secret generated from
Certificates & Secrets
on the app.
Configure the installer
-
IdP Name.
A user-facing name for this identity provider, which is used to offer the user a choice of login mechanisms in the Element UI. -
IdP ID.
A string identifying your identity provider in your configuration, this will be auto-generated for you (but can be changed). -
IdP Brand.
An optional brand for this identity provider, allowing clients to style the login flow according to the identity provider in question. -
Issuer.
The OIDC issuer. Used to validate tokens and (if discovery is enabled) to discover the provider's endpoints. Usehttps://login.microsoftonline.com/DIRECTORY_TENNANT_ID/v2.0
replacingDIRECTORY_TENNANT_ID
. -
Client Auth Method.
Auth method to use when exchanging the token. Set it toClient Secret Post
or any method supported by your IdP. -
Client ID.
Set this to yourApplication (client) ID
. -
Client Secret.
Set this to the secret value defined under "Certificates and secrets". -
Scopes.
By defaultopenid
,profile
andemail
are added, you shouldn't need to modify these. -
User Mapping Provider.
Configuration for how attributes returned from a OIDC provider are mapped onto a matrix user.-
Localpart Template.
Jinja2 template for the localpart of the MXID.
Set it to{{ user.preferred_username.split('@')[0] }}
if using Legacy Auth, or{{ (user.preferred_username | split('@'))[0] }}
if using MAS. -
Display Name Template.
Jinja2 template for the display name to set on first login.
If unset, no display name will be set. Set it to{{ user.name }}
.
-
-
Discover.
Enable / Disable the use of the OIDC discovery mechanism to discover endpoints. -
Backchannel Logout Enabled.
Synapse supports receiving OpenID Connect Back-Channel Logout notifications. This lets the OpenID Connect Provider notify Synapse when a user logs out, so that Synapse can end that user session. This property has to bet set tohttps://matrix.example.com/_synapse/client/oidc/backchannel_logout
in your identity provider, wherematrix
is the subdomain of Synapse andexample.com
is your base domain as configured on the Domains section.
OpenID on Microsoft AD FS
Install Microsoft AD FS
Before starting the installation, make sure:
-
your Windows computer name is correct since you won't be able to change it after having installed AD FS
-
you configured your server with a static IP address
-
your server joined a domain and your domain is defined under Server Manager > Local server
-
you can resolve your server FQDN like computername.my-domain.com
You can find a checklist here.
Steps to follow:
-
Install AD CS (Certificate Server) to issue valid certificates for AD FS. AD CS provides a platform for issuing and managing public key infrastructure [PKI] certificates.
-
Install AD FS (Federation Server)
Install AD CS
You need to install the AD CS Server Role.
- Follow this guide.
Obtain and Configure an SSL Certificate for AD FS
Before installing AD FS, you are required to generate a certificate for your federation service. The SSL certificate is used for securing communications between federation servers and clients.
-
Follow this guide.
-
Additionally, this guide provides more details on how to create a certificate template.
Install AD FS
You need to install the AD FS Role Service.
- Follow this guide.
Configure the federation service
AD FS is installed but not configured.
-
Click on
Configure the federation service on this server
underPost-deployment configuration
in theServer Manager
. -
Ensure
Create the first federation server in a federation server farm
and is selected
- Click
Next
- Select the SSL Certificate and set a Federation Service Display Name
- On the Specify Service Account page, you can either Create a Group Managed Service Account (gMSA) or Specify an existing Service or gMSA Account
- Choose your database
-
Review Options , check prerequisites are completed and click on
Configure
-
Restart the server
Add AD FS as an OpenID Connect identity provider
To enable sign-in for users with an AD FS account, create an Application Group in your AD FS.
To create an Application Group, follow theses steps:
-
In
Server Manager
, selectTools
, and then selectAD FS Management
-
In AD FS Management, right-click on
Application Groups
and selectAdd Application Group
-
On the Application Group Wizard
Welcome
screen-
Enter the Name of your application
-
Under
Standalone applications
section, selectServer application
and clickNext
-
- Enter
https://<matrix domain>/_synapse/client/oidc/callback
in Redirect URI: field, clickAdd
, save theClient Identifier
somewhere, you will need it when setting up Element and clickNext
(e.g. https://matrix.domain.com/_synapse/client/oidc/callback)
-
Select
Generate a shared secret
checkbox and make a note of the generated Secret and pressNext
(Secret needs to be added in the Element Installer GUI in a later step) -
Right click on the created Application Group and select `Properties``
-
Select
Add application...
button. -
Select
Web API
-
In the
Identifier
field, type in theclient_id
you saved before and clickNext
-
Select
Permit everyone
and clickNext
-
Under Permitted scopes: select
openid
andprofile
and clickNext
-
On
Summary
page, click `Next`` -
Click
Close
and thenOK
Export Domain Trusted Root Certificate
-
Run
mmc.exe
-
Add the
Certificates
snap-in- File/Add snap-in for
Certificates
,Computer account
- File/Add snap-in for
-
Under
Trusted Root Certification Authorities
/Certificates
, select your DC cert -
Right click and select
All Tasks
/Export...
and export asBase-64 encoded X 509 (.CER)
-
Copy file to local machine
Configure the installer
Add an OIDC provider in the 'Synapse' configuration after enabling Delegated Auth
and set the following fields in the installer:
-
Allow Existing Users
: if checked, it allows a user logging in via OIDC to match a pre-existing account instead of failing. This could be used if switching from password logins to OIDC. -
Authorization Endpoint
: the oauth2 authorization endpoint. Required if provider discovery is disabled.https://login.microsoftonline.com/<Directory (tenant) ID>/oauth2/v2.0/authorize
-
Backchannel Logout Enabled
: Synapse supports receiving OpenID Connect Back-Channel Logout notifications. This lets the OpenID Connect Provider notify Synapse when a user logs out, so that Synapse can end that user session. -
Client Auth Method
: auth method to use when exchanging the token. Set it toClient Secret Basic
or any method supported by your Idp -
Client ID
: theClient ID
you saved before -
Discover
: enable/disable the use of the OIDC discovery mechanism to discover endpoints -
Idp Brand
: an optional brand for this identity provider, allowing clients to style the login flow according to the identity provider in question -
Idp ID
: a string identifying your identity provider in your configuration -
Idp Name
: A user-facing name for this identity provider, which is used to offer the user a choice of login mechanisms in the Element UI. In the screenshot bellow,Idp Name
is set toAzure AD
-
Issuer
: the OIDC issuer. Used to validate tokens and (if discovery is enabled) to discover the provider's endpointshttps://<your-adfs.domain.com>/adfs/
-
Token Endpoint
: the oauth2 authorization endpoint. Required if provider discovery is disabled. -
Client Secret
: your client secret you saved before. -
Scopes: add every scope on a different line
-
The openid scope is required which translates to the Sign you in permission in the consent UI
-
You might also include other scopes in this request for requesting consent.
-
-
User Mapping Provider: Configuration for how attributes returned from a OIDC provider are mapped onto a matrix user.
-
Localpart Template
: Jinja2 template for the localpart of the MXID. For AD FS set it to{{ user.upn.split('@')[0] }}
if using Legacy Auth, or{{ (user.preferred_username | split('@'))[0] }}
if using MAS.
-
Other configurations are documented here.
SAML on Microsoft Azure
Before setting up the installer, you have to configure Microsoft Entra ID.
Set up Microsoft Entra ID
With an account with enough rights, go to : Enterprise Applications
-
Click on
New Application
-
Click on
Create your own application
on the top left corner -
Choose a name for it, and select `Integrate any other application you don't find in the
gallery`
-
Click on "Create"
-
Select
Set up single sign on
-
Select
SAML
-
Edit
onBasic SAML Configuration
-
In
Identifier
, add the following URL :https://synapse_fqdn/_synapse/client/saml2/metadata.xml
-
Remove the default URL
-
In
Reply URL
, add the following URL :https://synapse_fqdn/_synapse/client/saml2/authn_response
-
Click on
Save
-
Make a note of the
App Federation Metadata Url
underSAML Certificates
as this will be required in a later step. -
Edit
onAttributes & Claims
-
Remove all defaults for additional claims
-
Click on
Add new claim
to add the following (suggested) claims (the UID will be used as the MXID):-
Name:
uid
, Transformation :ExtractMailPrefix
, Parameter 1 :user.userprincipalname
-
Name:
email
, Source attribute :user.mail
-
Name:
displayName
, Source attribute :user.displayname
-
-
Click on
Save
- In the application overview screen select
Users and Groups
and add groups and users which may have access to element
Configure the installer
Add a SAML provider in the 'Synapse' configuration after enabling Delegated Auth
and set the following (suggested) fields in the installer:
-
Allow Unknown Attributes.
Checked -
Attribute Map.
SelectURN:Oasis:Names:TC:SAML:2.0:Attrname Format:Basic
as theIdentifier
-
Mapping.
Set the following mappings:-
From:
Primary Email
To:email
-
From:
First Name
To:firstname
-
From:
Last Name
To:lastname
-
-
Entity.
-
Description.
-
Entity ID. (From Azure)
-
Name.
-
-
User Mapping Provider.
Set the following:-
MXID Mapping
:Dotreplace
-
MXID Source Attribute
:uid
-
-
Metadata URL.
Add theApp Federation Metadata URL
from Azure.
Troubleshooting
Redirection loop on SSO
Synapse needs to have the X-Forwarded-For
and X-Forwarded-Proto
headers set by the reverse proxy doing the TLS termination. If you are using a Kubernetes installation with your own reverse proxy terminating TLS, please make sure that the appropriate headers are set.
Backup and Restore
An ESS Administrators focused guide on backing up and restoring Element Server Suite.
Welcome, ESS Administrators. This guide is crafted for your role, focusing on the pragmatic aspects of securing crucial data within the Element Server Suite (ESS). ESS integrates with external PostgreSQL databases and persistent volumes and is deployable in standalone or Kubernetes mode. To ensure data integrity, we recommend including valuable, though not strictly consistent, data in backups. The guide also addresses data restoration and a straightforward disaster recovery plan.
Software Overview
ESS provides Synapse and Integrations which require an external PostgreSQL and persistent volumes. It offers standalone or Kubernetes deployment.
-
Standalone Deployments.
The free version of our Element Server Suite.
Allowing you to easily install a Synapse homeserver and hosted Element Web client. -
Kubernetes Deployments.
We strongly recommend to leverage your own cluster backup solutions for effective data protection.
You'll find below a description of the content of each component data and db backup.
Synapse
- Synapse deployments creates a PVC named
<element deployment cr name>-synapse-media
. It contains all users medias (avatar, photos, videos, etc). It does not need strict consistency with database content, but the more in sync they are, the more medias can be correctly synced with rooms state in case of restore. - Synapse requires an external postgressql database which contains all the server state.
Adminbot
- Adminbot integration creates a PVC named
<element deployment cr name>-adminbot
. It contains the bot decryption keys, and a cache of the adminbot logins.
Auditbot
-
Auditbot integration creates a PVC named
<element deployment cr name>-auditbot
. It contains the bot decryption keys, and a cache of the adminbot logins. -
Auditbot store the room logs of your organization either in an S3 Bucket or the aforementioned PVC. Depending on the critical nature of being able to provide room logs for audit, you need to properly backup your S3 Bucket or the PVC.
Matrix Authentication Service
- Matrix Authentication Service requires an external postgresql database. It contains the homeserver users, their access tokens and their Sessions/Devices.
Sliding Sync
- Sliding Sync requires an external postgresql database. It contains Sliding Sync running state, and data cache. The database backup needs to be properly secured. This database needs to be backed-up to be able to avoid UTDs and initial-syncs on a disaster recovery.
Sydent
- Sydent integration creates a PVC named
<element_deployment_cr_name>-sydent
. It contains the integration SQLite database.
Integrator
- Integrator requires an external postgresql database. It contains information about which integration was added to each room.
Bridges (XMPP, IRC, Whatsapp, SIP, Telegram)
- The bridges require each an external postgresql database. It contains mapping data between Matrix Rooms and Channels on the other bridge side.
Backup Policy & Backup Procedure
There is no particular prerequisite to do before executing an ESS backup. Only Synapse and MAS Databases should be backed up in sync and stay consistent. All other individual components can be backed up on it's own lifecycle.
Backups frequency and retention periods must be defined according to your own SLAs and SLIs.
Data restoration
The following ESS components should be restored first in case of complete restoration. Other components can be restore on their distinctively, on their own time:
- Synapse Postgresql database
- Synapse media
- Matrix Authentication Service database (if installed)
- Restart Synapse & MAS (if installed)
- Restore and restart each individual component
Disaster Recovery Plan
In case of disaster recovery, the following components are critical for your system recovery:
- Synapse Postgresql Database is critical for Synapse to send consistent data to other servers, integrations and clients.
- Synapse Keys configured in ESS configuration (Signing Key, Macaroon Secret Key, Registration Shared Secret) are critical for Synapse to start and identify itself as the same server as before.
- Matrix Authentication Service Postgresql Database is critical for your system to recover your user accounts, their devices and sessions.
The following systems will recover features subsets, and might involve reset & data loss if not recovered :
-
Synapse Media Storage.
Users will loose their Avatars, and all photos, videos, files uploaded to the rooms wont be available anymore -
AdminBot and AuditBot Data.
The bots will need to be renamed for them to start joining all rooms and logging events again -
Sliding Sync.
Users will have to do an initial-sync again, and their encrypted messages will display as "Unable to decrypt" if its database cannot be recovered -
Integrator.
Integrations will have to be added back to the rooms where they were configured. Their configuration will be desynced from integrator, and they might need to be reconfigured from scratch to have them synced with integrator.
Security Considerations
Some backups will contain sensitive data, Here is a description of the type of data and the risks associated to it. When available, make sure to enable encryption for your stored backups. You should use appropriate access controls and authentication for your backup processes.
Synapse
Synapse media and db backups should be considered sensitive.
Synapse media backups will contain all user media (avatar, photos, video, files). If your organization is enforcing encrypted rooms, the media will be stored encrypted with each user e2ee keys. If you are not enforcing encryption, you might have media stored in cleartext here, and appropriate measures should be taken to ensure that the backups are safely secured.
Synapse postgresql backups will contain all user key backup storage, where their keys are stored safely encrypted with each user passphrase. Synapse DB will also store room states and events. If your organization is enforcing encrypted rooms, these will be stored encrypted with each user e2ee keys.
The Synapse documentation contains further details on backup and restoration. Importantly the e2e_one_time_keys_json
table should not be restored from backup.
Adminbot
Adminbot PV backup should be considered sensitive.
Any user accessing it could read the content of your organization rooms. Would such an event occur, revoking the bot tokens would prevent logging in as the AdminBot and stop any pulling of the room messages content.
Auditbot
Auditbot PV backup should be considered sensitive.
Any user accessing it could read the content of your organization rooms. Would such an event occur, revoking the bot tokens would prevent logging in as the AuditBot and stop any pulling of the room messages content.
Logs stored by the AuditBot for audit capabilities are not encrypted, so any user able to access it will be able to read any logged room content.
Sliding Sync
Sliding-Sync DB Backups should be considered sensitive.
Sliding-Sync database backups will contain Users Access tokens, which are encrypted with Sliding Sync Secret Key. The tokens are only refreshed regularly if you are using Matrix Authentication Services. These tokens give access to user messages-sending capabilities, but cannot read encrypted messages without user keys.
Sydent
Sydent DB Backups should be considered sensitive.
Sydent DB Backups contain association between user matrix accounts and their external identifiers (mails, phone numbers, external social networks, etc).
Matrix Authentication Service
Matrix Authentication Service DB Backups should be considered sensitive.
Matrix Authentication Service database backups will contain user access tokens, so they give access to user accounts. It will also contain the OIDC providers and confidential OAuth 2.0 Clients configuration, with secrets stored encrypted using MAS encryption key.
IRC Bridge
IRC Bridge DB Backups should be considered sensitive.
IRC Bridge DB Backups contain user IRC passwords. These passwords give access to users IRC account, and should be reinitialized in case of incident.
Standalone Deployment Guidelines
General storage recommentations for single-node instances
-
/data
is where the standalone deployment installs PostgreSQL data and Element Deployment data. It should be a distinct mount point.- Ideally this would have an independent lifecycle from the server itself
- Ideally this would be easily snapshot-able, either at a filesystem level or with the backing storage
Adminbot storage:
- Files stored with
uid=10006
/gid=10006
, default config uses/data/element-deployment/adminbot
for single-node instances - Storage space required is proportional to the number of user devices on the server. 1GB is sufficient for most servers
Auditbot storage:
- Files stored with
uid=10006
/gid=10006
, default config uses/data/element-deployment/auditbot
for single-node instances - Storage space required is proportional to the number of events tracked.
Synapse storage:
- Media:
- File stored with
uid=10991
/gid=10991
, default config uses/data/element-deployment/synapse
for single-node instances - Storage space required grows with the number and size of uploaded media. For more information, see the Synapse Media section from the Requirements and Recommendations doc.
- File stored with
Postgres (in-cluster) storage:
- Files stored with
uid=999
/gid=999
, default config uses/data/postgres
for single-node instances
Backup Guidance:
-
AdminBot.
Backups should be made by taking a snapshot of the PV (ideally) or rsyncing the backing directory to backup storage -
AuditBot.
Backups should be made by taking a snapshot of the PV (ideally) or rsyncing the backing directory to backup storage -
Synapse Media.
Backups should be made by taking a snapshot of the PV (ideally) or rsyncing the backing directory to backup storage -
Postgres.
-
In Cluster: Backups should be made by
kubectl -n element-onprem exec -it postgres-synapse-0 -- sh -c 'pg_dump --exclude-table-data e2e_one_time_keys_json -U $POSTGRES_USER $POSTGRES_DB' \ > synapse_postgres_backup_$(date +%Y%m%d-%H%M%S).sql
- External: Backup procedures as per your DBA, keeping in mind Synapse specific details
-
In Cluster: Backups should be made by
-
Configuration.
Please ensure that your entire configuration directory (that contains at leastparameters.yml
&secrets.yml
but may also include other sub-directories & configuration files) is regularly backed up.
The suggested configuration path in Element's documentation is~/.element-onpremise-config
but could be anything. It is whatever directory you used with the installer.
Calculate monthly active users
Take great care when modifying and running queries in your database. Ensure you understand what the queries do and double check that your query is correct.
Incorrect queries can cause irrecoverable data loss.
We recommend you familiarize yourself with Transactions. That way, changes are not immediately written and you can undo any errors.
- Connect to your Synapse database
- Get the UNIX timestamps in milliseconds for the time frame you are interested in. You want the time set to 00:00:00 GMT. https://www.epochconverter.com/ is a great tool to convert to/from UNIX timestamps.
a. If you are interested in the current MAU number, pick the date 30 days ago. Note that if you have MAU metrics enabled, this information is also available in Grafana (or your metrics system of choice)
b. If you want a specific month, get the timestamps for 1st of that month and 1st of the following month - Modify and run the appropriate query below
Get your current MAU number. This uses the timestamp for 30 days ago. For example, if you're running this on January 7, 2025, you would use December 8 2024. This is similar to the query used by Synapse to calculate user count for phone-home stats (Synapse sourse).
SELECT COUNT(*) FROM (
SELECT user_id
FROM user_ips
WHERE
last_seen >= 1733616000000 AND -- Sunday, 8 December 2024 00:00:00 GMT
user_id NOT IN (
SELECT name
FROM users
WHERE user_type = 'support'
)
GROUP BY user_id
) AS temp;
For reference, this is equal to
SELECT COUNT(*) FROM (
SELECT
user_id,
MAX(timestamp) AS timestamp
FROM user_daily_visits
WHERE
timestamp >= 1733616000000 AND -- Sunday, 8 December 2024 00:00:00 GMT
user_id NOT IN (
SELECT name
FROM users
WHERE user_type = 'support'
)
GROUP BY user_id
) AS temp;
To get retrospective statistics, use this query instead
SELECT COUNT(*) FROM (
SELECT
user_id,
MAX(timestamp) AS timestamp
FROM user_daily_visits
WHERE
timestamp >= 1730419200000 AND -- Friday, 1 November 2024 00:00:00 GMT
timestamp < 1733011200000 AND -- Sunday, 1 December 2024 00:00:00 GMT
user_id NOT IN (
SELECT name
FROM users
WHERE user_type = 'support'
)
GROUP BY user_id
) AS temp;
Configuring Element Desktop
Element Desktop is a Matrix client for desktop platforms with Element Web at its core.
You can download Element Desktop for Mac, Linux or Windows from the Element downloads page.
See https://web-docs.element.dev/ for the Element Web and Desktop documentation.
Aligning Element Desktop with your ESS deployed Element Web
By default, Element Desktop will be configured to point to the Matrix.org homeserver, however this is configurable by supplying a User Specified config.json
.
As Element Desktop is mainly Element Web, but packaged as a Desktop application, this config.json
is identical to the config.json
ESS will configure and deploy for you at https://<element_web_fqdn>/config.json
, so it is recommended to setup Element Desktop using that file directly.
How you do this will depend on your specific environment, but you will need to ensure the config.json
is placed in the correct location to be used by Element Desktop.
-
%APPDATA%\$NAME\config.json
on Windows -
$XDG_CONFIG_HOME/$NAME/config.json
or~/.config/$NAME/config.json
on Linux -
~/Library/Application Support/$NAME/config.json
on macOS
In the paths above, $NAME
is typically Element, unless you use --profile $PROFILE
in which case it becomes Element-$PROFILE
.
As Microsoft Windows File Explorer by default hides file extensions, please double check to ensure the config.json
does indeed have the .json
file extension, not .txt
.
Customising your desktop configuration
You may wish to further customise Element Desktop, if the changes you wish to make should not also apply to your ESS deployed Element Web, you will need to add them in addition to your existing config.json
.
You can find Desktop specific configuration options, or just customise using any options from the Element Web Config docs.
The Element Desktop MSI
Where to download
Customers who have a subscription to the Enterprise edition of the Element Server Suite (ESS) can download a MSI version of Element Desktop. This version of Element Desktop is by default installed into Program Files (instead of per user) and can be used to deploy into enterprise environments. To download, login to your EMS Accoutn and access from the same download page you'd find the enterprise installer, https://ems.element.io/on-premise/download.
Using the Element Desktop MSI
The Element Desktop MSI can be used to install Element Desktop to all desired machines in your environment, unlike the usual installer, you can customise it's install directory (which now defaults to Program Files
).
You can customise the installation directory by installing the MSI using, or just generally configuring the APPLICATIONFOLDER
:
msiexec /i "Element 1.11.66.msi" APPLICATIONFOLDER="C:\Element"
MSI and config.json
Once users run Element for the first time, an Element folder will be created in their AppData
profile specific to that user. By using Group Policy, Logon Scripts, SCCM or whatever other method you like, ensure the desired config.json
is present within %APPDATA%\Element
. (The config.json
can be present prior to the directories creation.)
Guidance on High Availability
ESS makes use of Kubernetes for deployment so most guidiance on high-availability is tied directly with general Kubernetes guidance on high availability.
Kubernetes
Essential Links
- Options for Highly Available Topology
- Creating Highly Available Clusters with kubeadm
- Set up a High Availability etcd Cluster with kubeadm
- Production environment
High-Level Overview
It is strongly advised to make use of the Kubernetes documentation to ensure your environment is setup for high availability, see links above. At a high-level, Kubernetes achieves high availability through:
-
Cluster Architecture.
-
Multiple Masters: In a highly available Kubernetes cluster, multiple master nodes (control plane nodes) are deployed. These nodes run the critical components such as
etcd
, the API server, scheduler, and controller-manager. By using multiple master nodes, the cluster can continue to operate even if one or more master nodes fail. -
Etcd Clustering:
etcd
is the key-value store used by Kubernetes to store all cluster data. It can be configured as a cluster with multiple nodes to provide data redundancy and consistency. This ensures that if one etcd instance fails, the data remains available from other instances.
-
-
Pod and Node Management.
-
Replication Controllers and ReplicaSets: Kubernetes uses replication controllers and ReplicaSets to ensure that a specified number of pod replicas are running at any given time. If a pod fails, the ReplicaSet automatically replaces it, ensuring continuous availability of the application.
-
Deployments: Deployments provide declarative updates to applications, allowing rolling updates and rollbacks. This ensures that application updates do not cause downtime and can be rolled back if issues occur.
-
DaemonSets: DaemonSets ensure that a copy of a pod runs on all (or a subset of) nodes. This is useful for deploying critical system services across the entire cluster.
-
-
Service Discovery and Load Balancing.
-
Services: Kubernetes Services provide a stable IP and DNS name for accessing a set of pods. Services use built-in load balancing to distribute traffic among the pods, ensuring that traffic is not sent to failed pods.
-
Ingress Controllers: Ingress controllers manage external access to the services in a cluster, typically HTTP. They provide load balancing, SSL termination, and name-based virtual hosting, enhancing the availability and reliability of web applications.
-
-
Node Health Management.
-
Node Monitoring and Self-Healing: Kubernetes continuously monitors the health of nodes and pods. If a node fails, Kubernetes can automatically reschedule the pods from the failed node onto healthy nodes. This self-healing capability ensures minimal disruption to the running applications.
-
Pod Disruption Budgets (PDBs): PDBs allow administrators to define the minimum number of pods that must be available during disruptions (such as during maintenance or upgrades), ensuring application availability even during planned outages.
-
-
Persistent Storage.
-
Persistent Volumes and Claims: Kubernetes provides abstractions for managing persistent storage. Persistent Volumes (PVs) and Persistent Volume Claims (PVCs) decouple storage from the pod lifecycle, ensuring that data is preserved even if pods are rescheduled or nodes fail.
-
Storage Classes and Dynamic Provisioning: Storage classes allow administrators to define different storage types (e.g., SSDs, network-attached storage) and enable dynamic provisioning of storage resources, ensuring that applications always have access to the required storage.
-
-
Geographical Distribution.
- Multi-Zone and Multi-Region Deployments: Kubernetes supports deploying clusters across multiple availability zones and regions. This geographical distribution helps in maintaining high availability even in the event of data center or regional failures.
-
Network Policies and Security.
-
Network Policies: These policies allow administrators to control the communication between pods, enhancing security and ensuring that only authorized traffic reaches critical applications.
-
RBAC (Role-Based Access Control): RBAC restricts access to cluster resources based on roles and permissions, reducing the risk of accidental or malicious disruptions to the cluster's operations.
-
-
Automated Upgrades and Rollbacks.
-
Cluster Upgrade Tools: Tools like
kubeadm
and managed Kubernetes services (e.g., Google Kubernetes Engine, Amazon EKS, Azure AKS) provide automated upgrade capabilities, ensuring that clusters can be kept up-to-date with minimal downtime. -
Automated Rollbacks: In the event of a failed update, Kubernetes can automatically roll back to a previous stable state, ensuring that applications remain available.
-
How does this tie into ESS
As ESS is deployed into a Kubernetes cluster, if you are looking for high availability you should ensure your environment is configured with that in mind. One important factor is to ensure you deploy using the Kubernetes deployment option, whilst Standalone mode will deploy to a Kubernetes cluster, by definition it exists solely on a single node so options for high availability will be limited.
PostgreSQL
Essential links
- PostgreSQL - High Availability, Load Balancing, and Replication
- PostgreSQL - Different replication solutions
High-Level Overview
To ensure a smooth failover process for ESS, it is crucial to prepare a robust database topology. The following list outline the necessary element to take into consideration:
-
Database replicas
- Location: Deploy the database replicas in a separate data center from the primary database to provide geographical redundancy.
- Replication: Configure continuous replication from the primary database to the s econdary database. This ensures that the secondary database has an up-to-date copy of all data.
-
Synchronization and Monitoring
- Synchronization: Ensure that the secondary database is consistently synchronized with the primary database. Use reliable replication technologies and monitor for any lag or synchronization issues.
- Monitoring Tools: Implement monitoring tools to keep track of the replication status and performance metrics of both databases. Set up alerts for any discrepancies or failures in the replication process.
-
Data Integrity and Consistency
- Consistency Checks: Periodically perform consistency checks between the primary and secondary databases to ensure data integrity. -Backups: Maintain regular backups of both the primary and secondary databases. Store backups in a secure, redundant location to prevent data loss.
-
Testing and Validation
- Failover Testing: Conduct regular failover drills to test the transition from the primary to the secondary database. Validate that the secondary database can handle the load and that the failover process works seamlessly.
- Performance Testing: Evaluate the performance of the secondary database under expected load conditions to ensure it can maintain the required service levels.
By carefully preparing the database topology as described, you can ensure that the failover process for ESS is efficient and reliable, minimizing downtime and maintaining data integrity.
How does this tie into ESS
As ESS relies on PostgreSQL for its database if you are looking for high availability you should ensure your environment is configured with that in mind. The database replicas can be achieved the same way in both Kubernetes and Standalone deployment, as the database is not managed by ESS.
ESS failover plan
This document outlines a high-level, semi-automatic, failover plan for ESS. The plan ensures continuity of service by switching to a secondary data center (DC) in the event of a failure in the primary data center.
Prerequisites
- Database Replica: A replica of the main database, located in a secondary data center, continuously reading from the primary database.
- Secondary ESS Deployment: An instance of the ESS deployment, configured in a secondary data center.
- Signing Keys Synchronization: The signing keys stored in ESS secrets need to be kept synchronized between the primary and secondary data centers.
- Media Repository: Media files are stored on a redundant S3 bucket accessible from both data centers.
ESS Architecture for failover capabilities based on 3 datacenters
DC1 (Primary)
-
ElementDeployment Manifest.
- Manifest points to addresses in DC1.
- TLS Secrets managed by ACME.
-
TLS Secrets.
- Replicated to DC2 and DC3.
-
Operator.
- 1 replica.
-
Updater.
- 1 replica.
-
PostgreSQL.
- Primary database.
DC2
-
ElementDeployment Manifest.
- Manifest points to addresses in DC2.
- TLS Secrets pointing to existing secrets, replicated locally from DC1.
-
Operator.
- 0 replica, it prevents the deployment of the kubernetes workloads
-
Updater.
- 1 replica, the base element manifest are ready for the operator to deploy the workloads
-
PostgreSQL.
- Hot-Standby, replicating from DC1.
DC3
-
ElementDeployment Manifest.
- Manifest points to addresses in DC3.
- TLS Secrets pointing to existing secrets, replicated locally from DC1.
-
Operator.
- 0 replica, it prevents the deployment of the kubernetes workloads
-
Updater.
- 1 replica, the base element manifest are ready for the operator to deploy the workloads
-
PostgreSQL.
- Hot-Standby, replicating from DC1.
Failover Process
When DC1 experiences downtime and needs to be failed over to DC2, follow these steps:
-
Disable DC1.
- Firewall outbound traffic to prevent federation/outbound requests such as push notifications.
- Scale down the Operator to 0 replicas and remove workloads from DC1.
-
Activate DC2.
- Promote the PostgreSQL instance in DC2 to the primary role.
- Set Operator Replicas:
- Increase the Operator replicas to 1.
- This starts the Synapse workloads in DC2.
- Update the DNS to point the ingress to DC2.
- Open the firewall if it was closed to ensure proper network access.
-
Synchronize DC3.
- Ensure PostgreSQL Replication:
- Make sure that the PostgreSQL in DC3 is properly replicating from the new primary in DC2.
- Adjust the PostgreSQL topology if necessary to ensure proper synchronization.
- Ensure PostgreSQL Replication:
You should decline your own failover procedure based on this high-level failover overview. By doing so, you can ensure that ESS continues to operate smoothly and with minimal downtime, maintaining service availability even when the primary data center goes down.
Migrating from Self-Hosted to ESS
This document is currently work-in-progress and might not be accurate. Please speak with your Element contact if you have any questions.
Preparation
This section outlines what you should do ahead of the migration in order to ensure the migration goes as quickly as possible and without issues.
- At the latest 48 hours before your migration is scheduled, set the TTL on any DNS records that need to be updated to the lowest allowed value.
- Check the size of your database:
- PostgreSQL: Connect to your database and issue the command
\l+
- PostgreSQL: Connect to your database and issue the command
- Check the size of your media
- Synapse Media Store:
du -hs /path/to/synapse/media_store/
- Synapse Media Store:
- If you are using SQLite instead of PostgreSQL, you should port your database to PostgreSQL by following this guide before dumping your database
Note that the database and media may be duplicated/stored twice on your ESS host during the import process depending on how you do things.
Setup your new ESS server
Follow the ESS docs for first-time installation, configuring to match your existing homeserver before proceeding with the below.
The Domain Name
on the Domains page during the ESS initial setup wizard must be the same as you have on your current setup. The other domains can be changed if you wish.
To make the import later easier, we recommend you select the following Synapse Profile. You can change this as required after the import.
- Monthly Active Users: 500
- Federation Type: closed
After the ESS installation, you can check your ESS Synapse version on the Admin
-> Server Info
page:
Export your old Matrix server
SSH to your old Matrix server
You might want to run everything in a tmux
or a screen
session to avoid disruption in case of a lost SSH connection.
Upgrade your old Synapse to the same version EES is running
Follow https://element-hq.github.io/synapse/latest/upgrade.html
Please be aware that ESS, especially our LTS releases may not run the latest available Synapse release. Please speak with your Element contact for advice on how to resolve this issue. Note that Synapse does support downgrading, but occationally a new Synapse version includes database schema changes and this limits downgrading. See https://element-hq.github.io/synapse/latest/upgrade.html#rolling-back-to-older-versions for additional details and compatible versions.
Start Synapse, make sure it's happy.
Stop Synapse
Create a folder to store everything
mkdir -p /tmp/synapse_export
cd /tmp/synapse_export
The guide from here on assumes your current working directory is /tmp/synapse_export
.
Set restrictive permissions on the folder
If you are working as root: (otherwise set restrictive permissions as needed):
chmod 700 /tmp/synapse_export
Copy Synapse config
Get the following files :
- Your Synapse configuration file (usually
homeserver.yaml
) - Your message signing key.
- This is stored in a separate file. See the Synapse config file [
homeserver.yaml
] for the path. The variable issigning_key_path
https://element-hq.github.io/synapse/latest/usage/configuration/config_documentation.html?highlight=signing_key_path#signing_key_path
- This is stored in a separate file. See the Synapse config file [
- grab
macaroon_secret_key
fromhomeserver.yaml
and place it in the "Secrets \ Synapse \ Macaroon" on your ESS server - If you use native Synapse user authentication,
password.pepper
must remain unchanged. If not you need to reset all passwords. Note that setting the pepper is not supported in ESS as time of writing, please check with your Element contact.
Stop Synapse
Once Synapse is stopped, do not start it again after this
Doing so can cause issues with federation and inconsistent data for your users.
While you wait for the database to export or files to transfer, you should edit or create the well-known files and DNS records to point to your new EES host. This can take a while to update so should be done as soon as possible in order to ensure your server will function properly when the migration is complete.
Database export
Dump your database:
pg_dump -Fc -O -h <dbhost> -U <dbusername> -d <dbname> -W -f synapse.dump
-
<dbhost>
(ip or fqdn for your database server) -
<dbusername>
(username for your synapse database) -
<dbname>
(the name of the database for synapse)
Import to your ESS server
Database import
Enter a bash shell on the Synapse postgres container:
Stop Synapse
kubectl .... replicas=0
Note that this might differ depending on how you have your Postgres managed. Please consult the documentation for your deployment system.
kubectl exec -it -n element-onprem synapse-postgres-0 --container postgres -- /bin/bash
Then on postgres container shell run:
psql -U synapse_user synapse
The following command will erase the existing Synapse Database without warning or confirmation. Please ensure that is is the correct database and there is no production data on it.
DO $$ DECLARE
r RECORD;
BEGIN
FOR r IN (SELECT tablename FROM pg_tables WHERE schemaname = current_schema()) LOOP
EXECUTE 'DROP TABLE ' || quote_ident(r.tablename) || ' CASCADE';
END LOOP;
END $$;
DROP sequence cache_invalidation_stream_seq;
DROP sequence state_group_id_seq;
DROP sequence user_id_seq;
DROP sequence account_data_sequence;
DROP sequence application_services_txn_id_seq;
DROP sequence device_inbox_sequence;
DROP sequence event_auth_chain_id;
DROP sequence events_backfill_stream_seq;
DROP sequence events_stream_seq;
DROP sequence presence_stream_sequence;
DROP sequence receipts_sequence;
DROP sequence un_partial_stated_event_stream_sequence;
DROP sequence un_partial_stated_room_stream_sequence;
Use \q
to quit, then back on the host run:
gzip -d synapse_export.sql.gz
sudo cp synapse_export.sql /data/postgres/synapse/
# or
kibectl --namespace element-onprem cp synapse_export.sql element-onprem synapse-postgres-0:/tmp
Finally on the pod:
cd /var/lib/postgresql/data
# or
cd /tmo
pg_restore <connection> --no-owner --role=<new role> -d <new db name> dump.sql
Mobile client provisioning
The default sign-in flow in the mobile app requires the user to enter the correct account provider (homeserver) which is error prone and also not effortless.
In order to improve the user experience, and to skip this step, there are 3 options to specify the account provider or other configuration for the user:
- Use mobile device management (MDM).
- Use a deeplink to open the app.
- Use a custom app.
This is a new feature available in since 25.05.2 (iOS) and 25.06.0 (Android).
Deeplinking
Deeplinking is the simplest option (unless the customer is already using MDM for their users or has a custom app).
How does it work?
The customer would send an on-boarding email to its users which contains two major steps:
- Download the app
- Sign in to the app
The second step then contains the deeplink which opens the app and sets the desired account provider. Optionally, a QR code of the link would be included in case the user can't access the mail on their phone.
Here's an example of how such e-mail could look like. Note that currently ESS does not have means for composing and sending these emails to users automatically.
The configuration set by opening the app with the deeplink is retained only until the app is running. If the app is killed and opened without using the deeplink, the default behaviour will apply (e.g. user has to enter account provider).
If account provider is already set by MDM or customer publishes their own app which does not allow using an arbitrary account provider, the account provider in the deeplink is ignored.
If the user tries to open the app with the deeplink, but does not have the app installed yet on their phone, the user lands on a web page that guides them to install it.
How to Create the Deeplink?
The format for the deeplink is:
https://mobile.element.io/<app>/?account_provider=<example.com>&login_hint=<alice>
The <app>
specifies the mobile app to open:
-
element
-> Element X -
element-pro
-> Element Pro - missing -> Element classic
Note: while Element X is supported, it is expected that customers use the Element Pro app.
The <example.com>
specifies the server name of the account provider aka homeserver, e.g. example.com. Note that for backward compatibility purposes the hs_url
works as an alternative for the Element (classic) app, but it expects an URL, e.g. https://example.com
.
The <alice>
is intended to populate the username for the login. Since this is implemented by providing it as the login_hint
parameter in the OIDC protocol, the exact value/format of it depends on which means of authentication is used.
When MAS is used without an upstream identity provider (e.g. no SSO for users), the format mxid:@alice:example.org
is expected. An example deeplink/URL for @alice:example.com
to sign to Element Pro in such a case is: https://mobile.element.io/element-pro/?account_provider=example.com&login_hint=mxid:@alice:example.org
When MAS is used with an upstream identity provider, you need to set forward_login_hint: true
in your MAS configuration for the hint to be forwarded. See the MAS documentation for more details. The hint itsef depends on what the provider expects, but in many cases the email address is used as the user ID, e.g. alice@example.com. Also, in case of SSO providing the username is less critical, as user is very likely already signed in because they have used other apps with the SSO.
To create a QR code of the link, online services can be used.
App Download Links
The links to download the app are:
- Element Pro
- iOS: https://apps.apple.com/app/element-pro-for-work/id6502951615
- Android: https://play.google.com/store/apps/details?id=io.element.enterprise
- Element X
- iOS: https://apps.apple.com/ee/app/element-x-secure-chat-call/id1631335820
- Android: https://play.google.com/store/apps/details?id=io.element.android.x
Please refer to Apple and Google documentation how to use badges for the download buttons.
Note: Redirect from Element Web
Note: when a user tries to access Element web app with a browser on a mobile device (e.g. they go to https://chat.example.com), the user is directed to a page which also guides them to download the native mobile app and has a button to open the app with the deeplink. In such a case the account provider is set to the one that the Element web installation specifies.
Mobile Device Management (MDM)
If the customer is using MDM, the account provider can be delivered to the user's phone via the AppConfig.
When applying the MDM configuration to Element Pro, the Application/Bundle ID is io.element.enterprise
. The key for specifying the account provider is accountProvider
. The value must be the server name, e.g. example.com
.
The end user just needs to download the app (if not installed by the MDM solution already) and open it.
Note that when MDM is used for specifying the account provider, other means like deeplink are ignored.
Custom app
If the customer is publishing their own custom app in the stores, the account provider(s) is built into the app.
The account provider(s) are specified by the accountProviders
parameter of the custom app's build pipeline (by Element who sets up the pipeline). Details are provided in the pipeline inputs documentation.
With custom app it is possible to specify more than one account provider. In such a case the user would still need to make a selection but would not need to input manually.
Note that the account provider(s) in the custom app are overriden by the one from MDM (if exists).
If a custom app is opened with a deeplink that also specifies account provider, the one in the deeplink is only applied when the accountProviders
parameter of the custom app contains *
. The latter means that users of the custom app are allowed to use arbitrary account providers.
Starting and Stopping ESS Services
Stopping a component
To stop a component, such as Synapse
, it is necessary to stop the operator :
kubectl scale deploy/element-operator-controller-manager -n operator-onprem --replicas 0
Once the operator is stopped, you can delete the Synapse resource to remove all Synapse workloads :
kubectl delete synapse/first-element-deployment -n element-onprem
To get a list of resources that you can remove, you can look at the following command :
kubectl get elementdeployment/first-element-deployment -n element-onprem --template='{{range $key, $value := .status.dependentCRs}}{{$key}}{{"\n"}}{{end}}'
Example :
ElementWeb/first-element-deployment
Hookshot/first-element-deployment
Integrator/first-element-deployment
MatrixAuthenticationService/first-element-deployment
Synapse/first-element-deployment
SynapseAdminUI/first-element-deployment
SynapseUser/first-element-deployment-adminuser-donotdelete
SynapseUser/first-element-deployment-telemetry-donotdelete
WellKnownDelegation/first-element-deployment
Starting a component
To stop a component, such as Synapse
, it is necessary to start the operator :
kubectl scale deploy/element-operator-controller-manager -n operator-onprem --replicas 1
Because the Synapse
resource will automatically have been recreated by the updater
, the operator on startup will automatically detect it and recreate all synapse workloads.
Advanced
Advanced setup
Contents
- Values documentation
- Using a dedicated PostgreSQL database
- Configuring the storage path when using k3s
- Monitoring
- Components Configuration
Values documentation
The Helm chart values documentation is available in:
- The GitHub repository values files.
- The chart README.
- Artifacthub.io.
Configuration samples are available in the GitHub repository.
Using a dedicated PostgreSQL database
The stack can need up to 3 databases:
-
For Synapse https://element-hq.github.io/synapse/latest/postgres.html
-
For MAS https://element-hq.github.io/matrix-authentication-service/setup/database.html
-
For GroupSync
To configure your own PostgreSQL Database in your installation, copy the file charts/matrix-stack/ci/fragments/quick-setup-postgresql.yaml
to postgresql.yaml
in your ESS configuration values directory and configure it accordingly.
For Group Sync, merge the file charts/matrix-stack/ci/fragments/group-sync-test-postgres.yaml
together with charts/matrix-stack/ci/fragments/group-sync-test-postgres-secrets-in-helm.yaml
in the postgresql.yaml
of your ESS configuration values.
Configuring the storage path when using K3s
K3s by default deploys the storage in /var/lib/rancher/k3s/storage/
. If you want to change the path, you will have to run the K3s setup with the parameter --default-local-storage-path <your path>
.
Monitoring
The chart provides ServiceMonitor
automatically to monitor the metrics exposed by ESS Pro.
If your cluster has Prometheus Operator or Victoria Metrics Operator installed, the metrics will automatically be scraped.
Configuration
ESS Pro allows you to easily configure its individual components. You basically have to create a values file for each component in which you specify your custom configuration. Below you find sections for each component.
If you have created new values files for custom configuration, make sure to apply them by passing them with the helm upgrade command (see Setting up the stack).
Configuring Element Web
Element Web configuration is written in JSON. The documentation can be found in the Element Web repository.
To configure Element Web, create a values file with the JSON config to inject as a string under “additional”:
elementWeb:
additional:
user-config.json: |
{
"some": "settings"
}
Configuring Synapse
Synapse configuration is written in YAML. The documentation can be found here.
synapse:
additional:
user-config.yaml:
config: |
# Add your settings below, taking care of the spacing indentation
some: settings
Configuring Matrix Authentication Service
Matrix Authentication Service configuration is written in YAML. The MAS documentation can be found here.
See this document for additional ESS MAS documentation.
matrixAuthenticationService:
additional:
user-config.yaml:
config: |
# Add your settings below, taking care of the spacing indentation
some: settings
Configuring GroupSync
GroupSync configuration is written in YAML. The documentation can be found here.
groupSync:
additional:
user-config.yaml:
config: |
# Add your settings below, taking care of the spacing indentation
some: settings
Troubleshooting
Update times out
If your helm update
times out, you can extend the time limit from the default 5 minutes with --timeout 1h
. Or pass the --debug
option to get more information. For example:
helm upgrade --install --namespace "ess" ess \
oci://registry.element.io/matrix-stack:25.4.2 \
--values "$HOME/ess-config/element.yaml" \
--values "$HOME/ess-config/essCredentials.yam"l \
--values "$HOME/ess-config/hostnames.yaml" \
--values "$HOME/ess-config/mas.yaml" \
--values "$HOME/ess-config/postgres.yaml" \
--values "$HOME/ess-config/rtc.yaml" \
--values "$HOME/ess-config/tls.yaml" \
--wait --timeout 1h --debug 2>&1 | tee ess-upgrade.log
Stuck upgrade
If an upgrade failed, but is preventing you from running a new upgrade
Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
You can use Helm to roll back. Make sure you are not rolling back to an incompatible version. For example for Synapse, see this page for compatible Synapse versions.
List Helm deployment versions
$ helm --namespace ess history ess
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
6 Tue Mar 25 13:17:00 2025 superseded matrix-stack-0.7.5 Upgrade complete
7 Mon Mar 31 13:00:49 2025 superseded matrix-stack-0.8.1 Upgrade complete
8 Tue Apr 8 10:49:10 2025 superseded matrix-stack-0.9.1 Upgrade complete
9 Fri May 2 11:31:29 2025 failed matrix-stack-0.11.3 Upgrade "ess" failed: failed to create resource: Service "ess-matrix-rtc-sfu-tcp" is invalid: spec.ports[0].nodePort: Invalid value: 30881: provided port is already allocated
10 Fri May 2 11:38:15 2025 superseded matrix-stack-0.11.3 Upgrade complete
11 Tue May 6 15:51:44 2025 superseded matrix-stack-0.11.4 Upgrade complete
12 Thu May 8 12:55:26 2025 superseded matrix-stack-0.11.5 Upgrade complete
13 Fri May 16 09:22:36 2025 deployed matrix-stack-0.12.0 Upgrade complete
14 Wed May 21 14:47:23 2025 failed matrix-stack-25.4.2 Upgrade "ess" failed: pre-upgrade hooks failed: 1 error occurred:
* timed out waiting for the condition
15 Wed May 21 14:55:14 2025 pending-upgrade matrix-stack-25.4.2 Preparing upgrade
Roll back to the last good deployment
$ helm --namespace ess rollback ess 13
Rollback was a success! Happy Helming!
You can now run the normal helm upgrade --install ...
command again.