🗃️ Element Server Suite Documentation LTS 23.10

LTS 23.10 is now out of support, we recommend upgrading to the latest LTS 24.10. To upgrade, please update your deployment to the latest LTS 23.10 release. You can then upgrade to the latest patch release of LTS 24.10.

Introduction to Element Server Suite
Kubernetes Installations
Kubernetes Installations - Quick Start
Single Node Installations
Single Node Installs: Storage and Backup Guidelines
Configuring Element Desktop
Using the Installer in an Air-Gapped Environment
Troubleshooting
Setting up Permalinks With the Installer
Setting Up Well Known Delegation
Setting up Delegated Authentication With the Installer
Integrations and Add-Ons

Setting Up Jitsi and TURN With the Installer
Setting up Group Sync with the Installer
Setting up GitLab, GitHub, JIRA and Webhooks Integrations With the Installer
Setting up Adminbot and Auditbot
Setting Up Hydrogen
Setting up On-Premise Metrics
Setting Up the Telegram Bridge
Setting Up the Teams Bridge
Setting Up the IRC Bridge
Setting Up the SIP Bridge
Setting Up the XMPP Bridge
Setting up Location Sharing
Removing Legacy Integrations
Setting up Sliding Sync
Setting up Element Call
Setting Up the Skype for Business Bridge

Migration from self-hosted to ESS On-Premise
Configuring Synapse workers
Setting up Delegated Authentication with LDAP on Windows AD
Setting up Delegated Authentication with OpenID on Microsoft Azure
Setting up Delegated Authentication with OpenID on Microsoft AD FS
Getting Started with the Enterprise Helm Charts
Automating ESS Deployment
Kubernetes : namespace-scoped deployments
Customize containers ran by ESS
Support Policies

On-Premise Support Scope of Coverage
Single Node Scope of Coverage Addendum

Appendices

Preparing Element Server Suite PoC
How to run a Webserver on Standalone Deployments
Notifications, MDM & Push Gateway
Verifying ESS releases against Cosign
ESS CRDs support in ArgoCD
Synapse database troubleshooting
Auditbot troubleshooting

Archived Documentation Repository

Documentation covering v1 and installers prior to 2022-07.03
Documentation Covering Installers From 2022.07.03 to 2022.09.05
Documentation Covering Installers From 2022.10.01 to 2023.02.01
Documentation Covering Installer 2023-02.02 CLI Only.
Documentation Covering Installers from 2023-03.01 to 2023-05.04

ESS Sizing
ESS - Backup & Restore Guide
Guidance on High Availability

Introduction to Element Server Suite

What is Element Server Suite?

Element Server Suite provides an enterprise-grade secure communications platform that can be run either on your own premise or in our Element Cloud. Element Server Suite includes the Element Matrix Server, which provides a host of security and privacy features, including:

Built on the Matrix open communications standard.
Provides end to end encrypted messaging, voice, and video through a consumer style messenger with the power of a collaboration tool.
Delivers data sovereignty.
Affords a high degree of flexibility that can be tailored to many use cases.
Allows secure federation within a single organisation or across a supply chain or ecosystem.

and combines them with the following Element Server Extensions:

Group Sync: Synchronize group data from your identity provider and map these into Element spaces.
Adminbot: Give your server administrator the ability to be admin in any rooms on your homeserver.
Auditbot: Have an auditable record of conversations conducted on your homeserver.
Security and feature updates: Updates are easy to deploy and handled by our installer.
Bridges: Bridge to IRC, XMPP, Telegram, Microsoft Teams, or SIP. More coming soon.

Further, we also offer Enterprise Support, giving you access to the experts in federated, secure communications giving you confidence to deploy our platform for your most critical secure communications needs.

Given the flexibility afforded by this platform, there are a number of moving parts to configure. This documentation will step you through architecting and deploying Element Enterprise On-Premise.

Deploying to Kubernetes or to a standalone server?

Element Enterprise On-Premise can be deployed both to Kubernetes (a lightweight container orchestration platform) or a standalone server. One key benefit of going with Kubernetes is that you can add more resources and nodes to a cluster as you need them where you are capped at one node with our standalone server. In the case of our standalone server installation, we deploy microk8s (a smaller lightweight distribution of Kubernetes), which we then use for deploying our application.

In general, regardless of if you pick the standalone server or Kubernetes deployment, you will need a base level of hardware to support the application.

For scenarios that utilise open federation, Element recommends a minimum of 8 vCPUs/CPUs and 32GB ram for the host(s) running synapse pods.

For scenarios that utilise closed federation, Element recommends a minimum of 6 vCPUs/CPUs and 16GB ram for the host(s) running synapse pods.

Architecture

This document gives an overview of our secure communications platform architecture:

(Please click on the image to view it at 100%.)

Comprising our secure communications platform are the following components:

synapse : The homeserver itself.
element-web : The Element Web client.
integrator: Our integration manager.
synapse admin ui : Our Element Enterprise Administrator Dashboard.
postgresql (Optional) : Our database. Only optional if you already have a separate PostgreSQL database, which is required for a multiple node setup. Use an external DB if you have more than 300 users.
groupsync (Optional) : Our group sync software
adminbot (Optional) : Our bot for admin tasks.
auditbot (Optional) : Our bot that provides auditability.
hookshot (Optional) : Our integrations with gitlab, github, jira, and custom webhooks.
hydrogen (Optional) : A light weight alternative chat client.
jitsi (Optional) : Our VoIP platform for group conferencing.
coturn (Optional) : TURN server. Required if deploying VoIP.
element-call (Optional) : Our new VoIP plaftorm for group conferencing
sfu (Optional) : Element Call LiveKit component for scalable conferencing
prometheus (Optional) : Provides metrics about the application and platform.
grafana (Optional) : Graphs metrics to make them consumable.
telegram bridge (Optional) : Bridge to connect Element to Telegram.
teams bridge (Optional) : Bridge to connect Element to MS Teams.
xmpp bridge (Optional) : Bridge to connect Element to XMPP.
irc bridge (Optional) : Bridge to connect Element to IRC.
sip bridge (Optional) : Bridge to connect Element to SIP.

For each of the components in this list (excluding postgresql, groupsync, adminbot, auditbot, and prometheus), you must provide a hostname on your network that meets this criteria:

Fully resolvable to an IP address that is accessible from your clients.
Signed PEM encoded certificates for the hostname in a crt/key pair. Certificates should be signed by an internet recognised authority, an internal to your company authority, or LetsEncrypt.

It is possible to deploy Element Enterprise On-Premise with self-signed certificates and without proper DNS in place, but this is not ideal as the mobile clients and federation do not work with self-signed certificates. Information on how to use self-signed certificates and hostname mappings instead of DNS can be found in How to Setup Local Host Resolution Without DNS

In addition to hostnames for the above, you will also need a hostname and PEM encoded certificate key/cert pair for your base domain. If we were deploying a domain called example.com and wanted to deploy all of the software, we would have the following hostnames in our environment that needed to meet the above criteria:

example.com (base domain)
matrix.example.com (the synapse homeserver)
element.example.com (element web)
integrator.example.com (integration manager)
admin.example.com (admin dashboard)
hookshot.example.com (Our integrations)
hydrogen.example.com (Our light weight chat client)
jitsi.example.com (Our VoIP platform)
coturn.example.com (Our TURN server)
grafana.example.com (Our Grafana server)
telegrambridge.example.com (Our Telegram Bridge)
teamsbridge.example.com (Our Teams Bridge)
roomadmin.example.com (AdminBot)
audit.example.com (Audit capablility)
element-call.example.com (Our new VoIP platform)
sfu.example.com (Scaling for Element Call)

As mentioned above, this list excludes postgresql, groupsync, adminbot, auditbot, and prometheus.

Wildcard certificates do work with our application and it would be possible to have a certificate that validated *.example.com and example.com for the above scenario. It is key to do both the base domain and the wildcard in the same certificate in order for this to work.

Further, if you want to do voice or video across network boundaries (ie: between people not on the same local network), you will need a TURN server. If you already have one, you do not have to set up coturn. If you do not already have a TURN server, you will want to set up coturn (our installer can do this for you) and if your server is behind NAT, you will need to have an external IP in order for coturn to work.

Installation

Airgapped Environments

If you are going to be installing into an airgapped environment (one without internet connectivity), you will need to also download the airgapped installer, which is ~6GB of data that will need to be transferred to your airgapped environment. More information on this can be found in our airgapped installation documentation here: https://ems-docs.element.io/books/element-on-premise-documentation/page/using-the-installer-in-an-air-gapped-environment

Software

To obtain our software, please visit our downloads page at: https://ems.element.io/on-premise/download

Kubernetes Application (Multiple Nodes)

For an installation into a kubernetes environment, make sure you have a Kubernetes platform deployed that you have access to and head over to Kubernetes Installations. You will also need a linux computer to run the installer from. That computer should be running RHEL 8 or RHEL 9 or Ubuntu.

Standalone (Single Node)

For a standalone installation, please note that we support these on the following platforms based on the ESS version

LTS ESS Version	Supported Ubuntus	Supported Enterprise Linux (RHEL, Rocky, etc)	General Python Version requirements
23.10	20.04, 22.04	8, 9	Python 3.8-3.10
24.04	20.04, 22.04	8, 9	Python 3.9-3.11
24.10	22.04, 24.04	8, 9	Python 3.10-3.12

Once you have a server with one of these installed, please head over to Single Node Installations

Kubernetes Installations

Overview

Our Installer can handle the installation of Element Enterprise into your existing production kubernetes (k8s) environment.

Server minimum requirements

The ESS deployment resource usage is described in ESS Sizing.

Prerequisites

Before beginning the installation, there are a few things that must be prepared to ensure a successful deployment and functioning installation.

Python environment

The installer needs python3, pip3 and python3-venv installed to run.

Kubectl environment

The installer uses your currently active kubectl context which can be determined with kubectl config current-context - make sure this is the correct context as all subsequent operations will be performed under this.

More information on configuring this can be found in the upstream kubectl docs

Be sure to export K8S_AUTH_CONTEXT=<kube context name> for the Installer if you need to use a context aside from your currently active one.

PostgreSQL

Before you can begin with the installation you must have a PostgreSQL database instance available. The installer does not manage databases itself.

The database you use must be set to a locale of C and use UTF8 encoding

see https://element-hq.github.io/synapse/latest/postgres.html#set-up-database for further details as they relate to Synapse. If the locale / encoding are incorrect, Synapse will fail to initialize the database and get stuck in a CrashLoopBackoff cycle.

Please make note of the database hostname, database name, user, and password as you will need these to begin the installation.

For testing and evaluation purposes, you can deploy PostgreSQL to k8s before you begin the installation process - see Kubernetes Installations - Quick Start - Deploying PostgreSQL to Kubernetes for more information.

Kubernetes Ingress Controller

The installer does not manage cluster Ingress capabilities since this is typically a cluster-wide concern - You must have this available prior to installation. Without a working Ingress Controller you will be unable to route traffic to your services without manual configuration.

If you do not have an Ingress Controller deployed please see Kubernetes Installations - Quick Start - Deploying ingress-nginx to Kubernetes for information on how to set up a bare-bones ingress-nginx installation to your cluster.

Use an existing Ingress Controller

If you have an Ingress Controller deployed already and it is set to the default class for the cluster, you shouldn't have to do anything else.

If you're unsure you can see which providers are available in your cluster with the following command:

$ kubectl get IngressClass
NAME    CONTROLLER             PARAMETERS   AGE
nginx   k8s.io/ingress-nginx   <none>       40d

And you can check to see whether an IngressClass is set to default using kubectl, for example:

$ kubectl describe IngressClass nginx
Name:         nginx
Labels:       app.kubernetes.io/component=controller
              app.kubernetes.io/instance=ingress-nginx
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=ingress-nginx
              app.kubernetes.io/part-of=ingress-nginx
              app.kubernetes.io/version=1.1.1
              argocd.argoproj.io/instance=ingress-nginx
              helm.sh/chart=ingress-nginx-4.0.17
Annotations:  ingressclass.kubernetes.io/is-default-class: true
Controller:   k8s.io/ingress-nginx
Events:       <none>

In this example cluster there is only an nginx IngressClass and it is already default, but depending on the cluster you are deploying to this may be something you must manually set.

Beginning the Installation

Head to https://ems.element.io/on-premise/download and download the latest installer. The installer will be called element-enterprise-graphical-installer-YYYY-MM.VERSION-gui.bin. You will take this file and copy it to the machine where you will be installing the Element Server Suite from. This machine will need to be running on RHEL 8, RHEL 9, or Ubuntu and have access to your kubernetes cluster network. Once you have this file on the machine in a directory accessible to your sudo-enabled user, you will run:

chmod +x ./element-enterprise-graphical-installer-YYYY-MM.VERSION-gui.bin

replacing the YYYY-MM.VERSION with the appropriate tag for the installer you downloaded.

If you have multiple kubernetes clusters configured in your kubeconfig, you will have to export the K8S_AUTH_CONTEXT variable before running the installer :

export K8S_AUTH_CONTEXT=<kube context name>

Once you have done this, you will run:

./element-enterprise-graphical-installer-YYYY-MM.VERSION-gui.bin

replacing the YYYY-MM.VERSION with the appropriate tag for the installer you downloaded, and this will start a web server with the installer loaded.

You will see a message similar to:

[user@element-demo ~]$ ./element-enterprise-graphical-installer-2023-02.02-gui.bin 
Testing network...

Using self-signed certificate with SHA-256 fingerprint:
         F3:76:B3:2E:1B:B3:D2:20:3C:CD:D0:72:A3:5E:EC:4F:BC:3E:F5:71:37:0B:D7:68:36:2E:2C:AA:7A:F2:83:94 

To start configuration open:
        https://192.168.122.47:8443 or https://10.1.185.64:8443 or https://127.0.0.1:8443

At this point, you will need to open a web browser and browse to one of these IPs. You may need to open port 8443 in your firewall to be able to access this address from a different machine.

If you are unable to open port 8443 or you are having difficulty connecting from a different machine, you may want to try ssh port forwarding in which you would run:

ssh <host> -L 8443:127.0.0.1:8443

replacing host with the IP address or hostname of the machine that is running the installer. At this point, with ssh connected in this manner, you should be able to use the https://127.0.0.1:8443 link as this will then forward that request to the installer box via ssh.

Upon loading this address for the first time, you may be greeted with a message informing you that your connection isn't private such as this:

In this case, you'll need to click "Advanced" and then "Continue to (unsafe)" in order to view the installer. As the exact button names and links can vary between browsers, it would be hard for us to document them all, so you may have slightly different wording depending on your browser.

The Hosts Screen

The very first page that you come to is the host screen.

You will want to make sure that "Kubernetes Application" is selected. You will then need to specify the kuberentes context name for which you are deploying into. (Hint: Use kubectl config view to see which contexts you have access to.)

You can opt to skip the update setup or the operator setup, but unless you know why you are doing that, you should leave them alone.

The very next prompt that you come to is for an EMS Image Store Username and Token. These are provided to you by element as access tokens for our enterprise container registries. If you have lost your token, you can always generate a new token at https://ems.element.io/on-premise/subscriptions.

Here, we find the ability to set the namespaces that the application will be deployed into.

The next options on the hostpage are related to connectivity. For this guide, we are assuming "Connected" and you can leave that be. If you are doing "Airgapped", you would pick airgapped at this point and then please see the section on airgapped installations.

You are presented with the option to provide docker hub credentials. These are optional, but if you do not provide them, you may be rate limited by Docker and this could cause issues pulling container images.

Finally, we come to the host admin page, which allows you to set parameters around which domain the installer and admin console should run on post deployment. This section is optional.

The Domains Screen

On this page, we get to specify the domains for our installation. In this example, we have a domain name of airgap.local and this would mean our MXIDs would look like @kabbott:airgap.local.

Our domain page has checking to ensure that the host names resolve. Once you get green checks across the board, you can click continue.

The Certificates Screen

On the Certificates screen, you will provide SSL certificate information for well-known delegation, Synapse, Element Web, Synapse Admin, and Integrator.

If you are using Let's Encrypt, then each of the sections should look like:

If you are using certificate files, then you will see a screen like:

which allows you to upload a .crt and .key file for each host. These files must be in PEM encoding. Our installer does accept wildcard certificates.

Once you have completed the certificate section for each host on the page, you may click continue.

The Database Screen

As you must use an external PostgreSQL database with our kubernetes install, on this page, we provide the option to specify the database name, the database host name, the port to connect to, the SSL mode to use, and finally, the username and password to connect with. Once you have completed this section, you may click continue.

The Media Screen

On this page, you can specify the size of your synapse media volume. Please leave "Create New Volume" checked and specify the size of the volume that you wish to allocate. You must have this space available in /data/element-deployment or whatever you specified back on the hosts screen. If you wish to create a 50G volume, you would need to specify 50Gi for the Volume size.

The Cluster Screen

Most deployments can ignore this, however, if you want to change any kubernetes cluster parameters, this is where to do it.

If you are in an environment where you have self-signed certificates, you will want to disable TLS verification, by clicking "Advanced" and then scrolling down and unchecking Verify TLS:

Please bear in mind that disabling TLS verification and using self-signed certificates is not recommended for production deployments.

If your host names are not DNS resolvable, you need to use host aliases and this can be set up here. You will also click "Advanced" and scroll down to the "Host Aliases" section in "k8s". In here, you will click "Add Host Aliases" and then you will specify an IP and host names that resolve to that IP as such:

Important: If you are not using OpenShift, you will need to set "Force UID GID" and "Set Sec Comp" to "Enable" under the section "Security Context" so that it looks like:

If you are using OpenShift, you should leave the values of "Force UID GID" and "Set Sec Comp" set to "Auto".

When you are finished with this page, you can click continue.

The Synapse Screen

The first setting that you will come to is our built in performance profiles. Select the appropriate answers for "Monthly Active Users" and "Federation Type" to apply our best practices based on years of running Matrix homeservers.

The next setting that you will see is whether you want to auto accept invites. The default of "Manual" will fit most use cases, but you are welcome to change this value.

The next setting is the maximum number of monthly active users (MAU) that you have purchased for your server. Your server will not allow you to go past this value. If you set this higher than your purchased MAU and you go over your purchased MAU, you will need to true up with Element to cover the cost of the unpaid users.

The next setting concerns registration. A server with open registration on the open internet can become a target, so we default to closed registration. You will notice that there is a setting called "Custom" and this requires explicit custom settings in the additional configuration section. Unless instructed by Element, you will not need the "Custom" option and should instead pick "Closed" or "Open" depending on your needs.

After this, you will see that the installer has picked an admin password for you. You will want to use the eye icon to view the password and copy this down as you will use this with the user onprem-admin-donotdelete to log into the admin panel after installation.

Continuing, we see telemetry. You should leave this enabled as you are required to report MAU to Element. In the event that you are installing into an enviroment without internet access, you may disable this so that it does not continue to try talking to Element. That said, you are still required to generate an MAU report at regular intervals and share that with Element.

For more information on the data that Element collects, please see: What Telemetry Data is Collected by Element?

Next, we have an advanced button, which allows you to configure further settings about synapse and the kubernetes environment in which it runs. The additional configuration text box allows you to inject additional synapse configs (homeserver.yaml).

You can hit continue to go to the next screen.

The Element Web Screen

Most users will be able to simply click "Continue" here.

The Advanced section allows you to set any custom element web configurations you would like (config.json).

A common custom configuration would be configuring permalinks for Element, which we have documented here: Setting up Permalinks With the Installer

Further, it provides access to the k8s section, allowing you to explicitly set any kubernetes cluster settings that you would like just for the element-web pod.

The Homeserver Admin

Most users will be able to simply click "Continue" here. The Advanced section allows you to explicitly set any kubernetes cluster settings that you would like just for the synapse-admin-ui pod.

One word to note here is that if you are not using delegated authentication, then the initial username that an administrator will use to log into this dashboard post-installation is onprem-admin-donotdelete. You can find the password for this user on the Synapse page in the "Admin Password" field.

If you are using delegated authentication, you will need to assign a user admin rights as detailed in this article: How do I give a user admin rights when I am using delegated authentication and cannot log into the admin console?

The Integrator Screen

On this page, you can set up Integrator, the integrations manager.

The first option allows you to choose whether users can add custom widgets to their rooms with the integrator or not.

The next option allows you to specify which Jitsi instance the Jitsi widget will create conferences on.

The verify TLS option allows you to set this specifically for Integrator, regardless of what you set on the cluster screen.

The logging section allows you to set the log level and whether the output should be structured or not.

The Advanced section allows you to explicitly set any kubernetes cluster settings that you would like just for the integrator pods.

Click "Continue to go to the next screen".

The Integrations Screen

This screen is where you can install any available integrations.

Some of these integrations will have "YAML" next to them. When you see this designation, this integration requires making settings in YAML, much like the old installer. However, with this installer, these YAML files are pre-populated and often only involve a few changes.

If you do not see a "YAML" designation next to the integration then this means that will use regular GUI elements to configure this integration.

Over time, we will do the work required to move the integrations with "YAML" next to them to the new GUI format.

For specifics on configuring well known delegation, please see Setting Up Well Known Delegation

For specifics on setting up Delegated Authentication, please see Setting up Delegated Authentication With the Installer

For specifics on setting up Group Sync, please see Setting up Group Sync with the Installer

For specifics on setting up GitLab, GitHub, and JIRA integrations, please see Setting up GitLab, GitHub, and JIRA Integrations With the Installer

For specifics on setting up Adminbot and Auditbot, please see: Setting up Adminbot and Auditbot

For specifics on setting up Hydrogen, please see: Setting Up Hydrogen

For specifics on pointing your installation at an existing Jitsi instance, please see Setting Up Jitsi and TURN With the Installer

If you do not have an existing TURN server or Jitsi server, our installer can configure these for you by following the extra steps in Setting Up Jitsi and TURN With the Installer

For specifics on configuring the Teams Bridge, please see Setting Up the Teams Bridge

For specifics on configuring the Telegram Bridge, please see Setting Up the Telegram Bridge

For specifics on configuring the IRC Bridge, please see Setting Up the IRC Bridge

For specifics on configuring the XMPP Bridge, please see Setting Up the XMPP Bridge

Once you have configured all of the integrations that you would like to configure, you can click "Continue" to head to the installation screen.

The Installation Screen

On the installation screen, you should see a blank console and a start button:

Click Start.

After a moment, you will notice the installer hang. If you go back to the prompt where you are running the installer, you will see that you are being asked for the sudo password:

Go ahead and enter the sudo password and the installation will continue.

On the very first time that you run the installer, you will be prompted to log out and back in again to allow Linux group membership changes to be refreshed. This means that you will need to issue a ctrl-C in the terminal running your installer and actually log all the way out of your Linux session, log back in, restart the installer, navigate back to the installer screen, click start again, and then re-enter your sudo password. You will only have to perform this step once per server.

Verifying Your Installation

Once the installation has finished, it can take as much as 15 minutes on a first run for everything to be configured and set up. If you use:

kubectl get pods -n element-onprem

You should see similar output to:

NAME                                        READY   STATUS    RESTARTS   AGE
app-element-web-c5bd87777-rqr6s             1/1     Running   1          29m
server-well-known-8c6bd8447-wddtm           1/1     Running   1          29m
postgres-0                                  1/1     Running   1          40m
instance-synapse-main-0                     1/1     Running   2          29m
instance-synapse-haproxy-5b4b55fc9c-hnlmp   1/1     Running   0          20m

Once the admin console is up and running:

first-element-deployment-synapse-admin-ui-564cbf5665-dn8nv   1/1     Running                  1 (4h4m ago)   3d1h

and synapse:

first-element-deployment-synapse-redis-59548698df-gqkcq      1/1     Running                  1 (4h4m ago)   3d2h
first-element-deployment-synapse-haproxy-7587dfd6f7-gp6wh    1/1     Running                  2 (4h3m ago)   2d23h
first-element-deployment-synapse-appservice-0                1/1     Running                  3 (4h3m ago)   3d
first-element-deployment-synapse-main-0                      1/1     Running                  0              3h19m

then you should be able to log in at your admin panel (in our case https://admin.airgap.local/) with the onprem-admin-donotdelete user and the password that was specified on the "Synapse" screen.

A word about Configuration Files

In the new installer, all configuration files are placed in the directory .element-enterprise-server. This can be found in your user's home directory. In this directory, you will find a subdirectory called config that contains the actual configurations.

Running the Installer without the GUI

It is possible to run the installer without using the GUI provided that you have a valid set of configuration files in the .element-enterprise-server/config directory. Directions on how to do this are available at: https://ems-docs.element.io/books/ems-knowledge-base/page/how-do-i-run-the-installer-without-using-the-gui. Using this method, you could use the GUI as a configuration editor and then take the resulting configuration and modify it as needed for further installations.

This method also makes it possible to set things up once and then run future updates without having to use the GUI.

End-User Documentation

After completing the installation you can share our User Guide to help orient and onboard your users to Element!

Kubernetes Installations - Quick Start

For testing and evaluation purposes - Element cannot guarantee production readiness with these sample configurations.

Requires Helm installed locally

Deploying PostgreSQL to Kubernetes

If you do not have a database present, it is possible to deploy PostgreSQL to your Kubernetes cluster.

This is great for testing and can work great in a production environment, but only for those with a high degree of comfort with PostgreSQL as well as the tradeoffs involved with k8s-managed databases.

There are many different ways to do this depending on your organization's preferences - as long as it can create an instance / database with the required locale and encoding it will work just fine. For a simple non-production deployment, we will demonstrate deployment of the bitnami/postgresql into your cluster using Helm.

You can add the bitnami repo with a few commands:

helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
helm search repo bitnami/postgresql                                                                                                                                                                                                        ~/Desktop
NAME                    CHART VERSION   APP VERSION     DESCRIPTION                                       
bitnami/postgresql      12.5.7          15.3.0          PostgreSQL (Postgres) is an open source object-...
bitnami/postgresql-ha   11.7.5          15.3.0          This PostgreSQL cluster solution includes the P...

Next, you'll need to create a values.yaml file to configure your PostgreSQL instance. This example is enough to get started, but please consult the chart's README and values.yaml for a list of full parameters and options.

auth:
  # This is the necessary configuration you will need for the Installer, minus the hostname
  database: "synapse"
  
  username: "synapse"
  password: "PleaseChangeMe!"

primary:
  initdb:
    # This ensures that the initial database will be created with the proper collation settings
    args: "--lc-collate=C --lc-ctype=C"

  persistence:
    enabled: true
    # Set this value if you need to use a non-default StorageClass for your database's PVC
    # storageClass: ""
    size: 20Gi

  
  # Optional - resource requests / requirements
  # These are sufficient for a 10 - 20 user server
  resources:
    requests:
      cpu: 500m
      memory: 512Mi
    limits:
      memory: 2Gi

This example values.yaml file is enough to get you started for testing purposes, but things such as TLS configuration, backups, HA and maintenance tasks are outside of the scope of the installer and this document.

Next, pick a namespace to deploy it to - this can be the same as the Installer's target namespace if you desire. For this example we'll use the postgresql namespace.

Then it's just a single Helm command to install:

# format:
# helm install --create-namespace -n <namespace> <helm-release-name> <repo/chart> -f <values file> (-f <additional values file>)

helm install --create-namespace -n postgresql postgresql bitnami/postgresql -f values.yaml

Which should output something like this when it is successful:

-- snip --

PostgreSQL can be accessed via port 5432 on the following DNS names from within your cluster:

    postgresql.postgresql.svc.cluster.local - Read/Write connection
-- snip --

This is telling us that postgresql.postgresql.svc.cluster.local will be our hostname for PostgreSQL connections, which is the remaining bit of configuration required for the Installer in addition to the database/username/password set in values.yaml. This will differ depending on what namespace you deploy to, so be sure to check everything over.

If needed, this output can be re-displayed with helm get notes -n <namespace> <release name>, which for this example would be helm get notes -n postgresql postgresql)

Deploying ingress-nginx controller

Similar to the PostgreSQL quick start example, this requires Helm

The kubernetes/ingress-nginx chart is an easy way to get a cluster outfitted with Ingress capabilities.

In an environment where LoadBalancer services are handled transparently, such as in a simple test k3s environment with svclb enabled there's a minimal amount of configuration.

This example values.yaml file will create an IngressClass named nginx that will be used by default for any Ingress objects in the cluster.

controller:
  ingressClassResource:
    name: nginx
    default: true
    enabled: true

However, depending on your cloud provider / vendor (i.e. AWS ALB, Google Cloud Load Balancing etc) the configuration for this can vary widely. There are several example configurations for many cloud providers in the chart's README

You can see what your resulting HTTP / HTTPS IP address for this ingress controller by examining the service it creates - for example, in my test environment I have an installed release of the ingress-nginx chart called k3s under the ingress-nginx namespace, so I can run the following:

# format:
# kubectl get service -n <namespace> <release-name>-ingress-nginx-controller
$ kubectl get service -n ingress-nginx k3s-ingress-nginx-controller

NAME                                   TYPE           CLUSTER-IP      EXTERNAL-IP                                               PORT(S)                      AGE
k3s-ingress-nginx-controller            LoadBalancer   10.43.254.210   192.168.1.129                                             80:30634/TCP,443:31500/TCP   79d

The value of EXTERNAL-IP will be the address that you'll need your DNS to point to (either locally via /etc/hosts or LAN / WAN DNS configuration) to access your installer-provisioned services.

Single Node Installations

Installing a Standalone Server

Overview

Our installer can handle the installation of environments in which only one server is available. This environment consists of a single server with a microk8s deployment in which we deploy our Element Server Suite to, resulting in a fully functioning version of our platform.

To get started with a standalone installation, there are several things that need to be considered and this guide will work through them:

Operating System
Postgresql Database
TURN Server
SSL Certificates
Extra configuration items

Once these areas have been covered, you'll be ready to install your standalone server!

Server minimum requirements

CPU and Memory

The installer binary requires support for the x86_64 architecture. The Standalone deployment will need 2 GiB of memory to run properly the OS and microk8s. The ESS deployment resource usage is described in ESS Sizing.

Disk size

It is crucial that your storage provider supports fsync for data integrity.

/var : 50Gb
/data/element-deployment : It will contain your Synapse medias. The path can be adjusted in the UI. Please refer to ESS Sizing page to find an estimation of the expecting size growth.
/data/postgres : It will contain your postgres servers data. The path can be adjusted in the U. Please refer to ESS Sizing page to find an estimation of the expecting size growth.

Operating System

We provide support for Ubuntu 20.04 and Red Hat Enterprise Linux (RHEL) versions 8 and 9 and suggest that you start there as well. Please note that the installer binary requires support for the x86_64 architecture.

You can grab an Ubuntu iso here:

https://releases.ubuntu.com/20.04.3/ubuntu-20.04.3-live-server-amd64.iso

You can get Red Hat Enterprise Linux 8 with a Developer Subscription at:

https://access.redhat.com/downloads/content/479/ver=/rhel---8/8.7/x86_64/product-software

Ubuntu Specific instructions

Make sure to select docker as a package option. Do set up ssh.

Once you log in, please run:

sudo apt-get update
sudo apt-get upgrade
sudo apt-get install git

The installer requires that you run it as a non-root user who has sudo permissions. Please make sure that you have a user who can use sudo. If you wanted to make a user called element-demo that can use sudo, the following commands (run as root) would achieve that:

useradd element-demo
gpasswd -a element-demo sudo

The installer also requires that your non-root user has a home directory in /home.

RHEL Specific instructions

Make sure to select "Container Management" in the "Additional Software" section.

Once you log in, please run:

sudo yum update -y
sudo yum install python39-pip python39-devel make gcc git -y
sudo yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm -y
sudo update-alternatives --config python3

You should also follow the steps linked here to Install microk8s on RHEL, or included below, if you run into Error: System does not fully support snapd: cannot mount squashfs image using "squashfs":

Install the EPEL repository

RHEL9:

sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm
sudo dnf upgrade

RHEL8:

sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
sudo dnf upgrade

Install Snap, enable main snap communication socket and enable classic snap support

sudo yum install snapd
sudo systemctl enable --now snapd.socket
sudo ln -s /var/lib/snapd/snap /snap

Reboot
(Optional) Install microk8s using sudo snap install microk8s --classic, the installer will do this for you otherwise.

On the update-alternatives command, if you see more than one option, select the option with a command string of /usr/bin/python3.9.

useradd element-demo
gpasswd -a element-demo wheel

The installer also requires that your non-root user has a home directory in /home.

Kernel modules

microk8s rquires the kernel module nf_conntrack to be enabled.

if ! grep nf_conntrack /proc/modules; then
    echo "nf_conntrack" | sudo tee --append /etc/modules
    sudo modprobe nf_conntrack
fi

Migrating from our older installer

If you have previously used installer versions 2023-03.01 and earlier, you will need to run our migration script to convert your previous configuration to the new format that is used with our UI based installer. This script became available in 2023-03.02, so you must have at least that version or higher of the graphical installer for this to work.

NOTE: Before running the migration script, we highly recommend that you take a backup or snapshot of your working environment. While we have tested the migration script against several configurations at this point, we have not tested for all of the combinations of configuration that the previous installer allowed. We expect that migration will be a quick process for most customers, but in the event that something goes wrong, you'll want to be able to get back to a known good state through a backup or snapshot.

NB: If you are using group sync, you cannot presently migrate to the graphical installer. We are working to address the issues with migrating group sync and will remove this note once we have those addressed.

If you have not used our installer before, you may safely ignore this section.

To run the migration script, please do the following:

chmod +x ./element-enterprise-graphical-installer-YYYY-MM.VERSION-gui.bin
./element-enterprise-graphical-installer-YYYY-MM.VERSION-gui.bin --import ~/.element-onpremise-config

Make sure to replace ~/.element-onpremise-config with the path that your actual configuration exists in. Further, replace YYYY-MM.VERSION with the appropriate tag for the installer you downloaded.

Once the import has finished, the GUI will start and you will be able to browse to the installer at one of the provided URLs, much as if you had started the installer without doing a migration as detailed in the following section.

Network Specifics

Element Enterprise On-Premise needs to bind and serve content over:

Port 80 TCP
Port 443 TCP
Port 8443 TCP ( Installer GUI )

microk8s needs internally to bind and serve content over:

Port 16443 TCP
Port 10250 TCP
Port 10255 TCP
Port 25000 TCP
Port 12379 TCP
Port 10257 TCP
Port 10259 TCP
Port 19001 TCP

For more information, see https://microk8s.io/docs/ports.

In a default Ubuntu installation, these ports are allowed through the firewall. You will need to ensure that these ports are passed through your firewall.

For RHEL instances with firewalld enabled, the installer will take care of opening these ports for you.

Further, you need to make sure that your host is able to access the following hosts on the internet:

api.snapcraft.io
*.snapcraftcontent.com
gitlab.matrix.org
gitlab-registry.matrix.org
pypi.org
docker.io
*.docker.com
get.helm.sh
k8s.gcr.io
cloud.google.com
storage.googleapis.com
registry.k8s.io
fastly.net
GitHub.com

In addition, you will also need to make sure that your host can access your distributions' package repositories. As these hostnames can vary, it is beyond the scope of this documentation to enumerate them.

Network Proxies

We also cover the case where you need to use a proxy to access the internet. Please see this article for more information: Configuring a microk8s Single Node Instance to Use a Network Proxy

Postgresql Database

The installation requires that you have a postgresql database with a locale of C and UTF8 encoding set up. See https://github.com/element-hq/synapse/blob/develop/docs/postgres.md#set-up-database for further details.

If you have this already, please make note of the database name, user, and password as you will need these to begin the installation.

If you do not already have a database, then the single node installer will set up PostgreSQL on your behalf.

Beginning the Installation

Head to https://ems.element.io/on-premise/download and download the latest installer. The installer will be called element-enterprise-graphical-installer-YYYY-MM.VERSION-gui.bin. You will take this file and copy it to the machine where you will be installing the Element Server Suite. Once you have this file on the machine in a directory accessible to your sudo-enabled user, you will run:

chmod +x ./element-enterprise-graphical-installer-YYYY-MM.VERSION-gui.bin

replacing the YYYY-MM.VERSION with the appropriate tag for the installer you downloaded.

Once you have done this, you will run:

./element-enterprise-graphical-installer-YYYY-MM.VERSION-gui.bin

replacing the YYYY-MM.VERSION with the appropriate tag for the installer you downloaded, and this will start a web server with the installer loaded.

You will see a message similar to:

[user@element-demo ~]$ ./element-enterprise-graphical-installer-2023-02.02-gui.bin 
Testing network...

Using self-signed certificate with SHA-256 fingerprint:
         F3:76:B3:2E:1B:B3:D2:20:3C:CD:D0:72:A3:5E:EC:4F:BC:3E:F5:71:37:0B:D7:68:36:2E:2C:AA:7A:F2:83:94 

To start configuration open:
        https://192.168.122.47:8443 or https://10.1.185.64:8443 or https://127.0.0.1:8443

At this point, you will need to open a web browser and browse to one of these IPs. You may need to open port 8443 in your firewall to be able to access this address from a different machine.

If you are unable to open port 8443 or you are having difficulty connecting from a different machine, you may want to try ssh port forwarding in which you would run:

ssh <host> -L 8443:127.0.0.1:8443

Upon loading this address for the first time, you may be greeted with a message informing you that your connection isn't private such as this:

In this case, you'll need to click "Advanced" and then "Continue to <IP> (unsafe)" in order to view the installer. As the exact button names and links can vary between browsers, it would be hard for us to document them all, so you may have slightly different wording depending on your browser.

The Hosts Screen

The very first page that you come to is the host screen.

$host_page1.png$

You will want to make sure that "Standalone" is selected. If you are using LetsEncrypt for your certificates, you will want to make sure that you select "Setup Cert Manager" and enter an email address for LetsEncrypt to associate with your certificates. If you are using custom certificates or electing to manage SSL certificates yourself, then you will want to select "Skip Cert Manager".

The next option that you have is for microk8s. By default, microk8s will set up persistent volumes in /data/element-deployment and will allow 20GB of space to do this. For most installations, this is fine and can be left alone, but if you'd like to customize those options, you can do that here.

Next, we have DNS resolvers. The default DNS resolvers are Google (8.8.8.8 and 8.8.4.4). If you need to use your company's DNS servers, please change these values appropriately.

Next, we get the option to either have the installer install Postgres in your cluster or to use an external postgresql server. The Postgres in cluster option is only supported for our standalone installation and you should read our storage and backup guidelines for this configuration. At any rate, if you use the in cluster postgres, you will see that the installer defaults to /data/postgres and has generated a random password for your postgresql admin account. You can use the eye to see the password and you can certainly change this to whatever you'd like.

The final options on the host page are related to connectivity. For this guide, we are assuming "Connected" and you can leave that be. If you are doing "Airgapped", you would pick airgapped at this point and then please see the section on airgapped installations.

The Domains Screen

On this page, we get to specify the domains for our installation. In this example, we have a domain name of airgap.local and this would mean our MXIDs would look like @kabbott:airgap.local.

Our domain page has checking to ensure that the host names resolve. Once you get green checks across the board, you can click continue.

The Certificates Screen

On the Certificates screen, you will provide SSL certificate information for well-known delegation, Synapse, Element Web, Synapse Admin, and Integrator.

2 options

Option 1: You already host a base domain example.com on a web server, then Well-Known Delegation should be set to Externally Managed.

Element clients need to be able to request https://example.com/.well-known/matrix/client to work properly.

The web server hosting the domain name should forward the requests to .well-known/matrix/client to the element enterprise server so that the wellKnownPod can serve it to the clients.

If that's not possible, the alternative is to copy the well known file directly on the example.com web server. The wellKnownPod will still be present but wont be used by any system.

It cannot be set to Certmanager / Let's Encrypt.

Option 2: You don't already host a base domain example.com, then the wellKnownPod hosts the well-known file and serves the base domain example.com

You can choose those 3 different settings:

Certmanager / Let's Encrypt: the certificate for the base domain is signed by Let's Encrypt
Certificate File: the certificate is signed by your own CA or by a public CA (Verisign, Sectigo,..)
Existing TLS Certificate in the Cluster: certificate already uploaded in a secret

If you are using Let's Encrypt, then each of the sections should look like:

If you are using certificate files, then you will see a screen like:

which allows you to upload a .crt and .key file for each host. These files must be in PEM encoding. Our installer does accept wildcard certificates.

Once you have completed the certificate section for each host on the page, you may click continue.

The Database Screen

If you have elected to have the installer configure PostgreSQL for you, then you will not see this screen and can skip this section.

If you are using an external database, then you will see this page, where we provide the option to specify the database name, the database host name, the port to connect to, the SSL mode to use, and finally, the username and password to connect with.

If your database is installed on the same server where Element is installed, you have to enter the server public IP address since the container is not sharing the host network namespace. Entering 127.0.0.1 will resolve to the container itself and cause the installation failing.

Once you have completed this section, you may click continue.

The Cluster Screen

Most deployments can ignore this, however, if you want to change any microk8s cluster parameters, this is where to do it.

If you are in an environment where you have self-signed certificates, you will want to disable TLS verification, by clicking "Advanced" and then scrolling down and unchecking Verify TLS:

Please bear in mind that disabling TLS verification and using self-signed certificates is not recommended for production deployments.

When you are finished with this page, you can click continue.

The Synapse Screen

The next setting that you will see is whether you want to auto accept invites. The default of "Manual" will fit most use cases, but you are welcome to change this value.

Continuing, we see telemetry. You should leave this enabled as you are required to report MAU to Element. In the event that you are installing into an environment without internet access, you may disable this so that it does not continue to try talking to Element. That said, you are still required to generate an MAU report at regular intervals and share that with Element.

For more information on the data that Element collects, please see: What Telemetry Data is Collected by Element?

You can hit continue to go to the next screen.

The Element Web Screen

Most users will be able to simply click "Continue" here.

The Advanced section allows you to set any custom element web configurations you would like (config.json).

A common custom configuration would be configuring permalinks for Element, which we have documented here: Setting up Permalinks With the Installer

Further, it provides access to the k8s section, allowing you to explicitly set any microk8s cluster settings that you would like just for the element-web pod.

The Homeserver Admin

Most users will be able to simply click "Continue" here. The Advanced section allows you to explicitly set any microk8s cluster settings that you would like just for the synapse-admin-ui pod.

The Integrator Screen

On this page, you can set up Integrator, the integrations manager.

The first option allows you to choose whether users can add custom widgets to their rooms with the integrator or not.

The next option allows you to specify which Jitsi instance the Jitsi widget will create conferences on.

The verify TLS option allows you to set this specifically for Integrator, regardless of what you set on the cluster screen.

The logging section allows you to set the log level and whether the output should be structured or not.

The Advanced section allows you to explicitly set any microk8s cluster settings that you would like just for the integrator pods.

Click "Continue to go to the next screen".

The Integrations Screen

This screen is where you can install any available integrations.

If you do not see a "YAML" designation next to the integration then this means that will use regular GUI elements to configure this integration.

Over time, we will do the work required to move the integrations with "YAML" next to them to the new GUI format.

For specifics on configuring well known delegation, please see Setting Up Well Known Delegation

For specifics on setting up Delegated Authentication, please see Setting up Delegated Authentication With the Installer

For specifics on setting up Group Sync, please see Setting up Group Sync with the Installer

For specifics on setting up GitLab, GitHub, and JIRA integrations, please see Setting up GitLab, GitHub, and JIRA Integrations With the Installer

For specifics on setting up Adminbot and Auditbot, please see: Setting up Adminbot and Auditbot

For specifics on setting up Hydrogen, please see: Setting Up Hydrogen

For specifics on pointing your installation at an existing Jitsi instance, please see Setting Up Jitsi and TURN With the Installer

If you do not have an existing TURN server or Jitsi server, our installer can configure these for you by following the extra steps in Setting Up Jitsi and TURN With the Installer

For specifics on configuring the Teams Bridge, please see Setting Up the Teams Bridge

For specifics on configuring the Telegram Bridge, please see Setting Up the Telegram Bridge

For specifics on configuring the IRC Bridge, please see Setting Up the IRC Bridge

For specifics on configuring the XMPP Bridge, please see Setting Up the XMPP Bridge

Once you have configured all of the integrations that you would like to configure, you can click "Continue" to head to the installation screen.

The Installation Screen

On the installation screen, you should see a blank console and a start button:

Click Start.

After a moment, you will notice the installer hang. If you go back to the prompt where you are running the installer, you will see that you are being asked for the sudo password:

Go ahead and enter the sudo password and the installation will continue.

Verifying Your Installation

Once the installation has finished, it can take as much as 15 minutes on a first run for everything to be configured and set up. If you use:

kubectl get pods -n element-onprem

You should see similar output to:

NAME                                        READY   STATUS    RESTARTS   AGE
app-element-web-c5bd87777-rqr6s             1/1     Running   1          29m
server-well-known-8c6bd8447-wddtm           1/1     Running   1          29m
postgres-0                                  1/1     Running   1          40m
instance-synapse-main-0                     1/1     Running   2          29m
instance-synapse-haproxy-5b4b55fc9c-hnlmp   1/1     Running   0          20m

Once the admin console is up and running:

first-element-deployment-synapse-admin-ui-564cbf5665-dn8nv   1/1     Running                  1 (4h4m ago)   3d1h

and synapse:

first-element-deployment-synapse-redis-59548698df-gqkcq      1/1     Running                  1 (4h4m ago)   3d2h
first-element-deployment-synapse-haproxy-7587dfd6f7-gp6wh    1/1     Running                  2 (4h3m ago)   2d23h
first-element-deployment-synapse-appservice-0                1/1     Running                  3 (4h3m ago)   3d
first-element-deployment-synapse-main-0                      1/1     Running                  0              3h19m

then you should be able to log in at your admin panel (in our case https://admin.airgap.local/) with the onprem-admin-donotdelete user and the password that was specified on the "Synapse" screen.

Manually create your first user

If you wish to create users from your terminal, run the following command:

$ kubectl --namespace element-onprem exec --stdin --tty \
    first-element-deployment-synapse-main-0 \
    -- register_new_matrix_user --config /config/homeserver.yaml

New user localpart: your_username
Password: 
Confirm password: 
Make admin [no]: yes
Sending registration request...
Success!

Make sure to enter yes on Make admin if you wish to use this user on the installer or standalone Admin page.

Please note, you should be using the Admin page or the Synapse Admin API instead of kubectl/register_new_matrix_user to create subsequent users.

A word about Configuration Files

Running the Installer without the GUI

This method also makes it possible to set things up once and then run future updates without having to use the GUI.

Cleaning up images cache

The installer, from version 24.02, comes with the tool crictl which lets you interact with microk8s containerd daemon.

After upgrading, once all pods are running, you might want to run the following command to clean-up old images :

~/.element-enterprise-server/installer/.install-env/bin/crictl -r unix:///var/snap/microk8s/common/run/containerd.sock rmi --prune

Upgrading microk8s

Prior to versions 23.10.35 and 24.04.05

Using the installer to upgrade

Upgrading microk8s rely on uninstalling, rebooting the machine, and reinstalling ESS on the new version. It thus involves a downtime.

To upgrade microk8s, please run the installer with : ./<installer>.bin --upgrade-cluster.

The machine will reboot during the process. Once it has rebooted, log in as the same user, and run : ./<installer>.bin unattended. ESS will be reinstalled on the upgraded microk8s cluster.

Manually upgrading microk8s

Upgrading microk8s

The first step in upgrading microk8s to the latest version deployed by the installer is to remove the existing microk8s installation. Given that all of microk8s is managed by a snap, we can do this without worrying about our Element Enterprise On-Premise installation. The important data for your installation is all stored outside of the snap space and will not be impacted by removing microk8s. Start by running:

sudo snap list

and just determine that microk8s is installed:

[user@element2 element-enterprise-installer-2022-05.06]$ sudo snap list
Name      Version    Rev    Tracking     Publisher   Notes
core      16-2.55.5  13250  -            canonical✓  core
core18    20220428   2409   -            canonical✓  base
microk8s  v1.21.13    3410   1.21/stable  canonical✓  classic

Once you've made sure that microk8s is installed, remove it by running:

sudo snap remove microk8s

Now at this point, you should be able to verify that microk8s is no longer installed by running:

sudo snap list

and getting output similar to:

[user@element2 element-enterprise-installer-2022-05.06]$ sudo snap list
Name      Version    Rev    Tracking     Publisher   Notes
core      16-2.55.5  13250  -            canonical✓  core
core18    20220428   2409   -            canonical✓  base

Now that you no longer have microk8s installed, you are ready to run the latest installer. Once you run the latest installer, it will install the latest version of microk8s.

When the installer finishes, you should see an upgraded version of microk8s installed if you run sudo snap list similar to:

Name      Version   Rev    Tracking       Publisher   Notes
core18    20220706  2538   latest/stable  canonical✓  base
microk8s  v1.24.3   3597   1.24/stable    canonical✓  classic
snapd     2.56.2    16292  latest/stable  canonical✓  snapd

At this point, you will need to reboot the server to restore proper networking into the microk8s cluster. After a reboot, wait for your pods to start and your Element Enterprise On-Premise installation is now running a later version of microk8s.

After versions 23.10.35 and 24.04.05 and 24.05.01

Microk8s will be upgraded gracefully automatically when the new installer is used. The upgrade involves upgrading the addons, and might involve a downtime of a couple of minutes while it runs.

End-User Documentation

After completing the installation you can share our User Guide to help orient and onboard your users to Element!

Single Node Installs: Storage and Backup Guidelines

General storage recommentations for single-node instances

/data is where the standalone deployment installs PostgreSQL data and Element Deployment data. It should be a distinct mount point.
- Ideally this would have an independent lifecycle from the server itself
- Ideally this would be easily snapshot-able, either at a filesystem level or with the backing storage

Adminbot storage:

Files stored with uid=10006/gid=10006, default config uses /data/element-deployment/adminbot for single-node instances
Storage space required is proportional to the number of user devices on the server. 1GB is sufficient for most servers

Auditbot storage:

Files stored with uid=10006/gid=10006, default config uses /data/element-deployment/auditbot for single-node instances
Storage space required is proportional to the number of events tracked.

Synapse storage:

Media:
- File stored with uid=10991/gid=10991, default config uses /data/element-deployment/synapse for single-node instances
- Storage space required grows with the number and size of uploaded media. For more information, see ESS Sizing

Postgres (in-cluster) storage:

Files stored with uid=999/gid=999, default config uses /data/postgres for single-node instances

Backup Guidance:

Adminbot:
- Backups should be made by taking a snapshot of the PV (ideally) or rsyncing the backing directory to backup storage
Auditbot:
- Backups should be made by taking a snapshot of the PV (ideally) or rsyncing the backing directory to backup storage
Synapse Media:
- Backups should be made by taking a snapshot of the PV (ideally) or rsyncing the backing directory to backup storage
Postgres (in-cluster):
- Backups should be made by kubectl -n element-onprem exec -it postgres-synapse-0 -- sh -c 'pg_dump -U $POSTGRES_USER $POSTGRES_DB' > synapse_postgres_backup_$(date +%Y%m%d-%H%M%S).sql
Postgres (external):
- Backup procedures as per your DBA
Configuration:
- Please ensure that your entire configuration directory (that contains at least parameters.yml & secrets.yml but may also include other sub-directories & configuration files) is regularly backed up
  - The suggested configuration path in Element's documentation is ~/.element-onpremise-config but could be anything. It is whatever directory you used with the installer.

Configuring Element Desktop

Element Desktop is a Matrix client for desktop platforms with Element Web at its core.

You can download Element Desktop for Mac, Linux or Windows from the Element downloads page.

See https://web-docs.element.dev/ for the Element Web and Desktop documentation.

Aligning Element Desktop with your ESS deployed Element Web

By default, Element Desktop will be configured to point to the Matrix.org homeserver, however this is configurable by supplying a User Specified config.json.

As Element Desktop is mainly Element Web, but packaged as a Desktop application, this config.json is identical to the config.json ESS will configure and deploy for you at https://<element_web_fqdn>/config.json, so it is recommended to setup Element Desktop using that file directly.

How you do this will depend on your specific environment, but you will need to ensure the config.json is placed in the correct location to be used by Element Desktop.

%APPDATA%\$NAME\config.json on Windows
$XDG_CONFIG_HOME/$NAME/config.json or ~/.config/$NAME/config.json on Linux
~/Library/Application Support/$NAME/config.json on macOS

In the paths above, $NAME is typically Element, unless you use --profile $PROFILE in which case it becomes Element-$PROFILE.

As Microsoft Windows File Explorer by default hides file extensions, please double check to ensure the config.json does indeed have the .json file extension, not .txt.

Customising your desktop configuration

You may wish to further customise Element Desktop, if the changes you wish to make should not also apply to your ESS deployed Element Web, you will need to add them in addition to your existing config.json.

You can find Desktop specific configuration options, or just customise using any options from the Element Web Config docs.

The Element Desktop MSI

Where to download

Customers who have a subscription to the Enterprise edition of the Element Server Suite (ESS) can download a MSI version of Element Desktop. This version of Element Desktop is by default installed into Program Files (instead of per user) and can be used to deploy into enterprise environments. To download, login to your EMS Account and access from the same download page you'd find the enterprise installer, https://ems.element.io/on-premise/download.

Using the Element Desktop MSI

The Element Desktop MSI can be used to install Element Desktop to all desired machines in your environment, unlike the usual installer, you can customise it's install directory (which now defaults to Program Files).

You can customise the installation directory by installing the MSI using, or just generally configuring the APPLICATIONFOLDER:

msiexec /i "Element 1.11.66.msi" APPLICATIONFOLDER="C:\Element"

MSI and `config.json`

Once users run Element for the first time, an Element folder will be created in their AppData profile specific to that user. By using Group Policy, Logon Scripts, SCCM or whatever other method you like, ensure the desired config.json is present within %APPDATA%\Element. (The config.json can be present prior to the directories creation.)

Using the Installer in an Air-Gapped Environment

Defining Air-Gapped Environments

An air-gapped environment is any environment in which the running hosts will not have access to the greater internet. This proposes a situation in which these hosts are unable to get access to various needed bits of software from Element and also are unable to share telemetry data back with Element.

For some of these environments, they can be connected to the internet from time to time and updated during those connection periods. In other environments, the hosts are never connected to the internet and everything must be moved over sneaker net.

This guide will cover running the microk8s installer when only sneaker net is available as that is the most restrictive of these environments.

Preparing the media to sneaker net into the air gapped environment

You will need our airgapped dependencies .tar.gz file which you can get from Element:

element-enterprise-installer-airgapped-<version>-gui.tar.gz

Running the installer in the air gapped environment

Extract the airgapped dependencies to a directory on the machine you are installing from. You obtain the following directories :

airgapped/pip
airgapped/galaxy
airgapped/snaps
airgapped/containerd
airgapped/images

Your airgapped machine will still require access to airgapped linux repositories depending on your OS. If using Red Hat Enterprise Linux, you will also need access to the EPEL repository in your airgapped environment.

When using the installer, select "Airgapped" on the first hosts screen.

The Local Registry parameter should be left alone unless you have a separate custom registry that you would like to use.

For the Source directory, you need to specify the absolute path to the airgapped directory that was extracted from the tarball.

The installer will upload the images automatically to your local registry, and use these references to start the workloads.

If you are doing a kubernetes installation (instead of a single node installation), please note that once the image upload has been done, you will need to copy the airgapped/images/images_digests.yml file to the same path on the machine which will be used to render or deploy element services. Doing this, the new image digests will be used correctly in the kubernetes manifests used for deployment.

Troubleshooting

Introduction to Troubleshooting

Troubleshooting the Element Installer comes down to knowing a little bit about kubernetes and how to check the status of the various resources. This guide will walk you through some of the initial steps that you'll want to take when things are going wrong.

Known issues

Installer fails and asks you to start firewalld

The current installer will check if you have firewalld installed on your system. It does expect to find firewalld started as a systemd service if it is installed. If it is not started, the installer will terminate with a failure that asks you to start it. We noticed some Linux distributions like SLES15P4, RHEL8 and AlmaLinux8 that have firewalld installed as a default package but not enabled, or started.

If you hit this issue, you don't need to enable and start firewalld. The workaround is to uninstall firewalld, if you are not planning on using it.

On SLES

zypper remove firewalld -y

On RHEL8

dnf remove firewalld -y

Airgapped installation does not start

If you are using element-enterprise-graphical-installer-2023-03.02-gui.bin and element-enterprise-installer-airgapped-2023-03.02-gui.tar.gz. You might run into an error looking like this:

Looking in links: ./airgapped/pip

WARNING: Url './airgapped/pip' is ignored. It is either a non-existing path or lacks a specific scheme.

ERROR: Could not find a version that satisfies the requirement wheel (from versions: none)

ERROR: No matching distribution found for wheel

The workaround for it is to copy the pip folder from the airgapped directory to ~/.element-enterprise-server/installer/airgapped/pip

Failure downloading https://..., An unknown error occurred: ''CustomHTTPSConnection'' object has no attribute ''cert_file''

Make sure you are using a supported operating system version. See https://ems-docs.element.io/books/element-on-premise-documentation-lts-2404/page/requirements-and-recommendations for more details.

install.sh problems

Sometimes there will be problems when running the ansible-playbook portion of the installer. When this happens, you can increase the verbosity of ansible logging by editing .ansible.rc in the installer directory and setting:

export ANSIBLE_DEBUG=true
export ANSIBLE_VERBOSITY=4

and re-running the installer. This will generate quite verbose output, but that typically will help pinpoint what the actual problem with the installer is.

Problems post-installation

Checking Pod Status and Getting Logs

In general, a well-functioning Element stack has at it's minimum the following containers (or pods in kubernetes language) running:

[user@element2 ~]$ kubectl get pods -n element-onprem
kubectl get pods -n element-onprem
NAME                                                         READY   STATUS    RESTARTS   AGE
first-element-deployment-element-web-6cc66f48c5-lvd7w        1/1     Running   0          4d20h
first-element-deployment-element-call-c9975d55b-dzjw2        1/1     Running   0          4d20h
integrator-postgres-0                                        3/3     Running   0          4d20h
synapse-postgres-0                                           3/3     Running   0          4d20h
first-element-deployment-integrator-59bcfc67c5-jkbm6         3/3     Running   0          4d20h
adminbot-admin-app-element-web-c9d456769-rpk9l               1/1     Running   0          4d20h
auditbot-admin-app-element-web-5859f54b4f-8lbng              1/1     Running   0          4d20h
first-element-deployment-synapse-redis-68f7bfbdc-wht9m       1/1     Running   0          4d20h
first-element-deployment-synapse-haproxy-7f66f5fdf5-8sfkf    1/1     Running   0          4d20h
adminbot-pipe-0                                              1/1     Running   0          4d20h
auditbot-pipe-0                                              1/1     Running   0          4d20h
first-element-deployment-synapse-admin-ui-564bb5bb9f-87zb4   1/1     Running   0          4d20h
first-element-deployment-groupsync-0                         1/1     Running   0          20h
first-element-deployment-well-known-64d4cfd45f-l9kkr         1/1     Running   0          20h
first-element-deployment-synapse-main-0                      1/1     Running   0          20h
first-element-deployment-synapse-appservice-0                1/1     Running   0          20h

The above kubectl get pods -n element-onprem is the first place to start. You'll notice in the above, all of the pods are in the Running status and this indicates that all should be well. If the state is anything other than "Running" or "Creating", then you'll want to grab logs for those pods. To grab the logs for a pod, run:

kubectl logs -n element-onprem <pod name>

replacing <pod name> with the actual pod name. If we wanted to get the logs from synapse, the specific syntax would be:

kubectl logs -n element-onprem first-element-deployment-synapse-main-0

and this would generate logs similar to:

  2022-05-03 17:46:33,333 - synapse.util.caches.lrucache - 154 - INFO - LruCache._expire_old_entries-2887 - Dropped 0 items from caches
2022-05-03 17:46:33,375 - synapse.storage.databases.main.metrics - 471 - INFO - generate_user_daily_visits-289 - Calling _generate_user_daily_visits
2022-05-03 17:46:58,424 - synapse.metrics._gc - 118 - INFO - sentinel - Collecting gc 1
2022-05-03 17:47:03,334 - synapse.util.caches.lrucache - 154 - INFO - LruCache._expire_old_entries-2888 - Dropped 0 items from caches
2022-05-03 17:47:33,333 - synapse.util.caches.lrucache - 154 - INFO - LruCache._expire_old_entries-2889 - Dropped 0 items from caches
2022-05-03 17:48:03,333 - synapse.util.caches.lrucache - 154 - INFO - LruCache._expire_old_entries-2890 - Dropped 0 items from caches

Again, for every pod not in the Running or Creating status, you'll want to use the above procedure to get the logs for Element to look at.
If you don't have any pods in the element-onprem namespace as indicated by running the above command, then you should run:

[user@element2 ~]$ kubectl get pods -A
NAMESPACE            NAME                                        READY   STATUS    RESTARTS   AGE
kube-system          calico-node-2lznr                           1/1     Running   0          8d
kube-system          calico-kube-controllers-c548999db-s5cjm     1/1     Running   0          8d
kube-system          coredns-5dbccd956f-glc8f                    1/1     Running   0          8d
kube-system          dashboard-metrics-scraper-6b6f796c8d-8x6p4  1/1     Running   0          8d
ingress              nginx-ingress-microk8s-controller-w8lcn     1/1     Running   0          8d
cert-manager         cert-manager-cainjector-6586bddc69-9xwkj    1/1     Running   0          8d
kube-system          hostpath-provisioner-78cb89d65b-djfq5       1/1     Running   0          8d
kube-system          kubernetes-dashboard-765646474b-5lhxp       1/1     Running   0          8d
cert-manager         cert-manager-5bb9dd7d5d-cg9h8               1/1     Running   0          8d
container-registry   registry-f69889b8c-zkhm5                    1/1     Running   0          8d
cert-manager         cert-manager-webhook-6fc8f4666b-9tmjb       1/1     Running   0          8d
kube-system          metrics-server-5f8f64cb86-f876p             1/1     Running   0          8d
jitsi                sysctl-jvb-vs9mn                            1/1     Running   0          8d
jitsi                shard-0-jicofo-7c5cd9fff5-qrzmk             1/1     Running   0          8d
jitsi                shard-0-web-fdd565cd6-v49ps                 1/1     Running   0          8d
jitsi                shard-0-web-fdd565cd6-wmzpb                 1/1     Running   0          8d
jitsi                shard-0-prosody-6d466f5bcb-5qsbb            1/1     Running   0          8d
jitsi                shard-0-jvb-0                               1/2     Running   0          8d
operator-onprem      element-operator-controller-manager-...     2/2     Running   0          4d
updater-onprem       element-updater-controller-manager-...      2/2     Running   0          4d
element-onprem       first-element-deployment-element-web-...    1/1     Running   0          4d
element-onprem       first-element-deployment-element-call-...   1/1     Running   0          4d
element-onprem       integrator-postgres-0                       3/3     Running   0          4d
element-onprem       synapse-postgres-0                          3/3     Running   0          4d
element-onprem       first-element-deployment-integrator-...     3/3     Running   0          4d
element-onprem       adminbot-admin-app-element-web-...          1/1     Running   0          4d
element-onprem       auditbot-admin-app-element-web-...          1/1     Running   0          4d
element-onprem       first-element-deployment-synapse-redis-...  1/1     Running   0          4d
element-onprem       first-element-deployment-synapse-haproxy-.. 1/1     Running   0          4d
element-onprem       adminbot-pipe-0                             1/1     Running   0          4d
element-onprem       auditbot-pipe-0                             1/1     Running   0          4d
element-onprem       first-element-deployment-synapse-admin-ui-. 1/1     Running   0          4d
element-onprem       first-element-deployment-groupsync-0        1/1     Running   0          20h
element-onprem       first-element-deployment-well-known-...     1/1     Running   0          20h
element-onprem       first-element-deployment-synapse-main-0     1/1     Running   0          20h
element-onprem       first-element-deployment-synapse-appservice-0 1/1   Running   0          20h

This is the output from a healthy system, but if you have any of these pods not in the Running or Creating state, then please gather logs using the following syntax:

kubectl logs -n <namespace> <pod name>

So to gather logs for the kubernetes ingress, you would run:

kubectl logs -n ingress nginx-ingress-microk8s-controller-w8lcn

and you would see logs similar to:

I0502 14:15:08.467258       6 leaderelection.go:248] attempting to acquire leader lease ingress/ingress-controller-leader...
I0502 14:15:08.467587       6 controller.go:155] "Configuration changes detected, backend reload required"
I0502 14:15:08.481539       6 leaderelection.go:258] successfully acquired lease ingress/ingress-controller-leader
I0502 14:15:08.481656       6 status.go:84] "New leader elected" identity="nginx-ingress-microk8s-controller-n6wmk"
I0502 14:15:08.515623       6 controller.go:172] "Backend successfully reloaded"
I0502 14:15:08.515681       6 controller.go:183] "Initial sync, sleeping for 1 second"
I0502 14:15:08.515705       6 event.go:282] Event(v1.ObjectReference{Kind:"Pod", Namespace:"ingress", Name:"nginx-ingress-microk8s-controller-n6wmk", UID:"548d9478-094e-4a19-ba61-284b60152b85", APIVersion:"v1", ResourceVersion:"524688", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration

Again, for all pods not in the Running or Creating state, please use the above method to get log data to send to Element.

Default administrator

The installer creates a default administrator onprem-admin-donotdelete The Synapse admin user password is defined under the synapse section in the installer

Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress

Delete the updater namespace and Deploy again.

kubectl delete namespaces updater-onprem

microk8s takes a logn time to become ready after system boot

See https://ems-docs.element.io/link/109#bkmrk-kernel-modules

Node-based pods failing name resolution

05:03:45:601 ERROR [Pipeline] Unable to verify identity configuration for bot-auditbot: Unknown errcode Unknown error
05:03:45:601 ERROR [Pipeline] Unable to verify identity. Stopping
matrix-pipe encountered an error and has stopped Error: getaddrinfo EAI_AGAIN synapse.prod.ourdomain
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:84:26) {
  errno: -3001,
  code: 'EAI_AGAIN',
  syscall: 'getaddrinfo',
  hostname: 'synapse.prod.ourdomain'
}

To see what Hosts are set, try:

kubectl exec -it -n element-onprem <pod name> getent hosts

So to do this on the adminbot-pipe-0 pod, it would look like:

kubectl exec -it -n element-onprem adminbot-pipe-0 getent hosts

and return output similar to:

127.0.0.1       localhost
127.0.0.1       localhost ip6-localhost ip6-loopback
10.1.241.27     adminbot-pipe-0
192.168.122.5   ems.onprem element.ems.onprem hs.ems.onprem adminbot.ems.onprem auditbot.ems.onprem integrator.ems.onprem hookshot.ems.onprem admin.ems.onprem eleweb.ems.onprem

Node-based pods failing SSL

2023-02-06 15:42:04 ERROR: IrcBridge Failed to fetch roomlist from joined rooms: Error: unable to verify the first certificate. Retrying
MatrixHttpClient (REQ-13) Error: unable to verify the first certificate
at TLSSocket.onConnectSecure (_tls_wrap.js:1515:34)
at TLSSocket.emit (events.js:400:28)
at TLSSocket.emit (domain.js:475:12)
at TLSSocket. finishInit (_tls_wrap.js:937:8),
at TLSWrap.ssl.onhandshakedone (_tls_wrap.js:709:12) {
code: 'UNABLE TO VERIFY LEAF SIGNATURE

Drop into a shell on the pod

kubectl exec -it -n element-onprem adminbot-pipe-0 -- /bin/sh

Check it's ability to send a request to the Synapse server

node

require=("http")
request(https://synapse.server/)

Reconciliation failing / Enable enhanced updater logging

If your reconciliation is failing, a good place to start is with the updater logs

kubectl --namespace updater-onprem logs \
    "$(kubectl --namespace updater-onprem get pods --no-headers \
        --output=custom-columns="NAME:.metadata.name" | grep controller)" \
    --since 10m

If that doesn't have the answers you seek, for example

TASK [Build all components manifests] ******************************** 
fatal: [localhost]: FAILED! => {"censored": "the output has been hidden due to
the fact that 'no_log: true' was specified for this result"}

You can enable debug logging by editing the updater deployment

kubectl --namespace updater-onprem edit \
    deploy/element-updater-controller-manager

In this file, search for env and add the this variable to all occurrences

        - name: DEBUG_MANIFESTS
          value: "1"

Wait a bit for the updater to re-run and then fetch the updater logs again. Look for fatal or to get the stdout from Ansible, look for Ansible Task StdOut. See also Unhealthy deployment below.

Click for a specific example

I had this "unknown playbook failure"

After enabling debug logging for the updater, I found this error telling me that my Telegram bridge is misconfigured

--------------------------- Ansible Task StdOut -------------------------------
 TASK [Build all components manifests] ******************************** 
fatal: [localhost]: FAILED! => {"msg": "The task includes an option with an
undefined variable. The error was: 'dict object' has no attribute
'telegramApiId'. 'dict object' has no attribute 'telegramApiId'. 'dict object'
has no attribute 'telegramApiId'. 'dict object' has no attribute
'telegramApiId'. 'dict object' has no attribute 'telegramApiId'. 'dict object'
has no attribute 'telegramApiId'. 'dict object' has no attribute
'telegramApiId'. 'dict object' has no attribute 'telegramApiId'\n\nThe error
appears to be in '/element.io/roles/elementdeployment/tasks/prepare.yml': line
21, column 3, but may\nbe elsewhere in the file depending on the exact syntax
problem.\n\nThe offending line appears to be:\n\n\n- name: \"Build all
components manifests\"\n  ^ here\n"}

Unhealthy deployment

kubectl get elementdeployment --all-namespaces --output yaml

In the status you will see which component is having an issue. You can then do

kubectl --namespace element-onprem get `<kind>`/`<name>` --output yaml

And you would see the issue in the status.

Other Commands of Interest

Some other commands that may yield some interesting data while troubleshooting are:

Check list of active kubernetes events

kubectl get events -A

You will see a list of events or the message No resources found.

Show the state of services in the element-onprem namespace:

kubectl get services -n element-onprem

This should return output similar to:

NAME                             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                    AGE
postgres                         ClusterIP   10.152.183.47    <none>        5432/TCP                   6d23h
app-element-web                  ClusterIP   10.152.183.60    <none>        80/TCP                     6d23h
server-well-known                ClusterIP   10.152.183.185   <none>        80/TCP                     6d23h
instance-synapse-main-headless   ClusterIP   None             <none>        80/TCP                     6d23h
instance-synapse-main-0          ClusterIP   10.152.183.105   <none>        80/TCP,9093/TCP,9001/TCP   6d23h
instance-synapse-haproxy         ClusterIP   10.152.183.78    <none>        80/TCP                     6d23h

Connect to the Synapse Database

kubectl --namespace element-onprem exec --stdin --tty synapse-postgres-0 -- bash
psql "dbname=$POSTGRES_DB user=$POSTGRES_USER password=$POSTGRES_PASSWORD"

Excessive Synapse Database Space Usage

Connect to the Synapse database.

SQL queries provided for reference only. Ensure you fully understand what they do before runnign and use at your own risk.

List tables ordered by size

SELECT
    schemaname AS table_schema,
    relname AS table_name,
    pg_size_pretty(pg_relation_size(relid)) AS data_size
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_relation_size(relid) DESC;

Example output

 table_schema |              table_name               | data_size 
--------------+---------------------------------------+-----------
 public       | event_json                            | 2090 MB
 public       | event_auth                            | 961 MB
 public       | events                                | 399 MB
 public       | current_state_delta_stream            | 341 MB
 public       | state_groups_state                    | 294 MB
 public       | room_memberships                      | 270 MB
 public       | cache_invalidation_stream_by_instance | 265 MB
 public       | stream_ordering_to_exterm             | 252 MB
 public       | state_events                          | 249 MB
 public       | event_edges                           | 208 MB
(10 rows)

Count unique values in a table ordered by count

This example counts events per room from the event_json table (where all your messages etc. are stored). This may take a while to run and may use a lot of system resources.

SELECT
    room_id,
    COUNT(*) AS count
FROM event_json
GROUP BY room_id
ORDER BY count DESC
LIMIT 10;

Example output

             room_id             |  count  
---------------------------------+---------
 !GahmaiShiezefienae:example.com | 1382242
 !gutheetheixuFohmae:example.com |    1933
 !OhnuokaiCoocieghoh:example.com |     357
 !efaeMegazeeriteibo:example.com |     175
 !ohcahTueyaesiopohc:example.com |      93
 !ithaeTaiRaewieThoo:example.com |      43
 !PhohkuShuShahhieWa:example.com |      39
 !eghaiPhetahHohweku:example.com |      37
 !faiLeiZeefirierahn:example.com |      29
 !Eehahhaepahzooshah:example.com |      27
(10 rows)

In this instance something unusual might be going on in !GahmaiShiezefienae:example.com that warrants further investigation.

Export logs from all Synapse pods to a file

This will export logs from the last 5 minutes.

for pod in $(kubectl --namespace element-onprem get pods --no-headers \
    --output=custom-columns="NAME:.metadata.name" | grep '\-synapse')
do
    echo "$pod" >> synapse.log
    kubectl --namespace element-onprem logs "$pod" --since 5m >> synapse.log
done

Grep all configmaps

for configmap in $(kubectl --namespace element-onprem get configmaps --no-headers --output=custom-columns="NAME:.metadata.name"); do
    kubectl --namespace element-onprem describe configmaps "$configmap" \
    | grep --extended-regex '(host|password)'
done

List Synapse pods, sorted by pod age/creation time

kubectl --namespace element-onprem get pods --sort-by 'metadata.creationTimestamp' | grep --extended-regex '(NAME|-synapse)'

Matrix Authentication Service admin

If your server use Matrix Authentication Service (MAS), you might occasionally need to interact with this directly. This can be done either using the MAS Admin API or using mas-cli.

Here is an one-liner for connectign to mas-cli:

kubectl --namespace element-onprem exec --stdin --tty \
    "$(kubectl --namespace element-onprem get pods \
        --output=custom-columns='NAME:.metadata.name' \
        | grep first-element-deployment-matrix-authentication-service)" \
    -- mas-cli help

Alternately, to make it easier, you can create an alias:

alias mas-cli='kubectl --namespace element-onprem exec --stdin --tty \
    "$(kubectl --namespace element-onprem get pods \
        --output=custom-columns="NAME:.metadata.name" \
        | grep first-element-deployment-matrix-authentication-service)" \
    -- mas-cli '

Redeploy the micro8ks setup

It is possible to redeploy microk8s by running the following command as root:

snap remove microk8s

This command does remove all microk8s pods and related microk8s storage volumes. Once this command has been run, you need to reboot your server - otherwise you may have networking issues. Add --purge flag to remove the data if disk usage is a concern.

After the reboot, you can re-run the installer and have it re-deploy microk8s and Element Enterprise On-Premise for you.

Show all persistent volumes and persistent volume claims for the `element-onprem` namespace

kubectl get pv -n element-onprem

This will give you output similar to:

NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                   STORAGECLASS        REASON   AGE
pvc-fc3459f0-eb62-4afa-94ce-7b8f8105c6d1   20Gi       RWX            Delete           Bound    container-registry/registry-claim                       microk8s-hostpath            8d
integrator-postgres                        5Gi        RWO            Recycle          Bound    element-onprem/integrator-postgres                      microk8s-hostpath            8d
synapse-postgres                           5Gi        RWO            Recycle          Bound    element-onprem/synapse-postgres                         microk8s-hostpath            8d
hostpath-synapse-media                     50Gi       RWO            Recycle          Bound    element-onprem/first-element-deployment-synapse-media   microk8s-hostpath            8d
adminbot-bot-data                          10M        RWO            Recycle          Bound    element-onprem/adminbot-bot-data                        microk8s-hostpath            8d
auditbot-bot-data                          10M        RWO            Recycle          Bound    element-onprem/auditbot-bot-data                        microk8s-hostpath            8d

Show deployments in the `element-onprem` namespace

kubectl get deploy -n element-onprem

This will return output similar to:

NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
app-element-web            1/1     1            1           6d23h
server-well-known          1/1     1            1           6d23h
instance-synapse-haproxy   1/1     1            1           6d23h

Show hostname to IP mappings from within a pod

Run:

kubectl exec -n element-onprem <pod_name> -- getent hosts

and you will see output similar to:

127.0.0.1       localhost
127.0.0.1       localhost ip6-localhost ip6-loopback
10.1.241.30     instance-hookshot-0.instance-hookshot.element-onprem.svc.cluster.local instance-hookshot-0
192.168.122.5   ems.onprem element.ems.onprem hs.ems.onprem adminbot.ems.onprem auditbot.ems.onprem integrator.ems.onprem hookshot.ems.onprem admin.ems.onprem eleweb.ems.onprem

This will help you troubleshoot host resolution.

Show the Element Web configuration

kubectl describe cm -n element-onprem app-element-web

and this will return output similar to:

config.json:
----
{
    "default_server_config": {
        "m.homeserver": {
            "base_url": "https://synapse2.local",
            "server_name": "local"
        } 
  },
  "dummy_end": "placeholder",
  "integrations_jitsi_widget_url": "https://dimension.element2.local/widgets/jitsi",
  "integrations_rest_url": "https://dimension.element2.local/api/v1/scalar",
  "integrations_ui_url": "https://dimension.element2.local/element",
  "integrations_widgets_urls": [
      "https://dimension.element2.local/widgets"
  ]
}

Show the nginx configuration for Element Web: (If using nginx as your ingress controller in production or using thPoC installer.)

kubectl describe cm -n element-onprem app-element-web-nginx

and this will return output similar to:

  server {
      listen       8080;

      add_header X-Frame-Options SAMEORIGIN;
      add_header X-Content-Type-Options nosniff;
      add_header X-XSS-Protection "1; mode=block";
      add_header Content-Security-Policy "frame-ancestors 'self'";
      add_header X-Robots-Tag "noindex, nofollow, noarchive, noimageindex";

      location / {
          root   /usr/share/nginx/html;
          index  index.html index.htm;

          charset utf-8;
      }
  }

Show the status of all namespaces

kubectl get namespaces

which will return output similar to:

NAME                 STATUS   AGE
kube-system          Active   20d
kube-public          Active   20d
kube-node-lease      Active   20d
default              Active   20d
ingress              Active   6d23h
container-registry   Active   6d23h
operator-onprem      Active   6d23h
element-onprem       Active   6d23h

Show the status of the stateful sets in the `element-onprem` namespace

kubectl get sts -n element-onprem

This should return output similar to:

NAME                    READY   AGE
postgres                1/1     6d23h
instance-synapse-main   1/1     6d23h

Show the Synapse configuration

Click to see commands for installers prior to version 2023-05.05

For installers prior to 2022-05.06, use:

kubectl describe cm -n element-onprem first-element-deployment-synapse-shared

For the 2022-05.06 installer and later, use:

kubectl -n element-onprem get secret synapse-secrets -o yaml 2>&1 | grep shared.yaml | awk -F 'shared.yaml: ' '{print $2}' - | base64 -d

For the 2023-05.05 installer and later, use:

kubectl --namespace element-onprem get \
    secrets first-element-deployment-synapse-main --output yaml | \
    grep instance_template.yaml | awk '{print $2}' | base64 --decode

Verify DNS names and IPs in certificates

In the certs directory under the configuration directory, run:

for i in $(ls *crt); do echo $i && openssl x509 -in $i -noout -text | grep DNS; done

This will give you output similar to:

local.crt
              DNS:local, IP Address:192.168.122.118, IP Address:127.0.0.1
synapse2.local.crt
              DNS:synapse2.local, IP Address:192.168.122.118, IP Address:127.0.0.1

and this will allow you to verify that you have the right host names and IP addresses in your certificates.

View the MAU Settings in Synapse

kubectl get  -n element-onprem secrets/synapse-secrets -o yaml | grep -i shared.yaml -m 1| awk -F ': ' '{print $2}' - | base64 -d

which will return output similar to:

# Local custom settings
mau_stats_only: true

limit_usage_by_mau: False
max_mau_value: 1000
mau_trial_days: 2

mau_appservice_trial_days:
  chatterbox: 0

enable_registration_token_3pid_bypass: true

Integration issues

GitHub not sending events

You can trace webhook calls from your GitHub application under Settings/developer settings/GitHub Apps

Select your GitHub App

Click on Advanced and you should see queries issues by your app under Recent Deliveries

Updater and Operator in `ImagePullBackOff` state

Check EMS Image Store Username and Token

Check to see if you can pull the Docker image:

kubectl get pods -l app.kubernetes.io/instance=element-operator-controller-manager -n operator-onprem -o yaml | grep 'image:'

grab the entry like image: gitlab-registry.matrix.org/ems-image-store/standard/kubernetes-operator@sha256:305c7ae51e3b3bfbeff8abf2454b47f86d676fa573ec13b45f8fa567dc02fcd1

Should look like

microk8s.ctr image pull gitlab-registry.matrix.org/ems-image-store/standard/kubernetes-operator@sha256:305c7ae51e3b3bfbeff8abf2454b47f86d676fa573ec13b45f8fa567dc02fcd1 -u <EMS Image Store usenamer>:<EMS Image Store token>

Setting up Permalinks With the Installer

Element Extra Configurations

Please go to the "Element Web" page of the installer, click on "Advanced" and add the following to "Additional Configuration":

{
    "permalinkPrefix": "https://<element fqdn>"
}

Re-run the installer.

Setting Up Well Known Delegation

Well Known Delegation Configuration

From the Installer's Integrations page, click "Install" under "Well-Known Delegation".

Add any client configuration here:

A sample client configuration might look like:

  {
    "im.vector.riot.jitsi": {
      "preferredDomain": "jitsi.dev.local"
    }
  }

Add any server configuration here:

Re-run the installer for the changes to take effect.

Troubleshooting the Well Know config

The clients and servers will need to be able to access these configuration settings. You can check if everything is in place with curl. The following request is useful if your base domain is actually the same as your main webserver. This curl goes directly to the ingress of the kubernetes, which is implemented with nginx. Keeping the request header as "my.base.domain" allows nginx to route the request to the correct pod.

$ curl -X GET --header "HOST: my.base.domain" "https://matrix.my.base.domain/.well-known/matrix/client"
{
    "io.element.e2ee": {
        "default": false
    },
    "m.homeserver": {
        "base_url": "https://matrix.my.base.domain"
    }
}

The above shows a correctly setup well-known repsponse, for the direct request to the cluster. In some setups there is a web server in front of your Element installation. In these cases the main web server should be implementing a reverse proxy for everything that is under https://my.base.domain/.well-known/matrix/ . All these requests should be sent to https://matrix.my.base.domain/.well-known/matrix/. If the main web server would run Apache, the config would look like this :

    ProxyPass               /.well-known/matrix/ https://matrix.MYBASEDOMAIN/.well-known/matrix/
    ProxyPassReverse        /.well-known/matrix/ https://matrix.MYBASEDOMAIN/.well-known/matrix/
    ProxyPreserveHost On

This is the check :

$ curl -X GET https://my.base.domain/.well-known/matrix/client
{
    "io.element.e2ee": {
        "default": false
    },
    "m.homeserver": {
        "base_url": "https://matrix.my.base.domain"
    }
}

You can check the ingress logs. Verify the request reaching the nginx and check for the correct path. Replace ${XXXX} with the actual name in your deployment ( $ kubectl get pods -A will reveal that name ).

$ kubectl logs nginx-ingress-microk8s-controller-${XXXX} -n ingress
...

Setting up Delegated Authentication With the Installer

Delegated Authentication

At present, we support delegating the authentication of users to the following provider interfaces:

LDAP
SAML
OIDC
CAS

When enabling Delegated Auth, you can still allow local users managed by Element to connect to the instance

When Allow Local Users Login is Enabled, you can both connect to your instance using your IDP and the local database.

Different options are offered by the installer and you can combine two or more options on the same instance like enabling SAML and OIDC delegated authentication.

Setting up Delegated Authentication with LDAP on Windows AD

Setting up Delegated Authentication with OpenID on Microsoft Azure

Setting up Delegated Authentication with OpenID on Microsoft AD FS

Setting up Delegated Authentication with SAML on Microsoft Azure

Note: We are rapidly working to expand and improve this documentation. For now, we are providing screenshots of working configurations, but in the future, we will better explain the options as well. If you do not see your provider listed below, please file a support ticket or reach out to your Element representative and we will work to get you connected and our documentation updated.

Troubleshooting

Redirection loop on SSO

Synapse needs to have the X-Forwarded-For and X-Forwarded-Proto headers set by the reverse proxy doing the TLS termination. If you are using a Kubernetes installation with your own reverse proxy terminating TLS, please make sure that the appropriate headers are set.

Integrations and Add-Ons

Setting Up Jitsi and TURN With the Installer

Configure the Installer to install Jitsi and TURN

Prerequisites

Firewall

You will have to open the following ports to your microk8s host (or k8s cluster) to enable coturn and jitsi :

For jitsi :

30301/tcp
30300/udp

For coturn, allow the following ports :

3478/tcp
3478/udp
5349/tcp
5349/udp

You will also have to allow the following port range, depending on the settings you define in the installer (see below) :

<coturn min port>-<coturn max port>/udp

DNS

The jitsi and coturn domain names must resolve to the VM access IP. You must not use host_aliases for these hosts to resolve to the private IP locally on your setup.

Coturn

From the Installer's Integrations page, click "Install" under "Coturn".

For the coturn.yml presented by the installer, edit the file and ensure the following values are set:

coturn_fqdn: The access address to coturn. It should match something like coturn.<fqdn.tld>. It must resolve to the public-facing IP of the VM.
shared_secret: A random value, you can generate it with pwgen 32
min_port: The minimal UDP Port used by coturn for relaying UDP Packets, in range 32769-65535
max_port: The maximum UDP Port used by coturn for relaying UDP Packets, in range 32769-65535

Further, if you are using your own certificates instead of letsencrypt, for the coturn_fqdn, you will need to provide certificates for the installer outside of the GUI. Please find your ~/.element-enterprise-server/config directory and create a directory called ~/.element-enterprise-server/config/legacy/certs under which to put a .crt/.key PEM encoded certificate for this fqdn. If your fqdn was coturn.airgap.local, your filenames would need to be coturn.airgap.local.crt and coturn.airgap.local.key. You will need to have these certificate files in place before running the installer.

Jitsi

From the Installer's Integrations page, click "Install" under "Jitsi".

For the jitsi.yml presented by the installer, edit the file and ensure the following values are set:

jitsi_fqdn: The access address to jitsi. It should match something like jitsi.<fqdn.tld>. It must resolve to the public-facing IP of the VM.
jicofo_auth_password: # a secret internal password for jicofo auth
jicofo_component_secret: # a secret internal password for jicofo component
jvb_auth_password: # a secret internal password for jvb
helm_override_values: {} # if needed, to override helm settings automatically set by the installer; For Helm values that can be overriden, see https://vector-im.github.io/jitsi-helm/ For environment variables that can be passed in via Helm overrides, see https://jitsi.github.io/handbook/docs/devops-guide/devops-guide-docker/
timezone: Europe/Paris # The timezone in TZ format
stun_servers: Needed if you don't setup coturn using the installer. Should be a yaml list of server:port entries. Example:
```
stun_servers: 
- ip:port
- ip:port
```

Further, for the jitsi_fqdn, you will need to provide .crt/.key PEM encoded certificates. These can be entered in the installer UI. If your fqdn was jitsi.airgap.local, your filenames would need to be jitsi.airgap.local.crt and jitsi.airgap.local.key. You will need to edit the file name field in the UI before pressing "Choose File" button when selecting the certificates.

If your network does not have any NAT, Jitsi cannot use the local coturn server to determine the IP it should advertise to the users. In this case, you might have issues with your calls and video. To workaround it, you can use the following configuration :

provide_node_address_as_public_ip: true

helm_override_values:
  jvb:
    extraEnvs:
    - name: JVB_ADVERTISE_IPS
      value:  "public ip of jitsi"
    - name: JVB_ADVERTISE_PRIVATE_CANDIDATES
      value: "true"

Element

Please go to the "Element Web" page of the installer, click on "Advanced" and add the following to "Additional Configuration":

{
  "jitsi": {
    "preferred_domain": "<jitsi_fqdn>"
  }
}

In the above text, you will want to replace <jitsi_fqdn> with the actual fqdn.

Configure the installer to use an existing Jitsi instance

Please go to the "Element Web" page of the installer, click on "Advanced" and add the following to "Additional Configuration":

{
      "jitsi": {
            "preferred_domain": "your.jitsi.example.org"
      }
}

replacing your.jitsi.example.org with the hostname of your Jitsi server.

You will need to re-run the installer for this change to take effect.

Integrations and Add-Ons

Setting up Group Sync with the Installer

What is Group Sync?

Group Sync allows you to use the ACLs from your identity infrastructure in order to set up permissions on Spaces and Rooms in the Element Ecosystem. Please note that the initial version we are providing only supports a single node, non-federated configuration.

Configuring Group Sync

From the Installer's Integrations page, click "Install" under "Group Sync".

Leaving Dry Run checked in combination with Logging Level set to Debug gives you the ability to visualize in the pod's log file what result group sync will produce without effectively creating spaces and potentially corrupting your database. Otherwise, uncheck Dry Run to create spaces according to your spaces mappings defined in the Space mapping section.
Auto invite groupsync users to public room determines whether users will be automatically invited to rooms (default, public and space-joinable). Users will still get invited to spaces regardless of this setting.

Configuring the source

LDAP Servers

You should create a LDAP account with read access.
This account should use password authentication.

LDAP Base DN: the distinguished name of the root level Org Unit in your LDAP directory. In our example, Demo Corp is our root level, spaces are mapped against Org Units , but you can map a space against any object (groups, security groups,..) belonging to this root level. The root level must contain all the Users, Groups and OUs used in the space mapping.

The distinguished name can be displayed by selecting View/Advanced Features in the Active Directory console and then, right-clicking on the object, selecting Properties/Attributes Editor.

The DN is OU=Demo corp,DC=olivier,DC=sales-demos,DC=element,DC=io.

Mapping attribute for room name: LDAP attribute used to give an internal ID to the space (visible when setting the log in debug mode)
Mapping attribute for username: LDAP attribute like sAMAccountName used to map the localpart of the mxid against the value of this attribute.
If @bob:my-domain.org is the mxid, bob is the localpart and groupsync expects to match this value in the LDAP attribute sAMAccountName.
LDAP Bind DN: the distinguished name of the LDAP account with read access.
Check interval in seconds: the frequency Group sync refreshes the space mapping in Element.
LDAP Filter: an LDAP filter to filter out objects under the LDAP Base DN. The filter must be able to capture Users, Groups and OUs used in the space mapping.
LDAP URI: the URI of your LDAP server.
LDAP Bind Password: the password of the LDAP account with read access.

MS Graph (Azure AD)

You need to create an App registration. You'll need the Tenant ID of the organization, the Application (client ID) and a secret generated from Certificates & secrets on the app.
For the bridge to be able to operate correctly, navigate to API permissions and ensure it has access to Group.Read.All, GroupMember.Read.All and User.Read.All. Ensure that these are Application permissions (rather than Delegated).
Remember to grant the admin consent for those.
To use MSGraph source, select MSGraph as your source.
- msgraph_tenant_id: This is the "Tenant ID" from your Azure Active Directory Overview
- msgraph_client_id: Register your app in "App registrations". This will be its "Application (client) ID"
- msgraph_client_secret : Go to "Certificates & secrets", and click on "New client secret". This will be the "Value" of the created secret (not the "Secret ID").

Space Mapping

The space mapping mechanism allows us to configure spaces that Group Sync will maintain, beyond the ones that you can create manually.

It is optional – the configuration can be skipped but if you enable Group Sync, you have to edit the Space mapping by clicking on the EDIT button and rename the (unnamed space)to something meaningful.

Include all users in the directory in this space: all available users, regardless of group memberships join the space. This option is convenient when creating a common subspace shared between all users.

When clicking on Add new space, you can leave the space as a top level space or you can drag and drop this space onto an existing space, making this space a subspace of the existing space.

You can then map an external ID (the LDAP distinguished name) against a power level. Every user belonging to this external ID is granted the power level set in the interface. This external ID that can be any LDAP object like an OrgUnit, a Group or a Security Group. The external ID is case-sensitive

A power level 0 is a default user that can write messages, react to messages and delete his own messages.

A power level 50 is a moderator that can creates rooms, delete messages from members.

A power level 100 is an administrator but since GroupSync manages spaces, invitations to the rooms, it does not make sense to map a group against a power level 100.

Custom power levels other than 0 and 50 are not supported yet.

Users allowed in every GroupSync room

A list of userid patterns that will not get kicked from rooms even if they don't belong to them according to LDAP.

This is useful for things like auditbot if Audibot has been enabled.

Patterns listed here will be wrapped in ^ and $ before matching.

Defaults Rooms

A list of rooms added to every space

H

Integrations and Add-Ons

Setting up GitLab, GitHub, JIRA and Webhooks Integrations With the Installer

In Element Server Suite, our GitLab, GitHub, and JIRA extensions are provided by the hookshot package. This documentation explains how to configure hookshot.

Configuring Hookshot with the Installer

From the Installer's Integrations page, click "Install" under "Hookshot: Github, Gitlab, Jira, and Custom Webhooks."

On the first screen here, we can set the logging level and a hookshot specific verify tls setting. Most users can leave these alone.

To use hookshot, you will need to generate a hookshot password key, when can be done by running the following command on a Linux command line:

openssl genpkey -out passkey.pem -outform PEM -algorithm RSA -pkeyopt rsa_keygen_bits:4096

which will generate output similar to this:

..................................................................................................................................................................++++
......................................................................................++++

Once this has finished, you will have a file called passkey.pem that can use to upload as the "Hookshot Password key".

If you wish to change the hookshot provisioning secret, you can, but you can also leave this alone as it is randomly generated by the installer.

Next, we get to a set of settings that allow us to make changes to the Hookshot bot's appearance.

There is also a button to show widget settings, which brings up these options:

In this form, we have the ability to control how widgets are incorporated into rooms (the defaults are usually fine) and to set a list of Disallowed IP ranges wherein widgets will not load if the homeserver IP falls in the range. If your homeservers IP falls in any of these ranges, you will want to remove that range so that the widgets will load!

Next, we have the option to enable Gitlab, which shows us the following settings:

The webhook secret is randomly generated and does not need to be changed. You can also add Gitlab instances by specifying an instance name and pasting the URL.

Next, we have the option to enable Jira, which shows us the following settings:

In here, we can specify the OAuth Client ID and the OAuth client secret to connect to Jira. To obtain this information, please follow these steps:

The JIRA service currently only supports atlassian.com (JIRA SaaS) when handling user authentication. Support for on-prem deployments is hoping to land soon.

You'll first need to head to https://developer.atlassian.com/console/myapps/create-3lo-app/ to create a "OAuth 2.0 (3LO)" integration.
Once named and created, you will need to:
Enable the User REST, JIRA Platform REST and User Identity APIs under Permissions.
Use rotating tokens under Authorisation.
Set a callback url. This will be the public URL to hookshot with a path of /jira/oauth.
Copy the client ID and Secret from Settings

Once you've set these, you'll notice that a webhook secret has been randomly generated for you. You can leave this alone or edit it if you desire.

Next, let's look at configuring Webhooks:

You can set whether or not webhooks are enabled and whether they allow JS Transformation functions. It is good to leave these enabled per the defaults. You can also specify the user id prefix for the creation of custom webhooks. If you set this to webhook_ then each new webhook will appear in a room with a username starting with webhook_.

Next, let's look at configuring Github:

This bridge requires a GitHub App. You will need to create one. Once you have created this, you'll be able to fill in the Auth ID and OAuth Client ID. You will also need to generate a "Github application key file" to upload this. Further, you will need to specify a "Github OAuth client secret" and a "Github webhook secret", both of which will appear on your newly created Github app page.

On this screen, we have the option to change how we call the bot and other minor settings. We also have the ability to select which hooks we provide notifications for, what labels we wish to exclude, and then which hooks we will ignore completely.

Now we have the ability to add a list of labels that we want to match. This has the impact of the integration only notifying you of issues with a specifc set of labels.

We then have the ability to add a list of labels that all newly created issues through the bot should be labeled with.

Then we have the ability to enable showing diffs in the room when a PR is created.

Moving along, we can configure how workflow run results are configured in the bot, including matching specific workflows and including or excluding specific workflows.

Finishing Configuration

You furrther have the ability to click "Advanced" and set any kubernetes specific settings for how this pod is run. Once you have set everything up on this page, you can click "Continue" to go back to the Integrations page.

When you have finished running the installer and the hookshot pod is up and running, there are some configurations to handle in the Element client itself in the rooms that you wish the integration to be present.

As an admin, you will need to enable hookshot in the rooms using the "Add widgets, bridges, & bots" functionality to add the "Hookshot" widget to the room and finish the setup.

Integrations and Add-Ons

Setting up Adminbot and Auditbot

Overview

Adminbot allows for an Element Administrator to become admin in any existing room or space on a managed homeserver. This enables you to delete rooms for which the room administrator has left your company and other useful administration actions.

Auditbot allows you to have the ability to export any communications in any room that the auditbot is a member of, even if encryption is in use. This is important in enabling you to handle compliance requirements that require chat histories be obtainable.

On using Admin Bot and Audit Bot

Currently, we deploy a special version of Element Web to allow you to log in as the adminbot and auditbot. Given this, please do not make changes to widgets in rooms while logged in as the adminbot or the auditbot. The special Element Web does not have any custom settings that you have applied to the main Element Web that your users use and as such, you can cause problems for yourself by working with widgets as the adminbot and auditbot. In the future, we are working to provide custom interfaces for these bots.

Configuring Admin Bot

From the Installer's Integrations page, click "Install" under "Admin Bot"

You will then see the following:

Your first choice is to configure adminbot or enable this server as part of a federated adminbot cluster. For most cases, you'll want to select "Configure Adminbot".

Below this, we have a checkbox to either allow the adminbot to participate in DM rooms (rooms with 1-2 people) or not.

We also have a checkbox to join local rooms only. You probably want to leave this on. If you turn it off, the adminbot will try to join any federated rooms that your server is joined to.

Moving on, we also have the ability to change the logging level and set the username of the bot.

After this, we have the ability to set the "Backup Passphrase" which is used to gain access to the key backup store.

Two settings that need to be set in the "Advanced" section are the fqdn for the adminbot element web access point and its certifactes. These settings can be found by clicking "Advanced" and scrolling to:

and then:

Configuring Audit Bot

From the Installer's Integrations page, click "Install" under "Audit Bot".

You will then see the following:

Your first choice is to configure auditbot or enable this server as part of a federated auditbot cluster. For most cases, you'll want to select "Configure Auditbot".

Below this, we have a checkbox to either allow the adminbot to participate in DM rooms (rooms with 1-2 people) or not.

We also have a checkbox to join local rooms only. You probably want to leave this on. If you turn it off, the adminbot will try to join any federated rooms that your server is joined to.

Moving on, we also have the ability to change the logging level and set the username of the bot.

After this, we have the ability to set the "Backup Passphrase" which is used to gain access to the key backup store.

You can also configure an S3 bucket to log to and you can configure how many logfiles should be kept and how large a log file should be allowed to grow to. By default, the auditbot will log to the storage that has been attached by the cluster (check the storage settings under the "Advanced" tab).

Two settings that need to be set in the "Advanced" section are the fqdn for the auditbot element web access point and its certifactes. These settings can be found by clicking "Advanced" and scrolling to:

Adminbot Federation

On the central admin bot server

You will pick "Configure Admin Bot" and will fill in everything from the above Adminbot configuration instructions, but you will also add Remote Federated Homeservers in this interface:

You will need to fill out this form for each remote server that will join the federation. You will need to set the domain name and the matrix server for each to get started.

You will also need to grab the Admin user authentication token for each server and specify that here. You may get this with the following command run against a specific server: kubectl get synapseusers/adminuser-donotdelete -n element-onprem -o yaml. You are looking for the value of the field status.accessToken.

Then in the app service, you can leave Automatically compute the appservice tokens set. You will need to also get the generic shared secret from that server and specify it here as well. You can get this value from running: kubectl get -n element-onprem secrets first-element-deployment-synapse-secrets -o yaml | grep registration and looking at the value for the registrationSharedSecret.

On the remote admin bot server

Instead of selecting "Configure Adminbot", you will pick "Enable Central Adminbot Access" and will then be presented with this UI:

You will then specify the FQDN of the central adminbot server.

Auditbot Federation

On the central auditbot server

You will pick "Configure Audit Bot" and will fill in everything from the above Auditbot configuration instructions, but you will also add Remote Federated Homeservers in this interface:

You will need to fill out this form for each remote server that will join the federation. You will need to set the domain name and the matrix server for each to get started.

On the remote audit bot server

Instead of selecting "Configure Auditbot", you will pick "Enable Central Auditbot Access" and will then be presented with this UI:

You will then specify the FQDN of the central auditbot server.

Integrations and Add-Ons

Setting Up Hydrogen

Configuring Hydrogen

From the Installer's Integrations page, click "Install" under "Hydrogen".

For the hydrogen.yml presented by the installer, edit the file and ensure the following values are set:

hydrogen_fqdn is the FQDN that will be used for accessing hydrogen. It must have a PEM formatted SSL certificate as mentioned in the introduction. The crt/key pair must be in the CONFIG_DIRECTORY/certs directory.
extra_config is extra json config that should be injected into the hydrogen client configuration.

You will need to re-run the installer after making these changes for them to take effect.

Integrations and Add-Ons

Setting up On-Premise Metrics

Setting up VictoriaMetrics and Grafana

From the Installer's Integrations page, click "Install" under "Monitoring"

For the provided prom.yml, see the following descriptions of the parameters:

If you want to write prometheus data to a remote prometheus instance, please define these 4 variables :
- remote_write_url: The URL of the endpoint to which to push remote writes
- remote_write_external_labels: The labels to add to your data, to identify the writes from this cluster
- remote_write_username: The username to use to push the writes
- remote_write_password: The password to use to push the writes
You can configure which prometheus components you want to deploy :
deploy_vmsingle, deploy_vmagent and deploy_vmoperator: true to deploy VictoriaMetrics
deploy_node_exporter: requires prometheus deployment. Set to true to gather data about the k8s nodes.
deploy_kube_control_plane_monitoring: requires prometheus deployment. Set to true to gather data about the kube controle plane.
deploy_kube_state_metrics: requires prometheus deployment. Set to true to gather data about kube metrics.
deploy_element_service_monitors: Set to true to create ServiceMonitor resources into the K8S cluster. Set it to true if you want to monitor your element services stack using prometheus.
You can choose to deploy grafana on the cluster :
- deploy_grafana: true
- grafana_fqdn: The FQDN of the grafana application
- grafana_data_path: /mnt/data/grafana
- grafana_data_size: 1G

For the specified grafana_fqdn, you will need to provide a crt/key PEM encoded key pair in ~/.element-enterprise-server/config/legacy/certs prior to running the installer. If our hostname were metrics.airgap.local, the installer will expect to find metrics.airgap.local.crt and metrics.airgap.local.key in the ~/.element-enterprise-server/config/legacy/certs` directory. If you are using Let's Encrypt, you do not need to add these files.

After running the installer, open the FQDN of Grafana. The initial login user is admin and password is the value of admin_password. You'll be required to set a new password, please define one secured and keep it in a safe place. ~

Integrations and Add-Ons

Setting Up the Telegram Bridge

Configuring Telegram bridge

On Telegram platform

Basic config

From the Installer's Integrations page, click "Install" under "Telegram Bridge".

For the provided telegram.yml file, please see the following options:

postgres_create_in_cluster: true to create the postgres db into the k8s cluster. On a standalone deployment, it is necessary to define the postgres_data_path.
postgres_fqdn: The fqdn of the postgres server. If using postgres_create_in_cluster, you can choose the name of the workload.
postgres_data_path: "/mnt/data/telegram-postgres"
postgres_port: 5432
postgres_user: The user to connect to the db.
postgres_db: The name of the db.
postgres_password: A password to connect to the db.
telegram_fqdn: The FQDN of the bridge for communicating with Telegram and using public login user interface.
max_users: Max number of users enabled on the bridge.
bot_username: The username of the bot for users to manage their bridge connectivity.
bot_display_name: The display name of the bot.
bot_avatar: An mx content URL to the bot avatar.
admins: The list of admins of the bridge.
enable_encryption: true to allow e2e encryption in bridge.
enable_encryption_by_default: true to enable by default e2e encryption on every chat created by the bridge.
enable_public_portal: true to give the possibility to users to login using the bridge portal UI.
telegram_api_id: The telegram API ID you got from telegram platform.
telegram_api_hash: The telegram api hash you got from telegram platform.

For the specified telegram_fqdn, you will need to provide a crt/key PEM encoded key pair in ~/.element-enterprise-server/config/legacy/certs prior to running the installer. If our hostname were telegram.airgap.local, the installer will expect to find telegram.airgap.local.crt and telegram.airgap.local.key in the ~/.element-enterprise-server/config/legacy/certs` directory. If you are using Let's Encrypt, you do not need to add these files.

You will need to re-run the installer after making changes for these to take effect.

Usage

Talk to the telegram bot to login to the bridge. See Telegram Bridge starting at "Bridge Telegram to your Element account". Instead of addressing the bot as that document explains, use "@bot_username:domain" as per your setup.

Integrations and Add-Ons

Setting Up the Teams Bridge

Configuring Teams Bridge

Register with Microsoft Azure

You will first need to generate an "Application" to serve connect your Teams bridge with Microsoft.

Connect to Azure on https://portal.azure.com/#blade/Microsoft_AAD_IAM/ActiveDirectoryMenuBlade/Overview to go to the Active Directory.
Go to "Register an application screen" and register an application.
Supported account types can be what fits your needs, but do not select "Personal Microsoft accounts"
Redirect URI must be https://<teams_fqdn>/authenticate. You must use the type Desktop and Mobile apps. You don't need to check any of suggested redirection URIs.
You should be taken to a general configuration page. Click Certificates & secrets
Generate a Client Secret and copy the resulting value. The value will be your teams_client_secret.

Permissions

You will need to set some API permissions.

For each of the list below click Add permission > Microsoft Graph > and then set the Delegated permissions.

ChannelMessage.Read.All - Delegated
ChannelMessage.Send - Delegated
ChatMessage.Read - Delegated
ChatMessage.Send - Delegated
ChatMember.Read - Delegated
ChatMember.ReadWrite - Delegated
Group.ReadWrite.All - Delegated
offline_access - Delegated
profile - Delegated
Team.ReadBasic.All - Delegated
User.Read - Delegated
User.Read.All - Delegated

For each of the list below click Add permission > Microsoft Graph > and then set the Application permissions:

ChannelMember.Read.All - Application
ChannelMessage.Read.All - Application
Chat.Create - Application
Chat.Read.All - Application
Chat.ReadBasic.All - Application
Chat.ReadWrite.All - Application
ChatMember.Read.All - Application
ChatMember.ReadWrite.All - Application
ChatMessage.Read.All - Application
Group.Create - Application
Group.Read.All - Application
Group.ReadWrite.All - Application
GroupMember.Read.All - Application
GroupMember.ReadWrite.All - Application
User.Read.All - Application

Once you are done, click Grant admin consent

Go to Overview
Copy the "Application (client) ID" as your teams_client_id in the config
Copy the "Directory (tenant) ID" as the teams_tenant_id in the config.

Setting up the bot user

The bridge requires a Teams user to be registered as a "bot" to send messages on behalf of Matrix users. You just need to allocate one user from the Teams interface to do this.

First, you must go to the Azure Active Directory page.
Click users.
Click New user.
Ensure Create user is selected.
Enter a User name ex. "matrixbridge".
Enter a Name ex. "Matrix Bridge".
Enter an Initial password.
Create the user.
Optionally, set more profile details like an avatar.
You will now need to log in as this new bot user to set a permanent password (Teams requires you to reset the password on login).
After logging in you should be prompted to set a new password.
Enter the bot username and password into config under teams_bot_username and teams_bot_password

Getting the groupId

The groupId can be found by opening Teams, clicking ... on a team, and clicking "Get link to team". The groupId is included in the URL 12345678-abcd-efgh-ijkl-lmnopqrstuvw in this example.

https://teams.microsoft.com/l/team/19%3XXX%40thread.tacv2/conversations?groupId=12345678-abcd-efgh-ijkl-lmnopqrstuvw&tenantId=87654321-dcba-hgfe-lkji-zyxwvutsrqpo

On the hosting machine

Generate teams registration keys

openssl genrsa -out teams.key 1024
openssl req -new -x509 -key teams.key -out teams.crt -days 365

These keys need to be placed in ~/.element-enterprise-server/config/legacy/certs/teams on the machine that you are running the installer on.

Configure Teams Bridge

From the Installer's Integrations page, click "Install" under "Microsoft Teams Bridge"

For the provided teams.yml, please the following documentation of the parameters:

teams_client_id: # teams app client id
teams_client_secret: # teams app secret
teams_tenant_id: # teams app tenant id
teams_bot_username: # teams bot username
teams_bot_password: # teams bot password
teams_cert_file: teams.crt
teams_cert_private: teams.key
teams_fqdn: <teams bridge fqdn>
teams_bridged_groups:
- group_id: 218b0bfe-05d3-4a63-8323-846d189f1dc1 #change me
  properties:
    autoCreateRooms:
      public: true
      powerLevelContent:
        users:
          "@alice:example.com": 100 # This will add <alice> account as admin
          "@teams-bot:example.com": 100 # the Teams bot mxid <bot_sender_localpart>:<domain_name>
    autoCreateSpace: true
    limits:
      maxChannels: 25
      maxTeamsUsers: 25
# repeat "- group_id:" section above for each Team you want to bridge

     
bot_display_name: Teams Bridge Bot
bot_sender_localpart: teams-bot
enable_welcome_room: true
welcome_room_text: |
 Welcome, your Element host is configured to bridge to a Teams instance.

 This means that Microsoft Teams messages will appear on your Element
 account and you can send messages in Element rooms to have them appear
 on teams.

 To allow Element to access your Teams account, please say `login` and
 follow the steps to get connected. Once you are connected, you can open
 the 🧭 Explore Rooms dialog to find your Teams rooms.
# namespaces_prefix_user: OPTIONAL: default to _teams_
# namespaces_prefix_aliases: OPTIONAL: default to teams_

For each Bridged Group, you will need to set a group_id and some properties found in the config sample.

You will need to re-run the installer for changes to take effect.

Integrations and Add-Ons

Setting Up the IRC Bridge

Matrix IRC Bridge

The Matrix IRC Bridge is an IRC bridge for Matrix that will pass all IRC messages through to Matrix, and all Matrix messages through to IRC. Please also refer to the bridges' specific documentation for additional guidance.

For usage of the IRC Bridge via it's bot user see Using the Matrix IRC Bridge documentation.

Installation and Configuration

From the Installer's Integrations page find the IRC Bridge entry, and click Install.This will setup the IRC Bridges' config directory, by default this will be located:

~/.element-enterprise-server/config/legacy/ircbridge

You will initially be taken to the bridges configuration page, for any subsequent edits, the Install button will be replaced with Configure, indicating the bridge is installed.

There are two sections of the Matrix IRC Bridge configuration page, the Bridge.yml section, and a section to Upload a Private Key. We'll start with the latter as it's the simplest of the two, and is referenced in the first.

Upload a Private Key

As the bridge needs to send plaintext passwords to the IRC server, it cannot send a password hash, so those passwords are stored encrypted in the bridge database. When a user specifies a password to use, using the admin room command !storepass server.name passw0rd, the password is encrypted using a RSA PEM-formatted private key. When a connection is made to IRC on behalf of the Matrix user, this password will be sent as the server password (PASS command).

Therefore you will need a Private Key file, by default called passkey.pem:

If you have a Private Key file already, simply upload the file using this sections Upload File button, supplying a RSA PEM-formatted private key.
If you don't already have one, per the instructions provided in the section itself, you should generate this file by running the following command from within the IRC Bridges' config directory:
```
penssl genpkey -out passkey.pem -outform PEM -algorithm RSA -pkeyopt rsa_keygen_bits:2048
```

The `Bridge.yml` Section

The Bridge.yml is the complete configuration of the Matrix IRC Bridge. It points to a private key file (Private Key Settings), and both configures the bridges' own settings and functionality (Bridge Settings), and the specific IRC services you want it to connect with (IRC Settings).

Private Key Settings

key_file: passkey.pem

By default this is the first line in the Bridge.yml config, it refers to the file either moved into the IRC Bridges' config directory, or generated in there using openssl. If moved into the directory ensure the file was correctly renamed to passkey.pem.

Bridge Settings

The rest of the configuration sits under the bridged_irc_servers: section:

bridged_irc_servers:

You'll notice all entries within are initially indented ( ) so all code blocks will include this indentation. Focusing on settings relating to the bridge itself (and not any specific IRC connection) covers everything except the address: and associated parameters: sections, by default found at the end of the Bridge.yml.

Postgres

If you are using postgres-create-in-cluster you can leave this section as-is, the default ircbridge-postgres / ircbridge / postgres_password values will ensure your setup works correctly.

- postgres_fqdn: ircbridge-postgres
  postgres_user: ircbridge
  postgres_db: ircbridge
  postgres_password: postgres_password

Otherwise you should edit as needed to connect to your existing Postgres setup:

postgres_fqdn: Provide the URL to your Postgres setup
postgres_user: Provide the user that will be used to connect to the database
postgres_db: Provide the database you will connect to
postgres_password: Provide the password of the user specified above

You can uncomment the following to use as needed, note if unspecified some of these will default to the advised values, you do not need to uncomment if you are happy with the defaults.

postgres_data_path: This can be used to specify the path to the postgres db on the host machine
postgres_port: This can be used to specify a non-standard port, this defaults to 5432.
postgres_sslmode: This can be used to specify the sslmode for the Postgres connection, this defaults to 'disable', however 'no-verify' and 'verify-full are available options

For example, your Postgres section might instead look like the below:

- postgres_fqdn: https://db.example.com
  postgres_user: example-user
  postgres_db: matrixircbridge
  postgres_password: example-password
  # postgres_data_path: "/mnt/data/<bridged>-postgres"
  postgres_port: 2345
  postgres_sslmode: 'verify-full'

IRC Bridge Admins

Within the admins: section you will need to list all the Matrix User ID's of your users who should be Admins of the IRC Bridge. You should list one Matrix User ID per line using the full Matrix User ID formatted like @USERNAME:HOMESERVER

  admins:
  - "@user-one:example.com"
  - "@user-two:example.com"

Provisioning

Provisioning allows you to set specified rules about existing room when bridging those rooms to IRC Channels.

enable_provisioning: Set this to true to enable the use of provisioning_rules:
provisioning_rules: -> userIds: Use Regex to specify which User IDs to check for in existing rooms that are trying to be bridged
- exempt: List any User IDs you do not want to prevent the bridging of a room, that would otherwise meet the match in conflict:
- conflict: Specify individual User IDs, or use Regex
provisioning_room_limit: Specify the number of channels allowed to be bridged

So the example bridge.yml config below will block the bridging of a room if it has any User IDs within it from the badguys.com homeserver except @doubleagent:badguys.com, and limit the number of bridged rooms to 50.

  enable_provisioning: true
  provisioning_rules:
    userIds:
      exempt:
        - "@doubleagent:badguys.com"
      conflict:
        - "@.*:badguys.com"
  provisioning_room_limit: 50

IRC Ident

If you are using the Ident protocol you can enable it usage with the following config:

enable_ident: Set this to true to enable the use of IRC Ident
ident_port_type: Specify either 'HostPort' or 'NodePort' depending on your setup
ident_port_number: Specify the port number that should be used

  enable_ident: false
  ident_port_type: 'HostPort'
  ident_port_number: 10230

Miscellaneous

Finally there are a few additional options to configure:

logging_level: This specifies how detailed the logs should be for the bridge, by default this is info, but error, warn and debug are available.
- You can see the bridge logs using kubectl logs IRC_POD_NAME -n element-onprem
enable_presence: Set to true if presence is required.
- This should be kept as false if presence is disabled on the homeserver to avoid excess traffic.
drop_matrix_messages_after_seconds: Specify after how many seconds the bridge should drop Matrix messages, by default this is 0 meaning no messages will be dropped.
- If the bridge is down for a while, the homeserver will attempt to send all missed events on reconnection. These events may be hours old, which can be confusing to IRC users if they are then bridged. This option allows these old messages to be dropped.
- CAUTION: This is a very coarse heuristic. Federated homeservers may have different clock times which may be old enough to cause all events from the homeserver to be dropped.
bot_username: Specify the Matrix User ID of the the bridge bot that will facilitate the creation of rooms and can be messaged by admins to perform commands.
rmau_limit: Set this to the maximum number of remote monthly active users that you would like to allow in a bridged IRC room.
users_prefix: Specify the prefix to be used on the Matrix User IDs created for users who are communicating via IRC.
alias_prefix: Specify the prefix to be used on room aliases when created via the !join command.

The defaults are usually best left as-is unless a specific need requires changing these, however for troubleshooting purposes, switching logging_level to debug can help identify issues with the bridge.

  logging_level: debug
  enable_presence: false
  drop_matrix_messages_after_seconds: 0
  bot_username: "ircbridgebot"
  rmau_limit: 100
  users_prefix: "irc_"
  alias_prefix: "irc_"

Advanced Additional Configuration

You can find more advanced configuration options by checking the config.yaml sample provided on the Matrix IRC Bridge repository.

You can ignore the servers: block as config in that section should be added under the parameters: section associated with address: that will be setup per the below section. If you copy any config, ensure the indentation is correct, as above, all entries within are initially indented ( ), so they are under the bridged_irc_servers: section.

IRC Settings

The final section of Bridge.yml, here you specify the IRC network(s) you want the bridge to connect with, this is done using address: and parameter: formatted like so:

address: Specify your desired IRC Network

  address: irc.example.com
  parameters:

Aside from the address of the IRC Network, everything is configured within the parameters: section, and so is initially indented , all code blocks will include this indentation.

Basic IRC Network Configuration

At a minimum, you will need to specify the name: of your IRC Network, as well as some details for the bots configuration on the IRC side of the connection, you can use the below to get up and running.

name: The server name to show on the bridge.
botConfig:
- enabled: Keep this set as true
- nick: Specify the nickname of the bot user within IRC
- username: Specify the username of the bot user within IRC
- password: Optionally specify the password of the bot to give to NickServ or IRC Server for this nick. You can generate this by using the pwgen 32 1 command

    name: "Example IRC"
    botConfig:
      enabled: true
      nick: "MatrixBot"
      username: "matrixbot"
      password: "some_password"

Advanced IRC Network Configuration (Load Balancing, SSL, etc.)

For more fine-grained control of the IRC connection, there are some additional configuration lines you may wish to make use of. As these are not required, if unspecified some of these will default to the advised values, you do not need to include any of these if you are happy with the defaults. You can use the below config options, in addition to those in the section above, to get more complex setups up and running.

additionalAddresses: Specify any additional addresses to connect to that can be used for load balancing between IRCDs
- Specify each additional address within the [] as comma-separated values, for example:
  - [ "irc2.example.com", "irc3.example.com" ]
onlyAdditionalAddresses: Set to true to exclusively use additional addresses to connect to servers while reserving the main address for identification purposes, this defaults to false
port: Specify the exact port to use for the IRC connection
ssl: Set to true to require the use SSL, this defaults to false
sslselfsign: Set to true if the IRC network is using a self-signed certificate, this defaults to false
sasl: Set to true should the connection attempt to identify via SASL, this defaults to false
allowExpiredCerts: Set to true to allow expired certificates when connecting to the IRC server, this defaults to false
botConfig:
- joinChannelsIfNoUsers: Set to false to prevent the bot from joining channels even if there are no Matrix users on the other side of the bridge, this defaults to true so doesn't need to be specified unless false is required.

If you end up needing any of these additional configuration options, your parameters: section may look like the below example:

    name: "Example IRC"
    additionalAddresses: [ "irc2.example.com" ]
    onlyAdditionalAddresses: false
    port: 6697
    ssl: true
    sslselfsign: false
    sasl: false
    allowExpiredCerts: false
    botConfig:
      enabled: true
      nick: "MatrixBot"
      username: "matrixbot"
      password: "some_password"
      joinChannelsIfNoUsers: true

Mapping IRC user modes to Matrix power levels

You can use the configuration below to map the conversion of IRC user modes to Matrix power levels. This enables bridging of IRC ops to Matrix power levels only, it does not enable the reverse. If a user has been given multiple modes, the one that maps to the highest power level will be used.

modePowerMap: Populate with a list of IRC user modes and there respective Matrix Power Level in the formate of IRC_USER_MODE: MATRIX_POWER_LEVEL

    modePowerMap:
      o: 50
      v: 1

Configuring DMs between users

By default private messaging is enabled via the bridge and Matrix Direct Message rooms can be federated. You can customise this behaviour using the privateMessages: config section.

enabled: Set to false to prevent private messages to be sent to/from IRC/Matrix, defaults to true
federate: Set to false so only users on the homeserver attached to the bridge to be able to use private message rooms, defaults to true

    privateMessages:
      enabled: true
      federate: true

Mapping IRC Channels to Matrix Rooms

Whilst a user can use the !join command (if Dynamic Channels are enabled) to manually connect to IRC Channels, you can specify mappings of IRC Channels to Matrix Rooms, 1 Channel can be mapped to multiple Matrix Rooms, up-front. The Matrix Room must already exist, and you will need to include it's Room ID within the configuration - you can get this ID by using the 3-dot menu next to the room, and opening Settings.

mappings: Under here you will need to specify an IRC Channel, then within that you will need to list out the required roomIds: in [] as a comma-separated list and provide a key: if there is a Channel key / password to us. If provided Matrix users do not need to know the channel key in order to join it.
```
    mappings:
      "#IRC_CHANNEL_NAME":
        roomIds: ["!ROOM_ID_THREE:HOMESERVER", "!ROOM_ID_TWO:HOMESERVER"]
        key: "secret"
```

See the below example configuration for mapping the #welcome IRC Channel:

    mappings:
      "#welcome":
        roomIds: ["!exampleroomidhere:example.com"]

Allowing `!join` with Dynamic Channels

If you would like for users to be able to use the !join command to join any allowed IRC Channel you will need to configure dynamicChannels:.

You may remember you set an alias prefix in the Miscellaneous section above. If you wish to fully customise the format of aliases of bridged rooms you should remove that `alias_prefix:` line. However the only benefit to this would be to add a suffix to the Matrix Room alias so is not recommended.

enabled: Set to true to allow users to use the !join command to join any allowed IRC Channel, defaults to false
createAlias: Set to false if you do not want an alias to be created for any new Matrix rooms created using !join, defaults to true
published: Set to false to prevent the created Matrix room via !join from being published to the public room list, defaults to true
useHomeserverDirectory: Set to true to publish room to your Homeservers' directory instead of one created for the IRC Bridge, defaults to false
joinRule: Set to "invite" so only users with an invite can join the created room, otherwise this defaults to "public", so anyone can join the room
whitelist: Only used if joinRule: is set to invite, populate with a list of Matrix User IDs that the IRC bot will send invites to in response to a !join
federate: Set to false so only users on the homeserver attached to the bridge to be able to use these rooms, defaults to true
aliasTemplate: Only used if createAlias: is set to true. Set to specify the alias for newly created rooms from the !join command, defaults to "#irc_$CHANNEL"
- You should not include this line if you do not need to add a suffix to your Matrix Room alias. Using alias_prefix:, this will default to #PREFIX_CHANNEL_NAME:HOMESERVER
- If you are specifying this line, you can use the following variables within the alias:
  - $SERVER => The IRC server address (e.g. "irc.example.com")
  - $CHANNEL => The IRC channel (e.g. "#python"), this must be used within the alias
exclude: Provide a comma-separated list of IRC Channels within [] that should be prevented from being mapped under any circumstances

In addition you could also specify the below, though it is unlikely you should need to specify the exact Matrix Room Version to use.

roomVersion: Set to specify the desired Matrix Room Version, if unspecified, no specific room version is requested.
- If the homeserver doesn't support the room version then the request will fail.

    dynamicChannels:
      enabled: true
      createAlias: true
      published: true
      useHomeserverDirectory: true
      joinRule: invite
      federate: true
      aliasTemplate: "#irc_$CHANNEL"
      whitelist:
        - "@foo:example.com"
        - "@bar:example.com"
      exclude: ["#foo", "#bar"]

Exclude users from using the bridge

Using the excludedUsers: configuration you can specify Regex to identify users to be kicked from any IRC Bridged rooms.

regex: Set this to any Regex that should match on users' Matrix User IDs
kickReason: Set to specify the reason provided to users when kicked from IRC Bridged rooms

    excludedUsers:
      - regex: "@.*:evilcorp.com"
        kickReason: "We don't like Evilcorp"

Syncing Matrix and IRC Membership lists

To manage and control how Matrix and IRC membership lists are synced you will need to include membershipLists: within your config.

enabled: Set to true to enable the syncing of membership lists between IRC and Matrix, defaults to false
- This can have a significant effect on performance on startup as the lists are synced
floodDelayMs: Syncing membership lists at startup can result in hundreds of members to process all at once. This timer drip feeds membership entries at the specified rate, defaults to 10000 (10 Seconds)

Within membershipLists: are the following sections, global:, rooms:, channels: and ignoreIdleUsersOnStartup:. For global:, rooms:, channels: you can specify initial:, incremental: and requireMatrixJoined: which all default to false. You can configure settings globally, using global:, or specific to Matrix Rooms with rooms: or IRC Channels via channels:.

What does setting initial: to true do?
- For ircToMatrix: this gets a snapshot of all real IRC users on a channel (via NAMES) and joins their virtual matrix clients to the room
- For matrixToIrc: this gets a snapshot of all real Matrix users in the room and joins all of them to the mapped IRC channel on startup
What does setting incremental: to true do?
- For ircToMatrix: this makes virtual matrix clients join and leave rooms as their real IRC counterparts join/part channels
- For matrixToIrc: this makes virtual IRC clients join and leave channels as their real Matrix counterparts join/leave rooms
What does setting requireMatrixJoined: to true do?
- This controls if the bridge should check if all Matrix users are connected to IRC and joined to the channel before relaying messages into the room. This is considered a safety net to avoid any leakages by the bridge to unconnected users but given it ignores all IRC messages while users are still connecting it's likely not required.

The last section is ignoreIdleUsersOnStartup: which determines if the bridge should ignore users which are not considered active on the bridge during startup.

enabled: Set to true to allow ignoring of idle users during startup
idleForHours: Set to configure how many hours a user has to be idle for before they can be ignored
exclude: Provide Regex matching on Matrix User IDs that should be excluded from being marked as ignorable

    membershipLists:
      enabled: false
      floodDelayMs: 10000

      global:
        ircToMatrix:
          initial: false
          incremental: false
          requireMatrixJoined: false

        matrixToIrc:
          initial: false
          incremental: false
          
      rooms:
        - room: "!fuasirouddJoxtwfge:localhost"
          matrixToIrc:
            initial: false
            incremental: false

      channels:
        - channel: "#foo"
          ircToMatrix:
            initial: false
            incremental: false
            requireMatrixJoined: false

      ignoreIdleUsersOnStartup:
        enabled: true
        idleForHours: 720
        exclude: "foobar"

Configuring how IRC users appear in Matrix

As part of the bridge IRC users and their messages will appear in Matrix as Matrix users, you will be able to click on their profiles perform actions just like any other user. You can configure how they are display using matrixClients:.

You may remember you set a user name prefix in the Miscellaneous section above. If you wish to fully customise the format of your IRC users' Matrix User IDs you should remove that `users_prefix:` line. However the only benefit to this would be to add a suffix to the Matrix User ID so is not recommended.

userTemplate: Specify the template Matrix User ID that IRC users will appear as, it must start with an @ and feature $NICK within, $SERVER is usable
- You should not include this line if you do not need to add a suffix to your IRC users' Matrix IDs. Using users_prefix:, this will default to @PREFIX_NICKNAME:HOMESERVER
displayName: Specify the Display Name of IRC Users that appear within Matrix, it must contain $NICK within, $SERVER is usable
joinAttempts: Specify the number of tries a client can attempt to join a room before the request is discarded. Set to -1 to never retry or 0 to never give up, defaults to -1

    matrixClients:
      userTemplate: "@irc_$NICK"
      displayName: "$NICK"
      joinAttempts: -1

Configuring how Matrix users appear in IRC

As part of the bridge Matrix users and their messages will appear in IRC as IRC users, you will be able to perform IRC actions on them like any other user. You can configure how this functions using ircClients:.

nickTemplate: Set this to the template how Matrix users' IRC client nick name is set, defaults to "$DISPLAY[m]"
- You can use the following variables within the template, you must use at least one of these.
  - $LOCALPART => The user ID localpart (e.g. "alice" in @alice:localhost)
  - $USERID => The user ID (e.g. @alice:localhost)
  - $DISPLAY => The display name of this user, with excluded characters (e.g. space) removed.
    - If the user has no display name, this falls back to $LOCALPART.
allowNickChanges: Set to true to allow users to use the !nick command to change their nick on the server
maxClients: Set the max number of IRC clients that will connect
- If the limit is reached, the client that spoke the longest time ago will be disconnected and replaced, defaults to 30
idleTimeout: Set the maximum amount of time in seconds that a client can exist without sending another message before being disconnected.
- Use 0 to not apply an idle timeout, defaults to 172800 (48 hours)
- This value is ignored if this IRC server is mirroring matrix membership lists to IRC.
reconnectIntervalMs: Set the number of millseconds to wait between consecutive reconnections if a client gets disconnected.
- Set to 0 to disable scheduling i.e. it will be scheduled immediately, defaults to 5000 (5 seconds)
concurrentReconnectLimit: Set the number of concurrent reconnects if a user has been disconnected unexpectedly
- Set this to a reasonably high number so that bridges are not waiting an eternity to reconnect all its clients if we see a massive number of disconnect.
- Set to 0 to immediately try to reconnect all users, defaults to 50
lineLimit: Set the number of lines of text to allow being sent as from matrix to IRC, defaults to 3
- If the number of lines that would be sent is > lineLimit, the text will instead be uploaded to Matrix and the resulting URI is treated as a file. A link will be sent to the IRC instead to avoid spamming IRC.
realnameFormat: Set to either "mxid" or "reverse-mxid" to define the format used for the IRC realname.
kickOn:
- channelJoinFailure: Set to true to kick a Matrix user from a bridged room if they fail to join the IRC channel
- ircConnectionFailure: Set to true to kick a Matrix user from ALL rooms if they are unable to get connected to IRC
- userQuit: Set to true to kick a Matrix user from ALL rooms if they choose to QUIT the IRC network

You can also optionally configure the following, they do not need to be included in your config if you are not changing their default values.

ipv6:
- only: Set to true to force IPv6 for outgoing connections, defaults to false
userModes: Specify the required IRC User Mode to set when connecting, e.g. "RiG" to set +R, +i and +G, defaults to "" (No User Modes)
pingTimeoutMs: Set the minimum time to wait between connection attempts if the bridge is disconnected due to throttling.
pingRateMs: Set the rate at which to send pings to the IRCd if the client is being quiet for a while.
- Whilst IRCd should sending pings to the bridge to keep the connection alive, sometimes it doesn't and ends up ping timing out the bridge.

    ircClients:
      nickTemplate: "$DISPLAY[m]"
      allowNickChanges: true
      maxClients: 30
      # ipv6:
      #   only: false
      idleTimeout: 10800
      reconnectIntervalMs: 5000
      concurrentReconnectLimit: 50
      lineLimit: 3
      realnameFormat: "mxid"
      # pingTimeoutMs: 600000
      # pingRateMs: 60000
      kickOn:
        channelJoinFailure: true
        ircConnectionFailure: true
        userQuit: true

Deploying the IRC Bridge

Once you have make the required changes to your Bridge.yml configuration, make sure you find and click the Save button at the bottom of the IRC Bridge configuration page to ensure your changes are updated.

You will then need to re-Deploy for any changes to take effect, as above ensure all changes made are saved then click Deploy.

Using the Bridge

For usage of the IRC Bridge via it's bot user see Using the Matrix IRC Bridge documentation, or for end user focused documentation see Using the Matrix IRC Bridge as an End User.

If you have setup mapping of rooms in your Bridge.yml, some rooms will already be connected IRC, users need only join the bridged room and start messaging. IRC users should see Matrix users in the Channel and be able to communicate with them like any other IRC user.

Integrations and Add-Ons

Setting Up the SIP Bridge

Configuring SIP bridge

Basic config

From the Installer's Integrations page, click "Install" under "SIP Bridge"

For the provided sipbridge.yml, please see the following documentation:

- `postgres_create_in_cluster`: `true` to create the postgres db into the k8s cluster. On a standalone deployment, it is necessary to define the `postgres_data_path`.
- `postgres_fqdn`: The fqdn of the postgres server. If using `postgres_create_in_cluster`, you can choose the name of the workload.
- `postgres_data_path`: "/mnt/data/sipbridge-postgres"
- `postgres_port`: 5432
- `postgres_user`: The user to connect to the db.
- `postgres_db`: The name of the db.
- `postgres_password`: A password to connect to the db.
- `port_type`: `HostPort` or `NodePort` depending on which kind of deployment you want to use. On standalone deployment, we advise you to use `HostPort` mode.
- `port`: The port on which to configure the SIP protocol. On `NodePort` mode, it should be in kubernetes range:
- `enable_tcp`: `true` to enable TCP SIP.
- `pstn_gateway`: The hostname of the PSTN Gateway.
- `external_address`: The external address of the SIP Bridge
- `proxy` : The address of the SIP Proxy
- `user_agent`: A user agent for the sip bridge.
- `user_avatar`: An MXC url to the sip bridge avatar. Don't define it if you have not uploaded any avatar.
- `encryption_key`: A 32 character long secret used for encryption. Generate this with `pwgen 32 1`

Integrations and Add-Ons

Setting Up the XMPP Bridge

Configuring the XMPP Bridge

The XMPP bridge relies on the xmpp "component" feature. It is an equivalent of matrix application services. You need to configure an XMPP Component on an XMPP Server that the bridge will use to bridge matrix and xmpp user.

On the hosting machine

From the Installer's Integrations page, click "Install" under "XMPP Bridge".

Examples

In all the examples below the following are set:

The domain_name is your homeserver domain ( the part after : in your MXID ) : example.com
XMPP Server FQDN: xmpp.example.com
XMPP External Component/xmpp_domain: matrix.xmpp.example.com

Prosody Example

If you are configuring prosody, you need the following component configuration (for the sample xmpp server, matrix.xmpp.example.com):

    Component "matrix.xmpp.example.com"
        ssl = {
          certificate = "/etc/prosody/certs/tls.crt";
          key = "/etc/prosody/certs/tls.key";
        }
      component_secret = "eeb8choosaim3oothaeGh0aequiop4ji"

And then with that configured, you would pass the following into xmpp.yml:

xmpp_service: xmpp://xmpp.example.com:5347
xmpp_domain: "matrix.xmpp.example.com" # external component subdomain
xmpp_component_password: eeb8choosaim3oothaeGh0aequiop4ji # xmpp component password

Note: We've used pwgen 32 1 to generate the component_secret.

Joining an XMPP Room

Once you have the XMPP bridge up, you need to map an XMPP room to a Matrix ID. For example, if the room on XMPP is named: #welcome@conference.xmpp.example.com, where conference is the FQDN of the component hosting rooms for your XMPP instance, then on Matrix, you would join:

#_xmpp_welcome_conference.xmpp.example.com:example.com

So you can simply send the following command in your Element client to jump into the XMPP room via Matrix

/join #_xmpp_welcome_conference.xmpp.example.com:example.com

Joining a Matrix room from XMPP

If the Element/Matrix room is public you should be able to query the room list at the external component server address (Ex: matrix.xmpp.example.com)

The Matrix room at alias #roomname:example.com maps to #roomname#example.com@matrix.xmpp.example.com on the XMPP server xmpp.example.com if your xmpp_domain: matrix.xmpp.example.com

Note: If the Matrix room has users with the same name as yor XMPP account, you will need to edit you XMPP nickname to be unique in the room

Element		XMPP
#roomname:element.local (native Matrix room)	→	#roomname#element.local@element.xmpp.example.com (bridged into XMPP)
#_xmpp_roomname_conference.xmpp.example.com:element.local (bridged into Matrix/Element)	←	#roomname@conference.xmpp.example.com (native XMPP room)

Using the bridge as an end user

For end user documentation you can visit the Using the Matrix XMPP Bridge as an End User documentation.

Integrations and Add-Ons

Setting up Location Sharing

Overview

The ability to send a location share, whether static or live, is available without any additional configuration.

However, when receiving a location share, in order to display it on a map, the client must have access to a tile server. If it does not, the location will be displayed as text with coordinates.

By default, location sharing uses a MapTiler instance and API key that is sourced and paid for by Element. This is provided free, primarily for personal EMS users and those on Matrix.org.

If no alternate tileserver is configured either on the HomeServer or client then the mobile and desktop applications will fall back to Element's MapTiler instance. Self-hosted instances of Element Web will not fall back, and will show an error message.

Using Element's MapTiler instance

Customers should be advised that our MapTiler instance is not intended for commercial use, it does not come with any uptime or support SLA, we are not under any contractual obligation to provide it or continue to provide it, and for the most robust privacy customers should either source their own cloud-based tileserver or self-host one on-premises.

However, if they wish to use our instance with Element Web for testing, demonstration or POC purposes, they can configure the map_style_url by adding extra configurations in the advanced section of the Element Web page in the installer:

{
   "map_style_url": "https://api.maptiler.com/maps/streets/style.json?key=fU3vlMsMn4Jb6dnEIFsx"
}

Using a different tileserver

If the customer sources an alternate tileserver, whether from MapTiler or elsewhere, you should enter the tileserver URL in the extra_client section of the Well-Known Delegation Integration accessed from the Integrations page in the Installer:

{
... other info ...
"m.tile_server": {
"map_style_url": "http://mytileserver.example.com/style.json"
}

Self-hosting a tileserver

Customers can also host their own tileserver if they wish to dedicate the resources to doing so. Detailed information on how to do so is available here.

Changing permissions for live location sharing

By default live location sharing is restricted to moderators of rooms. In direct messages, both participants are admins by default so this isn't a problem. However this does impact public and private rooms. To change the default permissions for new rooms the following Synapse additional configuration should be set

default_power_level_content_override:
  private_chat:
    events:
      "m.beacon_info": 0
      "org.matrix.msc3672.beacon_info": 0
      "m.room.name": 50
      "m.room.power_levels": 100
      "m.room.history_visibility": 100
      "m.room.canonical_alias": 50
      "m.room.avatar": 50
      "m.room.tombstone": 100
      "m.room.server_acl": 100
      "m.room.encryption": 100
  # Not strictly necessary as this is used for direct messages, however if additional users are later invited into the room they won't be administrators
  trusted_private_chat:
    events:
      "m.beacon_info": 0
      "org.matrix.msc3672.beacon_info": 0
      "m.room.name": 50
      "m.room.power_levels": 100
      "m.room.history_visibility": 100
      "m.room.canonical_alias": 50
      "m.room.avatar": 50
      "m.room.tombstone": 100
      "m.room.server_acl": 100
      "m.room.encryption": 100
  public_chat:
    events:
      "m.beacon_info": 0
      "org.matrix.msc3672.beacon_info": 0
      "m.room.name": 50
      "m.room.power_levels": 100
      "m.room.history_visibility": 100
      "m.room.canonical_alias": 50
      "m.room.avatar": 50
      "m.room.tombstone": 100
      "m.room.server_acl": 100
      "m.room.encryption": 100

Integrations and Add-Ons

Removing Legacy Integrations

Today, if you remove a Yaml integration's config, its components will not be removed from the cluster automatically. You will also need to manually remove the custom resources from the Kubernetes cluster.

Removing Monitoring stack

You need to delete first the VMSingle and the VMAgent from the namespace :

kubectl delete vmsingle/monitoring -n <monitoring ns>
kubectl delete vmagent/monitoring -n <monitoring ns>

Once done, you can delete the namespace : kubectl delete ns/<monitoring ns>

Integrations and Add-Ons

Setting up Sliding Sync

Introduction to Sliding Sync

Sliding Sync is a backend component required by the Element X client beta. It provides a mechanism for the fast synchronisation of Matrix rooms. It is not recommended for production use and is only provide to enable the usage of the Element X client. The current version does not support SSO (OIDC/SAML/CAS). If you wish to try out the Element X client, then you need to be using password-based auth to allow Sliding Sync to work. SSO support (OIDC/SAML/CAS) will be added with a later version of the Sliding Sync tooling.

Installing Sliding Sync

From the integrations page, simply click the install button next to Sliding Sync:

This will take you to the following page:

You should be able to ignore both the sync secret and the logging, but if you ever wanted to change them, you can do that here.

If you are using an external PostgreSQL database, then you will need to create a new database for sliding sync and configure that here:

You will also need to set two values in the "Advanced" section -- the FQDN for sliding sync:

and the certificates for serving that FQDN over SSL:

Integrations and Add-Ons

Setting up Element Call

Introduction

Element Call is Element's next generation of video calling, set to replace Jitsi in the future. Element Call is currently an experimental feature so please use it accordingly; it is not expected to replace Jitsi yet.

How to set up Element Call

Required domains

In addition to the core set of domains for any ESS deployment, an Element Call installation on ESS uses the following domains:

Required:
- Element Call Domain: the domain of the Element Call web client.
- Element Call SFU Domain: the domain of the SFU (Selective Forwarding Unit) for forwarding media streams between call participants.
Optional:
- Coturn Domain: the domain of a Coturn instance hosted by your ESS installation. Required for airgapped environments.

Ensure you have acquired DNS records for these domains before installing Element Call on your ESS instance.

Required ports

Ensure that any firewalls in front of your ESS instance allow external traffic on the following ports:

Required:
- 443/tcp for accessing the Element Call web client.
- 30881/tcp and 30882/udp, for exposing the self-hosted Livekit SFU.
Optional:
- 80/http for acquiring LetsEncrypt certificates for Element Call domains.
- UDP (and possibly TCP) ports you choose for STUN TURN and/or the UDP relay of a self-hosted Coturn.

Basic installation

In the Admin Console, visit the Configure page, select Integrations on the left sidebar, and select Element Call (Experimental).

On the next page, the SFU > Networking section must be configured. Read the descriptions of the available networking modes to decide which is appropriate for your ESS instance.

Next, click the Advanced button at the bottom of the page, then to show the Kubernetes section, then click the Show button in that section.

In the section that appears, configure the Ingress and Ingresses > SFU sections with the Element Call Domain and Element Call SFU Domain (respectively) that you acquired earlier, as well as their TLS sections to associate those domain names with an SSL certificate for secure connections.

Other settings on the page may be left at their defaults, or set to your preference.

How to set up Element Call for airgapped environments

Your ESS instance must host Coturn in order for Element Call to function in airgapped environments. To do this, click Install next to Coturn from the integrations page.

On the Coturn integration page, set the External IP of your ESS instance that clients should be able to reach it at, the Coturn Domain, and at least STUN TURN.

Then, within the Element Call integration page, ensure SFU Networking has no STUN Servers defined. This will cause the deployed Coturn to be used by connecting users as the STUN server to discover their public IP address.

Element Call with guest access

By default, Element Call shares the same user access restrictions as the Synapse homeserver. This means that unless Synapse has been configured to allow guest users, calls on Element Call are accessible only to Matrix users registered on the Synapse homeserver. However, enabling guest users in Synapse to allow unregistered access to Element Call opens up the entire homeserver to guest account creation, which may be undesirable.

To solve the needs of allowing guest access to Element Call while blocking guest account creation on the homeserver, it is possible to grant guess access via federation with an additional dedicated homeserver, managed by an additional ESS instance. This involves a total of two ESS instances:

The main instance: an existing fully-featured ESS instance where registered accounts are homed & all integrations, including Element Call, are installed. Has Synapse configured with closed or restricted registration.
The guest instance: an additional ESS instance used only to host guest accounts, and to provide its own deployment of Element Call for unregistered/guest access. Has Synapse configured with open registration.

Guest access to Element Call is achieved via a closed federation between the two instances: the main instance federates with the guest instance and any other homeservers it wishes to federate with, and the guest homeserver federates only with the main instance. This allows unregistered users to join Element Call on the main instance by creating an account on the guest instance with open registration, while preventing these guest accounts from being used to reach any other homeservers.

How to set up Element Call with guest access

Install Element Call on your existing ESS instance by following the prior instructions on this page. This will be your main instance.
Prepare another ESS instance, then follow the prior instructions to install Element Call on it. This will be your guest instance.

Set custom images for Element Web and Element Call:

Log into each instance via SSH and follow these steps:

Save a file with the following content:

on the main instance:

apiVersion: v1
kind: ConfigMap
metadata:
  name: element-call-main-overrides
  namespace: element-onprem
data:
  images_digests: |
    element_web:
      element_web:
        image_repository_server: docker.io
        image_repository_path: vectorim/element-web
        image_tag: develop
        image_digest: sha256:0c5a025a4097a14f95077befad417f4a5af501cc2bc1dbda5ce0b055af0514eb

on the guest instance:

apiVersion: v1
kind: ConfigMap
metadata:
  name: element-call-guest-overrides
  namespace: element-onprem
data:
  images_digests: |
    element_call:
      element_call:
        image_repository_server: ghcr.io
        image_repository_path: element-hq/element-call
        image_tag: latest-ci
        image_digest: sha256:a9fbf8049567c2c11b4ddf8afbf98586a528e799d7f95266c7ae2ed16f250a56

Run kubectl -n element-onprem apply -f <path-to-saved-file>

In the admin console of each instance:
- Set Cluster > Advanced > Config > Image Digests Config Map to:
  - element-call-main-overrides on the main instance
  - element-call-guest-overrides on the guest instance
- In Synapse > Advanced > Additional, add this YAML content:
```
experimental_features:
  msc3266_enabled: true
```

In the admin console of the main instance:
- In Element Web > Advanced > Additional configuration, add this JSON content:
```
{
  "features": {
    "feature_new_room_decoration_ui": true,
    "feature_ask_to_join": true
  },
  "element_call": {
    "guest_spa_url": "https://<guest-instance-element-call-domain>"
  }
}
```
- To limit federation to only the guest instance, apply these settings in the Synapse section:
  - Set Profile > Federation Type to Limited
  - Set Config > Registration to Closed
  - Set Advanced > Allow List to include the the guest instance's Synapse Domain
In the admin console of the guest instance:
- To limit federation to only the main instance, apply these settings in the Synapse section:
  - Set Profile > Federation Type to Limited
  - Set Config > Registration to Closed
  - Set Advanced > Allow List to include the main instance's Synapse Domain
- In Integrations > Element Call > Additional configuration, add this JSON content:
```
{ 
  "livekit": { 
    "livekit_service_url": "https://<main-instance-sfu-domain>"
  } 
}
```

Integrations and Add-Ons

Setting Up the Skype for Business Bridge

Configuring the Skype for Business Bridge

Domains and certificates

The first step in preparing a Skype for Business (S4B) Bridge is to assign it a hostname that other S4B Server deployments can connect to it via SIP federation. This requires configuring DNS records and obtaining a TLS certificate for that hostname, which can be any name of your choosing.

The hostname assigned to a S4B Bridge is also known as its "SIP domain", as it serves as the domain name of the virtual SIP server managed by the bridge for federating with S4B Servers. The rest of this guide refers to a bridge's SIP domain as <bridge-sipdomain>.

Once you've chosen a hostname to assign to your bridge, other S4B Servers must be able to resolve that hostname to the bridge's public IP address via DNS. The most straightforward way to achieve this is to obtain public DNS records for <bridge-sipdomain>. If obtaining public records is not an option, an S4B Server administrator may configure it with internal records instead (which is outside the scope of this guide).

The DNS records to obtain are as follows:

A/AAAA <bridge-sipdomain> <bridge-public-ip-address>
SRV _sipfederationtls._tcp.<bridge-sipdomain> <any-priority> <any-weight> 5061 <bridge-sipdomain> (optional, but recommended)

You must also obtain a TLS certificate for <bridge-sipdomain>. It may be obtained from either a public CSA like Let's Encrypt, or by any PKI scheme shared between the bridge & any S4B Servers it must connect with.

Basic config

From the Installer's Integrations page, click "Install" under "Skype for Business Bridge".

The most important configuration options are under Advanced > Exposed Services, which is where to set the SIP domain & TLS certificates of the bridge:

Skype for Business Bridge Domain: set this to <bridge-sipdomain>
SIP:
- If your ESS deployment allows for the usage of Host Ports, set "Port" to 5060 and "Port Type" to "Host Port".
- Otherwise, you must configure a reverse proxy to redirect inbound traffic for port 5060 to the port you choose to assign this setting to.
SIPS: Same as above, but with a port of 5061.
TLS: Choose "Certificate File" and upload the certificate & private key obtained for <bridge-sipdomain>.

Configuring Skype for Business Server

In order for a S4B Server deployment to connect to your bridge, the deployment must first be configured with an Edge Server to support SIP federation & to explicitly allow federation with the SIP domain of the bridge.

This section describes how to modify an existing S4B Server deployment to federate with the bridge. It assumes that a functional S4B Server deployment has already been prepared; details on how to install a S4B Server deployment from scratch are out-of-scope of this guide.

Overview

To support SIP federation, a S4B Server deployment uses a pool of one or more Edge Servers to relay traffic from external SIP domains to the pool of internal servers that provide the core functionalty of the deployment, known as Front End Servers. This design is necessary because Front End Servers are meant to be run within the private network of a deployment, without access to external networks.

Edge Servers are also used as a proxy for allowing native S4B users to log in from outside the deployment's private network. Users who connect in this manner are known as "remote users".

Once equipped with an Edge Server, a S4B Server deployment must then be configured with which external SIP domains it may federate with. By default, traffic from all external SIP domains is blocked.

The S4B Bridge acts as a SIP endpoint with its own SIP domain. Thus, for it to connect to a S4B Server deployment, the deployment must not only be equipped with an Edge Server, but it must set the bridge's SIP domain as an "allowed" domain.

Below is a simple diagram of the network topology of a S4B Server deployment federated with a S4B Bridge:

external S4B clients <───> Edge Pool <───> S4B Bridge <~~~> Matrix homeserver <═══> Matrix clients
                               A                                  A
                               │                                  ╏
                               V                                  V
internal S4B clients <─> Front End Pool                     Matrix homeserver <═══> Matrix clients

<───>: SIP
<~~~>: Matrix Application Service API
<═══>: Matrix Client-Server API
<╍╍╍>: Matrix Federation API

This guide covers only the usecase of a single Front End Server and Edge Server. It is expected that similar instructions apply for multi-server pools, but that has not been tested.

Prerequisites

A S4B Server deployment must be prepared with least the following components in order for it to be capable of adding an Edge Server:

A Windows Server host running a Skype for Business 2019 Standard Edition Front End Server
A Windows Server host acting as a Domain Controller for all hosts in the deployment, and also acting as an internal Certificate Signing Authority (CSA) & DNS server for all hosts
- If a Domain Controller is not available to act as a CSA, you may use any alternative/custom PKI scheme of your choosing, as long as the root CA certificate is mutually trusted by all hosts.
- If a Domain Controller is not available to act as a DNS server, custom hostname mappings may instead be applied in the "hosts" file of all hosts, located at C:\Windows\System32\drivers\etc\hosts.

Such a deployment will have set some hostnames, which are referred to elsewhere in this guide as follows:

<s4b-intdomain>: The domain name / Primary DNS Suffix of the S4B Server deployment
<frnt>.<s4b-intdomain>: The internal FQDN of the Front End Server, where <frnt> is its host name
<s4b-sipdomain>: The default SIP domain of the deployment (visible in the Topology Builder on the Front End Server)

Deploying the Edge Server

An Edge Server must be deployed on a standalone host within the private network of the S4B Server deployment. It cannot be collocated on the same host as the Front End Server (source).

The OS to install on the Edge Server's host must be either Windows Server 2019 or 2016. Other versions of Windows Server, even newer versions, will not work (source). It should also be the same version of Windows Server that is installed on the host running the Front End Server. The host must also be outside of the Active Directory domain of the deployment.

Assign the host with a name of your choosing, which will be referred to elsewhere in this guide as <edge>. The internal FQDN of the host is therefore <edge>.<s4b-intdomain>.

After installing the OS, ensure Internet connectivity and perform Windows Update. Then, use the Server Manager desktop app (which can be found in Windows Search) to install the prerequisites listed by the official S4B documentation. Do not install any components needed for a Front End Server, as they may interfere with Edge Server components. It is also recommended to not install IIS on the Edge Server, despite the official documention, as it interferes with VoIP functionality.

Next, install the Skype for Business Administrative Tools. You may use the same installation media that was used for installing the Front End Server. Otherwise, it may be obtained from this download link.

Running the installation media will install two programs, known as the Core Components: the Deployment Wizard and the Management Shell. When using the Deployment Wizard on the Edge Server's host, do not run any tasks related to Active Directory, which should have already been run on the Front End Server, and must be run only once for the entire deployment. It is also unnecessary to install the rest of the Administrative Tools, such as the Topology Builder, on the Edge Server host.

Network topology

The network interfaces of hosts within the deployment must be configured such that inbound external SIP traffic is handled solely by one interface of the Edge Server, and that traffic between the Edge and Front End Servers remains within the private network of the deployment.

The Edge Server needs at least two network interfaces:

an external-facing interface for accepting inbound SIP traffic
- Its default gateway must at least have a route to the IP address of your S4B Bridge instance.
- If the Edge Server host is behind NAT, inbound traffic must be routed to this interface.
an internal-facing interface for reaching hosts within the private subnet of the deployment
- Its DHCP Server must be set to the internal IP address of the deployment's Domain Controller.
- This interface must not be routable to the public Internet.

Also, the firewall of the Edge Server must at least leave port 5061 open, and have it accessible to either the public Internet, or to the public IP address of your S4B Bridge host.

The Front End server needs at least one network interface, and for it to be an internal-facing interface with the same properties of the Edge Server's internal-facing interface. If Internet connectivity is desired (like for facilitating Phone Access & Meeting URLs), add a separate external-facing interface for handling external traffic, instead of making the internal-facing interface publicly routable.

The IP addresses of these interfaces are referred to elsewhere in this guide as follows:

<edge-extaddr>: the address of the Edge Server's external-facing interface
<edge-intaddr>: the address of the Edge Server's internal-facing interface
<frnt-intaddr>: the address of the Front End Server's internal-facing interface

DNS records

Internal records

The deployment needs an internal DNS record for the Edge Server's internal-facing interface in order to identify it by name. To add this record, open the DNS Manager on the Domain Controller host, and add an A/AAAA record for <edge>.<s4b-intdomain>, the FQDN of the Edge Server host, with the target address set to <edge-intaddr>.

External records

In order for your S4B Bridge to reach your Edge Server, acquire these public DNS records for advertising the SIP domain of your S4B Server deployment:

A/AAAA <edge>.<s4b-sipdomain> <edge-extaddr>
CNAME sip.<s4b-sipdomain> <edge>.<s4b-sipdomain>
SRV _sipfederationtls._tcp.<s4b-sipdomain> <any-priority> <any-weight> 5061 <edge>.<s4b-sipdomain>

Topology configuration

The topology of your S4B Server deployment may now be updated to include the Edge Server.

On the Front End Server, open the Topology Builder. Choose the option to download the current topology to a file, as this will ensure that you will edit an up-to-date version of the topology in the following steps.

Once the topology is loaded, navigate through the tree list on the left of the window to find the "Edge pools" entry (under "Skype for Business" > "site" > "Skype for Business Server 2019" > "Edge Pools"), right click it, select "New Edge Pool...", and apply the following settings in the wizard that appears:

Pool FQDN: set to <edge>.<s4b-intdomain>
Enable "This pool has one server"
Enable federation (port 5061)
Use a single FQDN and IP address
Apply IPv4/6 settings so that you will be able to use the Edge Server's internal & external interface addresses later.
External FQDN: set to <edge>.<s4b-sipdomain>
Leave service ports at their default of 5061, 444, and 443 for Access, Web Conferencing, and A/V Edge Services respectively
Internal & external IPv4/6 addresses: set these to addresses of the internal & external interfaces you set up earlier. The internal interface is never 127.0.0.1.
Next hop pool & media association: set this to the Front End Server (which should be the only choice)

Next, in the settings for your site (available by right-clicking the tree entry immediately below the top-level "Skype for Business Server" item and choosing "Edit Properties"), enable:

Apply federation route assignments to all sites
Enable SIP federation, and choose your Edge Server

All required topology changes have now been set. To apply these changes onto the Front End Server:

Using the menu bar at the top of the Topology Viewer window, click "Action" > "Topology" > "Publish..." (The progress window may display errors, but these can typically be ignored.)
Open the Deployment Wizard, click "Install or Update Skype for Business Server System", and execute the "Install Local Configuration Store" step. Choose the option to "retrieve directly from the Central Management Store".
While still in the Deployment Wizard, execute the "Setup or Remove Skype for Business Server Components" step.

The toplogy must next be published onto the Edge Server. To do so:

On the Front End Server, open the S4B Management Shell, and export the topology to a file with this command:
- Export-CsConfiguration -FileName <path\to\file>
Copy that file onto the Edge Server. Ideally export the file to a shared drive so that a manual copy is unnecessary.
On the Edge Server, open the Deployment Wizard, click "Install or Update Skype for Business Server System", and execute the "Install Local Configuration Store" step. Choose the option to "import from a file (recommended for Edge Servers)", and select the file for the exported topology configuration.
While still in the Deployment Wizard, execute the "Setup or Remove Skype for Business Server Components" step.

Certificates

S4B sends/receives all SIP traffic over TLS; thus, the Edge Server needs its own set of certificates, both internal & external to the S4B Server deployment.

To obtain all required certificates, open the Deployment Wizard on the Edge Server, click "Install or Update Skype for Business Server System", and execute the "Request, Install or Assign Certificates" task. This will display the Certificate Wizard, which shows a list of all required certificates, and which services they must contain the domain names of. Only two certificates should be listed: "Edge internal" and "External Edge certificate (public Internet)".

The "Edge internal" certificate should be obtained by sending a certificate signing request to the Domain Controller in your deployment, which acts as an internal Certificate Signing Authority. To do so, click the "Edge internal" entry in the list, then click the Request button on the right edge of the window. This will display a dialog that guides you through the steps of sending the request. Once the request is sent, enter the Domain Controller, accept the request, and then go back to the Edge Server to assign the approved certificate.

In contrast, the "External Edge certificate" must be provided by a Certificate Authority that is trusted by the host running the S4B Bridge. This may be a public CA such as Let's Encrypt, or any custom PKI scheme of your choosing. If using the latter, ensure that the root CA's certificate is installed on both the Edge Server host and the S4B Bridge host.

The "External Edge certificate" must contain these names:

Subject Name: <edge>.<s4b-sipdomain>
Subject Alternative Names:
- DNS Name: <edge>.<s4b-sipdomain>
- DNS Name: sip.<s4b-sipdomain>

Once the certificate is obtained, use the Certificate Wizard on the Edge Server to assign it.

Restart to apply changes

Changes to server topology requires restarting system services on both the Front End Server and Edge Server. To do so, open the Management Server on each server, and run these commands:

Run Stop-CsWindowsService on the Edge Server, and wait for it to complete.
Run Stop-CsWindowsService on the Front End Server, and wait for it to complete.
Run Start-CsWindowsService on the Front End Server, and wait for it to complete.
Run Start-CsWindowsService on the Edge Server, and wait for it to complete.

Federation settings

With the topology in place, the S4B Server deployment may now be configured to allow federation with your S4B Bridge. Federation settings may be applied on the Front End Server either in the web admin panel at https://<frnt>.<s4b-intdomain>/macp, or via Powershell commands in the Management Shell. This section lists each setting that must be applied in the web admin panel, followed by its equivalent Powershell in the Management Shell.

Log into the admin panel using the credentials of your Windows account on the Front End Server, and expand the "Federation and External Access" section on the left sidebar. Then, navigate to the following sections and apply these settings:

External Access Policy:
- In either the Global policy or a site-level policy for your S4B site:
  - "Enable communications with federated users"
- Powershell:
  - To edit the Global policy: Set-CsExternalAccessPolicy -Identity Global -EnableFederationAccess $True
  - To create & configure a site-level policy: New-CsExternalAccessPolicy -Identity Site:<your_site_name> -EnableFederationAccess $True
Access Edge Configuration
- In Global policy (the only option available):
  - "Enable federation and public IM connectivity"
  - Optional: "Enable partner domain discovery": Enable this if you would rather have federation be managed dynamically instead of having to explicitly add the SIP domain of your bridge to your S4B Server's allowlist of federated domains. For this to work, you must register a DNS SRV record for your bridge's SIP domain (see the section on bridge domains and certificates). However, adding the bridge's domain to your S4B Server's allowlist is still necessary to prevent the bridge's traffic from being rate-limited.
- Powershell: Set-CsAccessEdgeConfiguration -AllowFederatedUsers $True [-EnablePartnerDiscovery $True -DiscoveredPartnerVerificationLevel "AlwaysVerifiable"]
SIP Federated Domains
- add your S4B Bridge's SIP domain as an Allowed Domain:
  - Domain name (or FQDN): <bridge-sipdomain>
  - Access Edge service (FQDN):
    - If you registered a DNS SRV record of _sipfederationtls._tcp.<bridge-sipdomain>, leave this blank.
    - Otherwise, set this to <bridge-sipdomain>.
- Powershell: New-CsAllowedDomain -Identity "<bridge-sipdomain>" -Comment "<any-name-of-your-choice>"

To verify any of these settings in Powershell, replace New- or Set- in any of the issued commands with Get-. To unapply a setting, use Remove-.

These changes may take some time before they get applied. When in doubt, restart all services by running Stop-CsWindowsService then Start-CsWindowsService in the S4B Server Management Shell on both the Front End Server and the Edge Server.

Contact mapping

Matrix users in S4B

Once a S4B Server is connected to an instance of the bridge, a Matrix user may be added to a S4B user's contact list as a "Contact Not in My Organization". The S4B desktop client provides this action via the "Add a contact" button, which is on the right edge of the main window just below the contact search bar.

Proceeding will display a prompt to set the IM Address of the contact to be added. Technically, an IM Address is a SIP address without the leading sip: scheme.

The IM Address of a Matrix user managed by the bridge is derived from the user's MXID, and has the following mapping:

@username:matrixdomain → username+homeserver@bridge-sipdomain

username: the "localpart" of the MXID.
matrixdomain: the domain name of the Matrix user's homeserver.
bridge-sipdomain: the SIP domain of the bridge (which may differ from the homeserver domain).

S4B users in Matrix

S4B users are represented in Matrix by virtual "ghost" users managed by the bridge. The MXID of a virtual S4B user is derived from the "Bridge > User Prefix" setting (from the bridge's Integrations configuration page in the Installer) and the IM Address (i.e. the SIP Address) of the virtual user's corresponding S4B user, and has the following mapping:

username@s4b-sipdomain → @<user-prefix>sip=3ausername=40s4b-sipdomain:matrixdomain

<user-prefix>: the value of "Bridge > User Prefix" from the bridge's configuration. The default value is _s4b_.
sip=3a: the URL encoding of the sip: scheme of an IM Address (with an escape character of = instead of the typical %), encoded so as to not conflict with the : belonging to the MXID.
- Note that despite S4B using TLS for all SIP traffic, the IM Addresses of S4B contacts never use the sips: scheme.
username: the "localpart" of the IM Address.
=40: the URL encoding of the @ character of the IM address, encoded so as to not conflict with the @ belonging to the MXID.
s4b-sipdomain: the SIP domain of the S4B Server.
matrixdomain: the domain name of the homeserver that the bridge is registered with.

Thus, with a <user-prefix> of _s4b_, the IM Address to MXID mapping is:

username@s4b-sipdomain → @_s4b_sip=3ausername=40s4b-sipdomain:matrixdomain

Migration from self-hosted to ESS On-Premise

Notes

Migrate from self-hosted to Element Server Suite (On-Premise)

Preparation

This section outlines what you should do ahead of the migration in order to ensure the migration goes as quickly as possible and without issues.

At the latest 48 hours before your migration is scheduled, set the TTL on any DNS records that need to be updated to the lowest allowed value.
Upgrade your Synapse to the same version as ESS is running. Generally this will be the latest stable release. https://element.ems.host/_matrix/federation/v1/version is a good indicator.
- This is not required, but if your Synapse version is not the same as the EMS version, your migration will take longer.
Check the size of your database:
- PostgreSQL: Connect to your database and issue the command \l+
Check the size of your media repository and report to your EMS contact.
- Synapse Media Store: du -hs /path/to/synapse/media_store/
- Matrix Media Repo: https://github.com/turt2live/matrix-media-repo/blob/master/docs/admin.md#per-server-usage
If you are using SQLite instead of PostgreSQL, you should port your database to PostgreSQL by following this guide before dumping your database

SSH to your matrix server

You might want to run everything in a tmux or a screen session to avoid disruption in case of a lost SSH connection.

Upgrade Synapse to the same version EES is running

Follow https://element-hq.github.io/synapse/latest/upgrade.html

Start Synapse, make sure it's happy Stop Synapse

Create a folder to store everything

mkdir -p /tmp/synapse_export
cd /tmp/synapse_export

The guide from here on assumes your current working directory is /tmp/synapse_export.

Set restrictive permissions on the folder

If you are working as root: (otherwise set restrictive permissions as needed):

chmod 000 /tmp/synapse_export

Copy Synapse config

Get the following files :

Your Synapse configuration file (usually homeserver.yaml)
Your message signing key.
- This is stored in a separate file. See the Synapse config file [homeserver.yaml] for the path. The variable is signing_key_path https://element-hq.github.io/synapse/latest/usage/configuration/config_documentation.html#signing_key_path
grab macaroon_secret_key from homeserver.yaml and place it in the "Secrets \ Synapse \ Macaroon"

Stop Synapse

DO NOT START IT AGAIN AFTER THIS
Doing so can cause issues with federation and inconsistent data for your users.

While you wait for the database to export or files to transfer, you should edit or create the well-known files and DNS records to point to your new EES host. This can take a while to update so should be done as soon as possible in order to ensure your server will function properly when the migration is complete.

Database export

PostgreSQL

Dump, compress

Replace:

<dbhost> (ip or fqdn for your database server)
<dbusername> (username for your synapse database)
<dbname> (the name of the database for synapse)

pg_dump -Fc -O -h <dbhost> -U <dbusername> -d <dbname> -W -f synapse.dump

Setup new host

[LINK TO ON_PREM SETUP DOCS]

Import DB

Enter a bash shell on the Synapse postgres container:

kubectl exec -it -n element-onprem synapse-postgres-0 --container postgres -- /bin/bash

psql -U synapse_user synapse on postgres container shell

THE FOLLOWING COMMAND WILL ERASE THE EXISTING SYNAPSE DATABASE WITHOUT WARNING OR CONFIRMATION. PLEASE ENSURE THAT IT IS THE CORRECT DB AND THERE IS NO PRODUCTION DATA ON IT

DO $$ DECLARE
r RECORD;
BEGIN
  FOR r IN (SELECT tablename FROM pg_tables WHERE schemaname = current_schema()) LOOP
    EXECUTE 'DROP TABLE ' || quote_ident(r.tablename) || ' CASCADE';
  END LOOP;
END $$;

DROP sequence cache_invalidation_stream_seq;
DROP sequence state_group_id_seq;
DROP sequence user_id_seq;
DROP sequence account_data_sequence;
DROP sequence application_services_txn_id_seq;
DROP sequence device_inbox_sequence;
DROP sequence event_auth_chain_id;
DROP sequence events_backfill_stream_seq;
DROP sequence events_stream_seq;
DROP sequence presence_stream_sequence;
DROP sequence receipts_sequence;
DROP sequence un_partial_stated_event_stream_sequence;
DROP sequence un_partial_stated_room_stream_sequence;

\q to quit

on host:

gzip -d synapse_export.sql.gz
sudo cp synapse_export.sql /data/postgres/synapse/

on pod: cd /var/lib/postgresql/data

pg_restore <connection> --no-owner --role=<new role> -d <new db name> dump.sql

Configuring Synapse workers

From the Installer's Synapse page, scroll down to Synapse workers view.

Click on Add Workers

You have to select a Worker Type. Here are the workers which can be useful to you :

Pushers : If you experience slowness with notifications sending to clients
Client-Reader : If you experience slowness when clients login and sync their chat rooms
Synchrotron : If you experience slowness when rooms are active
Federation-x : If you are working in a federated setup, you might want to dedicate federation to workers.

If you are experiencing resources congestion, you can try to reduce the resources requested by each worker. Be aware that

if the node gets full of memory, it will try to kill containers which are consuming more than what they requested
if a container consumes more than its memory limit, it will be automatically killed by the node, even if there is free memory left.

You will need to re-run the installer after making these changes for them to take effect.

Setting up Delegated Authentication with LDAP on Windows AD

In the installer, set the following fields:

Base: the distinguished name of the root level Org Unit in your LDAP directory.
The distinguished name can be displayed by selecting View/Advanced Features in the Active Directory console and then, right-clicking on the object, selecting Properties/Attributes Editor.

Bind Dn: the distinguished name of the LDAP account with read access.
Filter: an LDAP filter to filter out objects under the LDAP Base DN.
Uri: the URI of your LDAP server. (often your Domain Controller) can pass in ldaps:// for SSL connectivity. The following are the typical ports for Windows AD LDAP servers:
- ldap://ServerName:389
- ldaps://ServerName:636
LDAP Bind Password: the password of the AD account with read access.

Setting up Delegated Authentication with OpenID on Microsoft Azure

Before setting up the installer, you have to configure Microsoft Azure Active Directory.

Set up Microsoft Azure Active Directory

You need to create an App registration.
You have to select Redirect URI (optional) and set it to https://matrix.your-domain.com/_synapse/client/oidc/callback

For the bridge to be able to operate correctly, navigate to API permissions, add Microsoft Graph APIs, choose Delegated Permissions and add

openid
profile

Remember to grant the admin consent for those.

To setup the installer, you'll need

the Application (client) ID
the Directory (tenant) ID
a secret generated from Certificates & secrets on the app.

Configure the installer

Add an OIDC provider in the 'Synapse' configuration after enabling Delegated Auth and set the following fields in the installer:

Allow Existing Users: if checked, it allows a user logging in via OIDC to match a pre-existing account instead of failing. This could be used if switching from password logins to OIDC.
Authorization Endpoint: the oauth2 authorization endpoint. Required if provider discovery is disabled.
https://login.microsoftonline.com/<Directory (tenant) ID>/oauth2/v2.0/authorize
Backchannel Logout Enabled: Synapse supports receiving OpenID Connect Back-Channel Logout notifications. This lets the OpenID Connect Provider notify Synapse when a user logs out, so that Synapse can end that user session. This property has to bet set to https://your-domain/_synapse/client/oidc/backchannel_logoutin your identity provider

Client Auth Method: auth method to use when exchanging the token. Set it to Client Secret Post or any method supported by your Idp
Client ID: your Application (client) ID
Discover: enable/disable the use of the OIDC discovery mechanism to discover endpoints
Idp Brand: an optional brand for this identity provider, allowing clients to style the login flow according to the identity provider in question
Idp ID: a string identifying your identity provider in your configuration
Idp Name: A user-facing name for this identity provider, which is used to offer the user a choice of login mechanisms in the Element UI. In the screenshot bellow, Idp Name is set to Azure AD

Issuer: the OIDC issuer. Used to validate tokens and (if discovery is enabled) to discover the provider's endpoints
https://login.microsoftonline.com/<Directory (tenant) ID>/v2.0
Token Endpoint: the oauth2 authorization endpoint. Required if provider discovery is disabled.
Client Secret: your secret value defined under "Certificates and secrets"

Scopes: add every scope on a different line
- The openid scope is required which translates to the Sign you in permission in the consent UI
- You might also include other scopes in this request for requesting consent.

User Mapping Provider: Configuration for how attributes returned from a OIDC provider are mapped onto a matrix user.
- Localpart Template: Jinja2 template for the localpart of the MXID. Set it to {{ user.preferred_username.split('@')[0] }} for Azure AD
- Display Name Template: Jinja2 template for the display name to set on first login. If unset, no displayname will be set. Set it to {{ user.name }}for Azure AD
Other configurations are documented here.

Setting up Delegated Authentication with OpenID on Microsoft AD FS

Install Microsoft AD FS

Before starting the installation, make sure:

your Windows computer name is correct since you won't be able to change it after having installed AD FS
you configured your server with a static IP address
your server joined a domain and your domain is defined under Server Manager > Local server
you can resolve your server FQDN like computername.my-domain.com

You can find a checklist here.

Steps to follow:

Install AD CS (Certificate Server) to issue valid certificates for AD FS. AD CS provides a platform for issuing and managing public key infrastructure [PKI] certificates.
Install AD FS (Federation Server)

Install AD CS

You need to install the AD CS Server Role.

Follow this guide.

Obtain and Configure an SSL Certificate for AD FS

Before installing AD FS, you are required to generate a certificate for your federation service. The SSL certificate is used for securing communications between federation servers and clients.

Follow this guide.
Additionally, this guide provides more details on how to create a certificate template.

Install AD FS

You need to install the AD FS Role Service.

Follow this guide.

Configure the federation service

AD FS is installed but not configured.

Click on Configure the federation service on this server under Post-deployment configuration in the Server Manager.
Ensure Create the first federation server in a federation server farm and is selected

Click Next

Select the SSL Certificate and set a Federation Service Display Name

On the Specify Service Account page, you can either Create a Group Managed Service Account (gMSA) or Specify an existing Service or gMSA Account

Choose your database

Review Options , check prerequisites are completed and click on Configure
Restart the server

Add AD FS as an OpenID Connect identity provider

In Server Manager, select Tools, and then select AD FS Management
In AD FS Management, right-click on Application Groups and select Add Application Group
On the Application Group Wizard Welcome screen
- Enter the Name of your application
- Under Standalone applications section, select Server application and click Next

Enter https://<matrix domain>/_synapse/client/oidc/callback in Redirect URI: field, click Add, save the Client Identifier somewhere, you will need it when setting up Element and click Next (e.g. https://matrix.domain.com/_synapse/client/oidc/callback)

Select Generate a shared secret checkbox and make a note of the generated Secret and press Next (Secret needs to be added in the Element Installer GUI in a later step)
Right click on the created Application Group and select `Properties``

Select Add application... button.
Select Web API
In the Identifier field, type in the client_id you saved before and click Next

Select Permit everyone and click Next
Under Permitted scopes: select openid and profile and click Next

On Summary page, click `Next``
Click Close and then OK

Export Domain Trusted Root Certificate

Run mmc.exe
Add the Certificates snap-in
- File/Add snap-in for Certificates, Computer account
Under Trusted Root Certification Authorities/Certificates, select your DC cert
Right click and select All Tasks/Export... and export as Base-64 encoded X 509 (.CER)
Copy file to local machine

Configure the installer

Add an OIDC provider in the 'Synapse' configuration after enabling Delegated Auth and set the following fields in the installer:

Allow Existing Users: if checked, it allows a user logging in via OIDC to match a pre-existing account instead of failing. This could be used if switching from password logins to OIDC.
Authorization Endpoint: the oauth2 authorization endpoint. Required if provider discovery is disabled.
https://login.microsoftonline.com/<Directory (tenant) ID>/oauth2/v2.0/authorize
Backchannel Logout Enabled: Synapse supports receiving OpenID Connect Back-Channel Logout notifications. This lets the OpenID Connect Provider notify Synapse when a user logs out, so that Synapse can end that user session.
Client Auth Method: auth method to use when exchanging the token. Set it to Client Secret Basic or any method supported by your Idp
Client ID: the Client ID you saved before
Discover: enable/disable the use of the OIDC discovery mechanism to discover endpoints
Idp Brand: an optional brand for this identity provider, allowing clients to style the login flow according to the identity provider in question
Idp ID: a string identifying your identity provider in your configuration
Idp Name: A user-facing name for this identity provider, which is used to offer the user a choice of login mechanisms in the Element UI. In the screenshot bellow, Idp Name is set to Azure AD

Issuer: the OIDC issuer. Used to validate tokens and (if discovery is enabled) to discover the provider's endpoints https://<your-adfs.domain.com>/adfs/
Token Endpoint: the oauth2 authorization endpoint. Required if provider discovery is disabled.
Client Secret: your client secret you saved before.
Scopes: add every scope on a different line
- The openid scope is required which translates to the Sign you in permission in the consent UI
- You might also include other scopes in this request for requesting consent.

User Mapping Provider: Configuration for how attributes returned from a OIDC provider are mapped onto a matrix user.
- Localpart Template: Jinja2 template for the localpart of the MXID. Set it to {{ user.upn.split('@')[0] }} for AD FS

Other configurations are documented here.

Getting Started with the Enterprise Helm Charts

Introduction

This document will walk you through how to get started with our Element Server Suite Helm Charts. These charts are provided to be used in environments which typically deploy applications by helm charts. If you are unfamiliar with helm charts, we'd highly recommend that you start with our Enterprise Installer.

General concepts

ESS deployment rely on the following components to deploy the workloads on a kubernetes cluster :

Updater : It reads an ElementDeployment CRD manifest, and generates the associated individual Element CRDs manifests linked together
Operator : It reads the individual Element CRDs manifests to generates the associated kubernetes workloads
ElementDeployment : This CRD is a simple structure following the pattern :

spec:
  global:
    k8s:
      # Global settings that will be applied by default to all workloads if not forced locally. This is where you will be able to configure a default ingress certificate, default number of replicas on the deployments, etc.
    config:
      # Global configuration that can be used by every element component
    secretName: # The global secret name. Required secrets keys can be found in the description of this field using `kubectl explain`. Every config named `<foo>SecretKey` will point to a secret key containing the secret targetted by this secret name.
  components:
    <component name>:
      k8s: 
        # Local kubernetes configuration of this component. You can override here the global values to force a certain behaviour for each components.
      config:
        # This component configuration
      secretName: # The component secret name containing secret values. Required secrets keys can be found in the description of this field using `kubectl explain`.  Every config named `<foo>SecretKey` will point to a secret key containing the secret targetted by this secret name.
   <another component>:
     ...

Any change to the ElementDeployment manifest deployed in the namespace will trigger a reconciliation loop. This loop will update the Element manifests read by the Operator. It will again trigger a reconciliation loop in the Operator process, which will update kubernetes workloads accordingly.

If you manually change a workload, it will trigger a reconciliation loop and the Operator will override your change on the workload.

The deployment must be managed only through the ElementDeployment CRD.

Installing the Operator and the Updater helm charts

We advise you to deploy the helm charts in one of the deployments model :

Cluster-Wide deployment : In this mode, the CRDs Conversion Webhook and the controller managers are deployed in their own namespace, separated from ESS deployments. They are able to manage ESS deployments in any namespace of the cluster The install and the upgrade of the helm chart requires cluster admin permissions.
Namespace-scoped deployment : In this mode, only the CRDs conversion webhooks require cluster admin permissions. The Controller managers are deployed directly in the namespace of the element deployment. The install and the upgrade of ESS does not require cluster admin permissions if the CRDs do not change.

All-in-one deployment (Requires cert-manager)

When cert-manager is present in the cluster, it is possible to use the all-in-one ess-system helm chart to deploy the operator and the updater.

First, let's add the ess-system repository to helm, replace ems_image_store_username and ems_image_store_token with the values provided to you by Element.

helm repo add ess-system https://registry.element.io/helm/ess-system --username
<ems_image_store_username> --password '<ems_image_store_token>'

Cluster-wide deployment

When deploying ESS-System as a cluster-wide deployment, updating ESS requires ClusterAdmin permissions.

Create the following values file :


emsImageStore:
  username: <username>
  password: <password>

element-operator:
  clusterDeployment: true
  deployCrds: true  # Deploys the CRDs and the Conversion Webhooks
  deployCrdRoles: true  # Deploys roles to give permissions to users to manage specific ESS CRs
  deployManager: true  # Deploys the controller managers

element-updater:
  clusterDeployment: true
  deployCrds: true  # Deploys the CRDs and the Conversion Webhooks
  deployCrdRoles: true  # Deploys roles to give permissions to users to manage specific ESS CRs
  deployManager: true  # Deploys the controller managers

Namespace-scoped deployment

When deploying ESS-System as a namespace-scoped deployment, you have to deploy ess-system in two parts :

One for the CRDs and the conversion webhooks. This part will be managed with ClusterAdmin permissions. These update less often.
One for the controller managers. This part will be managed with namespace-scoped permissions.

In this mode, the ElementDeployment CR is deployed in the same namespace as the controller-managers.

Create the following values file to deploy the CRDs and the conversion webhooks :


emsImageStore:
  username: <username>
  password: <password>

element-operator:
  clusterDeployment: true
  deployCrds: true  # Deploys the CRDs and the Conversion Webhooks
  deployCrdRoles: false  # Deploys roles to give permissions to users to manage specific ESS CRs
  deployManager: false  # Deploys the controller managers

element-updater:
  clusterDeployment: true
  deployCrds: true  # Deploys the CRDs and the Conversion Webhooks
  deployCrdRoles: false  # Deploys roles to give permissions to users to manage specific ESS CRs
  deployManager: false  # Deploys the controller managers

Create the following values file to deploy the controller managers in their namespace :


emsImageStore:
  username: <username>
  password: <password>

element-operator:
  clusterDeployment: false
  deployCrds: false  # Deploys the CRDs and the Conversion Webhooks
  deployCrdRoles: false  # Deploys roles to give permissions to users to manage specific ESS CRs
  deployManager: true  # Deploys the controller managers

element-updater:
  clusterDeployment: false
  deployCrds: false  # Deploys the CRDs and the Conversion Webhooks
  deployCrdRoles: false  # Deploys roles to give permissions to users to manage specific ESS CRs
  deployManager: true  # Deploys the controller managers

Without cert-manager present on the cluster

First, let's add the element-updater and element-operator repositories to helm, replace ems_image_store_username and ems_image_store_token with the values provided to you by Element.

helm repo add element-updater https://registry.element.io/helm/element-updater --username
<ems_image_store_username> --password '<ems_image_store_token>'
helm repo add element-operator https://registry.element.io/helm/element-operator --username <ems_image_store_username> --password '<ems_image_store_token>'

Now that we have the repositories configured, we can verify this by:

helm repo list

and should see the following in that output:

NAME                    URL                                               
element-operator        https://registry.element.io/helm/element-operator
element-updater         https://registry.element.io/helm/element-updater

N.B. This guide assumes that you are using the element-updater and element-operator namespaces. You can call it whatever you want and if it doesn't exist yet, you can create it with: kubectl create ns <name>.

Generating an image pull secret with EMS credentials

To generate an ems-credentials to be used by your helm chart deployment, you will need to generate an authentication token and palce it in a secret.

kubectl create secret -n element-updater docker-registry ems-credentials --docker-server=registry.element.io --docker-username=<EMSusername> --docker-password=<EMStoken>`
kubectl create secret -n element-operator docker-registry ems-credentials --docker-server=registry.element.io --docker-username=<EMSusername> --docker-password=<EMStoken>`

Generating a TLS secret for the webhook

The conversion webhooks need their own self-signed CA and TLS certificate to be integrated into kubernetes.

For example using easy-rsa :

easyrsa init-pki
easyrsa --batch "--req-cn=ESS-CA`date +%s`" build-ca nopass
easyrsa --subject-alt-name="DNS:element-operator-conversion-webhook.element-operator"\
  --days=10000 \
  build-server-full element-operator-conversion-webhook nopass
easyrsa --subject-alt-name="DNS:element-updater-conversion-webhook.element-updater"\
  --days=10000 \
  build-server-full element-updater-conversion-webhook nopass

Create a secret for each of these two certificates :

kubectl create secret tls element-operator-conversion-webhook --cert=pki/issued/element-operator-conversion-webhook.crt --key=pki/private/element-operator-conversion-webhook.key  --namespace element-operator
kubectl create secret tls element-updater-conversion-webhook --cert=pki/issued/element-updater-conversion-webhook.crt --key=pki/private/element-updater-conversion-webhook.key  --namespace element-updater

Installing the helm chart for the `element-updater` and the `element-operator`

Create the following values file to deploy the controller managers in their namespace :

values.element-operator.yml :

clusterDeployment: true
deployCrds: true  # Deploys the CRDs and the Conversion Webhooks
deployCrdRoles: true  # Deploys roles to give permissions to users to manage specific ESS CRs
deployManager: true  # Deploys the controller managers
crds:
  conversionWebhook:
    caBundle: # Paste here the content of `base64 pki/ca.crt -w 0`
    tlsSecretName: element-operator-conversion-webhook
    imagePullSecret: ems-credentials
operator:
  imagePullSecret: ems-credentials

values.element-updater.yml :

clusterDeployment: true
deployCrds: true  # Deploys the CRDs and the Conversion Webhooks
deployCrdRoles: true  # Deploys roles to give permissions to users to manage specific ESS CRs
deployManager: true  # Deploys the controller managers
crds:
  conversionWebhook:
    caBundle: # Paste here the content of `base64 pki/ca.crt -w 0`
    tlsSecretName: element-updater-conversion-webhook
updater:
  imagePullSecret: ems-credentials

Run the helm install command :

helm install element-operator element-operator/element-operator --namespace element-operator -f values.yaml 
helm install element-updater element-updater/element-updater --namespace element-updater -f values.yaml

Now at this point, you should have the following 4 containers up and running:

[user@helm ~]$ kubectl get pods -n element-operator
NAMESPACE            NAME                                                   READY   STATUS    RESTARTS        AGE
element-operator     element-operator-controller-manager-c8fc5c47-nzt2t     2/2     Running   0               6m5s
element-operator     element-operator-conversion-webhook-7477d98c9b-xc89s   1/1     Running   0               6m5s
[user@helm ~]$ kubectl get pods -n element-updater
NAMESPACE            NAME                                                   READY   STATUS    RESTARTS        AGE
element-updater      element-updater-controller-manager-6f8476f6cb-74nx5    2/2     Running   0               106s
element-updater      element-updater-conversion-webhook-65ddcbb569-qzbfs    1/1     Running   0               81s

Generating the ElementDeployment CR to Deploy Element Server Suite

Using the ess-stack helm-chart

The ess-stack helm chart is available in the ess-system repository :

helm repo add ess-system https://registry.element.io/helm/ess-system --username
<ems_image_store_username> --password '<ems_image_store_token>'

It will deploy an ElementDeployment CR and its associated secrets from the chart values file.

The values file will contain the following structure :

Available Components & Global settings can be found under https://ess-schemas-docs.element.io
For each SecretKey variable, the value will point to a secret key under secrets. For example, components.synapse.config.macaroonSecretKey is macaroon, so a macaroon secret must exists under secrets.synapse.content.

emsImageStore:
  username: <username>
  password: <password>

secrets:
  global:
    content:
      genericSharedSecret: # generic shared secret
  synapse:
    content:
        macaroon: # macaroon
        postgresPassword: # postgres password
        registrationSharedSecret: # registration shared secret

 # globalOptions contains the global properties of the ELementDeployment CRD
globalOptions:
  config:
    domainName: # your base domain
  k8s:
    ingresses:
      tls:
        mode: certmanager
        certmanager:
          issuer: letsencrypt
    workloads:
      replicas: 1

components:
  elementWeb: 
    k8s:
      ingress:
        fqdn:  # element web fqdn
  synapse:
    config:
      media:
        volume:
          size: 5Gi
      postgresql:
        database: # postgres database
        host:  # postgres host
        port: 5432
        user: # postgres user
    k8s:
      ingress:
        fqdn: # synapse fqdn
  wellKnownDelegation:
    config: {}
    k8s: {}

Writing your own ElementDeployment CR

Here is a small sample to deploy the basic components using your own certificate files. This is provided as an example, as ElementDeployment supports a whole range of configuration option that you can explore in :

The documentation website at https://ess-schemas-docs.element.io
the GUI
through kubectl explain command : kubectl explain elementdeployment.matrix.element.io.spec.components

apiVersion: matrix.element.io/v1alpha1
kind: ElementDeployment
metadata:
  name: <element_deployment_name>
  namespace: <target namespace>
spec:
  global:
    k8s:
      ingresses:
        ingressClassName: "public"
      workloads:
        dockerSecrets:
        - name: dockerhub
          url: docker.io
        - name: element-registry
          url: registry.element.io
      storage:
        storageClassName: "standard"
    secretName: global
    config:
      genericSharedSecretSecretKey: genericSharedSecret
      domainName: "deployment.tld"
  components:
    elementWeb:
      secretName: external-elementweb-secrets
      k8s:
        ingress:
          tls:
            mode: certfile
            certificate:
              certFileSecretKey: eleweb.tls
              privateKeySecretKey: eleweb.crt
          fqdn: element-web.tld
    synapse:
      secretName: external-synapse-secrets
      config:
        maxMauUsers: 100
        media:
          volume:
            size: 1
        postgresql:
          host: "<postgresql server>"
          user: "<user>"
          database: "<db>"
          passwordSecretKey: pgpassword
          sslMode: disable
      k8s:
        ingress:
          fqdn: synapse.tld
      tls:
        mode: certfile
        certificate:
          certFileSecretKey: synapse.tls
          privateKeySecretKey: synapse.crt
    wellKnownDelegation:
      secretName: external-wellknowndelegation-secrets
      k8s:
        ingress:
          tls:
            mode: certfile
            certificate:
              certFileSecretKey: wellknown.tls
              privateKeySecretKey: wellknown.crt

To inject secret values in the CR, you will have to create the following secrets :

name: global with data key genericSharedSecret containing any random value. It will be used as a seed for all secrets generated by the updater.
name: external-elementweb-secrets with data keys eleweb.tls containing element web private key and eleweb.crt containing element web certificate.
name: external-synapse-secrets with data keys synapse.tls containing synapse private key and synapse.crt containing synapse certificate. You will also need pgpassword with the postgres password. All attributes pointing to Secret Keys have a default value, and in this example we are relying on the default values of config.macaroonSecretKey : macaroon, config.registrationSharedSecretSecretKey : registrationSharedSecret, config.signingKeySecretKey : signingKey and the config.adminPasswordSecretKey pointing to adminPassword in the secret key.
name: external-wellknowndelegation-secrets with data keys wellknown.tls containing well known delegation private key and wellknown.crt containing well known delegation certificate.

Once the CRD and the Secrets deployed to the namespace, the Updater will be able to create all the resources handled by the Operator, which will then deploy the workloads on your kubernetes cluster.

Loading docker secrets into kubernetes in preparation of deployment

N.B. This guide assumes that you are using the element-onprem namespace for deploying Element. You can call it whatever you want and if it doesn't exist yet, you can create it with: kubectl create ns element-onprem.

Now we need to load secrets into kubernetes so that the deployment can access them. If you built your own CRD from scratch, you will need to follow our Element Deployment CRD documentation.

kubectl create secret -n element-onprem docker-registry ems-image-store --docker-server=registry.element.io --docker-username=<EMSusername> --docker-password=<EMStoken>

Checking deployment progress

To check on the progress of the deployment, you will first watch the logs of the updater:

kubectl logs -f -n element-updater element-updater-controller-manager-<rest of pod name>

You will have to tab complete to get the correct hash for the element-updater-controller-manager pod name.

Once the updater is no longer pushing out new logs, you can track progress with the operator or by watching pods come up in the element-onprem namespace.

Operator status:

kubectl logs -f -n element-operator element-operator element-operator-controller-manager-<rest of pod name>

Watching reconciliation move forward in the element-onprem namespace:

kubectl get elementdeployment -o yaml | grep dependentCRs -A20 -n element-onprem -w

Watching pods come up in the element-onprem namespace:

kubectl get pods -n element-onprem -w

Automating ESS Deployment

The `.element-enterprise-server` Directory

When you first run the installer binary, it will create a directory in your home folder, ~/.element-enterprise-server. This is where you'll find everything the installer uses / generates as part of the installation including your configuration, the installer itself and logs.

As you run through the GUI, it will output config files within ~/.element-enterprise-server/config that will be used when you deploy. This is the best way to get started, before any automation effort, you should run through the installer and get a working config that suits your requirements.

This will generate the config files, which can then be modified as needed, for your automation efforts, then in order to understand how deployments could be automated, you should understand what config is stored where.

The `cluster.yml` Config File

The Cluster YAML configuration file is populated with information used by all aspects of the installer. To start you'll find apiVersion:, kind: and metadata which are used by the installer itself to identify the version of your configuration file. In cases where you switch to a new version of the installer, it will then upgrade this config in-line with the latest versions requirements.

apiVersion: ess.element.io/v1alpha1
kind: InstallerSettings
metadata:
  annotations:
    k8s.element.io/version: 2023-07.09-gui
  name: first-element-cluster

The configuration information is then stored in the spec: section, for instance you'll see; your Postgres in cluster information; DNS Resolvers; EMS Token; etc. See the example below:

spec:
  connectivity:
    dockerhub: {}
  install:
    certManager:
      adminEmail: admin@example.com
    emsImageStore:
      password: examplesubscriptionpassword
      username: examplesubscriptionusername
    microk8s:
      dnsResolvers:
      - 8.8.8.8
      - 8.8.4.4
      postgresInCluster:
        hostPath: /data/postgres
        passwordsSeed: examplepasswordsseed

The `deployment.yml` Config File

The Deployment YAML configuration file is populated with the bulk of the configuration for you're deployment. As above, you'll find apiVersion:, kind: and metadata which are used by the installer itself to identify the version of your configuration file. In cases where you switch to a new version of the installer, it will then upgrade this config in-line with the latest versions requirements.

apiVersion: matrix.element.io/v1alpha1
kind: ElementDeployment
metadata:
  name: first-element-deployment
  namespace: element-onprem

The configuration is again found within the spec: section of this file, which itself has two main sections:

components: which contains the set configuration for each individual component i.e. Element Web or Synapse
global: which contains configuration required by all components i.e. the root FQDN and Certificate Authority information

`components:`

First each component has a named section, such as elementWeb, integrator, synapseAdmin, or in this example synapse:

      synapse:

Within each component, there are two sections to organise the configuration:

config: which is configuration of the component itself i.e. whether Synapse registration is Open / Closed

      config:
        acceptInvites: manual
        adminPasswordSecretKey: adminPassword
        externalAppservices:
          configMaps: []
          files: {}
        federation:
          certificateAutoritiesSecretKeys: []
          clientMinimumTlsVersion: '1.2'
          trustedKeyServers: []
        log:
          rootLevel: Info
        macaroonSecretKey: macaroon
        maxMauUsers: 250
        media:
          maxUploadSize: 100M
          volume:
            size: 50Gi
        postgresql:
          passwordSecretKey: postgresPassword
          port: 5432
          sslMode: require
        registration: closed
        registrationSharedSecretSecretKey: registrationSharedSecret
        security:
          defaultRoomEncryption: not_set
        signingKeySecretKey: signingKey
        telemetry:
          enabled: true
          passwordSecretKey: telemetryPassword
          room: '#element-telemetry'
        urlPreview:
          config:
            acceptLanguage:
            - en
        workers: []

k8s: which is configuration of the pod itself in k8s i.e. CPU and Memory resource limits or FQDN

      k8s:
        common:
          annotations: {}
        haproxy:
          workloads:
            annotations: {}
            resources:
              limits:
                memory: 200Mi
              requests:
                cpu: 1
                memory: 100Mi
            securityContext:
              fsGroup: 10001
              runAsUser: 10001
        ingress:
          annotations: {}
          fqdn: synapse.example.com
          services: {}
          tls:
            certmanager:
              issuer: letsencrypt
            mode: certmanager
        redis:
          workloads:
            annotations: {}
            resources:
              limits:
                memory: 50Mi
              requests:
                cpu: 200m
                memory: 50Mi
            securityContext:
              fsGroup: 10002
              runAsUser: 10002
        synapse:
          common:
            annotations: {}
          monitoring:
            serviceMonitor:
              deploy: auto
          storage: {}
          workloads:
            annotations: {}
            resources:
              limits:
                memory: 4Gi
              requests:
                cpu: 1
                memory: 2Gi
            securityContext:
              fsGroup: 10991
              runAsUser: 10991
      secretName: synapse

`global:`

The global: section works just like component: above, split into two sections config: and k8s:. It will set the default settings for all new components, you can see an example below:

  global:
    config:
      adminAllowIps:
      - 0.0.0.0/0
      - ::/0
      certificateAuthoritySecretKey: ca.pem
      domainName: example.com
      genericSharedSecretSecretKey: genericSharedSecret
      supportDnsFederationDelegation: false
      verifyTls: true
    k8s:
      common:
        annotations: {}
      ingresses:
        annotations: {}
        services:
          type: ClusterIP
        tls:
          certmanager:
            issuer: letsencrypt
          mode: certmanager
      monitoring:
        serviceMonitor:
          deploy: auto
      workloads:
        annotations: {}
        hostAliases: []
        replicas: 2
        securityContext:
          forceUidGid: auto
          setSecComp: auto
    secretName: global

The `secrets.yml` Config File

The Secrets YAML configuration file is populated, as expected, the secrets used for your configuration. It consists of multiple entries, separated by lines of --- each following the below format:

apiVersion: v1
data:
  genericSharedSecret: Q1BoVmNIaEIzWUR6VVZjZXpkMXhuQnNubHhLVVlM
kind: Secret
metadata:
  name: global
  namespace: element-onprem

The main section of interest for automation purposes, is the data: section, here you will find a dictionary of secrets, in the above you can see a genericSharedSecret and it's value opposite.

The `legacy` Directory

The legacy directory stores configuration for specific components not yet updated to the new format within the component: section of the deployment.yml. Work is steadily progressing on updating these legacy components to the new format, however in the meantime, you will find a folder for each legacy component here.

Within each components folder, you will see a .yml file, which is where the configuration of that component is stored. For instance, if you setup the IRC Bridge, it will create ~/.element-enterprise-server/config/legacy/ircbridge with bridge.yml inside. You can use the Integrations and Add-Ons chapter of our documentation for guidance on how these files are configured. Using the IRC Bridge example, you would have a bridge.yml like so:

key_file: passkey.pem
bridged_irc_servers:
- postgres_fqdn: ircbridge-postgres
  postgres_user: ircbridge
  postgres_db: ircbridge
  postgres_password: postgres_password
  admins:
  - "@user:example.com"
  logging_level: debug
  enable_presence: true
  drop_matrix_messages_after_seconds: 0
  bot_username: "ircbridgebot"
  provisioning_room_limit: 50
  rmau_limit: 100
  users_prefix: "irc_"
  alias_prefix: "irc_"
  address: irc.example.com
  parameters:
    name: "Example IRC"
    port: 6697
    ssl: true
    botConfig:
      enabled: true
      nick: "MatrixBot"
      username: "matrixbot"
      password: "some_password"
    dynamicChannels:
      enabled: true
    mappings:
      "#welcome":
        roomIds: ["!MLdeIFVsWCgrPkcYkL:example.com"]
    ircClients:
      allowNickChanges: true

There is also another important folder in legacy. The certs directory, here you will need to add any CA.pem file and certificates for the FQDN of any legacy components. As part of any automation, you will need to ensure these files are correct per setup and named correctly, the certificates in this directory should be named using the fully qualified domain name (.key and .crt).

Automating your deployment

Once you have a set of working configuration, you should make a backup of your ~/.element-enterprise-server/config directory. Through whatever form of automation you choose, automate the modification of your cluster.yml, deployment.yml, secrets.yml and any legacy *.ymls to adjust set values as needed.

For instance, perhaps you need 6 identical homeservers each with their own domain name, you would need to edit the fqdn of each component and the domainName in deployment.yml. You'd then have 6 config directories, each differing in domain, ready to be used by an installer binary.

On each of the 6 hosts, create the ~/.element-enterprise-server directory and copy that hosts specific config to ~/.element-enterprise-server/config. Copy the installer binary to the host, ensuring it's executable.

Running the installer unattended

Once host system is setup, you can use How do I run the installer without using the GUI? to run the installer unattended. It will pickup the configuration and start the deployment installation without needing to use the GUI to get it started.

Wiping all user data and start fresh with an existing config

On a standalone deployment you can wipe and start fresh by running:

sudo snap remove microk8s --purge && sudo rm -rf /data && sudo reboot

then run ./<element-installer>.bin unattended (this will require passwordless sudo to run noninteractively)

Kubernetes : namespace-scoped deployments

Prepare the cluster - Admin side

Installing the Helm Chart Repositories

The first step is to start on a machine with helm v3 installed and configured with your kubernetes cluster and pull down the two charts that you will need.

First, let's add the element-updater repository to helm:

helm repo add element-updater https://registry.element.io/helm/element-updater --username
<ems_image_store_username> --password '<ems_image_store_token>'

Replace ems_image_store_username and ems_image_store_token with the values provided to you by Element.

Secondly, let's add the element-operator repository to helm:

helm repo add element-operator https://registry.element.io/helm/element-operator --username <ems_image_store_username> --password '<ems_image_store_token>'

Replace ems_image_store_username and ems_image_store_token with the values provided to you by Element.

Now that we have the repositories configured, we can verify this by:

helm repo list

and should see the following in that output:

NAME                    URL                                               
element-operator        https://registry.element.io/helm/element-operator
element-updater         https://registry.element.io/helm/element-updater

Deploy the CRDs

Write the following values.yaml file :

clusterDeployment: true
deployCrds: true
deployCrdRoles: true
deployManager: false

To install the CRDS with the helm charts, simply run :

helm install element-updater element-updater/element-updater -f values.yaml
helm install element-operator element-operator/element-operator -f values.yaml

Now at this point, you should have the following two CRDs available:

[user@helm ~]$  kubectl get crds | grep element.io
elementwebs.matrix.element.io                         2023-10-11T13:23:14Z
wellknowndelegations.matrix.element.io                2023-10-11T13:23:14Z
elementcalls.matrix.element.io                        2023-10-11T13:23:14Z
hydrogens.matrix.element.io                           2023-10-11T13:23:14Z
mautrixtelegrams.matrix.element.io                    2023-10-11T13:23:14Z
sydents.matrix.element.io                             2023-10-11T13:23:14Z
synapseusers.matrix.element.io                        2023-10-11T13:23:14Z
bifrosts.matrix.element.io                            2023-10-11T13:23:14Z
lowbandwidths.matrix.element.io                       2023-10-11T13:23:14Z
synapsemoduleconfigs.matrix.element.io                2023-10-11T13:23:14Z
matrixauthenticationservices.matrix.element.io        2023-10-11T13:23:14Z
ircbridges.matrix.element.io                          2023-10-11T13:23:14Z
slidingsyncs.matrix.element.io                        2023-10-11T13:23:14Z
securebordergateways.matrix.element.io                2023-10-11T13:23:14Z
hookshots.matrix.element.io                           2023-10-11T13:23:14Z
matrixcontentscanners.matrix.element.io               2023-10-11T13:23:14Z
sygnals.matrix.element.io                             2023-10-11T13:23:14Z
sipbridges.matrix.element.io                          2023-10-11T13:23:14Z
livekits.matrix.element.io                            2023-10-11T13:23:14Z
integrators.matrix.element.io                         2023-10-11T13:23:14Z
jitsis.matrix.element.io                              2023-10-11T13:23:14Z
mautrixwhatsapps.matrix.element.io                    2023-11-15T09:03:48Z
synapseadminuis.matrix.element.io                     2023-10-11T13:23:14Z
synapses.matrix.element.io                            2023-10-11T13:23:14Z
groupsyncs.matrix.element.io                          2023-10-11T13:23:14Z
pipes.matrix.element.io                               2023-10-11T13:23:14Z
elementdeployments.matrix.element.io                  2023-10-11T13:34:25Z
chatterboxes.matrix.element.io                        2023-11-21T15:55:59Z

Namespace-scoped role

In the namespace where the ESS deployment will happen, to give a user permissions to deploy ESS, please create the following role and roles bindings :

User role :

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: ess-additional
rules:
- apiGroups:
  - apiextensions.k8s.io
  resources:
  - customresourcedefinitions
  verbs:
  - list
  - watch
  - get
- apiGroups:
    - project.openshift.io
  resources:
    - projects
  verbs:
    - get
    - list
    - watch

User roles bindings :

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: ess-additional
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: ess-additional
subjects:
<role subjects which maps to the user or its groups>

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: ess
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: edit
subjects:
<role subjects which maps to the user or its groups>

Using the installer in namespace-scoped mode

In the installer UI, on the cluster configuration screen, the user can now use the following values :

Skip Operator Setup: unchecked
Skip Updater Setup: unchecked
Skip Element Crds Setup: checked
Cluster Deployment: unchecked
Kube Context Name:
namespaces:
- Create Namespaces: unchecked
- operator:
- updater: <same as operator, namespace to deploy ess>
- element deployment: <same as operator, namespace to deploy ess>

See the following screenshot which describes the options. The webhooks CA passphrase value will be present in version 23.11.X and above, and will be randomly-generated on your deployment :

Customize containers ran by ESS

Issue

In some deployments, you might want to customize the containers, because you want to add contents, change the web clients features, etc.

Environment

Element Server Suite (ESS) On-Premise

Resolution

The steps are:

Create a new configmap definition with the overrides you need to configure and inject it into the cluster.
Configure the installer to use the new Images Digests Config Map.
Generate secret for the registry (if it requires authentication) and add it to ESS.

Creating the new Images Digests Config Map

In order to override images used by ESS during the install, you will need to inject a new ConfigMap which specifies the image to use for each component. To do that, you will need to inject a ConfigMap. It's structure maps the components of the ESS, all of them can be overridden :

data:
  images_digests: |# Copyright 2023 New Vector Ltd
    adminbot:
      access_element_web:
      haproxy:
      pipe:
    auditbot:
      access_element_web:
      haproxy:
      pipe:
    element_call:
      element_call:
      sfu:
      jwt:
      redis:
    element_web:
      element_web:
    groupsync:
      groupsync:
    hookshot:
      hookshot:
    hydrogen:
      hydrogen:
    integrator:
      integrator:
      modular_widgets:
      appstore:
    irc_bridges:
      irc_bridges:
    jitsi:
      jicofo:
      jvb:
      prosody:
      web:
      sysctl:
      prometheus_exporter:
      haproxy:
      user_verification_service:
    matrix_authentication_service:
      init:
      matrix_authentication_service:
    secure_border_gateway:
      secure_border_gateway:
    sip_bridge:
      sip_bridge:
    skype_for_business_bridge:
      skype_for_business_bridge:
    sliding_sync:
      api:
      poller:
    sydent:
      sydent:
    sygnal:
      sygnal:
    synapse:
      haproxy:
      redis:
      synapse:
    synapse_admin:
      synapse_admin:
    telegram_bridge:
      telegram_bridge:
    well_known_delegation:
      well_known_delegation:
    xmpp_bridge:
      xmpp_bridge:

Each container on this tree needs at least the following properties to override the source of download :

image_repository_path: elementdeployment/vectorim/element-web
image_repository_server: localregistry.local

You can also override the image tag and the image digest if you want to enforce using digests in your deployment :

image_digest: sha256:ee01604ac0ec8ed4b56d96589976bd84b6eaca52e7a506de0444b15a363a6967
image_tag: v0.2.2

For example, the required configmap manifest (e.g. images_digest_configmap.yml) format would be, to override the element_web/element_web container source path :

apiVersion: v1
kind: ConfigMap
metadata:
  name: <config map name>
  namespace: <namespace of your deployment>
data:
  images_digests: |
    element_web:
      element_web:
        image_repository_path: mycompany/custom-element-web
        image_repository_server: docker.io
        image_tag: v2.1.1-patched

Notes:

the image_digest: may need to be regenerated, or it can be removed.
The image_repository_path needs to reflect the path in your local repository.
The image_repository_server should be replaced with your local repository URL

The new ConfigMap can then be injected into the cluster with:

kubectl apply -f images_digest_configmap.yml -n <namespace of your deployment>

You will also need to configure the ESS Installer to use the new Images Digests Config Map by adding the <config map name> into the Cluster advanced section.

If your registry requires authentication, you will need to create a new secret. So for example, if your registry is called myregistry and the URL of the registry is myregistry.tld, the command would be:

kubectl create secret docker-registry myregistry --docker-username=<registry user> --docker-password=<registry password> --docker-server=myregistry.tld -n <your namespace>

The new secret can then be added into the ESS Installer GUI advanced cluster Docker Secrets:

Handling new releases of ESS

If you are overriding image, you will need to make sure that your images are compatbile with the new releases of ESS. You can use a staging environment to tests the upgrades for example.

Support Policies

On-Premise Support Scope of Coverage

For Element Enterprise On-Premise, we support the following:

Installation and Operation (Configuring the Installer, Debugging Issues)
Synapse Usage/Configuration/Prioritised Bug Fixes
Element Web Usage/Configuration/Prioritised Bug Fixes
Integrations
- Delegated Auth (e.g. OIDC/SAML/LDAP) (Add-on)
- Group Sync (LDAP, AD Graph API, SCIM supported) (Add-on)
- Our Monitoring Stack (Prometheus, Gafana .. )
- Github / Gitlab
- JIRA
- Webhooks
- Jitsi
- Sliding Sync Proxy
- Adminbot (Add-on)
- Auditbot (Add-on)
- XMPP, IRC and Telegram Bridges

For Element On-Premise, we support the following:

Installation and Operation (Configuring the Installer, Debugging Issues)
Synapse Usage/Configuration/Prioritised Bug Fixes
Element Web Usage/Configuration/Prioritised Bug Fixes
Integrations
- Github / Gitlab
- JIRA
- Webhooks
- Jitsi

The following items are not included in support coverage:

General Infrastructure Assistance
K8s Assistance
Operating System Support
Postgresql Database Support

For single node setups, the following also applies:

Element does not support deployment to a microk8s that was not installed by our installer.
Element does not provide a backup solution.
Element does not provide support for any underlying storage.

For kubernetes deployments, the following also applies:

Element does not support deploying the installer created postgresql in a kubernetes environment.
Element requires that you deploy postgresql separately in a kubernetes environment, external to your Element deployment.

Support Policies

Single Node Scope of Coverage Addendum

List of what is supported for this setup exists.
- Element supports the installation of microk8s using our installer on all installer supported platforms.
- Element supports upgrading microk8s using our installer on all installer supported platforms.
- Element supports the installation, configuration, and maintenance of our on-premise software delivered by the installer running on microk8s.
- Element provides diagnosis and bug fixes for our on-premise software delivered by the installer running on microk8s.
List of what is not supported in this setup exists.
- Element does not support the underlying operating system.
- Element does not support deployment to microk8s that was installed separately from our installer.
- Element does not provide a backup solution.
- Element does not provide support for any underlying storage.
Create a checklist of what makes a single node production workload.
- RHEL 8+ or Ubuntu 20.04+
- microk8s installed and running.
- Customer provided backup solution in place.
- Customer managed storage in place.
- synapse, haproxy, and element-web all running.
- Optionally, dimension, adminbot, auditbot, group sync, and hookshot may be running as well.

Appendices

Preparing Element Server Suite PoC

Please reach out our Element Sales Team if you want to run a Proof of Concept for Element Server Suite.

Note This guide is for running Proof of Concepts. We don't aim to show every feature here, we want to get you up and running most quickly. This guide is focusing on connected standalone installations currently. There are scenarios currently not covered by this guide. Installing into airgapped / disconnected environments, or testing our Cloud Based offering.

A Proof-of-Concept is done in preparation of a subscription sale with the goal of demonstrating the required capabilities.

Create an account on element.io

Please create an account on element.io. We will enable this account as part of the PoC process and grant you access to the Element Server Suite software packages.

Communication via matrix room

The account team will create a matrix room to improve communication and invite you. To do this We will need your Matrix ID (MXID) to invite you.

If you don't already have a MXID, you can create one here by signing up. This will create an account on matrix.org, you can authenticate via several identity providers.

When you have a MXID, we recommend adding it to your EMS Account via Your Account, Account. You should then send this to the account team so they can add you to the room. You could use the Element Web Client that you used to create the account or install one of the Element Mobile apps from the App or Playstore.

PoC preparation

Element Server Suite can be installed in a Kubernetes Cluster or as a standalone installation on top of an Operating System (RHEL 8/9 or Ubuntu 20.04/22.04). Most Proof-of-Concept installations will select the Standalone Installation on top of a VM which we recommend for speed and ease of operation.

Click here for an overview of the Element Server Suite. Here is the link detailing the single node installation.

Preparation of the VM and Ports

Please set up a VM with 8 vCPUs and 32GB RAM and 100 GB Storage. If this sounds like a lot of resources to you, the requirements do in fact vary and could be scaled down later if required. Install Ubuntu 20.04 LTS or RHEL8. Update the system to the latest available patches and create a user to be used for maintaining the Element Server Suite. See our documentation for this step here.

You will need to be able to reach the VM on Ports 80, 443 and 8443.

DNS Names and Certificates

You need to select a base domain for the Server. This can differ from the base domain of the matrix IDs but is often the same. Read more about this in the section on Matrix IDs and Well Known delegation below.

You have chosen eng.acme.com. The following DNS entries must be prepared and point to the external IP of the VM.

This results in the following hostnames for you :

eng.acme.com (base domain - might already exist )
matrix.eng.acme.com (the synapse homeserver)
element.eng.acme.com (element web)
admin.eng.acme.com (admin dashboard)
integrator.eng.acme.com (integration manager)
hookshot.eng.acme.com (Our integrations)

Optional for Monitoring and Integrations :

grafana.eng.acme.com (Our Grafana server)

Optional for Video Chat with Jitsi :

jitsi.eng.acme.com (Our VoIP platform)
coturn.eng.acme.com (Our TURN server)

Optional for Video Chat with Element Call :

call.acme.com (Element Call)
sfu.acme.com (Selective Forwarding Unit)

Opitonal for Element X support :

sliding-sync.acme.com

Optional for the Admin / Audit functionality :

roomadmin.eng.acme.com
audit.eng.acme.com

We require certificates for all these hostnames including the base domain to enable SSL/TLS encryption. The quick and easy way is to use the embedded letsencrypt. This is only available if you are in a connected environment. You can provide and use your own certificates.

Matrix IDs & Well Know delegation

Matrix IDs have the following format :

@USER:SERVER

In our example case the matrix server is matrix.eng.acme.com. If a user Tom Maier has a username tmaier in your LDAP, this would lead to an MXID @tmaier:matrix.eng.acme.com. This is often not desired as we like to keep the MXIDs short. It is more elegant to drop the "matrix" host name from the MXIDs. Tom's MXID would look like this @tmaier:eng.acme.com .

In order to be able to offer matrix IDs with the base domain, we recommend setting up a reverse proxy on eng.acme.com, which forwards https://eng.acme.com/.well-known/matrix/ to the matrix/synapse server on https://matrix.eng.acme.com/.well-known/matrix . Or you shorten the hostname part of your MXIDs even more to acme.com, this would require you to put the reverse proxy onto acme.com.

The configuration on your Apache WebServer should be similar to this :

    ProxyPass               /.well-known/matrix/ https://matrix.eng.acme.com/.well-known/matrix/
    ProxyPassReverse        /.well-known/matrix/ https://matrix.eng.acme.com/.well-known/matrix/
    ProxyPreserveHost On

More about well-known and MXIDs can be found in our Upstream Documentation here and here. Further configurations can be made using the well-known mechanism. An example is documented here.

Authentication and Postgres DB

The quickest setup is using local authentication and users only. This is what we recommend in a Proof-of-Concept situation. User accounts are created in the local Postgresql DB (recommended only up to 300 users) through our Admin UI or through API scripts for automation in this case. We support many mechanisms for AUthentication like LDAP, SAML2 and OIDC. We recommend to configure these as a 2nd step only if required.

You have the option to use an internal or external Postgres DB. We do recommend to use the internal Postgres DB for Proof-of-Concept installations. The internal Postgres DB is only available when you are opting for the Standalone Installation on top of an Operating System. You will need an external Postgres DB when installing into an existing Kubernetes cluster.

Checklist before starting the installation

Please prepare the above items before starting the installation. Make sure you have :

created and communicated your MXID to the Element Sales Team
registered an account on element.io
created and prepared your vm / machine with enough resources
created DNS entries
decided on letsencrypt / created host certificates for your hostnames
installed the reverse proxy on the webserver of your MXID URL e.g. eng.acme.com or acme.com

Don't hesitate to reach out to your Element Sales Team for support. We are here to guide you.

Appendices

How to run a Webserver on Standalone Deployments

This guide is does not come with support by Element. It is not part of the Element Server Suite (ESS) product. Use at your own risk. Remember you are responsible of maintaining this software stack yourself.

Some config options require a web content to be served. For example :

Changing Element Web appearance with custom background pictures.
Providing a HomePage for display in Element Web.
Providing a Guide PDF from your server in an airgapped environment.

One way to provide this content is to run a web server in the same Kubernetes Cluster as the Element Enterprise Suite.

Please consider other options before installing and maintaining just another webserver for this.

Consider to use an existing web server 1st.

The following guide describes the steps to setup the Bitnami Apache helm chart in the Standalone Microk8s cluster setup by Element Server Suite..

You need:

a DNS entry pages.BASEDOMAIN.
a Certificate (private key + certificate) for pages.BASEDOMAIN
an installed standalone Element Server Suite setup
access to the server on the command line

You get:

a web server that runs in the mircok8s cluster
a directory /var/www/apache-content to place and modify web content like homepage, backgrounds and guides.

You can deploy a Webserver to the same Kubernetes cluster that Element Server Suite is using. This guide is applicable to the Single Node deployment of Element Server Suite but can be used for guidance on how to host a webserver in other Kubernetes Clusters as well.

You can use any webserver that you like, in this example we will user the Bitnami Apache chart.

We need helm version 3. You can follow this Guide or ask microk8s to install helm3.

Enabling Helm3 with microk8s

$ microk8s enable helm3
Infer repository core for addon helm3
Enabling Helm 3
Fetching helm version v3.8.0.
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 12.9M  100 12.9M    0     0  17.4M      0 --:--:-- --:--:-- --:--:-- 17.4M
Helm 3 is enabled

Let's check if it is working

$ microk8s.helm3 version
version.BuildInfo{Version:"v3.8.0", GitCommit:"d14138609b01886f544b2025f5000351c9eb092e", GitTreeState:"clean", GoVersion:"go1.17.5"}

Create and Alias for helm

echo alias helm=microk8s.helm3 >> ~/.bashrc
source ~/.bashrc

Enable the Bitnami Helm Chart repository

Add the bitnami repository

helm repo add bitnami https://charts.bitnami.com/bitnami

Update the repo information

helm repo update

Prepare the Web-Server Content

Create a directory to supply content :

sudo mkdir /var/www/apache-content

Put your content e.g. a homepage into the apache-content directory.

cp /tmp/background.jpg /apache-content/
cp /tmp/home.html ~element/apache-content/

There are multiple ways to provide this content to the apache pod. The bitnami helm chart user ConfigMaps, Physical Volumes or a Git Repository.

ConfigMaps are a good choice for smaller amounts of data. There is a hard limit of 1MiB on ConfigMaps. So if all your data is not more that 1MiB, the config map is a good choice for you.

Physical Volumes are a good choice for larger amounts of data. There are several choices for backing storage available. In the context of the standalone deployments of ESS a Physical Hostpath is the most practical. HostPath is not a good solution for mutli node k8s clusters, unless you pin a pod to a certain node. Pinning the pod to a single node would put the workload at risk, should that node go down.

Git Repository is a favourite as it versions the content and you track and revert to earlier states easily. The bitnami apache helm chart is built in a way that updates in regular intervals to your latest changes.

We are selecting the Physical Volume option to serve content in this case. Our instance of Microk8s comes with the Hostpath storage addon enabled.

Define the physical volume:

cat <<EOF>pv-volume.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: apache-content-pv
  labels:
    type: local
spec:
  storageClassName: microk8s-hostpath
  persistentVolumeReclaimPolicy: Retain
  capacity:
    storage: 100Mi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/var/www/apache-content"
EOF

Apply to the cluster

kubectl apply -f pv-volume.yaml

Next we need a Physical Volume Claim:

cat <<EOF>pv-claim.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: apache-content-pvc
spec:
  volumeName: apache-content-pv
  storageClassName: microk8s-hostpath
  accessModes: [ReadWriteOnce]
  resources: { requests: { storage: 100Mi } }
EOF

Apply to the cluster to create the pvc

kubectl apply -f pv-claim.yaml

Configure the Helm Chart

We need to add configurations to adjust the apache deployment to our needs. The K8s service should be switched to ClusterIP. The Single Node deployment includes an Ingress configuration through nginx that we can use to route traffic to this webserver. The name of the ingressClass is "public". We will need to provide a hostname. This name needs to be resolvable through DNS. This could be done through the wildcard entry for *.$BASEDOMAIN that you might already have. You will need a certificate and certificate private key to secure this connection through TLS.

The full list of configuration options of this chart is explained in the bitnami repository here

Create a file called apache-values.yml in the home directory of your element user directory.

Remember to replace BASEDOMAIN with the correct value for your deployment.

cat <<EOF>apache-values.yaml
service:
  type: ClusterIP
ingress:
  enabled: true
  ingressClassName: "public"
  hostname: pages.BASEDOMAIN
htdocsPVC: apache-content-pvc
EOF

Deploy the Apache Helm Chart

Now we are ready to deploy the apache helm chart

helm install myhomepage -f apache-values.yaml oci://registry-1.docker.io/bitnamicharts/apache

Manage the deployment

List the deployed helm charts:

$ helm list 
NAME      	NAMESPACE	REVISION	UPDATED                                	STATUS  	CHART        	APP VERSION
myhomepage	default  	1       	2023-09-06 14:46:33.352124975 +0000 UTC	deployed	apache-10.1.0	2.4.57

Get more details:

$ helm status myhomepage
NAME: myhomepage
LAST DEPLOYED: Wed Sep  6 14:46:33 2023
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
CHART NAME: apache
CHART VERSION: 10.1.0
APP VERSION: 2.4.57

** Please be patient while the chart is being deployed **

1. Get the Apache URL by running:

  You should be able to access your new Apache installation through:
      - http://pages.lutz-gui.sales-demos.element.io

If you need to update the deployment, modify the required apache-values.yaml and run :

helm upgrade myhomepage -f apache-values.yaml oci://registry-1.docker.io/bitnamicharts/apache

If you don't want the deployment any more, you can remove it.

helm uninstall myhomepage

Secure the deployment with certificates

If you are in a connected environment, you can rely on cert-manager to create certificates and secrets for you.

Cert-manager with letsencrypt

If you have cert-manager enabled. You will just need to add the right annotations to the ingress of your deployment. Modify you apache-values.yaml and add these lines to the ingress block :

  tls: true
  annotations: 
    cert-manager.io/cluster-issuer: letsencrypt
    kubernetes.io/ingress.class: public

You will need to upgrade your deployment to reflect these changes:

helm upgrade myhomepage -f apache-values.yaml oci://registry-1.docker.io/bitnamicharts/apache

Custom Certificates

There are situations in which you want custom certificates instead. These can be used by modifying your apache-values.yaml. Add the following lines to the ingress block in the apache-values.yaml. Take care to get the indentation right. Replace the ... with your data.

  tls: true
  extraTls:
  - hosts:
    - pages.lutz-gui.sales-demos.element.io
    secretName: "pages.lutz-gui.sales-demos.element.io-tls"
  secrets:
    - name: pages.lutz-gui.sales-demos.element.io-tls
      key: |-
        -----BEGIN RSA PRIVATE KEY-----
        ...
        -----END RSA PRIVATE KEY-----
      certificate: |-
        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----

You will need to upgrade your deployment to reflect these changes:

helm upgrade myhomepage -f apache-values.yaml oci://registry-1.docker.io/bitnamicharts/apache

Tips and Tricks

You can make your life easier by using bash completing and an alias for kubectl. You will need to have the bash-completion package installed as a prerequisite.

For all users on the system:

kubectl completion bash | sudo tee /etc/bash_completion.d/kubectl > /dev/null

Set an aias for kubectl for your user:

echo 'alias k=kubectl' >>~/.bashrc

Enable auto-completion for your alias

echo 'complete -o default -F __start_kubectl k' >>~/.bashrc

After reloading your Shell, you can now enjoy auto completion for your k ( kubectl ) commands.

Appendices

Notifications, MDM & Push Gateway

The stock Android and iOS Apps will use an Element owned Push Gateway to send Notification via the Apple or Google Notifiction Services.

The URL of our push gateway is https://matrix.org/_matrix/push/v1/notify

The apps will on startup register with the Google or Apple Notification Services (APNs) and request a push_notification_client_identifier. If notifications need sending, the homeserver will use the configured Push Gateway to sent notification through the APNs.

What is a Notification?

A notification will not contain sensitive content. This is what notificatons actually look like :

▿ 5 elements
  ▿ 0 : 2 elements
    ▿ key : AnyHashable("unread_count")
      - value : "unread_count"
    - value : 1
  ▿ 1 : 2 elements
    ▿ key : AnyHashable("pusher_notification_client_identifier")
      - value : "pusher_notification_client_identifier"
    - value : ad0bd22bb90fabde45429b3b79cdbba12bd86f3dafb80ea22d2b1343995d8418
  ▿ 2 : 2 elements
    ▿ key : AnyHashable("aps")
      - value : "aps"
    ▿ value : 2 elements
      ▿ 0 : 2 elements
        - key : alert
        ▿ value : 2 elements
          ▿ 0 : 2 elements
            - key : loc-key
            - value : Notification
          ▿ 1 : 2 elements
            - key : loc-args
            - value : 0 elements
      ▿ 1 : 2 elements
        - key : mutable-content
        - value : 1
  ▿ 3 : 2 elements
    ▿ key : AnyHashable("room_id")
      - value : "room_id"
    - value : !vkibNVqwhZVOaNskRU:matrix.org
  ▿ 4 : 2 elements
    ▿ key : AnyHashable("event_id")
      - value : "event_id"
    - value : $0cTr40iZmOd3Aj0c65e_7F6NNVF_BwzEFpyXuMEp29g

Mobile Device Management (MDM)

You can use Mobile Device Management to configure and roll out Mobil Applications. To be able to configure mobile apps this way, the app needs to implement certain interfaces in a standard way. This is called AppConfig.

The Android Element App does not support AppConfig currently. You will need to rebuild the apk to include changes like a different homeserver or a diffrent pusherURL.

The iOS Element App got enabled for AppConfig in version 1.11.2. this allows the change of the following parameters and keys without the need to recompile the app.

im.vector.app.serverConfigDefaultHomeserverUrlString
im.vector.app.clientPermalinkBaseUrl
im.vector.app.serverConfigSygnalAPIUrlString

If you employ a Mobile Device Management solution like e.g. VmWare Workspace One, you will need to configure your iOS Element app with these keys as documented here in section Publish and update Managed AppConfig for your app in Workspace ONE.

Depending on the brand of MDM you are using, you can create the required keys manually, or enable these setting with an XML file. The XML file might look like this :

<managedAppConfiguration>
     <version>1</version>
     <bundleId>im.vector.app</bundleId>
     <dict>
          <string keyName="im.vector.app.serverConfigDefaultHomeserverUrlString">
               <defaultValue>
                    <value>https://matrix.BASEDOMAIN</value>
               </defaultValue>
          </string>
          <string keyName="im.vector.app.clientPermalinkBaseUrl">
               <defaultValue>
                    <value>https://messenger.BASEDOMAIN</value>
               </defaultValue>
          </string>
     </dict>
</managedAppConfiguration>

Using your own Push Gateway ( Sygnal )

Some organization still feel uncomfortable with using our Push Gateway. You are able to use your own push gateway (e.g. Sygnal) if you want.

You can install Sygnal as an integration with the Element Server Suite.

During the App Upload process a private key is created. We as Element Company retain and use that key on our Push infrastructure. This is why you can not use the stock Element Apps, but will need to upload your own version of the Element App. This will give you access to your own private notification key that is bound to the app you uploaded.

You will need to configure your Sygnal with the private key of your Element App.

You will need to set the "im.vector.app.serverConfigSygnalAPIUrlString" for the iOS App or the equilivant in the Android App Source code.

Appendices

Verifying ESS releases against Cosign

Cosign ESS Verification Key

ESS does not use Cosign transaction log to be able to support airgapped deployment. We are instead relying on a public key that you can ask if you need to run image verification in your cluster.

The ESS Cosign public key is the following one :

-----BEGIN PUBLIC KEY-----
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE1Lc+7BqkqD+0XYft05CeXto/Ga1Y
DKNk3o48PIJ2JMrq3mzw13/m5rzlGjdgJCs6yctf4+UdACZx5WSiIWTFbQ==
-----END PUBLIC KEY-----

Verifying manually

To verify a container against ESS Keys, you will have to run the following command :

Operator : cosign verify registry.element.io/ess-operator:<version> --key cosign.pub
Updater : cosign verify registry.element.io/ess-updater:<version> --key cosign.pub

If you are running in an airgapped environment, then you will need to append --insecure-ignore-tlog=true to the above commands

Verifying automatically

You will have to setup and configure your SIGStore Admission Policy to use ESS Public Key.

Appendices

ESS CRDs support in ArgoCD

ArgoCD can support getting the ESS CRDs Status as resource health using Custom Health Checks

You need to configure the following under the configmap argocd-cm of argocd :

data:
  resource.customizations: |
    matrix.element.io/*:
      health.lua: |
        hs = {}
        if obj.status ~= nil then
          if obj.status.conditions ~= nil then
            for i, condition in ipairs(obj.status.conditions) do
              if condition.type == "Failure" and condition.status == "True" then
                hs.status = "Degraded"
                hs.message = condition.message
                return hs
              end
              if condition.type == "Running" and condition.status == "True" and condition.reason ~= "Successful" then
                hs.status = "Progressing"
                hs.message = condition.message
                return hs
              end
              if condition.type == "Available" and condition.status == "True" then
                hs.status = "Healthy"
                hs.message = condition.message
                return hs
              end
              if condition.type == "Available" and condition.status == "False" then
                hs.status = "Degraded"
                hs.message = condition.message
                return hs
              end
              if condition.type == "Successful" and condition.status == "True" then
                hs.status = "Healthy"
                hs.message = condition.message
                return hs
              end
            end
          end
        end

        hs.status = "Progressing"
        hs.message = "Waiting for the CR to start to converge..."
        return hs
    EOT

Appendices

Synapse database troubleshooting

Room Retention policy enabled causes Synapse database to consume a lot of disk space

Run the following command against synapse postgres database : \d+. On an installer-managed postgresql, you can access psql command using : kubectl exec -it -n element-onprem synapse-postgres-0 -- bash -c 'psql "dbname=$POSTGRES_DB user=$POSTGRES_USER password=$POSTGRES_PASSWORD"'
Check the space taken by the table state_groups_state. For example, here it's consuming 540 GB : public | state_groups_state | table | synapse_user | permanent | 244 GB |
If you have Room retention policy enabled, there's a bug which causes some state groups to be orphaned, and as a consequence they are not cleaned up from the database automatically.
Follow the instruction from the page synapse-find-unreferenced-state-groups. The tool is available for download in the following link rust-synapse-find-unreferenced-state-groups.
On Standalone and Installer-managed postgresql database, you can use the following script to do it automatically. It's going to involve a downtime because Synapse is stopped before cleaning up orphaned state groups. Please make sure that you have appropriate disk space before running the script because the script generates a backup of the database before cleaning up the tables.. If you need to restore, the command will be kubectl exec -it pods/synapse-postgres-0 -n element-onprem -- psql "dbname=synapse user=synapse_user password=$POSTGRES_PASSWORD" < /path/to/dump.sql :

#!/bin/sh
set -e

echo "Stopping Operator..."
kubectl scale deploy/element-operator-controller-manager -n operator-onprem --replicas=0
echo "Stopping Synapse..."
kubectl delete synapse/first-element-deployment -n element-onprem

while kubectl get statefulsets -n element-onprem --no-headers -o custom-columns=":metadata.labels" | grep -q "matrix-server"; do
  echo "Waiting for synapse StatefulSets to be deleted..."
  sleep 2
done

echo "Forwarding postgresql port..."
kubectl port-forward pods/synapse-postgres-0 -n element-onprem   15432:5432 &
port_forward_pid=$!
sleep 1s

POSTGRES_PASSWORD=`echo $(kubectl get secrets/first-element-deployment-synapse-secrets -n element-onprem -o yaml | grep postgresPassword | cut -d ':' -f2) | base64 -d`
POSTGRES_USER=synapse_user
POSTGRES_DB=synapse

echo "Find unreferenced state groups..."
./rust-synapse-find-unreferenced-state-groups -p postgres://$POSTGRES_USER:$POSTGRES_PASSWORD@localhost:15432/$POSTGRES_DB -o ./sgs.txt
kill -9 $port_forward_pid

echo "Copy unreferenced state groups list to postgres pod..."
kubectl cp ./sgs.txt -n element-onprem synapse-postgres-0:/tmp/sgs.txt

echo "Backing up postgres database..."
kubectl exec -it pods/synapse-postgres-0 -n element-onprem -- bash -c 'pg_dump "dbname=$POSTGRES_DB user=$POSTGRES_USER password=$POSTGRES_PASSWORD"' > backup-unreferenced-state-groups-$(date '+%Y-%m-%d-%H:%M:%S').sql

echo "Cleanup postgres database..."
kubectl exec -it pods/synapse-postgres-0 -n element-onprem -- psql "dbname=$POSTGRES_DB user=$POSTGRES_USER password=$POSTGRES_PASSWORD" -c "CREATE TEMPORARY TABLE unreffed(id BIGINT PRIMARY KEY); COPY unreffed FROM '/tmp/sgs.txt' WITH (FORMAT 'csv'); DELETE FROM state_groups_state WHERE state_group IN (SELECT id FROM unreffed); DELETE FROM state_group_edges WHERE state_group IN (SELECT id FROM unreffed); DELETE FROM state_groups WHERE id IN (SELECT id FROM unreffed);"
echo "Starting Operator..."
kubectl scale deploy/element-operator-controller-manager -n operator-onprem --replicas=1

Running the script should look like this :

 bash cleanup-sgs.sh 
Stopping Operator...
deployment.apps/element-operator-controller-manager scaled
Stopping Synapse...
synapse.matrix.element.io "first-element-deployment" deleted
Waiting for synapse StatefulSets to be deleted...
Waiting for synapse StatefulSets to be deleted...
Waiting for synapse StatefulSets to be deleted...
Waiting for synapse StatefulSets to be deleted...
Waiting for synapse StatefulSets to be deleted...
Waiting for synapse StatefulSets to be deleted...
Waiting for synapse StatefulSets to be deleted...
Forwarding postgresql port...
Forwarding from 127.0.0.1:15432 -> 5432
Forwarding from [::1]:15432 -> 5432
Find unreferenced state groups...
Handling connection for 15432
  [0s] 741 rows retrieved
Fetched 725 state groups from DB
Total state groups: 725
Found 2 unreferenced groups
Copy unreferenced state groups list to postgres pod...
Defaulted container "postgres" out of: postgres, postgres-init-password, postgres-exporter
cleanup-sgs.sh: line 28: 428645 Killed                  kubectl port-forward pods/synapse-postgres-0 -n element-onprem 15432:5432
Backing up postgres database...
Cleanup postgres database...
Defaulted container "postgres" out of: postgres, postgres-init-password, postgres-exporter
DELETE 2
Starting Operator...
deployment.apps/element-operator-controller-manager scaled

Appendices

Auditbot troubleshooting

Auditbot Viewing Error - Bad MAC

This is a symptom that Auditbot Secure Storage got corrupted. It can happen if you try to change the passphrase of Auditbot through the UI for example.

This procedure will make rooms history unable to decrypt in auditbot UI. The rooms history will still be available in audit logs generated by auditbot, in the S3 or file storage.

To resolve you will need to reset the 4S passphrase of auditbot.

Stop the operator and edit the auditbot pipe:

kubectl scale deploy/element-operator-controller-manager -n operator-onprem --replicas=0
kubectl edit statefulsets.apps first-element-deployment-auditbot-pipe -n element-onprem

Add the following under env:

- name: AUDIT_FORCE_NEW_SSSS
  value: "true"

Wait for the pipe to restart, check its logs, and check that you can log in through the Admin Console.

Edit the Statefulset again:

kubectl edit statefulsets.apps first-element-deployment-auditbot-pipe -n element-onprem

Remove the env variable you added:

- name: AUDIT_FORCE_NEW_SSSS
  value: "true"

Restart the operator :

kubectl scale deploy/element-operator-controller-manager -n operator-onprem --replicas=1

This will restart auditbot and it normal functionality should be restored.

Archived Documentation Repository

Documentation covering v1 and installers prior to 2022-07.03

element-on-premise-documentation-july28-2022.pdf

Archived Documentation Repository

Documentation Covering Installers From 2022.07.03 to 2022.09.05

element-on-premise-documentation-0703-0905.pdf

Archived Documentation Repository

Documentation Covering Installers From 2022.10.01 to 2023.02.01

element-on-premise-documentation.pdf

Archived Documentation Repository

Documentation Covering Installer 2023-02.02 CLI Only.

element-on-premise-documentation (2).pdf

Archived Documentation Repository

Documentation Covering Installers from 2023-03.01 to 2023-05.04

element-on-premise-documentation.pdf

ESS Sizing

The values provided below are indicative and might vary a lot depending on your setup, the volume of federation traffic, active usage, bridged use-cases, integrations enabled, etc.

CPU & Memory

Synapse Homeserver

The installer comes with default installation profiles which configure workers depending on your setup. For each profile :

CPU is the maximum cpu cores the Homeserver can request
Memory is the average memory the Homeserver will require

	1–500 users	501–2500 users	2501–10000 users
unfed	2 CPU, 2000 MiB RAM	6 CPU, 5650 MiB RAM	10 CPU, 8150 MiB RAM
small fed	2 CPU, 2000 MiB RAM	6 CPU, 5650 MiB RAM	10 CPU, 8150 MiB RAM
open fed	5 CPU, 4500 MiB RAM	9 CPU, 8150 MiB RAM	15 CPU, 11650 MiB RAM

PostgresSQL

Synapse Postgres Server

Synapse postgres server will require the following resources :

	1–500 users	501–2500 users	2501–10000 users
unfed	1 CPU, 4 GiB RAM	2 CPU, 12 GiB RAM	4 CPU, 16 GiB RAM
small fed	2 CPU, 6 GiB RAM	4 CPU, 18 GiB RAM	8 CPU, 28 GiB RAM
open fed	3 CPU, 8 GiB RAM	5 CPU, 24 GiB RAM	10 CPU, 32 GiB RAM

Operator & Updater

The Updater memory usage remains at 256Mi. At least 1 CPU should be provisioned for the operator and the updater.

The Operator memory usage scales linearly with the number of integrations you deploy with ESS. It's memory usage will remain low, but might spike up to 256Mi x Nb Integrations during deployment and configuration changes.

Volumes

Synapse Medias

The disk usage to expect after a year can be calculated using the following formula : average media size x average number of media uploaded / day x active users x 365.

Media retention can be configured with the configuration option in Synapse/Config/Data Retention of the installer.

Postgres DB size

The disk usage to expect after a year can be calculated using the following formula :

If Federation is enabled, active users x 0.9GB.
If Federation is disabled or limited, active users x 0.6GB.

ESS - Backup & Restore Guide

Introduction

Welcome, ESS Administrators. This guide is crafted for your role, focusing on the pragmatic aspects of securing crucial data within the Element Server Suite (ESS). ESS integrates with external PostgreSQL databases and persistent volumes and is deployable in standalone or Kubernetes mode. To ensure data integrity, we recommend including valuable, though not strictly consistent, data in backups. The guide also addresses data restoration and a straightforward disaster recovery plan.

Software Overview

ESS provides Synapse and Integrations which require an external PostgreSQL and persistent volumes. It offers standalone or Kubernetes deployment.

If you are using Standalone deployment, please refer to Single-node Storage & Backup Guidelines.
If you are using Kubernetes deployment, we strongly recommend to leverage your own cluster backup solutions for effective data protection.

You'll find below a description of the content of each component data and db backup.

Synapse

Synapse deployments creates a PVC named <element deployment cr name>-synapse-media. It contains all users medias (avatar, photos, videos, etc). It does not need strict consistency with database content, but the more in sync they are, the more medias can be correctly synced with rooms state in case of restore.
Synapse requires an external postgressql database which contains all the server state.

Adminbot

Adminbot integration creates a PVC named <element deployment cr name>-adminbot. It contains the bot decryption keys, and a cache of the adminbot logins.

Auditbot

Auditbot integration creates a PVC named <element deployment cr name>-auditbot. It contains the bot decryption keys, and a cache of the adminbot logins.
Auditbot store the room logs of your organization either in an S3 Bucket or the aforementioned PVC. Depending on the critical nature of being able to provide room logs for audit, you need to properly backup your S3 Bucket or the PVC.

Matrix Authentication Service

Matrix Authentication Service requires an external postgresql database. It contains the homeserver users, their access tokens and their Sessions/Devices.

Sliding Sync

Sliding Sync requires an external postgresql database. It contains Sliding Sync running state, and data cache. The database backup needs to be properly secured. This database needs to be backed-up to be able to avoid UTDs and initial-syncs on a disaster recovery.

Sydent

Sydent integration creates a PVC named -sydent. It contains the integration SQLite database.

Integrator

Integrator requires an external postgresql database. It contains information about which integration was added to each room.

Bridges (XMPP, IRC, Whatsapp, SIP, Telegram)

The bridges require each an external postgresql database. It contains mapping data between Matrix Rooms and Channels on the other bridge side.

Backup Policy & Backup Procedure

There is no particular prerequisite to do before executing an ESS backup. Only Synapse and MAS Databases should be backed up in sync and stay consistent. All other individual components can be backed up on it's own lifecycle.

Backups frequency and retention periods must be defined according to your own SLAs and SLIs.

Data restoration

The following ESS components should be restored first in case of complete restoration. Other components can be restore on their distinctively, on their own time :

Synapse Postgresql database
Synapse medias
Matrix Authentication Service database (if installed)
Restart Synapse & MAS (if installed)
Restore and restart each individual component

Disaster Recovery Plan

In case of disaster recovery, the following components are critical for your system recovery :

Synapse Postgresql database is critical for Synapse to send consistent data to other servers, integrations and clients.
Synapse keys configured in ESS configuration (Signing Key, etc) are critical for Synapse to start and identify itself as the same server as before.
Matrix Authentication Service Postgresql database is critical for your system to recover your user accounts, their devices and sessions.

The following systems will recover features subsets, and might involve reset & data loss if not recovered :

Synapse media storage : Users will loose their Avatars, and all photos, videos, files uploaded to the rooms wont be available anymore
Adminbot & Auditbot data : The bots will need to be renamed for them to start joining all rooms and logging events again
Sliding Sync : Users will have to do an initial-sync again, and their encrypted messages will display as "Unable to decrypt" if its database cannot be recovered
Integrator : Integrations will have to be added back to the rooms where they were configured. Their configuration will be desynced from integrator, and they might need to be reconfigured from scratch to have them synced with integrator.

Security Considerations

Some backups will contain sensitive data, Here is a description of the type of data and the risks associated to it. When available, make sure to enable encryption for your stored backups. You should use appropriate access controls and authentication for your backup processes.

Synapse

Synapse media and db backups should be considered sensitive.

Synapse media backups will contain all user medias (avatar, photos, video, files). If your organization is enforcing encrypted rooms, these medias will be stored encrypted with each user e2ee keys. If you are not enforcing encryption, you might have media stored in cleartext here, and appropriate measures should be taken to ensure that the backups are safely secured.

Synapse postgresql backups will contain all user key backup storage, where their keys are stored safely encrypted with each user passphrase. Synapse DB will also store room states and events. If your organization is enforcing encrypted rooms, these will be stored encrypted with each user e2ee keys.

Adminbot

Adminbot PV backup should be considered sensitive.

Any user accessing it could read the content of your organization rooms. Would such an event occur, revoking the bot tokens would prevent logging in as the adminbot and stop any pulling of the room messages content.

Auditbot

Auditbot PV backup should be considered sensitive.

Any user accessing it could read the content of your organization rooms. Would such an event occur, revoking the bot tokens would prevent logging in as the auditbot and stop any pulling of the room messages content.

Logs stored by the auditbot for audit capabilities are not encrypted, so any user able to access it will be able to read any logged room content.

Sliding Sync

Sliding-Sync DB Backups should be considered sensitive.

Sliding-Sync database backups will contain Users Access tokens, which are encrypted with Sliding Sync Secret Key. The tokens are only refreshed regularly if you are using Matrix Authentication Services. These tokens give access to user messages-sending capabilities, but cannot read encrypted messages without user keys.

Sydent

Sydent DB Backups should be considered sensitive.

Sydent DB Backups contain association between user matrix accounts and their external identifiers (mails, phone numbers, external social networks, etc).

Matrix Authentication Service

Matrix Authentication Service DB Backups should be considered sensitive.

Matrix Authentication Service database backups will contain user access tokens, so they give access to user accounts. It will also contain the OIDC providers and confidential OAuth 2.0 Clients configuration, with secrets stored encrypted using MAS encryption key.

IRC Bridge

IRC Bridge DB Backups should be considered sensitive.

IRC Bridge DB Backups contain user IRC passwords. These passwords give access to users IRC account, and should be reinitialized in case of incident.

Guidance on High Availability

ESS makes use of Kubernetes for deployment so most guidiance on high-availability is tied directly with general Kubernetes guidance on high availability.

Kubernetes

Essential Links

High-Level Overview

It is strongly advised to make use of the Kubernetes documentation to ensure your environment is setup for high availability, see links above. At a high-level, Kubernetes achieves high availability through:

Cluster Architecture.
- Multiple Masters: In a highly available Kubernetes cluster, multiple master nodes (control plane nodes) are deployed. These nodes run the critical components such as etcd, the API server, scheduler, and controller-manager. By using multiple master nodes, the cluster can continue to operate even if one or more master nodes fail.
- Etcd Clustering: etcd is the key-value store used by Kubernetes to store all cluster data. It can be configured as a cluster with multiple nodes to provide data redundancy and consistency. This ensures that if one etcd instance fails, the data remains available from other instances.
Pod and Node Management.
- Replication Controllers and ReplicaSets: Kubernetes uses replication controllers and ReplicaSets to ensure that a specified number of pod replicas are running at any given time. If a pod fails, the ReplicaSet automatically replaces it, ensuring continuous availability of the application.
- Deployments: Deployments provide declarative updates to applications, allowing rolling updates and rollbacks. This ensures that application updates do not cause downtime and can be rolled back if issues occur.
- DaemonSets: DaemonSets ensure that a copy of a pod runs on all (or a subset of) nodes. This is useful for deploying critical system services across the entire cluster.
Service Discovery and Load Balancing.
- Services: Kubernetes Services provide a stable IP and DNS name for accessing a set of pods. Services use built-in load balancing to distribute traffic among the pods, ensuring that traffic is not sent to failed pods.
- Ingress Controllers: Ingress controllers manage external access to the services in a cluster, typically HTTP. They provide load balancing, SSL termination, and name-based virtual hosting, enhancing the availability and reliability of web applications.
Node Health Management.
- Node Monitoring and Self-Healing: Kubernetes continuously monitors the health of nodes and pods. If a node fails, Kubernetes can automatically reschedule the pods from the failed node onto healthy nodes. This self-healing capability ensures minimal disruption to the running applications.
- Pod Disruption Budgets (PDBs): PDBs allow administrators to define the minimum number of pods that must be available during disruptions (such as during maintenance or upgrades), ensuring application availability even during planned outages.
Persistent Storage.
- Persistent Volumes and Claims: Kubernetes provides abstractions for managing persistent storage. Persistent Volumes (PVs) and Persistent Volume Claims (PVCs) decouple storage from the pod lifecycle, ensuring that data is preserved even if pods are rescheduled or nodes fail.
- Storage Classes and Dynamic Provisioning: Storage classes allow administrators to define different storage types (e.g., SSDs, network-attached storage) and enable dynamic provisioning of storage resources, ensuring that applications always have access to the required storage.
Geographical Distribution.
- Multi-Zone and Multi-Region Deployments: Kubernetes supports deploying clusters across multiple availability zones and regions. This geographical distribution helps in maintaining high availability even in the event of data center or regional failures.
Network Policies and Security.
- Network Policies: These policies allow administrators to control the communication between pods, enhancing security and ensuring that only authorized traffic reaches critical applications.
- RBAC (Role-Based Access Control): RBAC restricts access to cluster resources based on roles and permissions, reducing the risk of accidental or malicious disruptions to the cluster's operations.
Automated Upgrades and Rollbacks.
- Cluster Upgrade Tools: Tools like kubeadm and managed Kubernetes services (e.g., Google Kubernetes Engine, Amazon EKS, Azure AKS) provide automated upgrade capabilities, ensuring that clusters can be kept up-to-date with minimal downtime.
- Automated Rollbacks: In the event of a failed update, Kubernetes can automatically roll back to a previous stable state, ensuring that applications remain available.

How does this tie into ESS

As ESS is deployed into a Kubernetes cluster, if you are looking for high availability you should ensure your environment is configured with that in mind. One important factor is to ensure you deploy using the Kubernetes deployment option, whilst Standalone mode will deploy to a Kubernetes cluster, by definition it exists solely on a single node so options for high availability will be limited.

PostgreSQL

Essential links

High-Level Overview

To ensure a smooth failover process for ESS, it is crucial to prepare a robust database topology. The following list outline the necessary element to take into consideration:

Database replicas
- Location: Deploy the database replicas in a separate data center from the primary database to provide geographical redundancy.
- Replication: Configure continuous replication from the primary database to the s econdary database. This ensures that the secondary database has an up-to-date copy of all data.
Synchronization and Monitoring
- Synchronization: Ensure that the secondary database is consistently synchronized with the primary database. Use reliable replication technologies and monitor for any lag or synchronization issues.
- Monitoring Tools: Implement monitoring tools to keep track of the replication status and performance metrics of both databases. Set up alerts for any discrepancies or failures in the replication process.
Data Integrity and Consistency
- Consistency Checks: Periodically perform consistency checks between the primary and secondary databases to ensure data integrity. -Backups: Maintain regular backups of both the primary and secondary databases. Store backups in a secure, redundant location to prevent data loss.
Testing and Validation
- Failover Testing: Conduct regular failover drills to test the transition from the primary to the secondary database. Validate that the secondary database can handle the load and that the failover process works seamlessly.
- Performance Testing: Evaluate the performance of the secondary database under expected load conditions to ensure it can maintain the required service levels.

By carefully preparing the database topology as described, you can ensure that the failover process for ESS is efficient and reliable, minimizing downtime and maintaining data integrity.

How does this tie into ESS

As ESS relies on PostgreSQL for its database if you are looking for high availability you should ensure your environment is configured with that in mind. The database replicas can be achieved the same way in both Kubernetes and Standalone deployment, as the database is not managed by ESS.

ESS failover plan

This document outlines a high-level, semi-automatic, failover plan for ESS. The plan ensures continuity of service by switching to a secondary data center (DC) in the event of a failure in the primary data center.

Prerequisites

Database Replica: A replica of the main database, located in a secondary data center, continuously reading from the primary database.
Secondary ESS Deployment: An instance of the ESS deployment, configured in a secondary data center.
Signing Keys Synchronization: The signing keys stored in ESS secrets need to be kept synchronized between the primary and secondary data centers.
Media Repository: Media files are stored on a redundant S3 bucket accessible from both data centers.

ESS Architecture for failover capabilities based on 3 datacenters

DC1 (Primary)

ElementDeployment Manifest:
- Manifest points to addresses in DC1.
- TLS Secrets managed by ACME.
TLS Secrets:
- Replicated to DC2 and DC3.
Operator:
- 1 replica.
Updater:
- 1 replica.
PostgreSQL:
- Primary database.

DC2

ElementDeployment Manifest:
- Manifest points to addresses in DC2.
- TLS Secrets pointing to existing secrets, replicated locally from DC1.
Operator:
- 0 replica, it prevents the deployment of the kubernetes workloads
Updater:
- 1 replica, the base element manifest are ready for the operator to deploy the workloads
PostgreSQL:
- Hot-Standby, replicating from DC1.

DC3

ElementDeployment Manifest:
- Manifest points to addresses in DC3.
- TLS Secrets pointing to existing secrets, replicated locally from DC1.
Operator:
- 0 replica, it prevents the deployment of the kubernetes workloads
Updater:
- 1 replica, the base element manifest are ready for the operator to deploy the workloads
PostgreSQL:
- Hot-Standby, replicating from DC1.

Failover Process

When DC1 experiences downtime and needs to be failed over to DC2, follow these steps:

Disable DC1:
- Firewall outbound traffic to prevent federation/outbound requests such as push notifications.
- Scale down the Operator to 0 replicas and remove workloads from DC1.
Activate DC2:
- Promote the PostgreSQL instance in DC2 to the primary role.
- Set Operator Replicas:
  - Increase the Operator replicas to 1.
  - This starts the Synapse workloads in DC2.
- Update the DNS to point the ingress to DC2.
- Open the firewall if it was closed to ensure proper network access.
Synchronize DC3:
- Ensure PostgreSQL Replication:
  - Make sure that the PostgreSQL in DC3 is properly replicating from the new primary in DC2.
  - Adjust the PostgreSQL topology if necessary to ensure proper synchronization.

You should decline your own failover procedure based on this high-level failover overview. By doing so, you can ensure that ESS continues to operate smoothly and with minimal downtime, maintaining service availability even when the primary data center goes down.