# Production
NOTE
This documentation is a work in progress and may be incomplete.
WARNING
This guide does not include setting up SSL. For proper security, ELLA should either be run in a secured environment (internal network, "walled garden", etc.) or with the nginx config modified to handle remote connections securely.
WARNING
ELLA relies on a separate annotation service, ella-anno (opens new window), to annotate and import data. The documentation for this service (opens new window) is also a work in progress. Please contact
ella-support for details on how to configure it for your own production setup.
# Requirements
- A PostgreSQL (opens new window) server
- For development purposes, a postgres container is provided. However, for a production environment, a managed postgres server is highly recommended.
- Recommended version: ≥ v14.0 (ELLA should function with versions of PostgresQL ≥ v9.6, but this is not tested)
- Docker (opens new window)/Podman (opens new window) and docker-compose (opens new window) v2
- Running ELLA without using containers is possible, but this is not documented and generally not recommended.
# Background
# Mount points
There are several directories you will want to mount from the host OS into the container.
Destination | Description |
---|---|
data | Data files (details below) |
logs | Log files (set env variable $LOGPATH) |
/tmp | Using host /tmp can increase performance (optional) |
# Data directory
ELLA needs access to various types of data, and while it is possible to mount those all individually, using a single unified data
directory is strongly recommended for both clarity and simplicity.
The following steps assume the folder structure and config locations:
data/
├── analyses/
| └── imported/ # Analyses that have already been imported $ANALYSES_PATH
| ├── incoming/ # Analyses to be imported automatically by the analysis watcher $ANALYSES_INCOMING
├── attachments/ # Storage of user attachments $ATTACHMENT_STORAGE
├── fixtures/ # Configuration data to be imported into the database
| └── genepanels/
| ├── users.json
| ├── usergroups.json
| ├── references.json
| ├── filterconfigs.json
├── igv-data/ # IGV resources, global and usergroup tracks $IGV_DATA
| ├── tracks/
| ├── track_config_default.json
├── ella_config.yml $ELLA_CONFIG
# Docker Compose
ELLA is most easily run with Docker Compose. The provided docker-compose.yml
is suitable for a dev environments, while docker-compose-prod.yml
serves as a good starting point for running ELLA in production. The latter would typically only need additional volume mounts, which can easily be achieved by specifying volumes in a separate file, e.g. docker-compose-volumes.yml
, and running ELLA with docker compose -f docker-compose-prod.yml -f docker-compose-volumes.yml up -d
.
# First time setup
The ELLA release page (opens new window) has information on past and current releases, including the Docker and Singularity images. This information can also be quickly gathered by running make latest-release-info
.
# Fetching the release image
Determine the latest Docker and Singularity image by whichever method you prefer, and then pull them to the local server.
# Docker
docker pull registry.gitlab.com/alleles/ella:${TAG}
# Singularity - download
wget https://gitlab.com/alleles/ella/-/releases/${TAG}/downloads/ella-release-${TAG}.sif
# Singularity - build from Docker
singularity pull docker://registry.gitlab.com/alleles/ella:${TAG}
# Define the environment
There are a few environment variables that should be set, and many others than can be modified to fit the production environment.
The easiest is to rely on an environment file (dotenv). See .env
in the root folder of the project for a detailed description of all environment variables.
See Application Configuration for all variables related to the setup of ELLA application.
# Create ELLA application config
Many of the settings can be adjusted as you go, but a basic application config is required for all additional steps. It must be available inside the container at the value given by ELLA_CONFIG
in the env file.
See also:
# Using ella-cli
The following steps use ella-cli
from inside the ella container. You can do this by starting a shell session inside a running container or using an alias/function wrapper to run directly from the host OS. The latter will make each command take longer, but lets you interact with the host OS and container in a single terminal.
ELLA_DIR=$PWD
ELLA_IMAGE=registry.gitlab.com/alleles/ella:v1.16.4
ENV_FILE=prod.env
# start an interactive bash shell
ella-shell() {
docker run -it --rm \
--name ella-shell \
--env-file "$ENV_FILE" \
-v "$ELLA_DIR/data:/data" \
-v "$ELLA_DIR/logs:/logs" \
-v /tmp \
"$ELLA_IMAGE" bash
}
# run a single command in a throwaway container
ella-cli() {
docker run -it --rm \
--env-file "$ENV_FILE" \
-v "$ELLA_DIR/data:/data" \
-v "$ELLA_DIR/logs:/logs" \
-v /tmp \
"$ELLA_IMAGE" ella-cli "$@"
}
# Initializing the database
ELLA relies on an external PostgreSQL database, using the default "public" schema.
The following assumes that there is an empty database, e.g. created with createdb
at $DB_URL
.
Run the following command:
ella-cli database make-production
This will:
- Setup the database from the migration base.
- Run all the migration scripts.
- Run the
database refresh
command, to setup json schemas and various triggers.
Once this is complete, you can start a persistent ELLA container and it will stay running.
# Define/Fetch configuration fixtures
ELLA is very configurable and as a result there are several configuration files that need to be prepared before the database can created and populated. Each file has documentation contain the details of its contents as well as a sample config available in the test data that can be used as reference.
In depth info is available in Configuration for all configuration options.
Some configs rely on each other, so when initially populating the database they must be loaded in the following order.
# Gene Panels
Gene panels are a core part of ELLA and must be loaded first. It is not currently possible to bulk load gene panels, so using a loop is recommended.
# adjust path to genepanels as needed
for gp_dir in /data/fixtures/genepanels/*/; do
ella-cli deposit genepanel --folder $gp_dir
done
See also:
- Gene Panel Configuration
- testdata/clinicalGenePanels (opens new window)
- alleles/genepanel-store (opens new window)
# User Groups
User groups are used to determine who can see what as well as which filters are available and used by default.
ella-cli users add_groups /data/fixtures/usergroups.json
See also:
# Filter Configs
In addition to gene panels, ELLA has highly configurable and extendable filters that make ignoring technical and known-but-uninteresting variants much simpler.
ella-cli filterconfigs update /data/fixtures/filterconfigs.json
See also:
# Users
Users can be added one by one or as a bulk action. It is currently only possible to add users via the CLI.
ella-cli users add_many /data/fixtures/users.json
See also:
# Annotation
ELLA can filter or modify the annotation on a variant when it is imported. This allows you to only import the information you want without clutter the database with info you don't.
# deposit normal annotation
ella-cli deposit annotation /data/fixtures/annotation-config.json
# Optional: deposit custom, user-specific annotation
ella-cli deposit custom_annotation /data/fixtures/custom_annotation.json
See also:
- Annotation Configuration
- annotation-config.yml (opens new window)
- custom_annotation_test.json (opens new window)
# IGV
IGV.js (opens new window) is used in visual mode for examining variants in more detail. Its configuration is dynamic, so does not need to be loaded into the database. The config and any necessary files for the track info must be available in $IGV_DATA
/$IGV_DATA/tracks
and have the correct permissions.
# Download the default IGV data
ella-cli igv-download "$IGV_DATA"
# If running ELLA in an airgapped network, you can download the files directly for a manual
# transfer afterwarads. This does not need to be run inside an ELLA container.
mkdir igv-data
./src/cli/commands/fetch-igv-data.sh igv-data
tar cvf igv_data.tar igv-data/
See also:
# Populate reference table
The references table in the database can be populated with PubMed IDs using a json file generated by the ella-cli:
- Gather all the PubMed IDs you want to populate the database with in a line-separated file.
- Run this command to create a file named references-YYMMDD.json:
ella-cli references fetch <path to file with PubMed ids>
- Create or update references in the database with this command:
ella-cli deposit references <path to references-YYMMDD.json>
# Log in / verify
- If you haven't already, start the ELLA services:
>> docker-compose --env-file env -f docker-compose.yml -f docker-compose-volumes.yml up -d [+] Running 6/6 ✔ Network ella-network Created 0.0s ✔ Container ella-apiv1 Started 0.1s ✔ Container ella-analysis-watcher Started 0.1s ✔ Container ella-frontend Started 0.1s ✔ Container ella-polling Started 0.1s ✔ Container ella-nginx Started 0.0s
2. Verify the services are running:
```bash
>> docker-compose --env-file env -f docker-compose.yml -f docker-compose.volumes.yml ps -a
NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS
ella-analysis-watcher registry.gitlab.com/alleles/ella/ella-backend:86f6c0a37 "" analysis-watcher 3 seconds ago Up 2 seconds
ella-apiv1 registry.gitlab.com/alleles/ella/ella-backend:86f6c0a37 "" apiv1 3 seconds ago Up 2 seconds
ella-frontend registry.gitlab.com/alleles/ella/ella-frontend:86f6c0a37 "" frontend 3 seconds ago Up 2 seconds
ella-nginx localhost/nginx:1.25.3-alpine "nginx -g daemon off;" nginx 3 seconds ago Up 2 seconds 80/tcp
ella-polling registry.gitlab.com/alleles/ella/ella-backend:86f6c0a37 "" polling 3 seconds ago Up 2 seconds
- Check for unexpected error messages with
docker-compose --env-file env -f docker-compose.yml -f docker-compose.volumes.yml logs
- Go to the appropriate URL/port and log in.
- Success!
# Upload an analysis
A new analysis can be uploaded either via the automated analysis-watcher or manually via
ella-cli
. In both cases, the files/directories must follow the naming format:
{analysis_name}-{genepanel_name}-v{genepanel_version}
.
# Deposit with ella-cli
If you have a single VCF without any accompanying files, it can be easily deposited:
ella-cli depost analysis sample_123-Mendeliome-v01.vcf
# Automatic import
To automatically import an analysis, have the final steps of your variant calling pipeline write
the output to itws own directory in $ANALYSES_INCOMING
. Once all the files have been written /
copied, create a file named READY
in the directory and the watcher will add it to the queue the
next time it scans the directory.
# pipeline finishes, copy the output to watched location
rsync -a output/ /host/path/to/analyses/incoming/sample_123-Mendeliome-v01/
# check that all desired files are there
ls -1 /host/path/to/analyses/incoming/sample_123-Mendeliome-v01/
# sample_123-Mendeliome-v01.analysis
# sample_123-Mendeliome-v01.cnv.vcf
# sample_123-Mendeliome-v01.vcf
# add READY to indicate it can be imported
touch /host/path/to/analyses/incoming/sample_123-Mendeliome-v01/READY
# watch log to verify it was imported successfully
tail -f /opt/ella/logs/analysis-watcher.log
# 2022-06-19 13:12:23,542 [info] vardb.deposit.deposit_analysis: Importing sample_123-Mendeliome-v01
# ...
# 2022-06-19 13:12:24,677 [info] vardb.deposit.deposit_analysis: All done!
# 2022-06-19 13:12:24,695 [info] __main____: Analysis sample_123-Mendeliome-v01 successfully imported!
See also: