(see https://github.com/genouest/biomaj-docker)

Run docker-compose (and all services of BioMAJ with it):

docker-compose up -d

Do not forget to install BioMAJ client:

pip install biomaj-cli

Create an .env file:

You will need to create an .env file in the biomaj-docker directory (where the docker-compose.yml is located)


echo "BIOMAJ_DIR=/<path to>/biomaj-docker/" > .env

echo "BIOMAJ_DATA_DIR=/..path_to_biomaj_data_dir_in_container" >> .env # location of data directory on host (not container, /db for example)
# used for biomaj-process with Docker executor
# Ideally BIOMAJ_HOST_DATA_DIR should equal BIOMAJ_DATA_DIR
echo "BIOMAJ_HOST_DATA_DIR=/..path_to_biomaj_data_dir_on_host" >> .env

echo "BIOMAJ_USER_PASSWORD=biomaj_user_default_password >> .env
echo "DOCKER_URL=tcp://x.y.z:2375" >> .env # if you wish to execute processes in Docker containers, give the IP of the host where docker is running (or swarm)

It is possible to override environment variables (and to override global.properties variables in the same way), for example to send mails with the microservice version of BioMAJ:

echo "BIOMAJ_MAIL_SMTP_HOST=<xx>" >> .env
echo "BIOMAJ_MAIL_ADMIN=<xx>" >>.env
echo "BIOMAJ_MAIL_FROM=<xx>" >> .env
 When you execute a container, you will find all directories under : /var/lib/biomaj/
 Do not forget to place all your new bank.properties files on your host in the BIOMAJ_DIR/biomaj/conf

How to I use a different directory for config (conf, log, etc.) vs data (bank directories)?

Use the compose template docker-compose-otherdb.yml instead of docker-copose.yml

Bank directories will be placed in .env defined BIOMAJ_DATA_DIR variable both on host and in container (/db for example)

How to modify and use your own global.properties with docker ?

Here.

You can check the contents of the global.properties file used directly in a container: cat /etc/biomaj/global.properties

Create a user for BioMAJ:

Run the user container in interactive mode:

docker exec -it <name of the biomaj-user-web container> /bin/bash

 How to know the name of every container?

docker ps

In the user container, create a user:

 python3 biomaj-user/bin/biomaj-users.py -A create -U biomaj -P biomajmdp

It will give you an apikey for your user:

+--------+------------+------------+
| User   | Password   | API Key    |
|--------+------------+------------|
| biomaj | biomajmdp  | M1A15AAER5 |
+--------+------------+------------+

 Copy your bank.properties files

Copy all your bank.properties files to biomaj/conf.

Get the alu.properties example:

wget -P <your path>/biomaj/conf https://github.com/genouest/biomaj/blob/master/tests/alu.properties 

Run an update

biomaj-cli.py --proxy http://biomaj-public-proxy --api-key XYZ --update --bank alu

For microservices deploiement, you have to use the IP of your host machine and the biomaj-public-proxy port, example:

http://127.0.0.1

How to get the status of a bank?

biomaj-cli.py --proxy http://172.18.0.8 --api-key M1A15AAER5 --status --bank alu 
+--------+-----------------+----------------------+---------------------+
| Name   | Type(s)         | Last update status   | Published release   |
|--------+-----------------+----------------------+---------------------|
| alu    | nucleic_protein | 2017-06-12 14:22:17  | 2003-11-26          |
+--------+-----------------+----------------------+---------------------+
+---------------------+------------------+------------+-----------------------------------------------------------+----------+
| Session             | Remote release   | Release    | Directory                                                 | Freeze   |
|---------------------+------------------+------------+-----------------------------------------------------------+----------|
| 2017-06-12 14:22:17 | 2003-11-26       | 2003-11-26 | /var/lib/biomaj/data/db/ncbi/blast/alu_ori/alu_2003-11-26 | no       |
+---------------------+------------------+------------+-----------------------------------------------------------+----------+

How to check the log file of a databank?

You can consult the logs of a bank by reading the file:

cat biomaj/log/<bank>/<XXXXXXX>/bank.log

The number XXXXXXXXX corresponds to the download session number.

If you have an influxdb issue in your log,  you will first have to run the CLI influxdb in the influxdb container and then create a biomaj database:

docker exec -it biomaj-docker_biomaj-influxdb_1 influx
#Connected to http://localhost:8086 version 1.3.0
#InfluxDB shell version: 1.3.0
create database biomaj

More information Here.

How to create a specific process instance?

To preinstall software for process handlers, or mount local system dir containing some tools, one can extend/replace the existing Docker container.

Extending
FROM osallou/biomaj-docker
RUN apt-get install …
#docker build -t my/biomaj-process .

Replacing
Create a new Dockerfile, installing biomaj-process python package and files/directories as specified here.

More information here.

How to simplify tools installation with Conda?

Conda is a package management and work environment management system. It makes it easier to install tools/software, especially in bioinformatics. Many tools are available on bioconda (https://bioconda.github.io/conda-recipe_index.html).

Conda is installed directly with biomaj-docker.

A python script (available here with its wrapper) allows the installation of your package(s) of interest via the biomaj post process. You just have to :

  • Add a special block (in your bank.properties file: example here) to install your package(s) of interest
  • Download the conda wrapper in biomaj/process
  • Download the python script in biomaj/process
  • Give them the execution permissions chmod 755 biomaj/process/*
####################
### Post Process ###
####################  The files should be located in the projectfiles/process directory.


BLOCKS=BLOCK1,BLOCK2
BLOCK1.db.post.process=META0
META0=conda

#wrapper_install_conda.sh + conda_install_multi.py
conda.name=conda
conda.type=install
conda.exe=wrapper_install_conda.sh
conda.args=blast $processdir/packageblast.txt $processdir
conda.cluster=false

BLOCK2.db.post.process=META1
META1=makeblastdb


#makeblastdb.sh
makeblastdb.desc=Index blast
makeblastdb.type=index
makeblastdb.cluster=
makeblastdb.name=makeblastdb
makeblastdb.args="flat/swissprot" "blast/" "-dbtype prot" "swissprot"
makeblastdb.exe=makeblastdb.sh

How to specify the packages to install?

Create the <list>.txt with all the conda packages you want to install (version number are available on bioconda for example here), and place it in your biomaj/process directory:

bwa=0.7.8
blast=2.5.0
bowtie2=2.3.4.1

Then complete the conda.args line of your bank.properties file with your environment name and the name of your <list>.txt file.

conda.args=<environment name> $processdir/<liste of packages>.txt $processdir

How to use your conda environment?

In your post process script, do not forget to activate your environment (in our example, the makeblastdb.sh file):

source activate blast

Do not forget to disable your environment at the end of your script:

deactivate

To delete your environment:

rm -r <path to processes>/process/blast

Advanced configuration (metrics, monitoring, administration …)

Here.