Global properties
The global.properties file is mandatory and you can use directly the example given at the end of the page or on the biomaj github . If not specified, ‘global.properties’ will be searched in current directory or at BIOMAJ_CONF environment variable path (export BIOMAJ_CONF=/xx/yy/global.properties)
The main configuration, shared by all banks, is in the global.properties file. It can also be superseeded by a file in user home directory ~/.biomaj.cfg (optional).
root.dir
with your settings.Mandatory parameters
BioMAJ needs to know :
- [GENERAL] : header of the global.properties file
- root.dir : path to directory that contains all files of BioMAJ (database, log, properties files)
- conf.dir=%(root.dir)s/conf: directory for all bank.properties files
- log.dir=%(root.dir)s/log: directory of all log files for each bank update
- process.dir=%(root.dir)s/process: directory to store all process files
- cache.dir=%(root.dir)s/cache
- lock.dir=%(root.dir)s/lock: Directory where the bank lock files are stored
- data.dir=%(root.dir)s/db: The root directory where all databases are stored.
- db.url=mongodb://localhost:27017
- db.name=biomaj
If your data is not stored under one directory hierarchy, you can override this value in the database properties file.
data.dir=/var/lib/biomaj
Optional parameters:
Reporting
- mail.from: sender’s email address
- mail.smtp.host: server smtp address
- mail.admin: list of email addresses to send reports to
Options
- use_ldap=0: ldap authentification
- use_elastic=1: Using ElasticSearch to index/search
- historic.logfile.level=DEBUG: definition of the information level for the output
- bank.num.threads=4: number of threads for bank management
- files.num.threads=4: number of threads for downloading
- visibility.default=public: access to default banks
Parsing of http server
It is possible to extract bank information from an URL:
- Date
- Name
- Size
Example of the regular expression used by default in BioMAJ:
http.parse.dir.line=<img[\s]+src="[\S]+"[\s]+alt="\
[DIR\]"[\s]*/?>[\s]*<a[\s]+href="([\S]+)/"[\s]*>.*([\d]
{2}-[\w\d]{2,5}-[\d]{4}\s[\d]{2}:[\d]{2})
http.parse.file.line=<img[\s]+src="[\S]+"[\s]+alt="\[[\s]
+\]"[\s]*/?>[\s]<a[\s]+href="([\S]+)".*([\d]{2}-[\w\d]
{2,5}-[\d]{4}\s[\d]{2}:[\d]{2})[\s]+([\d\.]+[MKG]{0,1})
http.group.dir.name=1
http.group.dir.date=2
http.group.file.name=1
http.group.file.date=2
http.group.file.size=3
More options here.
How to modify and use your own global.properties with docker ?
In the docker version, BioMAJ uses a standard version of the global.properties file that is not accessible for modifications. It is possible to customize your own global.properties file and use it in the docker version of biomaj by mounting it in all services of the docker-compose.yml file, as follows :
volumes:
- <path_to_global_file>:/etc/biomaj/global.properties
For example in the biomaj-public-proxy service of the docker-compose.yml file:
biomaj-public-proxy:
image: osallou/biomaj-proxy
volumes:
- ${BIOMAJ_DIR}/proxy/public:/proxy:ro
- ${BIOMAJ_DIR}/biomaj:/var/lib/biomaj/data
- <path_to_global_file>:/etc/biomaj/global.properties
ports:
- "5000:80"
depends_on:
- biomaj-consulv
Remember to add it for each service: biomaj-mongo, biomaj-redis, biomaj-elasticsearch, biomaj-download-message etc…
Example of a general properties file:
[GENERAL]
test=1
root.dir=/<path>/<to>/<biomaj>/<file>
conf.dir=%(root.dir)s/conf
log.dir=%(root.dir)s/log
process.dir=%(root.dir)s/process
cache.dir=%(root.dir)s/cache
# Directory where the bank lock files are stored
lock.dir=%(root.dir)s/lock
data.dir=%(root.dir)s/db
db.url=mongodb://localhost:27017
db.name=biomaj_test
use_ldap=1
ldap.host=localhost
ldap.port=389
ldap.dn=nodomain
# Use ElasticSearch for index/search capabilities
use_elastic=0
#Comma separated list of elasticsearch nodes host1,host2:port2
elastic_nodes=localhost
elastic_index=biomaj_test
celery.queue=biomaj
celery.broker=mongodb://localhost:27017/biomaj_celery
# Get directory stats (can be time consuming depending on number of files etc...)
data.stats=1
# List of user admin (linux user id, comma separated)
admin=
# Auto publish on updates (do not need publish flag, can be ovveriden in bank property file)
auto_publish=0
########################
# Global properties file
#To override these settings for a specific database go to its
#properties file and uncomment or add the specific line you want
#to override.
#----------------
# Mail Configuration
#---------------
#Uncomment thes lines if you want receive mail when the workflow is finished
#mail.smtp.host=
#mail.stmp.host=
mail.admin=
mail.from=
#mail.user=
#mail.password=
#mail.tls=
#---------------------
#Proxy authentification
#---------------------
#proxyHost=
#proxyPort=
#proxyUser=
#proxyPassword=
#Number of thread for processes
bank.num.threads=2
#Number of threads to use for downloading
files.num.threads=4
#to keep more than one release increase this value
keep.old.version=0
#----------------------
# Release configuration
#----------------------
release.separator=_
#The historic log file is generated in log/
#define level information for output : DEBUG,INFO,WARN,ERR
historic.logfile.level=DEBUG
#http.parse.dir.line=<a[\s]+href="([\S]+)/".*alt="\[DIR\]">.*([\d]{2}-[\w\d]{2,5}-[\d]{4}\s[\d]{2}:[\d]{2})
http.parse.dir.line=<img[\s]+src="[\S]+"[\s]+alt="\[DIR\]"[\s]*/?>[\s]*<a[\s]+href="([\S]+)/"[\s]*>.*([\d]{2}-[\w\d]{2,5}-[\d]{4}\s[\d]{2}:[\d]{2})
http.parse.file.line=<img[\s]+src="[\S]+"[\s]+alt="\[[\s]+\]"[\s]*/?>[\s]<a[\s]+href="([\S]+)".*([\d]{2}-[\w\d]{2,5}-[\d]{4}\s[\d]{2}:[\d]{2})[\s]+([\d\.]+[MKG]{0,1})
http.group.dir.name=1
http.group.dir.date=2
http.group.file.name=1
http.group.file.date=2
http.group.file.size=3
# Bank default access
visibility.default=public
[loggers]
keys = root, biomaj
[handlers]
keys = console
[formatters]
keys = generic
[logger_root]
level = INFO
handlers = console
[logger_biomaj]
level = DEBUG
handlers = console
qualname = biomaj
propagate=0
[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = DEBUG
formatter = generic
[formatter_generic]
format = %(asctime)s %(levelname)-5.5s [%(name)s][%(threadName)s] %(message)s