How to run BioMAJ?

Definitions
  • Bank: Any file or set of files located on a remote server, in BioMAJ a bank corresponds to a <bank>.properties file. (This file describes all properties of the bank server, protocol, location, processes, etc.)
  • Process: All transformations applied upstream or downstream of bank uploads (scripts)
  • Workflow : how does it work ? Here.
BioMAJ needs to know:
Directory structure

How will banks be stored?

  • /db/data
    • /offlinedir
    • /bank_1
      • /version_1
      • /version_2
      • current => version_2: current is the symlink on the used bank
    • /bank_2
      • /version_1
      • /version_2
      • current => version_2
  • <rootdir>/conf: configuration files (<bank>.properties)
  • <rootdir>/lock: bank locking (to avoid error in download, processing…)
  • <rootdir>/process: where to store processes (it also could be in « PATH »), every process can be could by BioMAJ or by a wrapper.
  • <rootdir>/cache
  • <rootdir>/log: where to store the logs of each bank and process (/bank/version/execution)

At first, the pre-processes will be applied then the files will be downloaded and uncompressed. It is possible to make a selection on the files via the variable local. file, the final files will be stored in the flat/ directory and the post-processes will be applied to the files.

Some generalities
  • Only one remote location per bank (it is not possible to mix protocols)
  • All execution logs are logged in log dir per bank/version/execution, including
    per-process log.
  • If a workflow step fails, the update stops. At the next update, the worklfow restarts at the failed stage.
  • The bank is usable when the entire workflow has been successfully completed.
  • In case of failure, only files whose download is incomplete or a failure will be downloaded again
Then you can start to use the BioMAJ client:

The global.properties file is mandatory. If not specified, ‘global.properties’ will be searched in current directory or at BIOMAJ_CONF environment variable path (export BIOMAJ_CONF=/xx/yy/global.properties)

Need help?

biomaj-cli.py -h

How to check a bank status?

biomaj-cli.py --config global.properties --status --bank alu

How to check if your bank file is OK?

biomaj-cli.py --config global.properties --check --bank alu

How to update a bank?

biomaj-cli.py --config global.properties  --bank alu --update

How to publish a bank and what is a published bank?

A published bank creates a symbolic link current on the specified released. This helps user accessing a bank with the same path (/../mybank/current). You can manage publishing at update time or later on with the –publish or –publish-version options. One and only one bank release can be published for each bank.

biomaj-cli.py --config global.properties  --bank alu --publish

See more options.

How to run BioMAJ with docker? Here.

How does it works? Here.