Production deployment
The getting started section is designed for quick local testing with minimal configuration. For a production deployment intended for external users, you should customize the settings to fit your specific needs.
To help you with this,
the Neurobagel recipes repository
includes dedicated production deployment recipes that should
cover the most common use cases.
Start with a fresh recipe for each deployment
Make a fresh clone of the
recipes repository for each
deployment you want to launch.
For example, hosting multiple Neurobagel nodes,
or a single node alongside a containerized proxy server,
means creating a separate clone of the recipes repo for each node or proxy server.
This will make it easier to update and maintain each deployment.
Deployable services
All Neurobagel services are containerized. The production deployment recipes use Docker Compose profiles to group related services and allow them to be launched together.
Neurobagel services that are included in the Docker Compose deployment recipes:
- Neurobagel node API/n-API (
api): The API that communicates with a single graph store and determines how detailed the response to a query should be from that graph. - Graph store (
graph): A third-party RDF store that stores Neurobagel-harmonized data to be queried. At the moment our recipe uses the free tier of GraphDB for this. - Neurobagel federation/f-API (
federation): A special API that can federate over one or more Neurobagel nodes to provide a single point of access to multiple distributed databases. By default it will federate over all public nodes and any local nodes you specify. - Neurobagel query tool (
query_federation): A web app that provides a graphical interface for users to query a federation API and view the results from one or more nodes. Because the query tool is a static app and is run locally in the user's browser, this service simply hosts the app.
Two additional, third-party services are part of production deployment recipes:
- NGINX reverse proxy (
nginx-proxy): A containerized version of the popular proxy server that lets you serve your Neurobagel services under custom URLs, handling routes automatically. - Automatic SSL certificate service (
acme-companion): A containerized companion tool for nginx that automatically provisions SSL certificates for the routes created bynginxso users can communicate with your services via encrypted HTTPS connections.
Deployment profiles
Neurobagel offers different deployment profiles that allow you to launch specific combinations of services (listed below), depending on your use case.
-
node: Deploys an individual Neurobagel node. A Neurobagel node includes an internal graph database and a node API that handles all incoming queries and talks to the graph database. You can run several nodes on the same machine.apigraph
-
portal: Deploys the federation engine and a connected web query interface. Use this profile only if you need to host your own federated query tool, e.g. to federate over nodes that are not in the list of public Neurobagel nodes.federationquery_federation
-
proxy: Deploys pre-configured, containerized reverse-proxy services that will automatically set up routes to your Neurobagel services under your desired URLs.You can also use an existing proxy server
Our deployment instructions assume that there is no existing proxy server set up on the machine that will host your Neurobagel services. If you already have a proxy server setup, follow the slightly modified steps described in deploying with an existing proxy server.
In this case, you can ignore the
proxydeployment recipe.
Setting up a production deployment
Common setup for all deployment profiles
Do these steps first for each deployment profile you set up
Each production deployment profile requires a fresh deployment recipe and begins with the same initial steps. Complete these steps before following the profile-specific setup instructions.
Clone the recipe repository
Make a fresh clone of the recipes repository in a location of your choice.
git clone https://github.com/neurobagel/recipes.git recipes
Consider changing recipes to a name you will recognize in the future
Then navigate into this directory for the remaining steps.
cd recipes
Copy the template configuration files
The recipe repository includes templates of files for configuring your deployment: .env and
local_nb_nodes.json. Copy and rename these templates, but do not edit
the templates themselves.
cp template.env .env
cp local_nb_nodes.template.json local_nb_nodes.json
Proxy server
Skip if you already have a reverse proxy server
If you already have a reverse proxy server set up and want to continue using it, do not follow this section and instead continue with our guide on production deployment with an existing proxy server.
Always launch the proxy server first
Our reverse proxy recipe is set up to automatically configure routes to the Neurobagel services you launch. In order to do that, the proxy server must already be running when you launch a new Neurobagel service. If you have already launched Neurobagel services (e.g. node or portal), shut them down again, launch the proxy server, and then relaunch the services.
Start from a fresh deployment recipe!
To host your Neurobagel node services under a custom URL
(e.g. https://www.myfirstnode.org/query) rather than a server IP address and port
(e.g. http://192.168.0.1:3000), we provide a recipe for you to easily set up an
NGINX reverse proxy
alongside your Neurobagel services.
Make sure that:
- you have access to the domain you want to host your services at, so that you can create a DNS entry that points at the machine that will host your Neurobagel services.
- your machine firewall allows incoming connections on ports 80 (HTTP) and 443 (HTTPS). Both are necessary to enable HTTPS connections.
Launch the proxy server using the corresponding deployment recipe:
docker compose -f docker-compose.proxy.yml up -d
You should now see the nginx-proxy and acme-companion services running:
docker ps
Node
Start from a fresh deployment recipe!
Make sure the proxy service is already running
The default portal deployment recipe requires that you have already
deployed the proxy server.
Set graph store credentials
This is a security relevant section!
The graph store (GraphDB instance) in a Neurobagel node
is secured with password-based access and includes two users:
an admin superuser and a regular database user,
both of which are automatically configured by the Neurobagel deployment recipe.
Passwords for both users are defined via files in the ./secrets directory
of the recipes repository, while the regular database username is set
through an environment variable in .env file.
Changing user credentials after the first launch requires a hard reset
If you've previously launched a Neurobagel deployment (Docker Compose stack), you'll need to reset your graph store for any changes you have made to user credentials to take effect. Any other configuration changes you've already made will be applied when you re-launch your node.
-
In your
.env, set a custom username and database name for your graph store by editing the following variables:NB_GRAPH_USERNAMENB_GRAPH_DB
-
In the (
./secretsdirectory, change the default passwords by replacing the contents of the fileNB_GRAPH_ADMIN_PASSWORD.txtfor theadminsuperuser, and the fileNB_GRAPH_PASSWORD.txtfor the graph database user (corresponding toNB_GRAPH_USERNAME).- To generate a random password in the terminal, you can use:
openssl rand -hex 16- (Optional) You can change the directory where your password files are stored by editing the
NB_GRAPH_SECRETS_PATHvariable in.env.
Graph store passwords are only for administrator use!
The
adminuser and graph database user credentials are intended solely for internal use by the deployment recipe scripts that automatically set up and update the graph store, or for a node administrator to interact directly with the graph store. These credentials also secure internal communication between your graph store and its node API, ensuring that node users cannot query your graph directly. GraphDB user credentials are not intended for use by a general node query user.Passwords are handled as Docker secrets
The contents of
NB_GRAPH_ADMIN_PASSWORD.txtandNB_GRAPH_PASSWORD.txtare passed to Neurobagel containers as Docker secrets. This ensures that your passwords are not exposed in the container logs or in thedocker-compose.ymlfile.Do not share your password files with others.
Add data to the node
By default,
any JSONLD data in the
./data
subdirectory of your deployment recipe directory
will be automatically uploaded to the graph store.
To add the dataset JSONLD files for your node, you can either:
- place the JSONLD files inside
./data(remember to delete the example JSONLD), OR - change the path where the deployment recipe will look for JSONLD files
by editing the variable
LOCAL_GRAPH_DATAin your.envfile
Set node response granularity
This is a security relevant section!
Review and change as needed the following variables in .env
based on your data sharing requirements:
NB_RETURN_AGG: whether to return aggregate counts only, instead of subject-level recordsNB_MIN_CELL_SIZE: minimum matching subject threshold for dataset visibility in queries
For more details on all available environment variables, check the Environment variable reference
Set node domain
In your .env file,
uncomment the variable NB_NAPI_DOMAIN and set it to the domain
(including any subdomain) that you want to use for your node API
(the web-accessible part of your node).
NB_NAPI_DOMAIN=node.mydomain.org
Do not include the protocol (http:// or https://) in the domain name
Set node deployment profile
In your .env file, ensure that COMPOSE_PROFILES is set to
the node profile. This is the default value.
COMPOSE_PROFILES=node
Set node subdirectory path
This is an optional step
To host your node API on a subdirectory of your domain (e.g. mydomain.org/node),
uncomment and set NB_NAPI_BASE_PATH
to the desired subdirectory path in the .env file.
NB_NAPI_BASE_PATH="/node"
This is useful if you want to serve several nodes (or services)
on the same domain, because you can use a different subdirectory for each
(e.g. mydomain.org/node1, mydomain.org/node2, ...).
Custom paths must include a leading slash /
Launch node
Save the changes to your .env file and launch your node:
docker compose -f docker-compose.prod.yml up -d
Portal
Start from a fresh deployment recipe!
Make sure the proxy server is already running
The default portal deployment recipe requires that you have already
deployed the proxy server.
Set nodes to federate
To host your own query portal that federates over a set of nodes,
use your local_nb_nodes.json to configure the nodes of interest.
Each node to be federated over is defined using a dictionary with two required keys:
NodeName: Display name of the node, which will be shown in the query portalApiURL: URL of the node API for the node
Example:
[
{
"NodeName": "Parkinson's Disease Data - Site 1",
"ApiURL": "https://mydomain.org/site1"
},
{
"NodeName": "Parkinson's Disease Data - Site 2",
"ApiURL": "https://mydomain.org/site2"
}
]
ApiURL must include the protocol (http:// or https://)
Public Neurobagel nodes do not need to be included
We maintain a list of publicly accessible Neurobagel nodes
here.
By default, every new f-API will look up this list
on startup and include it in its internal list of nodes to
federate over (this can be disabled using the environment variable NB_FEDERATE_REMOTE_PUBLIC_NODES).
This means that you do not have to manually add these public nodes to your local_nb_nodes.json file.
Avoid creating an infinite loop
Make sure you do not include your own f-API in the list of nodes to federate over. This will cause an infinite request loop that will likely overload your service, as an f-API will be repeatedly making requests to itself.
Set portal domains
In your .env file, set the domain names for your
web query tool and federation API by uncommenting and setting the following variables:
NB_QUERY_DOMAINfor the query toolNB_FAPI_DOMAINfor the federation API
Do not include the protocol (http:// or https://) in the domain name
Set portal subdirectory paths
This is an optional step
You may want to have one or several of your portal services
respond on a subdirectory of your domain (e.g. myinstitute.org/federate).
To host one or more of your portal services on a subdirectory
of your domain (e.g. mydomain.org/federate),
uncomment and set the following variable(s) to the desired
subdirectory path(s) in the .env file:
NB_QUERY_APP_BASE_PATHfor the query toolNB_FAPI_BASE_PATHfor the federation API
Custom paths must include a leading slash /
This is useful if you want to serve the services on the same domain,
because you can use a different subdirectory for each
(e.g. mydomain.org/federate, mydomain.org/query).
Set portal deployment profile
In your .env file, set COMPOSE_PROFILES to
the portal profile.
COMPOSE_PROFILES=portal
Launch portal
docker compose -f docker-compose.prod.yml up -d
Making your node publicly discoverable
The public Neurobagel query tool (https://query.neurobagel.org) provides query federation to all publicly accessible Neurobagel nodes.
To make your node queryable at https://query.neurobagel.org, it simply needs to be added to our public federation index on GitHub.
If you have a GitHub Account
-
Fork the neurobagel/menu repository.
-
Add your node name and node API URL to the public federation index JSON file, using the following format:
{ "NodeName": "NAME OF YOUR NODE", "ApiURL": "https://URL-OF-YOUR-NODE-API" }NodeNamedefines the display name of your node as it will appear the Neurobagel query tool. -
Open a pull request in the neurobagel/menu repository.
If you do not have a GitHub Account
Join the Neurobagel Discord server and message @neurobagel/dev with your node information, so that a maintainer can add your node to the public federation index.