Skip to content

Maintaining a node

Once your Neurobagel node is running and configured correctly, there are some recurring tasks you may have to do to keep it operating correctly.

Updating the Neurobagel services

Updating the Neurobagel Docker images

We are continuously improving Neurobagel tools and services, so you may want to update your Neurobagel node to the latest version to benefit from new features and bug fixes. We always publish our tools as Docker images on Docker Hub.

Each Docker image has a semantic version tag (vX.Y.Z), and also two rolling tags:

  • latest (the latest stable release). This is the default tag used in the Neurobagel docker-compose.yml file.
  • nightly (the latest build from the main branch). This tag is only used for compatibility testing and should not be used in production.

You can pull the most recent docker images for Neurobagel tools by running:

docker compose pull
Not sure what version you have?

Since latest is a rolling tag, each latest Docker image for a Neurobagel tool includes its corresponding semver number (vX.Y.X) as part of its Docker image labels.

You can find the labels for an image you have pulled in the image metadata, e.g.:

docker image inspect neurobagel/api:latest
or, to view only the labels:
docker image inspect --format='{{json .Config.Labels}}' neurobagel/api:latest
In either case, you should see something like this in the output:

    "Labels": {
        "org.opencontainers.image.created": "https://github.com/neurobagel/api",
        "org.opencontainers.image.revision": "01530f467e163f3dff595d3327bc60ba453de47d",
        "org.opencontainers.image.version": "v0.3.1"
    }
where "org.opencontainers.image.version" refers to the version number.

Restarting services after an update

Whether you have updated the Docker images, the configuration, or the data of your Neurobagel node, you will need to restart the services to apply the changes.

To shut down your Neurobagel services, navigate to the directory containing your deployment recipe and run:

docker compose down

Then, to start the services again:

docker compose up -d

For production deployments, you must specify the recipe filename

To relaunch services for a node or portal deployment, you must provide the production Docker Compose recipe filename explicitly using the -f option:

docker compose -f docker-compose.prod.yml up -d

Updating the data in your graph

The Neurobagel deployment recipe launches a dedicated graph database that stores the datasets for a single node. The data in this graph database is loaded from the path specified in the LOCAL_GRAPH_DATA environment variable, and can be changed at any time.

By default, the graph database will only contain an example dataset called BIDS synthetic.

If you have followed the initial setup for deploying a Neurobagel node from our Docker Compose recipe, replacing the existing data in your graph database with your own data (or updated data) is a straightforward process.

Once you have generated or updated the JSONLD files you want to upload, to update the data in your graph:

  1. Shut down the Neurobagel node, if it is already running

    docker compose down
    
  2. Update the data files in the directory specified by the LOCAL_GRAPH_DATA variable in .env, or simply change the path to a directory containing your JSONLD files.

  3. (Re)start the Neurobagel node

    docker compose up -d
    

For production deployments, you must specify the recipe filename

To relaunch services for a production node deployment, you must provide the production Docker Compose recipe filename explicitly using the -f option:

docker compose -f docker-compose.prod.yml up -d

Here are some other common scenarios where you might need to update the data in your graph:

Following a change in my dataset

When using Neurobagel tools on a dataset that is still undergoing data collection, you may need to update the Neurobagel annotations and/or graph-ready data for the dataset when you want to add new subjects or measurements or to correct mistakes in prior data versions.

For any of the below types of changes, you will need to regenerate a graph-ready .jsonld file for the dataset which reflects the change.

If the phenotypic (tabular) data have changed

If new variables have been added to the dataset such that there are new columns in the phenotypic TSV you previously annotated using Neurobagel's annotation tool, you will need to:

  1. Generate an updated data dictionary by annotating the new variables in your TSV following the annotation workflow

  2. Generate a new graph-ready data file for the dataset by re-running the CLI on your updated TSV and data dictionary

If only the imaging data have changed

If the BIDS data for a dataset have changed without changes in the corresponding phenotypic TSV (e.g., if new modalities or scans have been acquired for a subject), you have two options:

  • If you still have access to the dataset's phenotypic JSONLD generated from the pheno command of the bagel-cli (step 1), you may choose to rerun only the bids CLI command on the updated BIDS directory. This will generate a new graph-ready data file with updated imaging metadata of subjects.

OR

When in doubt, rerun both CLI commands.

If only the subjects have changed

If subjects have been added to or removed from the dataset but the phenotypic TSV is otherwise unchanged (i.e., only new or removed rows, without changes to the available variables), you will need to:

  • Generate a new graph-ready data file for the dataset by re-running the CLI (pheno and bids steps) on your updated TSV and existing data dictionary

Following a change in the Neurobagel data model

As Neurobagel continues developing the data model, new tool releases may introduce breaking changes to the data model for subject-level information in a .jsonld graph data file. Breaking changes will be highlighted in the release notes.

If you have already created .jsonld files for a Neurobagel graph database but want to update your graph data to the latest Neurobagel data model following such a change, you can easily do so by rerunning the CLI on the existing data dictionaries and phenotypic TSVs for the dataset(s) in the graph. This will ensure that if you use the latest version of the Neurobagel CLI to process new datasets (i.e., generate new .jsonld files) for your database, the resulting data will not have conflicts with existing data in the graph.

Note that if upgrading to a newer version of the data model, you should regenerate the .jsonld files for all datasets in your existing graph.

Re-uploading a modified dataset

To allow easy (re-)uploading of the updated .jsonld for your dataset(s) to a graph database, we recommend making a copy of it in a central directory on your research data fileserver for storing local Neurobagel jsonld datasets. Then, simply follow the steps for uploading/updating a dataset in the graph database.

Updating your graph backend configuration

Updating existing database user permissions

If you want to change database access permissions (e.g., adding or removing access to a database) for an existing user in your GraphDB instance, you must do so manually.

Of note, in GraphDB, there is no straightforward REST API call to update a user's database access permissions without replacing the list of their existing database permissions ("grantedAuthorities") entirely.

Tip

You can verify a user's settings at any time with the following:

curl -u "admin:NewAdminPassword" http://localhost:7200/rest/security/users/DBUSER

Example: if user DBUSER was granted read/write access to database my_db1 with the following command (this command is run by default as part of graphdb_setup.sh):

curl -X PUT --header 'Content-Type: application/json' -d '
{"grantedAuthorities": ["WRITE_REPO_my_db","READ_REPO_my_db"]}' http://localhost:7200/rest/security/users/DBUSER -u "admin:NewAdminPassword"

To grant DBUSER read/write access to a second database my_db2 (while keeping the existing access to my_db1), you would rerun the above curl command with all permissions (existing and new) specified since the existing permissions list will be overwritten:

curl -X PUT --header 'Content-Type: application/json' -d '
{"grantedAuthorities": ["WRITE_REPO_my_db1","READ_REPO_my_db1", "WRITE_REPO_my_db2","READ_REPO_my_db2"]}' http://localhost:7200/rest/security/users/DBUSER -u "admin:NewAdminPassword"

Similarly, to revoke my_db1 access so DBUSER only has access to my_db2:

curl -X PUT --header 'Content-Type: application/json' -d '
{"grantedAuthorities": ["WRITE_REPO_my_db2","READ_REPO_my_db2"]}' http://localhost:7200/rest/security/users/DBUSER -u "admin:NewAdminPassword"
Managing user permissions using the GraphDB Workbench

If you are managing multiple GraphDB databases, the web-based administration interface for a GraphDB instance, the Workbench, might be an easier way to manage user permissions than the REST API. More information on using the GraphDB Workbench can be found here.

Resetting your GraphDB instance

Each Neurobagel node has its own GraphDB instance, which is used to store the graph data for the node. If you want to reset your graph database and start again from scratch, follow these steps:

  1. Ensure that your Neurobagel node is not running (i.e., shut down the Docker containers for the node).

    docker compose down
    
  2. Delete the Docker volume that contains the GraphDB data for your node.

    docker volume rm neurobagel_node_graphdb_home
    

    Replace neurobagel_node_graphdb_home with the name of the volume created for your node. It is usually named <project_name>_graphdb_home where <project_name> is the name of your Docker Compose stack as defined in COMPOSE_PROJECT_NAME in your .env file.

    docker volume ls lists all volumes on your system

    You can use the docker volume ls command to list all volumes on your system. This will help you identify the name of the volume that was created for your Neurobagel node.

  3. Launch your Neurobagel node again.

    docker compose up -d
    

    For production deployments, you must specify the recipe filename

    To relaunch services for a production node deployment, you must provide the production Docker Compose recipe filename explicitly using the -f option:

    docker compose -f docker-compose.prod.yml up -d
    

Some examples of when you might want to do this:

  • You started but did not complete Neurobagel node setup previously and want to ensure you are using up-to-date instructions and recommended configuration options
  • Your local node has stopped working after a configuration change to your graph database (e.g., your Neurobagel node API no longer starts or responds with an error, but you have confirmed all environment variables you have set should be correct)
  • You need to modify credentials for your graph store

Warning

This action will wipe any graph databases and users you previously created!

Environment variables reference

Ensure that shell variables do not clash with .env file

If the shell you run docker compose from already has any shell variable of the same name set, the shell variable will take precedence over the configuration of .env! In this case, make sure to unset the local variable first.

For more information, see Docker's environment variable precedence.

Tip

Double check that any environment variables you have customized in .env are resolved with your expected values using the command docker compose config.

Below are all the possible Neurobagel environment variables that can be set in .env.

Environment variable Default needs change? Description Default value if not set Used in these installation modes
COMPOSE_PROJECT_NAME No (Used only by Docker Compose) Prefix for container names in the deployment; useful when running multiple deployments on the same machine. neurobagel_node Docker
NB_GRAPH_USERNAME Yes Username to set for the graph database user. - Docker, Python
NB_GRAPH_SECRETS_PATH Yes Path to files containing the secure passwords to set for the admin user (NB_GRAPH_ADMIN_PASSWORD.txt) and graph database user (NB_GRAPH_PASSWORD.txt). ./secrets Docker
NB_GRAPH_DB Yes Name to give your graph database (e.g., for a GraphDB database, use the format repositories/{database_name}) repositories/my_db Docker, Python
NB_GRAPH_MEMORY No The maximum amount of memory that can be used by graph. Equivalent to setting the -Xmx parameter on the JVM. Value should be a number followed directly by a letter denoting the size. E.g. 264m for 264 MB, 2g for 2 GB. (For more info, see https://graphdb.ontotext.com/documentation/10.8/requirements.html#hardware-sizing.) 2g Docker
LOCAL_GRAPH_DATA Yes Path on your filesystem to the JSONLD files you want to upload to the graph database ./data Docker
NB_GRAPH_PORT_HOST No Port number on the host machine to map the graph server container port to 7200 Docker
NB_NAPI_ALLOWED_ORIGINS No Origins allowed to make cross-origin resource sharing requests. Multiple origins must be separated with spaces in a single string enclosed in quotes. "" Docker, Python
NB_RETURN_AGG No Whether to return only aggregate, dataset-level query results (excluding subject/session-level attributes). One of [true, false] true Docker, Python
NB_MIN_CELL_SIZE No Minimum number of matching subjects required for a dataset to be returned as a query match. Datasets with matching subjects <= this number will be excluded from query results. 0 Docker, Python
NB_NAPI_TAG No Docker image tag for the Neurobagel node API latest Docker
NB_NAPI_PORT_HOST No Port number on the host machine to map the Neurobagel node API container port to 8000 Docker
NB_NAPI_BASE_PATH No (If using reverse proxy) The URL path where the node API is served from. Do not include a trailing slash. "" Docker
NB_NAPI_DOMAIN Yes (Production only) The domain name where the n-API will be hosted from "" Docker
NB_FAPI_TAG No Docker image tag for the Neurobagel federation API latest Docker
NB_FAPI_PORT_HOST No Port number on the host machine to map the Neurobagel federation API container port to 8080 Docker
NB_FEDERATE_REMOTE_PUBLIC_NODES No If "True", include public nodes in federation. If "False", only locally specified nodes in local_nb_nodes.json are queried. true Docker, Python
NB_FAPI_BASE_PATH No (If using reverse proxy) The URL path where the federation API is served from. Do not include a trailing slash. "" Docker
NB_FAPI_DOMAIN Yes (Production only) The domain name where the f-API will be hosted from "" Docker
NB_QUERY_TAG No Docker image tag for the query tool latest Docker
NB_QUERY_PORT_HOST No Port number used by the query_tool on the host machine 3000 Docker
NB_API_QUERY_URL No (Testing only) URL of the Neurobagel f-API that the query tool will send its requests to. The query tool sends requests from a user's machine, so ensure the API URL is provided as a user would access it from their own machine. See also the query tool README. In a production deployment this variable is ignored. http://localhost:8080 Docker
NB_QUERY_APP_BASE_PATH No (If using reverse proxy) The URL path for the query tool, determines the specific URL at which the app should be rendered for users to access it / Docker
NB_QUERY_DOMAIN Yes (Production only) The domain name where the query tool will be hosted from "" Docker
NB_CONFIG No (Experimental) Name of the Neurobagel community configuration to load, if the node uses a community's custom vocabularies. Neurobagel Docker, Python
NB_QUERY_HEADER_SCRIPT No (Experimental, for development environments only) Custom script to add to the header section of the query tool site, such as for a GDPR-aware analytics tool. "" Docker
NB_ENABLE_AUTH No (Experimental, for development environments only) Whether to enable authentication for cohort queries. One of [true, false] false Docker, Python
NB_QUERY_CLIENT_ID No (Experimental, for development environments only) OAuth client ID for the query tool. Required if NB_ENABLE_AUTH is set to true. - Docker, Python
COMPOSE_PROFILES No (Production only) Selects the production launch profile to use: "node" or "portal". See the documentation for details. node Docker