User guide

Parameters

To view command-line parameters type brooklyn_plot -h:

usage: brooklyn_plot [options]

Brooklyn (Gene co-expression and transcriptional bursting pattern recognition tool in single cell/nucleus RNA-sequencing data)

optional arguments:
  -h, --help  show this help message and exit
  --version   show program's version number and exit

Options:

  -h5,   --h5ad               input file in .h5ad format (accepts .h5ad)
  -ba,   --biomart            the reference gene annotations (in .csv format)
  -od,   --outDir             the directory of the outputs (Default: brooklyn-date-hh-mm-ss)
  -of,   --outFile            the name of summarized brooklyn file as CSV file and a brooklyn plot in PDF (Default: brooklyn)
  -ql,   --query              the list of genes to be queried upon (one gene per line and in .csv format)
  -sl,   --subject            the list of genes to be compared with (one gene per line and in .csv format)
  -cpu,  --threads            the number of processors to use for trimming, qc, and alignment (Default: 1)
  

CLI - Example usage

Example command usage:

brooklyn_plot -hd <input h5ad file> -ba <biomart_annotations.csv> -od <output_brooklyn> -of <brooklyn_plt>  -ql <queryGeneList.csv> -sl <GeneSearchSpace.csv> -cpu 12  
example: brooklyn_plot -h5 subset_AD_AT8_C3_Excit_EX-1-2-7.h5ad -ba AD_Exci_biomart_AT8_Ex-1-2-7.csv -of brooklyn_pkg -ql genelist_AT8_Ex-1-2-7_head.csv -sl againstlist_AT8_Ex-1-2-7_head.csv -cpu 10

Output command line:

Entering parallel mode with 10 CPU's.

With chunk size of 1, 9 chunks are created

The brooklyn_arch execution is completed in -8.1604 second(s)

The summary is completed in 0.1124 second(s)

The path to ourput directory: test_brooklyn/brooklyn_2023-03-31_16-50-23

The analysis completed in 10.1727 second(s)

Test

The test case illustrates the usage of brooklyn_plot with the cardiac cells - dataset

Download the required files from Source Forge, DCM_data

You can download to your working directory as shown below:
wget -O subset_seidman_TTN.h5ad "https://sourceforge.net/projects/brooklyn/files/data/subset_seidman_TTN.h5ad/download"
wget -O genelist.csv "https://sourceforge.net/projects/brooklyn/files/data/genelist.csv/download"
wget -O againstlist.csv "https://sourceforge.net/projects/brooklyn/files/data/againstlist.csv/download"
wget -O seidmanttn_var_biomart.csv "https://sourceforge.net/projects/brooklyn/files/data/seidmanttn_var_biomart.csv/download"

Run basic brooklyn_plot command:

brooklyn_plot -h5 subset_seidman_TTN.h5ad  -ba seidmanttn_var_biomart.csv -od results_ttn -ql genelist.csv -sl againstlist.csv -cpu 10

Entering parallel mode with 10 CPU's.

With chunk size of 35, 10 chunks are created

The brooklyn_arch execution is completed in -7923.1266 second(s)

The summary is completed in 4.2357 second(s)

The path to ourput directory: results_ttn/brooklyn_2023-04-05_14-25-57

The analysis completed in 7932.9977 second(s)

The output folder generated here is uploaded as zip file, you can download the same from here.

How to build required input files, such as gene list, h5ad and biomart annotations?

The .h5ad file can be obtained from a singel cell or single nuclei RNA sequencing datasets for example, cellxgene-collections. The users are recommended to select a single cell type corresponding to diagnosis/disease/normal state of interest. A detailed example of how we selected cell types from DCM/ACM heart cell atlas: Cardiomyocytes and other annotations such as biomart, gene lists are described here Tutorial-notebook—TTN.

Need assistance/have issues:

Please report any issues/concerns/suggestions here. Click create new issue and in Title: “Please describe the error you think is obvious and will be general for the scientific community to recognize”, and Comment: “Give us the maximum information possible regarding the error that you can see on the standard output/terminal”.

Resources

The h5ad AnnData was downloaded from cellxgene, Reichart et al. (2022) Science