BRC Analytics

Roadmap

BRC Analytics plans to deliver new powerful analysis functionality while provide access to some features of the VEuPathDb system. This will be an iterative process involving multiple steps in several areas.

Our plan

This will be an iterative process involving multiple steps in several areas.

This set of tasks deals with transferring data from VEuPathDb infrastructure and creating a list of genomes that will be initially maintained within BRC Analytics. The number of taxa included in our system will

  1. Transferring databases and associated data from VEuPathDb servers to TACC infrastructure.
  2. Understanding the structure of VEuPathDb database and deciding which data will be ingested as custom tracks for the BRC Analytics instance of the UCSC Genome Browser.
  3. Uncovering how JBrowse instances were created within VEuPathDb gene pages and replicating this data at the BRC Analytics instance of the UCSC Genome Browser.

This set of tasks ensures that BRC Analytics provides access to the latest versions of all 785 VEuPathDb taxa and associated annotations.

  1. Build UCSC Genome Browsers for all 785 taxa by matching genomes found in VEuPathDb against NCBI Datasets.
  2. Add annotations that can be extracted from the VEuPathDb databases and JBrowse instances (related to 2 and 3 above)
  3. Create links from UCSC Gene Pages to NCBI Gene pages
  4. Work with NCBI on evolution of gene pages to contain information valuable for understanding the structure and function of genes from a wide range of eukaryotes including eukaryotic pathogens and vectors.

  1. Create a dedicated Galaxy instance at https://brc.usegalaxy.org
  2. Prepare reference data for all 785 genomes that will initially be served by the system. This includes creation of indices for bowtie, bowtie2, bwa-mem, bwa-mem2, hisat2, STAR, and SnpEff.
  3. Create a single sign-on so that user accounts are shared between BRC Analytics and Galaxy. When creating an account the user should be able to specify the default Galaxy instances (.org, .eu., .org.au, .fr ...)
  4. Develop functionality for "focusing" Galaxy on a particular genome build selected by the user (see "Focusing" Galaxy on a particular genome galaxy#18882).
  5. Develop "simplified workflow" interface (see Simplified workflow run interface galaxy#18883)
  6. Develop and polish workflows for variant discovery, epigenetics, transcriptomics, assembly, and protein folding that were previously available to the users on VEuPathDb Galaxy instance.

  1. Create dedicated linkedin, mastodon, and bluesky accounts
  2. Create a dedicated support channel using Discourse infrastructure.
  3. Try to connect with as many users of VEuPathDb as possible to solicit their feedback across multiple areas including: which features are needed, what genomes should be integrated, which key datasets need to be re-analyzed in put in the context of available genomic data.
  4. Beginning on Oct 1st begin posting regular updates via social media channels.