Seamless sharing of childhood cancer data and analysis between researchers across international borders

Friday 14 February 2020

The development of personalised treatments that target rare paediatric cancer subtypes can be enhanced through global collaboration. Comparing an Australian patient's tumour to a larger group of other tumours allows insights that lead to better outcomes. But geography and rules to protect personal data in different jurisdictions can make the sharing and comparing of essential data difficult or even impossible.

In an effort to fix this, the Australian BioCommons is part of an international collaboration that came together in Sydney this month. Members of the partnership between the Australian BioCommons, BioPlatforms Australia, ARDC, Children’s Cancer Institute, D3b and Seven Bridges have been working to provide an integrated bioinformatics research platform with compute, storage, and file metadata tagging all in one place.

This multinational project is establishing internationally federated computational infrastructure to enable the harmonisation of pediatric cancer genomic data from Australia’s ZERO Childhood Cancer Program and the Gabriella Miller Kids First Data Resource Centre in the United States.

Australian initiatives such as Zero Childhood Cancer will leverage the benefits provided by Cavatica, through its expansion to AWS Sydney. Cavatica is a cloud-based platform for collaboratively accessing, sharing, and analysing cancer data.

Cavatica was launched in 2016 as a partnership between the Center for Data Driven Discovery in Biomedicine (D3b) at the Children’s Hospital of Philadelphia (CHOP), Seven Bridges, the Children’s Brain Tumor Tissue Consortium (CBTTC) and the Pacific Pediatric Neuro-Oncology Consortium (PNOC). Since then, it has expanded to being a collaborative platform for a number of initiatives, including the Gabriella Miller Kids First Data Resource Center (KFDRC).

AWS compute is leveraged allowing for high-throughput analysis, while workflows written in common workflow language (CWL) with docker to maximise portability and reusability. Additionally, via Cavatica’s Data Cruncher, analyses using various open-source R and Python packages can be shared through Jupyter Notebook. The KFDRC has used this platform to harmonise and process over 15,000 whole genomes, whole exomes, and RNA-seq, including alignment, somatic variant calling, copy number calls, structural variants, RNA expression and fusions.

During the visit, key members of the D3b team provided training in using Cavatica and Dr Allison Heath, Director of Data Technology and Innovation, Center for Data Driven Discovery in Biomedicine (D3b) at the Children's Hospital of Philadelphia kindly delivered an overview of Cavatica's features for the Australian BioCommons webinar series while she was in our time zone!

The intensive days together brought closer the reality of the platform’s readiness for use by Australian researchers in coming months. Stay tuned as Cavatica will soon be enabling seamless sharing of data and analysis methods between researchers in Australia and the United States.

See also: https://ardc.edu.au/news/developing-personalised-treatment-for-kids-with-cancer/

Watch the webinar Cavatica - the cloud-based platform for collaboratively accessing, sharing, and analysing cancer data