ABACBS-2018 Program
Monday 26th November
8:00am: Registration desk opens for Student symposium delegates
9:00am-5:00pm: COMBINE Student Symposium
Morning tea, lunch and afternoon tea provided.
2:00pm-5:30pm: BioIT & Research Computing Workshop.
Afternoon tea provided.
5:30Pm: Registration desk opens
6:00pm-7:00pm: ABACBS Awards Ceremony
7:30pm: Invited speakers dinner
Tuesday 27th November
8:00am: Registration desk opens
9:00am: Welcome to Country & Conference Opening
9:15am-10:30am: Sesson 1 - Evolution
9:15am: Maitreya Dunham — International keynote speaker (45min)
Drivers of aneuploidy and adaptation in yeast
Whole chromosome aneuploidy and large segmental copy number variants (CNVs) have been implicated in a wide array of diseases in humans as well as proliferative defects in single cells. Yet, paradoxically, CNVs are routinely observed in laboratory evolution experiments with microbes and contribute to the aberrant overgrowth of cancer cells. The defects associated with aneuploidy are often attributed to the imbalance of many differentially expressed genes on the affected chromosomes, with many genes each contributing incremental effects. An alternate hypothesis is that a small number of individual genes are large effect ‘drivers’ of these fitness changes when present in an altered copy number. To test these two views, we have engineered a library of strains with ~2000 synthetic chromosome arm amplifications tiled across the genome and competed them in nutrient limitation, a condition known to select for aneuploidies. Comparing the fitness of strains with amplifications differing only by a few genes, we found that many of the fitness effects seen in nutrient-limitation can be attributed to a small number of driver genes that disproportionately affect fitness. With this collection, we have expanded our focus to conditions with variable and even detrimental effects on aneuploids – temperature stress, extended stationary phase, and treatment with radicicol and benomyl. By comparing the fit quality of piecewise constant and linear models to our relative fitness data, we find that most chromosomes are better modeled as stepwise patterns in nutrient-limitation and extended stationary phase, suggesting that individual genes may disproportionately drive fitness in these conditions. Using these models, we have identified breakpoint regions for further validation. Candidate driver genes will be validated both in the context of euploid and aneuploid cells to determine whether fitness effects of amplifying these genes are dependent on the presence of large amplifications.
10:00am: Stephen Bent, Mutation signatures in the mitochondria of moth specimens preserved for up to 100 years (10min)
10:10am: Yi Jin Liew, Epigenetic adaptation of a coral to ocean acidification: non-standard genomics in a non-model organism (20min)
10:30am-11:00am: Morning tea
11:00am-1:00pm: Session 2 - Modelling
11:00am: Michael Stumpf — National Keynote speaker (30min)
Reconstructing Gene Regulatory Networks from Single Cell Data
Gene expression is controlled by networks of transcription factors and regu- lators, but the structure of these networks is as yet poorly understood and is thus inferred from data. Recent work has shown the efficacy of information theoretical approaches for network reconstruction from single cell transcriptomic data. Here I discuss methods for performing large scale hypothesis testing on putative network edges derived from information theory, bringing together empirical Baye appraches and work on theoretical null distributions for information measures. Crucially, our approach allows us also to use available prior data, and I discuss the application of these approaches using single cell data from mouse pluripotent stem cells and demonstrate that it is possible to improve the accuracy of networks inferred from single cell data can sometimes be improved using priors from literature or population-level ChIP-Seq and qPCR data.
11:30am: Belinda Phipson, Kidneys in a dish: examining the reproducibility of organoid differentiation using transcriptomics (20min)
11:50am: Yu Wan, A network approach for detecting horizontal co-transfer of antimicrobial resistance genes in bacteria (10min)
12:00pm: Kim-An Do — Special invited speaker (25min)
12:25pm: Alex Tokolyi, Plasmid classification and investigation through network analysis of co-occurring genes (5min)
12:30pm: Ralph Patrick, Decoding the identity and flux of cardiac cells in injury and homeostasis at single-cell resolution (20min)
12:50pm: Kirsti Paulsen, Optimising intrinsic protein disorder prediction for short linear motif discovery (5min)
12:55pm: Luke Zappia, Visualising trees to choose clusters for scRNA-seq data (5min)
1:00pm-2:00pm: Lunchtime
1:00pm-2:00pm: Poster session
1:15pm-1:45pm: Illumina presentation
2:00pm-4:00pm: Session 3 - Genomics
2:00pm: International keynote speaker: Leming Shi (45min)
Quality control and standardization of omics and bioinformatics for precision medicine
Realization of precision medicine depends on reliable and reproducible tools of genomics and bioinformatics for accurately characterizing patients at the genome scale. The MicroArray and Sequencing Quality Control (MAQC/SEQC) project is a community-wide effort to address concerns about the reliability of microarrays and next-generation sequencing. The first three phases of the MAQC/SEQC project focused on quality control and standardization on the generation and analysis of microarray and RNA-seq data, leading to the publication of three special issues by Nature Publishing Group. The ongoing MAQC-IV (SEQC2) project is developing standard operating procedures for DNA-seq, aiming to improve the reliability of sequence variant calls and to increase the predictive power of genomics-based models using large cohorts of cancer patients. The MAQC/SEQC project is expected to build the foundation for the development of new diagnostic techniques and methods and foster close collaboration among international communities of precision medicine. This decade-long effort has led to the formation of a new international society, the Massive Analysis and Quality Control (MAQC) Society (http://www.maqcsociety.org; Shi L et al., Nature Biotechnology, 2017), which is dedicated to quality control and analysis of massive data generated from high-throughput technologies for enhanced reproducibility.
The Chinese Quartet Project is a joint project between Fudan University with the National Institute of Metrology, the National Center for Clinical Laboratories, and the CFDA, aiming to establish a set of national reference materials and reference datasets for objectively evaluating the pros and cons of each step in clinical multi-omics studies. Specifically, we are generating reference materials at the levels of DNA, RNA, proteins, and metabolites simultaneously from the same set of immortalized cell lines from a “Chinese Quartet” including father, mother, and two monozygotic twin daughters. Qualitative and quantitative “genetic ground truths” are being established for the “Chinese Quartet” family. We are generating multi-omics datasets with multiple platforms, and the multi-omics integrative analysis will also improve the validity of the multi-omics reference datasets. We’ll share the reference materials and reference datasets to the community and try to improve the quality of clinical multi-omics studies, which is essential to translate the multi-omics technologies into clinical use.
2:45pm: Rebecca Poulos, Associations between mutational signatures and driver mutations in cancer reveal pathways toward cancer pathogenesis (20min)
3:05pm: COMBINE Prize Talk (20min)
3:25pm: Richard Edwards, Pseudodiploid pseudo-long-read whole genome sequencing and assembly of Pseudonaja textilis (eastern brown snake) and Notechis scutatus (mainland tiger snake) (20min)
3:45pm: Woo Jun Shim, Cell identity genes are predicted by absence of broad H3K27me3 domains (10min)
3:55pm: Taiyun Kim, Impact of similarity metrics on single-cell RNA-seq data clustering (5min)
4:00pm-4:30pm: Afternoon tea
4:30pm-6:00pm: Session 4 - Cancer
4:30pm: National Keynote Speaker: Ann-Maree Patch (30min)
Identifying intra-tumor heterogeneity and mechanisms of therapy resistance from cancer sample sequencing
Realizing the promise of precision medicine to treat cancer depends on i) correctly identifying patients who will benefit from a targeted therapy and ii) avoiding treatment resistance. Intra-tumor heterogeneity impairs our capacity to carry out these tasks therefore there is much interest in the detection and interpretation of sub-clonal events that affect patient outcomes. Although our ability to detect sub-populations of genetic, epigenetic or phenotypically distinct tumor cells is limited, recently developed methods and technologies are showing promise.
In this talk, I will present the methods we have been using to identify sub-clonal somatic variants illustrated by examples across different cancer types. From ovarian cancer, sub-clonal reversion of DNA repair deficiency and up regulation of cellular defenses. In colorectal cancer, a comparison of clonal representation between primary and metastatic disease. From mesothelioma, exploration of tumor cell heterogeneity in pleural effusion samples. How these findings could potentially affect patient’s responses to therapy and the challenges and limitations of these approaches, in particular relating to the annotation and interpretation of structural variants will be discussed.
5:00pm: Joseph Cursons, Methylation-induced silencing of tumour suppressor genes in liver cancer (20min)
5:20pm: Vivian Yeung, Primary and Metastatic Tumour Evolution (10min)
5:30pm: Christoffer Flensburg, Calling Somatic Copy Number Alterations From RNA-Seq (10min)
5:40pm: Anna Trigos, Genomic drivers of the fragmentation of co-expression modules regulating multicellularity in cancer (10min)
5:50pm: Stefano Mangiola, Allowing differential tissue composition analyses with ARMET (5min)
5:55pm: Anna Quagliari, Correcting unwanted variation in RNA sequencing data derived from a multi-centre study of leukemia (5min)
6:30pm-10:00pm Conference Dinner Level 6 tea room AT Walter and Eliza Hall Institute.
Wednesday 28th November
8:00am: Registration desk opens
9:00am-10:30am: Session 5 - Bioinformatics Methods
9:00am: Chris Saunders — International Keynote Speaker (45min)
Improving sequence analysis to increase the clinical value of whole genome sequencing.
To extract greater clinical value form whole genome sequencing (WGS) data, analysis methods must be extended to detect a greater variety of genomic variation, and to do so using methods which are computationally efficient and robust to assay variations such that they can be more reliably scaled. Towards this goal we have developed methods improvements applicable to both rare/undiagnosed genetic disease and cancer. including highly accurate and performant small variant calling which can adaptively accommodate factors such as PCR amplification error and tumor contamination of a normal control. Additional improvements have been made to increase the accuracy of structural variant (SV) calls, reconstruct longer insertions, better integrate copy number and SV inferences, describe chromosomal aberrations such as mosaicism and uniparental disomy, and use sequence graphs to more accurately characterize clinically important repeat expansions. These and other methodological improvements are contributing to an ongoing increase in solved clinical cases, bolstering the prospect of WGS as a first line diagnostic.
9:45am: Amali Thrimawithana, Manuka genome and beyond - a case of hybrid technologies and methodologies to unravel the plant genome (20min)
10:05am: James Hogan, Fast Clustering of Very Large Sequence Collections (10min)
10:15am: Charity Law A data-driven approach to characterising intron signal in RNA-seq data (10min)
10:25am: Momeneh Foroutan, Single sample scoring of molecular phenotypes (5min)
10:30am-11:30am: Morning tea
10:30am-11:30am: Poster session 2
11:30am-12:55pm: Sessions 6 - Imaging Data
11:30am: Nick Hamilton — National Keynote Speaker (30min)
Modelling, predicting and understanding kidney development using microscopy imaging
As attention turns to the functions and interactions of the tens of thousands of genes found in the genomics revolution, a second complex wave of data is arriving in the emerging field of high-throughput microscopy imaging of proteins in their cellular and organ contexts, often live in real time and in 3D. And while many areas of mathematical modelling of biological systems suffer from a sparsity of data, the new technologies provide extraordinarily data dense imaging that can be used as a foundation for robust quantification, modelling and prediction.
In recent work, my computational group in collaboration with the developmental biology groups of Ian Smyth (Monash) and Melissa Little (MCRI) created a high-throughput large-scale pipeline for the multi-dimensional analysis and modelling of kidney imaging. Kidneys of equal size can vary 10-fold in the number of nephrons at birth. Discovering what regulates such variation has been hampered by a lack of quantitative analysis to define kidney development, and factors leading to the formation of the ureteric tree and nephrons are still poorly understood. Taking advantage of advances in microscopy such as Optical Projection Tomography and high resolution 3D confocal imaging enabled our collaborators to image large numbers of mouse kidneys at high resolution across multiple time points. This has created a massively dense dataset that visualises the multiple stages of kidney development. For example, in one data set we have some 32 ureteric trees imaged in 3D of normal mouse kidneys at 6 distinct stages of development from which some 90,000 measurements have been extracted. In this talk I will outline the mathematical models we created towards making sense of these key structures of the kidney and how they develop. The aim is to be able to answer questions such as: is the ureteric branch formation stereotypic or is there a “random” element?; if there is a pattern, what is the nature of the pattern and what drives its formation?; and does patterning vary in mutants? In this presentation our analysis pipeline and algorithms will be described as well as recent results we have obtained in towards answering these questions in kidney patterning.
12:00pm: Thomas Boudier — Special Invited Speaker (25min)
12:25pm: Damien Hicks, Maps of variability in cell lineage trees (20min)
12:45pm: Clare Sloggert, Reduct: interactive visualisation of high-dimensional data (10min)
12:55pm-2:30pm: Lunch
1:30pm-2:30pm: ABACBS AGM
2:30pm-4:00pm: Session 7 - Statistical Bioinformatics
2:30pm: International Keynote Speaker: Nancy Zhang (45min)
Transfer Learning in Single Cell Transcriptomics
Cells are the basic biological units of multicellular organisms. The development of single-cell RNA sequencing (scRNA-seq) technologies have enabled us to study the diversity of cell types in tissue and to elucidate the roles of individual cell types in disease. Yet, scRNA-seq data are noisy and sparse, with only a small proportion of the transcripts that are present in each cell represented in the final data matrix. We propose a transfer learning framework to borrow information across related single cell data sets for de-noising and expression recovery. Our goal is to leverage the expanding resources of publicly available scRNA-seq data, for example, the Human Cell Atlas which aims to be a comprehensive map of cell types in the human body. Our method is based on a Bayesian hierarchical model coupled to a deep autoencoder, the latter trained to extract transferable gene expression features across studies coming from different labs, generated by different technologies, and/or obtained from different species. Through this framework, we explore the limits of data sharing: How much can be learned across cell types, tissues, and species? How useful are data from other technologies and labs in improving the estimates from your own study? If time allows, I will also discuss the implications of technical batch artifacts in the joint analysis of multiple data sets, and propose strategies for alignment of data across batch.
3:15pm: Victoria Jackson Uncovering the genetic basis of speech through sequencing of probands with childhood apraxia of speech (20min)
3:35pm: Yingxin Lin, scMerge: Integration of multiple single-cell transcriptomics datasets leveraging stable expression and pseudo-replication (10min)
3:45pm: Kevin Wang, RUV-Pro: Remove Unwanted Variation in prospective omics experiments (5min)
3:50pm: Regan Hayward, Epigenetic changes in Chlamydia-infected host cells (10min)
4:00pm-4:30pm: Afternoon tea
4:30pm-6:00pm: Session 8 - Genomics of Complex Diseases
4:30pm: Eleni Giannoulatou — National Invited Keynote (30min)
Improving the genetic diagnosis of congenital heart disease
Congenital heart disease (CHD) defines a large set of structural and functional deficits that arise during cardiac embryogenesis. CHD affects up to 1% of live births. However, in most cases a genetic diagnosis is not made. We have performed the first study to assess the outcomes of whole genome sequencing in a heterogeneous cohort of CHD patients. Ninety-seven families with probands born with CHD requiring surgical correction were recruited for whole genome sequencing. At minimum, a proband–parents trio was sequenced per family. By applying state-of-the-art bioinformatics pipelines, we identified variants with high clinical utility in 31% of our heterogeneous cohort. Currently, a large proportion of CHD cases cannot be attributed to a single DNA mutation and remain “unsolved”. We have been developing computational methods to increase the molecular diagnosis rate in CHD. Our novel bioinformatics approaches include: a customisable variant prioritisation tool, an integrative, scalable tool for the discovery of splice-altering variants, a visualisation tool for structural variation and a family-based variant caller built upon Google’s DeepVariant method. We also consider alternative models of the disease. Using burden testing we have identified alleles, genes and gene networks that harbour rare coding variants of moderate effect in cases versus ethnically matched controls. Our methods can be applied to other sequencing studies that aim to increase the diagnostic rate of genetic diseases as well as explore alternative genetic architectures.
5:00pm: David Humphreys, Ularcirc: Visualisation and enhanced analysis of circular RNAs via back and canonical forward splicing (20min)
5:20pm: Matthew Field, Comparison of predicted and actual consequences of missense mutations (10min)
5.30pm: Eleni-Maria Michanetzi, GidB: A Mutational Hotspot for Streptomycin Resistance in Tuberculosis (10min)
5:40pm: Ellis Patrick, Differential correlation across ranked samples for single-cell RNA-sequencing data (10min)
5:50pm: Conference Close (10min)
Thursday 29th November
9:00am-5:30pm: BioC Asia Workshop. Morning and afternoon tea provided
9:00am-12:30pm: Clinical Bioinformatics Symposium. Morning tea & lunch provided
9:00am-12:30pm: Docker and Singularity Workshop. Morning tea provided
2:00pm-5:30pm: Machine Learning Workshop. Afternoon tea provided
Friday 30th November
9:00am-5:30pm: BioC Asia Symposium. Morning and afternoon tea provided
9:00am-5:30pm: Best practices in Bioinformatics Software Development. Morning and afternoon tea provided