ABACBS National Seminar Series

The ABACBS national seminar series aims to highlight the work of bioinformaticians across the spectrum of career stages, located in both urban and regional universities.

The seminars are held from 12-1pm Eastern Time on the third Tuesday of each month, via Zoom. Each seminar features two speakers, with each talk running for approximately 25 minutes, followed by 5 minutes of Q&A time.

Free registration to attend is required via https://abacbs.org/seminarzoom

The ABACBS National seminar series is organised by: Matt Field, Nicola Armstrong, A.J. Sethi, Aaron Darling and Tingting Gong.

2021 seminar program

Tuesday November 16 2021, 12pm-1pm AEST

Speaker: Dr. Heejung Shim

Title NanoSplicer: Accurate identification of splice junctions from Oxford Nanopore sequencing

Link to recording

+ Abstract

Nanopore sequencing by Oxford Nanopore Technologies is a long-read sequencing method that has considerable advantages for characterising mRNA isoforms. It works by recording changes in electrical current when a DNA or RNA molecule traverses through a pore. However, basecalling of this raw signal (known as a squiggle) is error prone, making it challenging to accurately identify splice junctions. Existing strategies include using matched short-read data and/or annotated splice junctions to correct splice junctions from mapped nanopore reads, but add expense or limit junctions to known (incomplete) annotations. Therefore, a method that could accurately identify splice junctions solely from nanopore data would have numerous advantages.

In this talk, I will present a method NanoSplicer, that exploits the information in raw nanopore signal squiggles to improve splice junction identification. The key idea is to identify, for each splice junction, which of the squiggles predicted from potential splice junction sequences best matches the observed junction squiggle. This enables NanoSplicer to identify splice junctions solely from the nanopore data and its performance to be independent of other reads or read depth, having the potential to better identify rare splice junctions. Using both synthetic and biological data, we demonstrate that NanoSplicer improves splice junction identification, especially when the basecalling error rate near the splice junction is elevated.

This is a joint work with Yupei You and Michael Clark at the University of Melbourne.

+ About the speaker

Dr. Shim am a Group Leader in the Melbourne Integrative Genomics (MIG) and Lecturer (equivalent to Assistant Professor in US) in the School of Mathematics and Statistics at the University of Melbourne. Heejung Shim completed her BS in Mathematics (with a double major in Computer Science and Engineering) from the POSTECH, and her PhD in Statistics from the University of Wisconsin at Madison, advised by Dr. Bret Larget. Dr. Shim completed a postdoc at the University of Chicago working with Dr. Matthew Stephens. Previous to her current position, Dr. Shim was a tenure track Assistant Professor in the Department of Statistics at the Purdue University for two years. Dr. Shim retains an affiliation with Purdue as an Adjunct Assistant Professor.

More information

Speaker: Professor Tony Papenfuss

Title Unravelling genomic instability in cancer

Link to recording

+ Abstract

Structural variants (SVs) are large-scale polymorphisms or mutations in DNA (typically >50nt by convention). There are different kinds of SVs, including translocations, duplications, inversions, deletions, and insertions (of many different types of sequence). SVs can also co-occur in complex or catastrophic events. Copy number changes can also be considered SVs, and focal copy number variants are necessarily associated with breakpoints and rearrangements.

Analysis of SVs is an important part of whole genome sequencing analysis in healthy organisms and disease, especially in cancer. SVs are the key driver mutations in some cancers. They can tell us about the evolutionary history of a tumour and are frequently also the scars of mutational processes that can be diagnostic. Yet, there are many aspects of their analysis that remains challenging.

In this talk, I’ll discuss how we identify SVs using short-read sequencing data in cancer, covering progress by us and others over the last few years, some of the major ongoing challenges, and where things might be going. Finally, I’ll talk about some approaches to making sense of the output of SV callers.

+ About the speaker

My research spans computational cancer biology and bioinformatics methods development. My team develops novel mathematical, statistical and computational methods and applies these to make sense of cancer "omics" data. We particularly focus on rare cancers, melanoma, prostate cancer and myeloma, but also work on some other diseases.

More information

Past sessions:

Tuesday October 20 2021, 12pm-1pm AEST

Speaker: A/Prof Kim-Anh Lê Cao

Title Using community-wide data to address (some) challenges in single cell

Link to recording

+ Abstract

Cell identity classification is an ongoing challenge for the analysis of single cell RNA-seq (scRNA-seq) data. One solution is to use bulk transcriptome atlases as references, as these often contain detailed phenotype data.

We propose a new computational and statistical framework, Sincast*, to project and query scRNA-seq data to bulk RNA-seq data. We solve structural discrepancies between bulk and single cell data by either aggregating or imputing single cells and discuss the most beneficial approach depending on the data context. Sincast can also be used to reveal intermediate single cell states when projected against bulk data.

I will also briefly discuss the benefit of using hackathon data to advance methods development in multi-omics single cell.

*joint work with Yidi Deng and Jarny Choi.

+ About the speaker

A/Prof Kim-Anh Lê Cao develops computational methods, software and tools to interpret big biological data and answer research questions efficiently. Kim-Anh has a mathematical engineering background and graduated with a PhD in statistics from the Université de Toulouse, France. She then moved to Australia to forge her own non-linear career path, first working as a biostatistician consultant at QFAB Bioinformatics, then as a research group leader at the biomedical University of Queensland Diamantina Institute.

She currently continues her strong research focus at the University of Melbourne. Kim-Anh has secured two consecutive NHMRC fellowships from 2014. In 2019 she received the Australian Academy of Science's Moran Medal for her contributions to Applied Statistics. She was selected to the international HomewardBound leadership program for women in STEMM, culminating to a trip to Antarctica in 2019, and the superstars of STEM program from Science Technology Australia.

More information

Speaker: Professor Gordon Smyth

Title P-values, false discovery rates and fold-change cutoffs

Link to recording

+ Abstract

This talk will discuss some issues relating to p-values, multiple testing and effect sizes. How 0.05 became the commonly accepted p-value cutoff. How the false discovery rate has a Bayesian interpretation and how it relates to p-values. Why Benjamini-Hochberg FDR control breaks down if the genes are reordered. Why this has implications for fold-change cutoffs in gene expression analyses, and why fold-change cutoffs should not be needed in empirical Bayes testing framework

+ About the speaker

Professor Gordon Smyth is joint Head of WEHI's Bioinformatics Division and has published over 300 refereed articles. He is well known for creating statistical methods for analysing and interpreting genomic data, especially data arising from designed biomedical experiments. His Lab has developed some widely software packages for differential analyses of genomic data including limma, edgeR, csaw, diffHic and Rsubread. Together with collaborators, he works to increase our understanding of cancer biology and immunological diseases.

More information

Tuesday September 21 2021, 12pm-1pm AEST

Speaker: Dr.Rebecca Poulous

Title Generating reproducible and large-scale proteomic data

Link to recording

+ Abstract

Proteomic data can reveal novel associations between genotype and phenotype, beyond what is apparent from genomics or transcriptomics alone. However, generating reproducible and large-scale proteomic data is made more challenging by the need for such data to be obtained over an extended period of time and across multiple instruments. Here I will describe a novel computational pipeline (ProNorM), that was designed to mitigate unwanted variation and reduce missing values in large-scale proteomic datasets. We demonstrate how to improve the quantitative analysis of large-scale DIA-MS data, providing a pathway toward clinical proteomics.

+ About the speaker

Dr Rebecca Poulos is an NHMRC Early Career Fellow, Senior Research Officer in ProCan at the Children’s Medical Research Institute, and Conjoint Lecturer at the University of Sydney. She has expertise in cancer genomics, proteomics and data science. Dr Poulos is using proteogenomics to understand cancer biology, predict drug response and develop methods for large-scale multi-omic data analysis.

More information

Speaker:

Associate Professor Melissa Davis

Title

Differential co-expression derived gene networks in cancer

Link to recording

+ Abstract

The networks that control gene regulation are critical to the normal function of cells, and are frequently disrupted in disease states. Co-expression analysis is often used to reconstruct these gene regulatory networks, and a variety of methods have been proposed for this purpose. The analysis if differential networks poses a different problem – rather than seeking to reconstruct the regulatory network of a particular biological state, the differential analysis problem seeks to identify regulatory relationships that are different between two states. I will describe our work establishing a simulation framework for generating synthetic data under conditions of conditional regulation, our benchmarking study of differential co-expression methods, our findings regarding the interpretation of derived networks, and our application of the best approach to the differential analysis of breast cancer data.

+ About the speaker

The Davis Laboratory studies the regulatory networks that control the behaviour of cells in normal and cancerous tissues. We approach these questions with a suite of computational techniques that include classical bioinformatics methods, knowledge-based modelling, machine learning, and network analysis.

Our current focus is understanding the signalling and regulatory networks that underpin epithelial-mesenchymal transitions in breast cancer. Our work in this space is part of the EMPathy Breast Cancer Network, and is funded by the National Breast Cancer Foundation.

More information

Tuesday August 17 2021, 12pm-1pm AEST

Speaker: Dr. Anna Trinh

Title: Comprehensive cancer ‘omics profiling in N=1 studies

Link to recording

+ Abstract

Despite the challenges associated with low sample sizes, in depth genomic, transcriptomic and microenvironmental profiling of single tumor have the potential to reveal (i) biological insights on its evolutionary trajectory and (ii) vulnerabilities that can be therapeutically targeted. In this talk, I will showcase comprehensive profiling of early to invasive breast cancer, focusing on how the tumor and immune system could shape each other and patient outcome.

+ About the speaker

Dr Anne Trinh is a senior research fellow in the Kinghorn Centre for Clinical Genomics Computational Biology group

More information

Speaker: Associate Professor Eleni Giannoulatou

Title: Prioritising disease candidate genes using powerSFS

Link to recording

+ Abstract

Elucidating the genetics of human disease is essential to improve disease screening, diagnosis, prognosis and therapy. However, it is often complex and closely tied to technological and methodological advances in genomics. We apply and develop bioinformatics methodology to analyse whole exome or genome sequencing data to identify disease-causing mutations and increase the current diagnostic rate of rare genetic diseases.

In this talk, I will focus on our recent work on the development of powerSFS, a gene intolerance score, that can be used to prioritise disease candidate genes. powerSFS differs from previous scores by modelling the signature that purifying selection leaves in the site frequency spectrum of a gene. I will highlight the advantages such an approach gives and how this can be used in a genomics pipeline to improve variant prioritisation.

+ About the speaker

Eleni Giannoulatou graduated with a Masters of Computer Engineering and Informatics from the University of Patras, Greece in 2004. She next received her Master of Philosophy in Computational Biology from the University of Cambridge, UK and her Doctor of Philosophy in Bioinformatics from the University of Oxford, UK in 2011. She undertook postdoctoral work at the Wellcome Trust Centre for Human Genetics in Oxford as part of the Wellcome Trust Case Control Consortium and the Weatherall Institute of Molecular Medicine in Oxford.

A/Prof Giannoulatou joined the Victor Chang Cardiac Research Institute in 2013 as a member of the Bioinformatics and Systems Medicine Laboratory and in 2016 she started an independent research group.

A/Prof Giannoulatou’s research focuses on the development and application of statistical methods to answer genetic questions using high-throughput genomics data. Using the latest next-generation sequencing technologies, her team develops quantitative approaches to identify disease-causing DNA mutations and increase the current genetic diagnostic rate of cardiovascular disease.

Tuesday July 20 2021, 12pm-1pm AEST

Speaker: Dr. Timo Lassmann

Title: Interpretable machine learning for rare disease diagnostics

Link to recording

+ Abstract

In this talk I will outline our approaches to identify 'the' causative variant in rare disease patients. We utilize, or better test the utility of, public omics data sets (including single cell) in this context. Our models derive human interpretable rules and thereby meet the requirements of the 2018 European Union’s General Data Protection Regulation (GDPR) laws, specifically the “right to explanation”, whereby an individual has the right to request an explanation for the output of an algorithm.

+ About the speaker

Timo Lassmann completed his PhD in functional genomics at the Karolinska Institute Sweden, then moved to Japan where he worked on the Functional annotation of the mammmalian genome (FANTOM) and ENCODE projects and is currently leading leading the precision health program, the rare disease program and the computational biology team at the Telethon Kids Institute. He has published about 130 papers. His current focus is on employing and developing modern machine learning/AI techniques to utilize the wealth of public omics data in clinical and translational contexts.

Speaker: Dr. Michael Charleston

Title: Disease and despair in the quest for coevolutionary dynamics of pathogens & parasites

Link to recording

+ Abstract

The idea that parasites and pathogens co-speciate with their host species is appealing, powerful, and mostly wrong.

Since Fahrenholz' (1913) "rule" -- often quoted that "parasite phylogeny mirrors host phylogeny" -- there have been many attempts to find evidence of cospeciation, where one group of host species drives the speciation events of another group. Finding episodes of cospeciation enables us to co-locate speciation events in time, to identify length of association in evolutionary arms races, estimate relative evolutionary rates, and to detect host switching events.

This attractive idea was based on some beautiful, and as it turns out, atypical, examples of phylogenetic agreement between hairy hosts and chewing lice infecting them.

However it hasn't stopped us trying, and this talk will describe some of the hilarious escapades and encounters I have had so far in this perilous, complex, and ultimately underwhelming odyssey: from the thrill of codivergence in the jungle to the ABCs of host switching, turmoil in tanglegrams, and whispers in the dark.

There might even be science!

+ About the speaker

Michael Charleston completed his PhD in phylogenetics in 1994 at Massey University in New Zealand. Since then he has worked at UT Austin, University of Glasgow, Oxford University, University of Sydney and has been in Universtiy of Tasmania since 2015. This journey has taken him from a mathematics department to three departments of zoology (evolutionary biology), one school of information technologies (computer science and software), and now a discipline of mathematics. He has published about 100 papers across a broad range of subjects; his favourites are those on cophylogenetics, epidemiology, modelling, and the ones with the most citations. He's currently trying to learn more about machine learning, Michael's outside interests include hiking, music, roleplaying and board games, table tennis and general geekdom.

Tuesday June 15 2021, 12pm-1pm AEST

Breen.Jimmy_Dr._Precision_Medicine_Bioinformatics_Core_1_PURE.jpg

Speaker: Dr. Jimmy Breen

Title: Bioinformatics Advocacy through ABACBS

Link to recording

+ Abstract

Bioinformaticians and Computational Biologist are becoming increasingly important to the way modern biological research is conducted, yet their contributions can often be overlooked and sometimes ignored. In this talk, I will give a brief overview of ABACBS, outlining where we have come from and how you can get involved.

About the speaker: Jimmy Breen is the head of the bioinformatics hub at the Robinson Research Institute and the current president of ABACBS.

More information

Speaker: Dr. Min Zhao

Title: Multi-omics integration in cancers

Link to recording

+ Abstract

Genomic-based personalized medicine is at the forefront of innovation in cancer therapy. With the development of technologies to rapidly sequence DNA/RNA from tumor cells, an exponential increase in the amount of data lends personalized medicine to big data or “informatics” approaches. One of key engineering challenges for enabling personalized medicine is to develop rapid analysis and effective diagnosis tools so that a variety of genetic variations can be quickly screened and proper treatments can be promptly applied. In my talk, I will briefly introduce the data integration challenge in cancer genomic and cover the topics: 1) Moore's Law, DNA sequencing technology; 2) How cancer genomes differ from normal genomes and general feature of cancer genomic data; 3) Multi-dimensional data mining approach to identify the driver genetic events using public cancer genomics data.

About the speaker: Dr. Min Zhao is aSenior Research Fellow focusing on bioinformatics and genomics at the University of the Sunshine Coast.

More information

Tuesday April 20 2021, 12pm-1pm AEST

Speaker: Professor Alicia Oshlack

Title: New methods for analysing cancer transcriptomes

Link to recording

+ Abstract

Cancer is a disease of the genome which arises from an accumulation of mutations at a range of scales from single nucleotides to chromosomal rearrangements. The functional consequences of mutations can be transcribed into RNA and detected through transcriptome sequencing. We use transcriptome sequencing to discover the causes and consequences of cancer in a variety of context.

We can use gene expression profiles to classify tumours into prognostic groups to inform clinical treatments. Events that alter the function of genes by driving novel transcript structures can also be detected using RNA sequencing and we have been working on methods and approaches for this with traditional RNA-seq. In addition, we are working on the analysis of the transcriptome with long read sequencing to give deeper insights into cancer specific transcripts such as fusion genes.

About the speaker: Prof. Oshlack is Co-Head of Computational Biology Program at the Peter MacCallum Cancer Centre in Melbourne, Victoria, Australia
More information

Speaker: A/Prof Matt Ritchie

Title: Tools for analysing nanopore long-read data

Link to recording

+ Abstract

The application of Oxford Nanopore Technologies long-read sequencing in genomics research continues to expand. In this presentation I will highlight a number of tools we have developed to keep track of the myriad of analysis tools that are becoming available for long-read data (https://long-read-tools.org) along with software for visualising DNA methylation signal (NanoMethViz) and performing isoform analysis (FLAMES) in both bulk and single-cell transcriptomics data.

About the speaker: A/Prof Ritchie is a laboratory head in the Epigenetics and Development Division at the Walter and Eliza Hall Institute of Medical Research
More information

Tuesday May 18 2021, 12pm-1pm AEST

Speaker: A/Prof Nicola Armstrong

Title: Investigating epigenetic clocks

Link to recording

Abstract: An epigenetic clock uses methylation levels to estimate an individual’s age. There is currently a lot of interest in these clocks, including the potential to use them to estimate age in forensics. In this presentation, I will give an overview of how these clocks are developed, introduce several well-known epigenetic clocks and discuss how they perform in practice.

About the speaker: A/Prof Armstrong is discipline leader in maths and statistics at Murdoch University

More information

Speaker: Ramil Mauleon (presenter) and Graham J. King

Title: Harvesting and Harnessing Information for Plant and Crop Science

Link to recording

Abstract: In order to feed the projected 9.7 billion people by 2050, there is a need to diversify the range of crops in the world’s food basket, as well as intensify production, concurrent with improving resilience to extreme environments and the nutritional quality of selected crops comprising 90% of the global food production. At Southern Cross Plant Science, Southern Cross University, we utilize ‘omics technologies to characterize the genomes, proteomes and metabolomes of several global niche crops for the discovery of candidate genes underpinning important crop traits affecting production and end-use quality. We also develop bioinformatics tools and platforms that allow us and others to add value to these datasets beyond the intent of original studies through development of controlled vocabularies and data integration, ultimately leading to better-informed crop breeding design. In this talk I shall present some of the ‘omics work and bioinformatics tools for genome signal analyses, knowledge representation, and steps towards integrated ‘omics platform.

About the speaker: Dr. Ramil Mauleon is a Senior Research Fellow at Southern Cross University. Prof. Graham King is Professor of Recombination & Crop Genomics of Southern Cross Plant Science at Southern Cross University

More information on Ramil Mauleon
More information on Graham King

Tuesday March 16 2021, 12pm-1pm AEDT

Speaker: Dr Daniel MacArthur
Title: Filling the gaps: how can we ensure that genomic medicine in Australia is equitable?

Link to recording

+ Abstract

Rapid technological change in genomics has resulted in a historically unprecedented transformation of biology, and is increasingly changing the practice of medicine. However, there are ominous signs that the current promising trajectory of genomic medicine will not provide benefit to everyone equally. In addition to creating potentially expensive diagnostics and therapies that may exacerbate existing inequities in medical practice, genomic medicine relies fundamentally on the existence of genomic reference data to improve variant interpretation, and existing reference data is currently very deep for European populations and shallow or non-existent for many other groups. These discrepancies are particularly marked in Australia, where the lack of local large-scale population genomics efforts mean that Aboriginal and Torres Strait Islander communities, as well as many of the largest non-European immigrant communities, are currently effectively missing from available public reference databases. This in turn means that rare disease patients from these communities are less likely to be able to obtain an accurate diagnosis for their condition.

In this talk I'll discuss what's needed to address this imbalance and ensure an equitable future for genomic medicine in Australia.

About the speaker: Dr Daniel MacArthur is the Director of the Centre for Population Genomics
More information

Speaker: A/Prof Mark Cowley
Title: Translating data into clinical impact, for children with high-risk cancers

Link to recording

+ Abstract

Childhood cancer remains the leading cause of disease-related death in Australia, with ~1000 new diagnoses per annum. Precision medicine offers a promising opportunity to substantially improve patient outcomes, though the molecular characterisation of inherited, and somatically acquired genetic changes or perturbed epigenetic and transcriptional phenotypes. This information can guide treatment recommendations, change patient diagnosis, and inform family cancer risk.

In Australia, the Children’s Cancer Institute and the Kid’s Cancer Centre are leading the national ZERO Childhood Cancer program, which has recruited >450 patients with high-risk or rare cancers since 2015. ZERO, supported by the Lions Kids Cancer Genome Project (2015-2020), uses whole genome (WGS) and transcriptome sequencing (RNA-Seq), and methylome profiling in real-time to provide targeted treatment recommendations for patients. The interim findings from this research program were published in late 2020 (PMID: 33020650).

Analysing a tumour with multiple omic profiling platforms provides the most comprehensive insights into its molecular state. These data include germline and somatic single nucleotide variants (SNV), indel, copy number and structural variants (CNV, SV), gene expression levels, isoform usage, fusions, pathways, methylation, and mutational signatures. Interpreting this information from a single patient forms a critical component of precision medicine but requires significant expertise and is highly labour intensive. We have developed a suite of computational approaches to support the delivery of ZERO, with real-time analysis (~5 patients/week).

Furthermore, we continue to improve the clinical utility of RNA-Seq through analysing outliers as well as expressed mutations and their allele-specificity. The integration of WGS with RNA-Seq has resulted in 64% of patients receiving at least one treatment recommendation, within 7-weeks (minimum 8-days). Promisingly, for 43 patients with follow-up data who took a treatment recommendation, 31% had an objective response (tumour shrinkage) and another 40% achieved stabilisation of their aggressive disease. This demonstrates the significant potential for precision medicine to improve outcomes in paediatric cancer.

In this seminar, I will focus on the bioinformatics approaches that we have developed to deliver a national precision medicine program built upon WGS, RNAseq and methylation profiling.

About the speaker: A/Prof Mark Cowley is the head of the Computational Biology Group at the Children's Cancer Institute.

More information

Tuesday February 16 2021, 12pm-1pm AEDT

Speaker: Dr Philipp Bayer
Title: Interpretable machine learning in bioinformatics

Link to recording

+ Abstract

Machine learning methodologies are now part of the bioinformatics toolkit. I will talk about how we use interpretable machine learning in studies of plant gene presence/absence mechanisms. I will discuss our toolkit and how we run these tools at UWA. I’ll also talk about drawbacks and stumbling blocks when using machine learning tools in bioinformatics.

About the speaker: Dr Philipp Bayer is a research fellow at the University of Western Australia
More information

Speaker: Dr Tom Schmidt
Title: Spatial population genomics of a recent mosquito invasion

Link to recording

+ Abstract

Population genomic approaches can characterize dispersal across a single generation through to many generations in the past, bridging the gap between individual movement and intergenerational gene flow. These approaches are particularly useful when investigating dispersal in recently altered systems, such as human‐mediated biological invasions. This talk presents the results of an investigation of past and present dispersal patterns in a 2004 invasion of Aedes albopictus (the Asian tiger mosquito) in the Torres Strait Islands of Australia. This involved sampling mosquitoes from 13 TSI villages simultaneously and using double-digest RAD sequencing to genotype 373 mosquitoes, including 331 from the TSI, 36 from Papua New Guinea and four incursive mosquitoes detected in uninvaded regions. This talk details our findings, including movement of mosquitoes within and between islands and the spread of putatively adaptive alleles from Papua New Guinea into the Torres Strait Islands.

About the speaker: Dr Tom Schmidt is a research fellow at the University of Melbourne
More information