KCCG Genomics Summer Scholarship @ Sydney

Employer: Garvan Institute of Medical Research

Closing date: 07-12-2019

Brief position description: The Garvan Institute of Medical Research (Garvan) is one of Australia’s leading medical research institutes, with over 600 scientists, students and support staff. Its mission is to make significant contributions to medical research that will change the directions of science and medicine and have major impacts on human health. Garvan’s research encompasses immunology, bone biology, neuroscience, diabetes and metabolism, cancer, and genomics and epigenetics. These research activities are organised within four themes: Cancer, Inflammatory Diseases, Healthy Aging, and Genomics, and two Centres: the Kinghorn Centre for Clinical Genomics and the Garvan-Weizmann Centre for Cellular Genomics.

The Kinghorn Centre for Clinical Genomics (KCCG), established by the Garvan Institute is an Australian research and sequencing centre delivering genomic information for clinical use. The KCCG facilitates genome-based research, particularly in cancer and monogenic diseases, but also in complex disease such as diabetes, osteoporosis and immunological disease. Our vision is to translate medical research into clinical care in Australia and beyond by integrating sequencing, bioinformatics and data management in a cutting-edge Genomics research environment.

The Opportunities
The KCCG is offering currently enrolled undergraduate students opportunities to carry out projects during summer 2019/2020. These projects provide hands-on research experience in the following topics:

1. Integration of “real” real-time nanopore analysis
State of the art nanopore sequencers produce terabytes of data over a sequencing run of 48 hours. To improve turnaround times, we have designed a proof-of-concept system that processes the data from the sequencer on the fly. In this project, you will go one step further by interfacing the computing system with the sequencer at the software level, and also add failure handling so the system can be used in a production workflow.

2. Signal analysis algorithm and parameter optimisation
Nanopore sequencing offers the possibility of portable genome sequencing, but analysing the terabytes of data generated is both compute and memory intensive. We have heavily optimised gold-standard "nanopore event alignment" software to efficiently leverage the power of embedded CPU-GPU heterogeneous computing systems. You will work on improvements such as optimising runtime parameters (data chunk size, number of threads, etc.) and accelerating algorithm performance using SIMD (Single Instruction Multiple Data).

3. Developing a National Genomics Resource Research Centre at KCCG
KCCG is developing a National Genomics Resource Centre (NGRC) to assist researchers develop sound research proposals with ethically defensible plans, as well as partnering with researchers to notify study participants of clinically actionable findings, and providing comprehensive follow-up support for such participants. In this project, you will help set up various aspects of the NGRC, which will comprise a web-based platform, a tailored messaging system and 1800 telephone line staffed by appropriately trained, dedicated genetic counsellors accessible to both researchers and research participants.

4. Generating pedigrees from patient information.
Entering pedigree information into computer-readable format is difficult, as mistakes are easily made when entering large numbers of people. The current version of the pedigree-management software uses a graphical interface and can predict some information for adding people to a pedigree. In this project, you will expand this software to include further validation and visualisation, as well as saving additional phenotype information. You will gain experience developing software using Python.

5. Detecting and interpreting genomic structural variants in 4000 Australians
At KCCG we perform genetic testing for disease-causing genomic variants. We also have the largest cohort of healthy individuals with whole genome sequencing data in Australia, which has allowed us to estimate the frequencies of single nucleotide variants in the general population. In this project, you will use our accredited detection pipeline and high-performance compute infrastructure to analyse this cohort for structural variants, summarising the detected variants, determining their frequencies and interpreting biological findings.

6. Identifying tandem repeats from Nanopore sequencing for FSHD diagnostics.
Help develop the code to better identify tandem repeats, then optimise the code for the types of repeats relevant to FSHD diagnostics, including the 3.3Kb macrosatelite and the 68bp b-satelite repeats on chr4q/10q. You will gain experience in programming for a biological application as part of a bioinformatics team. Exact details will vary depending on the state of the project at the time.

7. Characterising detailed methylation patterns across the D4Z4 repeat in FSHD patients.
You will analyse a collection of samples sequenced using nanopore sequencing, including unaffected people and patients with FSHD. You will gain an understanding of the biology of FSHD and experience handling biological data.

8. Clinical diagnosis using Whole Genome Analysis
In clinical practice, Whole Genome Analysis has traditionally been difficult and expensive due to extended manual variant curation. Restricting analysis to panels of curated genes reduces complexity and improves turnaround times but may miss novel pathogenic variants. KCCG has developed Orrery, a variant prioritisation framework that enables rapid analysis of whole genomes, and which has recently been validated and accredited for clinical applications. In this project, you will use Orrery to revisit samples from the cardiogenomics flagship, comparing diagnostic yields and clinical significance with previous panel-based analyses. This work will help determine the optimal diagnostic workflow.

9. Exploring the transcriptional signatures of skeletal dysplasia genes
Skeletal dysplasias are a group of rare genetic disorders characterised by severe deformities of the skeleton. Our group aims to identify the genetic defects responsible for these disorders. In this project, you will help explore the expression patterns of known skeletal dysplasia genes across various tissues and cell types to identify new skeletal dysplasia genes. Identifying these genes is an essential first step in understanding what is causing these disabling diseases and developing a possible cure.

10. Developing an app for “Dynamic Consent” for use of one’s genome and DNA information
We are now able to sequence the entire genome of an individual within hours. This is very valuable data, able to provide information to help not only the individual, but also their relatives, and also to advance medical diagnosis and scientific research. The concept of “Dynamic Consent” allows individuals to specify who can, and who can’t, access their genetic information. And the individual can review and change their consent at any time. This project involves a mock-up (and a basic working prototype) of a website or a mobile app that implements Dynamic Consent for genomic testing.

11. Using text mining to extract knowledge from clinical trial literature for oncology
The literature on oncological clinical trials is expanding so rapidly there is now an urgent need for methods of extracting key knowledge automatically. In this project, you will apply text mining techniques to public repositories to corroboratively extract information such as therapeutic combinations, indications and outcomes. We will then comparatively examine the performance and utility of different text mining methods, including machine learning-based methods in a small case study. The goal is to produce a database of trial results that will help oncology researchers ask more sophisticated questions and thereby design better interventional trials.

12. Using deep learning to extract insights from biological data sets
Skymapper is a new deep learning tool that uses Convolutional Neural Networks to derive insights from large biological data sets, including DNA and RNA sequences, methylation and multi-omics data. You will be involved in testing and documenting Skymapper with the goal of completing the documentation. You can also contribute to projects implemented using Skymapper.

The position will be offered full-time for 10 weeks and provide an allowance of $5000 as a tax-free scholarship.

How to Apply
All applications must be submitted via the Garvan Careers site. Applications from other sites/channels will not be considered.

Your application should include: your CV along with:
Your CV with a cover letter mentioning about which project(s) you are applying for and,
Academic transcripts
Closing Date

The position will remain open until filled. We will be reviewing applications as they are received, and so we encourage you to submit your application as soon as possible.

Job website: http://garvan.wd3.myworkdayjobs.com/en-US/garvan_institute/job/Sydney/KCCG-Genomics-summer-scholarship_PRF5295-1

Contact name: Shree Chattopadhyay

Contact email: shreec@garvan.org.au