Automating High Content Screening with Deep Learning
Author | : Oren Zeev Kraus |
Publisher | : |
Total Pages | : |
Release | : 2018 |
ISBN-10 | : OCLC:1333980377 |
ISBN-13 | : |
Rating | : 4/5 ( Downloads) |
Download or read book Automating High Content Screening with Deep Learning written by Oren Zeev Kraus and published by . This book was released on 2018 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: High-content screening (HCS) combines automated microscopy with high throughput technologies for assaying genetic or environmental perturbations. HCS systems have enabled experiments designed to infer relationships between genetic mutations, compound libraries, and cellular phenotypes. Despite the benefits of image-based screens, analyzing large amounts of microscopy data remains a challenging bottleneck. Complex patterns, such as sub-cellular protein localizations, are difficult to quantify and several research groups have resorted to scoring these phenotypes manually by eye. Other researchers have developed customized computational analysis pipelines that need to be re-engineered for every screen. In this thesis, I apply supervised deep learning to HCS datasets as a computational approach that can be used to analyze a wide range of microscopy screens. First, I show that the classification performance of a deep convolutional neural network (CNN) outperforms a previously published computational pipeline for assessing the subcellular localization of fluorescently tagged proteins in genome wide yeast screens. I further show that this CNN can be used and updated on other protein localization datasets, including one generated by a different lab and on different imaging equipment. The CNN was used to identify ~300 proteins that change in localization in response to mating pheromone exposure. I trained a similar CNN to assign single cells in HCS images to 1 of 6 cell cycle stage categories, enabling in-silico synchronization of single cells. I combined the outputs from these two CNNs with data from RNA sequencing and ribosome profiling experiments to generate a comprehensive database of protein dynamics throughout the cell cycle. Additionally, I developed a novel CNN architecture to train models on HCS images without requiring cell segmentation or annotations for individual cells. This architecture only requires a list of phenotypes evident in a handful of images for training and can then predict which phenotypes are visible in other images and can indicate which cells in the images exhibit them. This model takes a major step toward automating the analysis of HCS images as minimal input is required to use it for analyzing different screens.