Lecturer: Dave Cash (UCL DRC)
In the past few years, the amount of medical imaging data available across the whole spectrum of health and bioscience research has skyrocketed, enabled many new applications and discoveries. Larger sample sizes help characterize the clinical heterogeneity of the population and identify small but meaningful changes, particularly in individuals who are at-risk or showing the earliest signs of disease. In addition, they also are often needed to train and test machine learning and deep learning algorithms. However, the logistics of managing and analysing imaging data grow in complexity as the size of the data increases. Cloud-based and federated solutions can alleviate some of these issues but also bring new challenges. In addition to data management issues, funders are increasingly requiring researchers to share their data. While these open science mandates are crucial for the field, researchers must balance data sharing requirements with data protection law. Appropriate anonymisation strategies are important to remove all personal information from the image metadata, and in some cases the image themselves (e.g. the reconstruction of a face from a head MRI), whist not being so stringent that the data becomes unusable or affects reproducibility of the analysis. This talk will discuss considerations and strategies for data management and analysis of large-scale imaging studies, including considerations around multi-user access, multi-site studies, and public data sharing.