Bachelor thesis: Image Synthesis for Real Clinical Tabular Data in Orthopaedics
20.11.2024, Abschlussarbeiten, Bachelor- und Masterarbeiten
Abstract
This IDP aims to develop and evaluate machine learning-based techniques for data synthesis of real clinical tabular data, specifically from the orthopaedics department. The main goal is to generate high-quality synthetic datasets that can be utilized for research purposes, while safeguarding the privacy of patients. By leveraging state-of-the-art models, the project will contribute to the understanding of how synthetic data can effectively replicate the properties of real patient data, supporting clinical decision-making and research initiatives. The models will be tested on both standard patient datasets and more complex nested datasets to determine their robustness and accuracy in diverse settings.
Background
In clinical research, access to high-quality data is often constrained by privacy regulations, limiting the availability of datasets that are crucial for the development of new medical technologies and decision-support systems. The use of synthetic data is increasingly seen as a viable solution for this challenge, as it offers the ability to share and analyze data without compromising patient confidentiality. This project focuses on generating realistic synthetic data from tabular patient records in orthopaedics, with a particular focus on data synthesis from the data. The goal is to test multiple AI models to determine which can best capture the statistical properties and complexity of real clinical data, while ensuring patient privacy is maintained.
Tasks
- Literature Review: Conduct an extensive review of existing methodologies for synthetic data generation, with a focus on healthcare applications.
- Model Evaluation and Testing: Implement and evaluate 2-3 different models to synthesize clinical tabular data. The models will be tested on both a standard patient dataset as well as more complex, nested data to assess their performance in different contexts. Conduct a quantitative and qualitative evaluation.
- Results Presentation: Analyze and present the results, focusing on how well the synthesized data captures the characteristics of the original patient data, and on the models' applicability to diverse clinical scenarios.
Prerequisites
Preliminary experience and solid understanding of machine learning and deep learning techniques, particularly in data synthesis.
References
- https://arxiv.org/abs/2209.15421
- https://arxiv.org/abs/2410.20626
- https://link.springer.com/article/10.1007/s00167-022-06957-w
To apply, please send a short email with your CV and transcript of records to florian.hinterwimmer@tum.de.
Contact: florian.hinterwimmer@tum.de
Kontakt: florian.hinterwimmer@tum.de