Direkt zum Inhalt springen
login.png Login join.png Register    |
de | en
MyTUM-Portal
Technische Universität München

Technische Universität München

Sitemap > Schwarzes Brett > Abschlussarbeiten, Bachelor- und Masterarbeiten > Master Thesis: Vision-Language Pretraining for Bone Tumor Classification
auf   Zurück zu  Nachrichten-Bereich    vorhergehendes   Browse in News  nächster    

Master Thesis: Vision-Language Pretraining for Bone Tumor Classification

04.12.2024, Abschlussarbeiten, Bachelor- und Masterarbeiten

Abstract:
Bone tumor classification presents significant challenges due to the subtle visual differences among tumor entities, even for expert radiologists. This thesis aims to enhance diagnostic capabilities using vision-language pretraining to classify bone tumors from X-ray images. By pretraining on large public datasets such as MURA and incorporating anatomical context through captions, this thesis seeks to address key limitations posed by data scarcity and anatomical heterogeneity in the field of bone tumors.

Methodology:
- Literature review on the current state-of-the-art techniques for bone tumor classification and self-supervised vision-language pretraining.
- Implement a supervised model for bone tumor classification using X-Rays to serve as a baseline.
- Pretrain a vision-language model in a self-supervised manner, which will serve as a general-purpose model for downstream tasks.
- Test several fine-tuning strategies for bone tumor classification and test zero-shot capabilities.

Prerequisites:
- Advanced knowledge of deep learning with imaging data.
- Beneficial but not necessary: experience in medicine/oncology.
- Preferred starting date: January-February 2025 (with flexibility).

What we offer:
- Very rare medical data with high potential for publication.
- Highly educated & interdisciplinary environment.
- Top-level hardware for scientific computing.
- Constant feedback from medical and computer science experts.

How to apply:
Send an email to anna.curto-vilalta@tum.de, with your CV and a small introduction about you and your motivation.

References:
A. Radford et al., “Learning Transferable Visual Models From Natural Language Supervision,” Feb. 26, 2021, arXiv: arXiv:2103.00020. doi: https://doi.org/10.48550/arXiv.2103.00020.
H. Q. Vo et al., "Frozen Large-scale Pretrained Vision-Language Models are the Effective Foundational Backbone for Multimodal Breast Cancer Prediction," in IEEE Journal of Biomedical and Health Informatics, doi: 10.1109/JBHI.2024.3507638.

Kontakt: anna.curto-vilalta@tum.de