Master Thesis: Integrating World Models with Vision‑Language‑Action (VLA) Models for Autonomous Driving
20.02.2026, Abschlussarbeiten, Bachelor- und Masterarbeiten
This theis project aims to evaluate and quantify a robot’s action‑generation capability when a VLA model is combined with different world models, and to develop a clearer understanding of how various world‑model properties contribute to more stable and capable long‑horizon action generation. Additionally, if possible; devise, test and validate an effective world model implementation method tailored to the nature of the task and the environment.
Research Goals
- Understand state‑of‑the‑art VLA models and world‑model approaches
- Implementing and benchmarking multiple world‑model variants with a VLA framework * Developing simulation‑based setups for model evaluation
- Exploring and validating improved integration strategies based on findings (if feasible)
- Identifying and designing task‑specific strategies for fine-tuning to enhance model performance (if feasible)
Prerequisites:
- Basic understanding and interest in Robotics
- Basic understanding of computer vision
- Ability to understand and run modern deep learning and robotics codebases
- Proficiency in Python or C/C++
- Experience in reinforcement learning, simulation environments (like Mujoco/ Isaac) is a plus.
Kontakt: Christian.Prehofer@tum.de


