Sitemap > Bulletin Board > Diplomarbeiten, Bachelor- und Masterarbeiten > Master’s Thesis: Data-Efficient Language Model Alignment Techniques for Text Generation Tasks

Back to News Board

Browse in News

Master’s Thesis: Data-Efficient Language Model Alignment Techniques for Text Generation Tasks

15.08.2025, Diplomarbeiten, Bachelor- und Masterarbeiten

This thesis explores data-efficient preference alignment techniques for text generation. The goal is to understand and extend hybrid approaches where models leverage limited human supervision alongside their own output feedback. By analyzing double-gradient architectures and iterative refinement strategies, this work aims to improve the efficiency and reliability of LLM alignment with reduced dependency on large, annotated datasets.

TL;DR: This thesis explores data-efficient preference alignment techniques for text generation. The goal is to understand and extend hybrid approaches where models leverage limited human supervision alongside their own output feedback. By analyzing double-gradient architectures and iterative refinement strategies, this work aims to improve the efficiency and reliability of LLM alignment with reduced dependency on large, annotated datasets.

Project Background

Aligning large language models (LLMs) with human preferences is essential for safe and useful text generation. Traditional approaches like Reinforcement Learning with Human Feedback (RLHF) are data- and resource-intensive. Recent research explores hybrid alignment methods that combine small amounts of annotated data with model-generated supervision. Techniques such as SPPO, I-SHEEP, and RS-DPO show that models can iteratively refine themselves by generating, evaluating, and learning from their own outputs, often through mechanisms that allow gradients to pass through the model multiple times. This project investigates such techniques to achieve efficient alignment with minimal human input.

Your Tasks

Survey and categorize hybrid alignment methods, focusing on those enabling double gradient flow and iterative refinement (e.g., SPIN, SPPO, I-SHEEP, SPO, RS-DPO).
Develop and implement a comparative framework for evaluating selected alignment strategies on text generation tasks.
Conduct empirical analysis on alignment quality, data efficiency, and training stability using open datasets and benchmarks (e.g., AlpacaEval, OpenAssistant).
Investigate the impact of limited supervision on preference model quality and generalization.
Propose and evaluate refinements or novel techniques where applicable.

What We Offer

Access to computing resources (GPU clusters) and relevant datasets.
Regular supervision and support from researchers with expertise in language model training and alignment.
Opportunity to collaborate on publications or contribute to open-source implementations.
A focused and relevant research environment in alignment and generative AI.

Project Details

Prerequisites: Strong background in deep learning and NLP; experience in PyTorch/JAX and HuggingFace. Familiarity with large language model fine-tuning and alignment is a plus.
Preferred Start Date: October 2025 (flexible).
How to Apply: Submit a CV, brief motivation highlighting relevant background/experience, and transcripts to marton.szep@tum.de. Please indicate any prior experience with language models or alignment methods.

References

Wenn Sie selbst eine Diplomarbeit ausschreiben wollen, lesen Sie bitte vorher unbedingt das 'Best Practice Manual Stellenanzeigen'.

Kontakt: marton.szep@tum.de

MyTUM-Portal
Technical University of Munich

Technical University of Munich

Master’s Thesis: Data-Efficient Language Model Alignment Techniques for Text Generation Tasks

Project Background

Your Tasks

What We Offer

Project Details

References

26.09.2025

25.09.2025

24.09.2025

22.09.2025

19.09.2025

19.09.2025

RSS

Todays events

MyTUM-Portal Technical University of Munich

Technical University of Munich

Master’s Thesis: Data-Efficient Language Model Alignment Techniques for Text Generation Tasks

Project Background

Your Tasks

What We Offer

Project Details

References

26.09.2025

25.09.2025

24.09.2025

22.09.2025

19.09.2025

19.09.2025

RSS

Todays events

MyTUM-Portal
Technical University of Munich