MSc Thesis presentation of Mr. Giorgos Petsangourakis Tuesday, July 7, 2026
Student: “Giorgos Petsangourakis”
Program: “Data Science and Information Technologies”
Title: “Semantically-Guided Image Synthesis: Augmenting VAE Decoders with Foundation Model Representations”
Abstract: Variational Autoencoders (VAEs) serve as the critical first stage in modern latent generative modeling, yet they often face an inherent trade-off between latent compression and reconstruction fidelity.
Moreover, standard VAE architectures frequently struggle to recover fine-grained textures and complex semantic structures from the low-dimensional latent bottleneck z.
In this work, we propose an architectural enhancement to the VAE decoder that leverages high-level semantic information from Vision Foundation Models (VFMs). Central to our approach is the adaptation of the lightweight
convolutional semantic compressor introduced in the REGLUE framework. While REGLUE utilizes this module to entangle semantic features within a diffusion process, we extend its application to the VAE decoder to non-linearly aggregate multi-layer DINO VFM features into a spatially structured, low-dimensional representation that directly conditions the VAE decoder.
By injecting these “semantic maps” into the decoder’s upsampling blocks during finetuning, we provide the model with a structural signal that supplements the information in the primary latent space. Our experimental results demonstrate that this semantically-guided decoding strategy outperforms baseline VAEs across key metrics.
Most notably, we observe a substantial improvement in rFID (Reconstruction FID), indicating a superior ability to synthesize images that are both distributionally and structurally faithful to the ground truth. Furthermore, improvements in standard generative FID suggest that the augmented decoder provides a more robust foundation for downstream synthesis tasks.
Our findings highlight that the non-linear compression of VFM features is not only beneficial for diffusion backbones but is a transformative tool for overcoming the fundamental reconstruction bottlenecks of autoencoder architectures.
Date/Time: July 7, 2026 – 13:00 PM.
Examination Committee:
Dr. Bill Psomas
Dr. Stavros Perantonis
Dr. Ioannis Kakogeorgiou
Presentation link: https://meet.google.com/vsy-kkih-qah
—
Bill Psomas
MSCA Postdoctoral Fellow
VRG, FEE, Czech Technical University in Prague
Karlovo nám. 13, 120 00 Nové Město, Czech Republic
MSc Thesis presentation of Vasileios Klearchos Chatzitolios, Tuesday, 7/6/2026 at 13.00
On Tuesday, 7/6/2026 at 13.00 Vasileios Klearchos Chatzitolios, a graduate
student of the program “Data Science and Information Technologies,
Biomedical Data Science – Bioinformatics” will present their MSc thesis
titled:
Fine-Tuning a Pretrained Actigraphy Transformer for Mild Cognitive
Impairment Classification
Abstract
Mild Cognitive Impairment (MCI) represents a prodromal stage of dementia,
impacting approximately 15–20% of adults over the age of 60. Currently,
scalable and low-burden assessment tools remain scarce. Wrist actigraphy
emerges as a promising modality, offering continuous and unobtrusive
measurement of rest–activity rhythms in real-world environments. Although
disrupted circadian rhythms have been linked to cognitive decline,
classical machine learning approaches that use human-selected actigraphy
features have generally demonstrated only moderate efficacy in
distinguishing cognitively normal (NC) individuals from those with MCI.
This thesis investigates whether a pretrained actigraphy foundation model
can be transferred to MCI classification without the need for human
feature engineering, and whether its learned representations can be
interpreted in terms of circadian patterns relevant to cognitive decline.
The study involves fine-tuning the Pretrained Actigraphy Transformer
(PAT-Large), which was pretrained via masked patch reconstruction on
29,307 NHANES participants, for binary classification between normal
control (NC) and MCI. It employs six-day wrist actigraphy recordings from
150 clinically characterized participants in the ALBION Greek cohort (98
NC, 52 MCI).
Through a systematic configuration search, two Pareto-optimal fine-tuning
strategies were identified: a fully frozen encoder configuration
(Head-Only) and a ULMFiT-style two-phase partial adaptation strategy.
These approaches were evaluated against deep learning baselines trained
from scratch and classical machine learning models utilizing feature
engineering, all using identical stratified cross-validation splits.
Among models trained exclusively on actigraphy data, PAT fine-tuned
solutions demonstrated leading overall performance. PAT ULMFiT achieved
the highest Matthews Correlation Coefficient (MCC ≈ 0.49), F1 score, and
balanced accuracy. It outperformed both raw-sequence deep learning
baselines and feature-engineered machine learning pipelines on these
summary metrics. Performance was also competitive with age-augmented
machine learning models, despite PAT-based solutions not receiving
participant age, suggesting that pretrained actigraphy representations
capture discriminative temporal structure beyond what circadian summaries
alone can.
A multi-method interpretability framework that integrates Integrated
Gradients, learned attention pooling, and occlusion analysis has
identified a circadian sensitivity pattern characterized by nocturnal
activity and daytime hypoactivity. The model attributes MCI-supporting
evidence to heightened nocturnal activity and persistent daytime
hypoactivity, with the attribution polarity shifting near the sleep–wake
boundary. Furthermore, occlusion analysis independently underscores the
importance of nocturnal periods and complements the observed phase lag
between peak model sensitivity and the peak of MCI-related nocturnal
fragmentation.
As an initial proof of concept in a single cohort, these findings indicate
that pretrained actigraphy representations can be transferred to the
classification of clinician-diagnosed MCI and produce interpretable
circadian signatures consistent with established rest–activity
disturbances associated with cognitive decline.
Examination Committee:
Prof. Em. Elias S. Manolakos (advisor), Dept. of Informatics and
Telecommunications, NKUA
Prof. Nikolaos Scarmeas, Dept. of Neurology, NKUA Medical School
Dr. Kostas Vekrellis, Research Director, Biomedical Research Foundation of
the Academy of Athens
Zoom Invitation
Elias Manolakos is inviting you to a scheduled Zoom meeting.
Topic: Vassilis Chatzitolios Msc Thesis presentation
Time: Jul 7, 2026 01:00 PM Athens
Join Zoom Meeting
https://us02web.zoom.us/j/82215719630?pwd=BBQtwyYDSWd9PMRc8YYazaVizWtQ15.1
Meeting ID: 822 1571 9630
Passcode: 148342
One tap mobile
+302111984488,,82215719630#,,,,*148342# Greece
+302311180599,,82215719630#,,,,*148342# Greece
Join by SIP
• 82215719630@zoomcrc.com
Join instructions
https://us02web.zoom.us/meetings/82215719630/invitations?signature=9_nYE0dEbcBExmyX3pRO_zAAJu7s8_Np6c2k7AyAF8Q
MSc Thesis presentation of Ms. Vasso Strouthopoulou Thursday, July 2, 2026
Student: “Vasso Strouthopoulou”
Program: “Data Science and Information Technologies”
Title: “Transformer-Based Architectures for Financial Time Series Forecasting: From Single-Stock Learning to Cross-Asset Generalization”
Abstract:
This thesis investigates the application of Transformer-based neural architectures to short-term stock price prediction. The study examines three model configurations — a Seq2One Transformer encoder, an Attention-Only model, and a Seq2Seq encoder–decoder architecture — evaluated on daily open prices from NASDAQ-listed stocks across varying sequence lengths and training regimes.
Results demonstrate a clear performance progression across models. The Seq2One architecture establishes a stable baseline but exhibits cumulative error in autoregressive forecasts. The Attention-Only model achieves competitive short-term accuracy by capturing localized temporal dependencies. The Seq2Seq Transformer outperforms both predecessors, generating consistent five-day forecasts with the lowest mean absolute and mean squared errors. A multi-stock experiment further demonstrates that Transformers can generalize learned patterns across unseen assets. Comparative evaluation against an LSTM baseline reveals that the LSTM consistently matches or outperforms the Transformer while training 3–4× faster, highlighting practical trade-offs between recurrent and attention-based architectures.
The findings confirm the flexibility of attention-based models for financial forecasting and underscore the importance of sequence length selection, with 10–30 day input windows offering the best trade-off between responsiveness and noise reduction. Future work may incorporate multi-feature inputs, adaptive fine-tuning, and volatility-aware embeddings to enhance model robustness and interpretability.
Date/Time: July 2, 2026 – 12:00 PM.
Examination Committee:
”
Assoc. Prof. Aggelos Pikrakis
Dr. Kosmas Kritsis
Dr. Konstantinos Koutroumbas
“
Presentation link:
https://unipi.webex.com/meet/webex-host2
MSc Thesis presentation of Eirini Baltzi – Thursday, 2/7/2026
Student: “Eirini Baltzi”
Program: “Data Science and Information Technologies”
Title: “Transformer-Based Modeling of Irregular Clinical Time Series: Bi-Axial Attention and Temporal Encoding”
Abstract:
In this work, we study transformer-based modeling of irregular clinical time series derived from Electronic Health Records, with particular focus on the Bi-Axial Transformer (BAT) architecture. Clinical prediction from EHR data remains challenging due to irregular sampling, sparse observations, informative missingness and the coexistence of dynamic and static patient information. Within this setting, our goal is to first, reproduce the BAT architecture and verify that the reconstructed implementation is consistent with the original pa- per and second, to investigate a modification of its temporal encoding mechanism through Rotary Positional Encoding (RoPE). The experimental evaluation is conducted on three clinical benchmarks: PhysioNet 2019 for sepsis prediction, PhysioNet 2012 for in-hospital mortality prediction, and MIMIC-III, where additional experiments are performed under a separate preprocessing pipeline. In addition to the reproduced BAT and the proposed BAT-RoPE variant, two baseline models are considered, namely GRU-D and a standard Transformer. The results show that BAT is a strong and competitive architecture across both datasets. The original BAT achieves the best AUPRC on PhysioNet 2019, while BAT-RoPE performs best on PhysioNet 2012. Additional attention and ablation analyses suggest that BAT captures structured temporal and inter-variable relationships and relies primarily on the dynamic clinical time series, while focal loss does not provide a consistent improvement.
Date-Time: 2/7/2026 – 11:00 AM
Examination Committee:
Associate Professor Aggelos Pikrakis, University of Piraeus
Dr. Kosmas Kritsis, ILSP/Athena Research Center
Dr. Vassilis Psomas, Czech Technical University, Prague
Msc Thesis Presentation
Thursday, July 2 · 11:00am – 12:00pm
Time zone: Europe/Athens
Webex joining info
Video call link: https://unipi.webex.com/meet/webex-host2
MSc Thesis presentation of Georgios Xydias – Thursday 25/06/2026
Title: “Modular LLM Pipeline for Task-Oriented Customer Service Dialogue”
Abstract:
Task-oriented dialogue (TOD) systems are a cornerstone of modern customer-service
automation, promising to handle structured user goals such as searching, booking, and
information requests through natural conversation. The recent rise of instruction-tuned Large
Language Models (LLMs) has reopened a fundamental design question: should a single
general-purpose LLM handle the entire turn, or should the responsibilities of a dialogue agent
be split into a modular pipeline of smaller, role-specific components? Existing work explores
both ends of this spectrum, but rarely compares them head-to-head under a setup that uses the
same dataset, the same evaluation procedure, and the same model families across
architectures, leaving the practical trade-offs between architectural simplicity and architectural
decomposition never brought together under a single controlled comparison.
In this work, we introduce a controlled comparative study of three task-oriented dialogue
architectures for hotel and restaurant customer service, evaluated on the MultiWOZ 2.2
benchmark. The first architecture is a single-LLM baseline that performs dialogue state tracking,
database lookup, and response generation in one prompt. The second is a zero-shot modular
pipeline that separates the dialogue state tracker from the response generator, connected by a
deterministic database lookup and a policy gate, with a lightweight guardrail on the generated
response. The third applies role-specific LoRA fine-tuning to the two LLM-based modules of the
pipeline (the dialogue state tracker, DST, and the response generator, ResponseGen) by
training small open-source models for their assigned role via QLoRA. All three architectures are
evaluated with the standardized MultiWOZ evaluator, reporting both dialogue-state metrics
(Joint Goal Accuracy, Slot F1) and end-to-end metrics (Inform, Success, BLEU, Combined),
complemented by a custom set of operational metrics covering hallucination, policy violations,
system correctness, latency, and cost.
Our findings show that a) architectural decomposition is not universally beneficial: it
substantially helps weaker models that struggle to manage every responsibility in a single
prompt, but it can hurt the strongest open-source models that already coordinate all
responsibilities reliably, and b) role-specific LoRA fine-tuning lifts dialogue-state tracking of
small open-source models close to the level of much larger commercial APIs, making them
cost-efficient, fast, and competitive alternative for slot tracking. These results show that
architectural decomposition and role-specific fine-tuning are effective tools, but only under the
right conditions, for building reliable LLM-based customer-service dialogue systems.
Date-Time: Thursday 25/06/2026 – 10:00 AM
Examination Committee:
Dr. Kosmas Kritsis
Prof. Aggelos Pikrakis
Dr. Georgios Paraskevopoulos
Msc Thesis Presentation
Thursday , June 25 · 10:00am – 11:00pm
Time zone: Europe/Athens
Microsoft Teams meeting
Join:
https://teams.microsoft.com/meet/318082679942220?p=ne3PcOyBsgIcxTjyGu