Dorian Joubaud

Dorian Joubaud

PhD candidate, Machine Learning

SerVal group, SnT, University of Luxembourg

Luxembourg

I'm a PhD candidate in Machine Learning at the SnT of the University of Luxembourg, supervised by Prof. Yves Le Traon and Dr. Sylvain Kubler. My research focuses on synthetic data generation for time series: producing, selecting, and evaluating augmented samples that help downstream models when real data is scarce, imbalanced, or drawn from a different distribution.

In BALANCER (IEEE Access 2024), we trained a model to pick the right augmentation method for each imbalanced dataset. DADA (PHM 2024) uses a Wasserstein CycleGAN to translate labeled source-domain samples into synthetic target-domain samples, allowing a remaining-useful-life predictor to be trained for a target domain that has no labels of its own. ASCENSION (under review at TMLR) expands class regions inside a VAE latent space so a classifier stays confident with little training data. Reach out if any of this is relevant to your work.

News

  • May 2026 ASCENSION resubmitted to TMLR; Action Editor assigned.
  • Dec 2024 BALANCER published in IEEE Access.
  • Nov 2024 DADA presented at the Annual Conference of the PHM Society 2024.
  • Nov 2022 Started my PhD at the University of Luxembourg, SnT.

Publications

ASCENSION: VAE-based Latent Space Class Expansion for Time-Series Data Augmentation Under review

Transactions on Machine Learning Research (TMLR) 2026

D. Joubaud, M. Olekhnovitch, A. Bolling, E. Zotov, S. Kubler, M. Cordy, et al.

Augmentation in the VAE latent space via class-conditional contrastive expansion with an α-scaling control. Evaluated across the 102 UCR datasets; the framework consistently improves downstream classification accuracy over standard augmentation baselines.

CycleGAN-based Data Augmentation for Enhanced Remaining Useful Life Prediction Under Unsupervised Domain Adaptation

Annual Conference of the PHM Society 2024

D. Joubaud, E. Zotov, O. Bektaş, S. Kubler, Y. Le Traon

A Wasserstein CycleGAN with gradient penalty (W-CycleGAN-GP) generates synthetic samples in an unlabeled target domain from labeled source-domain samples. Combined with adversarial and correlation-alignment domain-adaptation losses, the augmented data improves remaining-useful-life prediction on NASA's C-MAPSS turbofan benchmark.

Decision Support Model for Time Series Data Augmentation Method Selection

IEEE Access 2024

D. Joubaud, S. Kubler, R. Lourenção, M. Cordy, Y. Le Traon

BALANCER is a machine-learning framework that recommends the most effective augmentation technique for an imbalanced time-series classification task. Validated on an empirical study of 720 datasets and explained with SHAP-based interpretability.

Talks & Events

Nov 2024 DADA: CycleGAN-based Data Augmentation for RUL under UDA Annual Conference of the PHM Society 2024, conference paper presentation.

Teaching & Supervisions

2023 Introduction to Machine Learning, Space Master University of Luxembourg. Practical sessions covering synthetic image generation (Hubble galaxies and nebulae) and pulsar classification under class imbalance.
2024 Internship supervision: Matthieu Olekhnovitch Latent-space class expansion for time-series data augmentation. Co-author on ASCENSION. Awarded the Louis-Édouard Rivot Medal by the Académie des sciences (Institut de France) for Best Computer Science Research Internship at École Polytechnique.
2023 Internship supervision: Quentin Lao Data augmentation techniques for time-series classification (École Polytechnique).

If you're a student interested in time series, augmentation, or synthetic data generation, feel free to reach out.

Education

2022 – PhD in Machine Learning, University of Luxembourg (SnT) Data augmentation for time series. Supervised by Prof. Yves Le Traon and Dr. Sylvain Kubler (SerVal group).
2021 – 22 M.Sc. TRIED, Institut Polytechnique de Paris Data science and big-data track at Paris-Saclay (Telecom SudParis · ENSIIE · UVSQ · CNAM).
2019 – 22 Engineering Diploma, ENSIIE Specialisation in data science and applied machine learning.