Worldwide Thesis Database & PhD tips

Learning factorised representation via generative models. - PhDData

Access database of worldwide thesis

Learning factorised representation via generative models.

The thesis was published by Zeng, Zezhen, in August 2022, University of Southampton.

Abstract:

Deep learning has been widely used in real-life applications during the last few decades, such as face recognition, machine translation, object detection and classification. Representation learning is an important part of deep learning, which can simply be understood as a method for dimension reduction. However, the representation learned by the task-specific model is hard to be applied to other tasks without parameter tuning, since it discards irrelevant information from the input. While for generative models, the model can learn a joint distribution over all variables and the latent space can almost maintain the whole information of the dataset rather than task-specific information. But the vanilla generative models can only learn an entangled representation which cannot be used efficiently. Thus, a factorised representation is needed in most cases. Focus more on images, this thesis proposes new methods to learn a factorised representation. This thesis starts by figuring out the quality of the representation learned by the backbone model Variational Autoencoder (VAE) visually. The proposed tool alleviates the blurriness of the vanilla VAE by introducing a discriminator. Then the potential of the VAE on transfer learning is explored. Collecting data is expensive, especially with labels. Transfer learning is one way to solve this issue. The results show a strong ability of the VAE on generalisation, which means the VAE can produce reasonable results even without parameter tuning. For factorised representation learning, this thesis follows a rule from a shallow level to a deep level. We propose a VAE-based model that can learn a latent space that factorises the foreground and the background of images, while the foreground in the experiments is defined as the objects inside the given bounding box labels. This factorised latent space allows the model to do conditional generation. The results can achieve a state-of-the-art Frรฉchet inception distance (FID) score. Then we investigate the unsupervised object-centric representation learning, which can be seen as a deeper level of the foreground representation. By observing that the object area tends to contain more information than the background in a multi-object scene, the model is designed to discover objects according to this difference. A better result can be obtained on the downstream task with the learned representation when compared to other related models.

The full thesis can be downloaded at :
https://eprints.soton.ac.uk/472890/
https://eprints.soton.ac.uk/472890/1/Learning_Factorised_Representation_Via_Generative_Models.pdf

Read the last PhD tips

2022
October

What exactly is the most scary part of a Ph.D. according Ph.D. holders?
2024
November

The Hidden Costs of Over-Qualification: A PhD Perspective
2022
October

What can you do better with your PhD?
2022
September

Have you ever regretted your Ph.D.? In the sense that, looking back, you’d have instead invested your time in working outside academia
2022
September

Why would a PhD student drop out after spending so much time in the program?