Best use cases of t-SNE 2023 part2(Machine Learning) – Medium

Photo by Christopher Burns on Unsplash

Author : Okan Dzyel

Abstract : The quality of GAN-generated images on the MNIST dataset was explored in this paper by comparing them to the original images using t-distributed stochastic neighbor embedding (t- SNE) visualization. A GAN was trained with the dataset to generate images and the result of generating all synthetic images, the corresponding labels were saved. The dimensionality of the generated images and the original MNIST dataset was reduced using t-SNE and the resulting embeddings were plotted. The rate of the GAN-generated images was examined by comparing the t-SNE plots of the generated images and the original MNIST images. It was found that the GAN- generated images were similar to the original images but had some differences in the distribution of the features. It is believed that this study provides a useful evaluation method for assessing the quality of GAN-generated images and can help to improve their generation in the future.

2.Revised Conditional t-SNE: Looking Beyond the Nearest Neighbors (arXiv)

Author : Edith Heiter, Bo Kang, Ruth Seurinck, Jefrey Lijffijt

Abstract : Conditional t-SNE (ct-SNE) is a recent extension to t-SNE that allows removal of known cluster information from the embedding, to obtain a visualization revealing structure beyond label information. This is useful, for example, when one wants to factor out unwanted differences between a set of classes. We show that ct-SNE fails in many realistic settings, namely if the data is well clustered over the labels in the original high-dimensional space. We introduce a revised method by conditioning the high-dimensional similarities instead of the low-dimensional similarities and storing within- and across-label nearest neighbors separately. This also enables the use of recently proposed speedups for t-SNE, improving the scalability. From experiments on synthetic data, we find that our proposed method resolves the considered problems and improves the embedding quality. On real data containing batch effects, the expected improvement is not always there. We argue revised ct-SNE is preferable overall, given its improved scalability. The results also highlight new open questions, such as how to handle distance variations between clusters

Continued here:
Best use cases of t-SNE 2023 part2(Machine Learning) - Medium

Related Posts

Comments are closed.