Yu-Ying Yeh

Hi. I'm Yu-Ying Yeh (葉鈺濙).


About Me

I am a PhD student at UC San Diego and also a member of Center of Visual Computing. My research interest mainly focuses on deep learning and computer vision. I am currently working on projects related to inverse rendering, such as shape reconstruction, material enhancement. My goal is to leverage computer vision and graphics technique to enable realistic content creation for AR/VR applications. Besides, I am also interested in representation learning, feature disentanglement and video prediction and have done some works related to generative models on videos and domain adaptation.


University of California San Diego

Ph.D. student in Computer Science and Engineering

National Tsing Hua University

Non-degree, Computer Science

National Chiao Tung University

Non-degree, Computer Science

National Taiwan University

B.S. in Physics & B.A. in Economics

Work Experience

Research Internship, Adobe Inc.

Adobe Research

Graduate Student Researcher, University of California, San Diego

Center of Visual Computing

Supervised by Prof. Manmohan Chandraker

Research Assistant, National Taiwan University

Vision and Learning Lab

Supervised by Prof. Yu-Chiang Frank Wang

Research Assistant, Academia Sinica

Multimedia and Maching Learning Lab

Supervised by Dr. Yu-Chiang Frank Wang


Here are my recent research projects.

Through the Looking Glass: Neural 3D Reconstruction of Transparent Shapes

Yu-Ying Yeh*, Zhengqin Li*, Manmohan Chandraker (*indicates equal contributions)
(CVPR 2020 Oral Presentation)

Full paper: [ArXiv] / [Project Page] / Code and Dataset: Coming Soon.

Recovering the 3D shape of transparent objects using a small number of unconstrained natural images is an ill-posed problem. Complex light paths induced by refraction and reflection have prevented both traditional and deep multiview stereo from solving this challenge. We propose a physically-based network to recover 3D shape of transparent objects using a few images acquired with a mobile phone camera, under a known but arbitrary environment map. Our novel contributions include a normal representation that enables the network to model complex light transport through local computation, a rendering layer that models refractions and reflections, a cost volume specifically designed for normal refinement of transparent shapes and a feature mapping based on predicted normals for 3D point cloud reconstruction. We render a synthetic dataset to encourage the model to learn refractive light transport across different views. Our experiments show successful recovery of high-quality 3D geometry for complex transparent shapes using as few as 5-12 natural images.

Static2Dynamic: Video Inference from a Deep Glimpse

Yu-Ying Yeh, Yen-Cheng Liu, Wei-Chen Chiu, Yu-Chiang Frank Wang
(IEEE Transactions on Emerging Topics in Computational Intelligence)

Full paper: [PDF]

In this paper, we address a novel and challenging task of video inference, which aims to infer video sequences from given non-consecutive video frames. Taking such frames as the anchor inputs, our focus is to recover possible video sequence outputs based on the observed anchor frames at the associated time. With the proposed Stochastic and Recurrent Conditional GAN (SR-cGAN), we are able to preserve visual content across video frames with additional ability to handle possible temporal ambiguity. In the experiments, we show that our SR-cGAN not only produces preferable video inference results, it can also be applied to relevant tasks of video generation, video interpolation, video inpainting, and video prediction.

A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation

Alexendar Liu, Yen-Cheng Liu, Yu-Ying Yeh, Yu-Chiang Frank Wang (NIPS 2018)

Full paper: [PDF] / Code: [Github]

We present a novel and unified deep learning framework which is capable of learning domain-invariant representation from data across multiple domains. Realized by adversarial training with additional ability to exploit domain-specific information, the proposed network is able to perform continuous cross-domain image translation and manipulation, and produces desirable output images accordingly. In addition, the resulting feature representation exhibits superior performance of unsupervised domain adaptation, which also verifies the effectiveness of the proposed model in learning disentangled features for describing cross-domain data.

Detach and Adapt: Learning Cross-Domain Disentangled Deep Representation

Yen-Cheng Liu, Yu-Ying Yeh, Tzu-Chien Fu, Wei-Chen Chiu, Sheng-De Wang, Yu-Chiang Frank Wang (CVPR 2018 Spotlight)

Full paper: [PDF] / Code: [Github] / Present Video: [CVPR]

While representation learning aims to derive interpretable features for describing visual data, representation disentanglement further results in such features so that particular image attributes can be identified and manipulated. However, one cannot easily address this task without observing ground truth annotation for the training data. To address this problem, we propose a novel deep learning model of Cross-Domain Representation Disentangler (CDRD). By observing fully annotated source-domain data and unlabeled target-domain data of interest, our model bridges the information across data domains and transfers the attribute information accordingly. Thus, cross-domain joint feature disentanglement and adaptation can be jointly performed. In the experiments, we provide qualitative results to verify our disentanglement capability. Moreover, we further confirm that our model can be applied for solving classification tasks of unsupervised domain adaptation, and performs favorably against state-of-the-art image disentanglement and translation methods.

Adaptation and Re-Identification Network: An Unsupervised Deep Transfer Learning Approach to Person Re-Identification

Yu-Jhe Li, Fu-En Yang, Yen-Cheng Liu, Yu-Ying Yeh, Xiaofei Du, Yu-Chiang Frank Wang (CVPR 2018 workshop)

Full paper: [Arxiv] / Code: To be updated soon.

Person re-identification (Re-ID) aims at recognizing the same person from images taken across different cameras. To address this task, one typically requires a large amount labeled data for training an effective Re-ID model, which might not be practical for real-world applications. To alleviate this limitation, we choose to exploit a sufficient amount of pre-existing labeled data from a different (auxiliary) dataset. By jointly considering such an auxiliary dataset and the dataset of interest (but without label information), our proposed adaptation and re-identification network (ARN) performs unsupervised domain adaptation, which leverages information across datasets and derives domain-invariant features for Re-ID purposes. In our experiments, we verify that our network performs favorably against state-of-the-art unsupervised Re-ID approaches, and even outperforms a number of baseline Re-ID methods which require fully supervised data for training.



PyTorch implementation of Isospectralization, or how to hear shape, style, and correspondence," in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019.

Recurrent Denoising Autoencoder

Tensorflow implementation of Interactive Reconstruction of Monte Carlo Image Sequences using a Recurrent Denoising Autoencoder, Siggraph 2017.

Generative model

Tensorflow implementation of Variational Autoencoder and Generative Adversarial Networks