This project is about our efforts on understanding the content and meaning of static images. A wide range of topics are covered, including but not limited to semantic segmentation (assigning a semantic category to every pixel in an image) , image captioning (generating a sentence describing the entire image or a specific region) , edge detection, depth estimation etc. In doing so, we also collect large-scale datasets  with detailed annotation to facilitate computer vision research. We mainly study mid-level to high-level computer vision problems, with possible connection and extension to natural language understanding. We build models with structure, 3D, and interpretability in mind, and test on challenging real-world images. Our long-term goal is holistic, human-like understanding of objects and scenes.
 Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille, DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs, PAMI 2017.
 Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, Zhiheng Huang, Alan L. Yuille, Deep captioning with multimodal recurrent neural networks (m-rnn), ICLR 2015.
 Roozbeh Mottaghi, Xianjie Chen, Xiaobai Liu, Nam-Gyu Cho, Seong-Whan Lee, Sanja Fidler, Raquel Urtasun, Alan L. Yuille, The role of context for object detection and semantic segmentation in the wild, CVPR 2014.