ambulance bed bolt briefcase calendar chain chevron-left chevron-right clock-o commenting-o commenting comments diamond envelope-o envelope facebook feed flask globe group heart-o heart heartbeat hospital-o instagram leaf map-marker medkit phone quote-left quote-right skype star-o star tint trophy twitter user-md user youtube


Cognitive Science

Yan Wang (postdoc, contact person), Hongru Zhu, Weichao Qiu, Chenxi Liu, Qing Liu, Huiyu Wang

Deep nets do very well on specific types of visual tasks and on specific benchmarked datasets. For Cognitive Science, Deep Nets offer the possibility of developing computational theories which can be tested on natural, or realistically synthetic images. Many topics are covered in this project, including but not limited to compare the performance of various Deep Nets models with humans (or primates), design computational algorithms that exhibit the robustness of biological vision [1,2], vision and text analogy, analysis by synthesis [3] etc. We are cooperating with other groups in JHU and MIT.

[1] Jianyu Wang, Cihang Xie, Zhishuai Zhang, Jun Zhu, Lingxi Xie, Alan Yuille, Detecting Semantic Parts on Partially Occluded Objects, BMVC 2017
[2] Boyang Deng, Qing Liu, Siyuan Qiao, Alan Yuille, Few-shot Learning by Exploiting Visual Concepts within CNNs
[3] Alan Yuille, Daniel Kersten, Vision as Bayesian Inference: Analysis by Synthesis? In Trends in Cognitive Neuroscience 2006

Deep Networks and Beyond

Wei Shen, Siyuan Qiao (contact person), Lingxi Xie, Chenxi Liu, Zhuotun Zhu, Zhishuai Zhang, Cihang Xie, Huiyu Wang, Qing Liu, Yan Wang (visiting Ph.D. student), Yan Zheng

This project includes our research on deep neural networks and beyond. The topics are broad, including but not limited to neural architecture search [1], visual concepts [2], adversarial examples and defense [3], neural architecture design [4], object detection, deep forest, and few-shot and large-scale image recognition. The goal of the project is to develop interpretable and effective algorithms and systems for various computer vision tasks.

[1] Lingxi Xie, Alan Yuille, Genetic CNN, ICCV 2017
[2] Jianyu Wang, Zhishuai Zhang, Cihang Xie, Vittal Premachandran, Alan Yuille, Unsupervised learning of object semantic parts from internal states of CNNs by population encoding
[3] Cihang Xie, Jianyu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, Alan Yuille, Adversarial Examples for Semantic Segmentation and Object Detection, ICCV 2017
[4] Yan Wang, Lingxi Xie, Chenxi Liu, Siyuan Qiao, Ya Zhang, Wenjun Zhang, Qi Tian, Alan Yuille, SORT: Second-Order Response Transform for Visual Recognition, ICCV 2017

Synthetic World

Weichao Qiu (contact person), Yi Zhang, Siyuan Qiao, Zihao Xiao

The advancement of computer graphics and VR provides numerous opportunity of AI researchers. In this project, we are developing realistic virtual worlds using computer graphics and generative models to develop and evaluate computer vision models. Our work includes: developing software infrastructure [1], train computer vision models using synthetic images [2], stress-test vision algorithms [3,4] and domain adaptation. This project is supported by DIVA and Visual Cortex On Silicon.

[1] Weichao Qiu, Alan Yuille, UnrealCV: Connecting Computer Vision to Unreal Engine, ECCV Workshop VARVAI 2016
[2] Siyuan Qiao, Wei Shen, Weichao Qiu, Chenxi Liu, and Alan Yuille. ScaleNet: Guiding Object Proposal Generation in Supermarkets and Beyond, ICCV 2017
[3] Yi Zhang, Weichao Qiu, Qi Chen, Xiaolin Hu, and Alan Yuille. Unrealstereo: A synthetic dataset for analyzing stereo vision, arXiv preprint 2016
[4] Xiaohui Zeng, Chenxi Liu, Yu-Siang Wang, Weichao Qiu, Lingxi Xie, Yu-Wing Tai, Chi Keung Tang, Alan Yuille. Adversarial Attacks Beyond the Image Space, arXiv preprint 2017

Medical Imaging Analysis

Seyoun Park, Wei Shen, Lingxi Xie (contact person), Yan Wang (postdoc), Yuyin Zhou, Yan Wang (visiting Ph.D. student), Zhuotun Zhu, Yingda Xia, Fengze Liu, Qihang Yu

The medical imaging analysis project is mainly supported by the FELIX project, a long-term funding provided by the Lustgarten foundation aimed at detecting the pancreas neoplasm using deep learning techniques. The main story is to ask the professional radiologists to annotate medical data (such as CT scans), and train deep networks to learn from these knowledge. In the first year, we are mainly working on segmenting normal pancreases from abdominal CT scans, and we have achieved the state-of-the-art accuracy [1,3,4] in a public dataset. Also we had some preliminary studies in detecting pancreatic cysts [2]. In the current (second) year, we move on to deal with abnormal pancreases, in particular the most common pancreatic cancer known as pancreatic ductal adenocarcinoma (PDAC). In the fundamental research, we are also interested in the advantages and disadvantages of 2D and 3D segmentation approaches.

[1] Qihang Yu, Lingxi Xie, Yan Wang, Yuyin Zhou, Elliot K. Fishman, Alan L. Yuille, Recurrent Saliency Transformation Network: Incorporating Multi-Stage Visual Cues for Small Organ Segmentation, arXiv preprint: arXiv 1709.04518, 2017 (submitted to CVPR 2018).
[2] Yuyin Zhou, Lingxi Xie, Elliot K. Fishman, Alan L. Yuille, Deep Supervision for Pancreatic Cyst Segmentation in Abdominal CT Scans, MICCAI 2017.
[3] Yuyin Zhou, Lingxi Xie, Wei Shen, Yan Wang, Elliot K. Fishman, Alan L. Yuille, A Fixed-Point Model for Pancreas Segmentation in Abdominal CT Scans, MICCAI 2017 (project page).
[4] Zhuotun Zhu, Yingda Xia, Wei Shen, Elliot K. Fishman, Alan L. Yuille, A 3D Coarse-to-Fine Framework for Automatic Pancreas Segmentation, arXiv preprint: arXiv 1712.00201, 2017 (submitted to CVPR 2018).

Image Understanding

Wei Shen, Chenxi Liu (contact person), Qi Chen, Peng Tang, Zhe Ren, Yuma Matsuoka

This project is about our efforts on understanding the content and meaning of static images. A wide range of topics are covered, including but not limited to semantic segmentation (assigning a semantic category to every pixel in an image) [1], image captioning (generating a sentence describing the entire image or a specific region) [2], edge detection, depth estimation etc. In doing so, we also collect large-scale datasets [3] with detailed annotation to facilitate computer vision research. We mainly study mid-level to high-level computer vision problems, with possible connection and extension to natural language understanding. We build models with structure, 3D, and interpretability in mind, and test on challenging real-world images. Our long-term goal is holistic, human-like understanding of objects and scenes.

[1] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille, DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs, PAMI 2017.
[2] Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, Zhiheng Huang, Alan L. Yuille, Deep captioning with multimodal recurrent neural networks (m-rnn), ICLR 2015.
[3] Roozbeh Mottaghi, Xianjie Chen, Xiaobai Liu, Nam-Gyu Cho, Seong-Whan Lee, Sanja Fidler, Raquel Urtasun, Alan L. Yuille, The role of context for object detection and semantic segmentation in the wild, CVPR 2014.

Human Pose Estimation

Chenxu Luo (contact person), Zihao Xiao, Yi Zhang

This project is about understanding human in images and videos. Currently, we are working on both 2D and 3D human pose estimation. We developed models for single person pose estimation [1], jointly human part segmentation and pose estimation [2], and also 3D human pose estimation from monocular images [3].

[1] Xiao Chu*, Wei Yang*, Wanli Ouyang, Cheng Ma, Alan L. Yuille, Xiaogang Wang. Multi-Context Attention for Human Pose Estimation. CVPR 2017.
[2] Fangting Xia, Peng Wang, Xianjie Chen, Alan L. Yuille. Joint Multi-Person Pose Estimation and Semantic Part Segmentation. CVPR 2017.
[3] Chenxu Luo, Xiao Chu, Alan L. Yuille. OriNet for 3D Human Pose Estimation.