2025 Summer Intern Project at the CCVL Lab

Lab Page

https://ccvl.jhu.edu/

Advisor Profile

Professor Alan Yuille is a Bloomberg Distinguished Professor in the Department of Computer Science and the Department of Cognitive Science at Johns Hopkins. He published many influential papers in computer vision, cognitive science, etc. He has won the ICCV Marr Award and is an IEEE Fellow.

Professor Tianmin Shu is an Assistant Professor in the Department of Computer Science at Johns Hopkins University. He also holds a joint appointment with the Department of Cognitive Science. His research goal is to advance human-centered AI by engineering machine social intelligence to build socially intelligent systems that can understand, reason about, and interact with humans in real-world settings. Tianmin approaches this from an interdisciplinary perspective, connecting machine learning, computer vision, robotics, and social cognition to study machine social intelligence.

Overall Information

We are seeking several summer research interns for 2025. If interested, please email Professor Alan Yuille (ayuille1@jhu.edu) with your resume attached. Interns will collaborate with Professor Alan Yuille, Professor Tianmin Shu, and their research teams at Johns Hopkins. The internship starts in May, and the duration is flexible (between 6 months to 1 year). Exceptional interns from previous years have been published as the first authors at top conferences in computer vision or medical image processing, such as CVPR, ICLR, and MICCAI. Priority will be given to exceptional interns for Ph.D. applications.

Research Directions

Our lab’s research lies in computer vision and machine learning. The detailed research groups include:
  • 3D generative models
  • 3D datasets
  • Medical image analysis
  • Transformers
  • Vision and language
  • Embodied AI (mentored by Prof. Tianmin Shu)

Requirements

The applicants are expected to fulfill one of the following group’s requirements. Besides, we would really appreciate it if you could specify which group you’re interested in when submitting your applications. We strongly enough you to read the related papers of our group and learn some preliminary knowledge by checking our publication list: https://ccvl.jhu.edu/publication/

The requirements for different groups are as follows,
  • 3D generative models:
    • Basic skills in using Python, PyTorch, and other machine-learning libraries;
    • Understanding recent 3D vision or reconstruction techniques. At least one of the following topics
      • 3D from images (pose and shape, 3D detection)
      • Differentiable rendering (e.g., PyTorch3D, Gaussian Splatting)
      • Other 3D-related topics
    • Publications or submissions in related conferences and journals, e.g., CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, TPAMI, IJCV, JMLR.
  • 3D datasets:
    • Basic skills in using Python, PyTorch, and other machine-learning libraries;
    • Basic skills in using 3D tools, e.g., Blender.
  • Medical image analysis:
    • Proficiency in computer vision and image analysis concepts;
    • Proficiency in Python programming to use prevalent frameworks (such as nnU-Net and MONAI);
    • Prior experience with the analysis of radiological image datasets for AI applications is preferred;
    • Relevent publication/submission in conferences/journals (such as MICCAI, TMI, and MedIA) is preferred.
  • Transformers:
    • Basic skills in using Python, PyTorch, and other machine-learning libraries;
    • Basic mathematics foundations in related areas, e.g., statistical learning and optimization;
    • Knowledge of the basic concepts of the Transformers architectures;
    • Publications or submissions in related conferences and journals, e.g., CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, TPAMI, IJCV, JMLR.
  • Vision and language:
    • Proficiency in using Python, PyTorch, and other machine-learning libraries;
    • Basic knowledge in common deep learning methods in image understanding, language modeling, and multimodal learning (E.g. CNN, LSTM, Transformer);
    • Understanding the concepts of generative learning and the attention mechanism with transformers Hands-on experience with the vision-language model or large language model (e.g., CLIP, GPT, LLAMA, BLIP, Flamingo, StableDiffusion…)
    • Publications or submissions in related conferences or journals, e.g., CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, TPAMI, IJCV, TMLR.
  • Embodied AI (mentored by Prof. Tianmin Shu):
    • Basic skills in using Python, PyTorch, and other machine-learning libraries;
    • Experience or interests in the following topics:
      • Generative AI for developing embodied simulators with diverse and realistic human behaviors, including but not limited to synthesizing human-object interactions in household environments, human-vehicle interactions, and physically grounded social interactions.
      • Multimodal theory of mind reasoning for embodied agents.
      • Embodied human-AI cooperation and communication.

    Copyright © 2025 Johns Hopkins University