video diffusion model, multimodal model with <5 years of experience

completed18 qualified1 runApr 22, 6:55 AMvideo-diffusion-model-multimodal-model-with-5-years-of-exper-1776840919

Parsed2 topics · Junior · Hybrid

Generating seed nodes

0 proposed

Explored 0 queries

0/0 done

Expanding nodes

queued

Qualifying candidates

queued

Qualified Candidates (18)

David Junhao Zhang

high hireability

Ph.D. student@National University of Singapore

Previously: Student Researcher @ Google

Final-year PhD student at NUS Show Lab, core contributor to video diffusion and multimodal LLM research
Co-authored Show-1 (294 citations, pixel+latent T2V diffusion), Show-o (367 citations, unified multimodal understanding+generation transformer), and MotionDirector (165 citations, T2V motion customization); also curates Awesome-Video-Diffusion. h-index 19
Hireability: HIGH — explicitly listed as final-year PhD on website, prime transition window for industry roles

Jinheng Xie

high hireability

PhD student@National University of Singapore

Previously: Intern @ Google Research

Singapore, SG

Lead author of Show-o (ICLR 2025, 357 citations) — unified multimodal understanding+generation transformer, and Show-o2 (NeurIPS 2025)
Also has video+multimodal work ('Learning Video Context as Interleaved Multimodal Sequences', 2025) and diffusion model papers
PhD at NUS ShowLab focused on 'unifying multimodal understanding and generation'
Hireability: HIGH — final-year PhD (graduation expected 2025-2026), currently only a Google internship not a full-time role, prime transition window

Liangbin Xie

high hireability

Ph.D. Student@University of Macau

Previously: MS Student @ Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences

Montreal, CA

Strong video diffusion + multimodal fit: first-author on Image Conductor (AAAI 2025, video synthesis), Videopainter (2025, video inpainting/editing), SmartEdit (CVPR 2024 Highlight, multimodal MLLM-based image editing), and T2I-Adapter (1401 citations, controllable generation). 3rd-year PhD at U
Macau / SIAT, internships at Adobe Research and Kuaishou KLING
Well within <5 years experience
Hireability: HIGH — website explicitly states 'I am looking for full-time positions after summer 2026'; cv_update + new papers added Mar–Jun 2025

Anil Kag

medium hireability

Research Scientist@Snap

Previously: Research Assistant @ Boston University

Los Angeles, US

Senior RSci leading Efficient Generative AI team at Snap; co-authored Snap Video (text-to-video, 102 citations, 2024), SF-V (single-forward video generation, 2024), and DenseDPO (video diffusion preference optimization, 2025) — exact video diffusion match
Note: overall ML career 10+ years, but specialization in video/multimodal diffusion started at Snap in 2023 (~3 years)
Hireability: MEDIUM — ~3 years into current role at Snap, no open-to-work signals on GitHub/LinkedIn/website (pipeline shows no changes), but within typical transition window

Bohan Zeng

medium hireability

PhD student@PhD student, Peking University

Strong video diffusion + multimodal AI researcher with NeurIPS 2022 (FNeVR face animation), AAAI 2024 (Controllable Mind Visual Diffusion Model), ICLR 2025 (IPDreamer 3D generation), and 2025 papers on video diffusion (spatio-temporal zero-shot synthesis) and multimodal LLMs (Mavors, VersaVid-R1)
Actively contributing to OpenDCAI/OpenWorldLib (unified world models framework) as of April 2026. h-index 12
PhD student at Peking University, ~3-4 years in program, <5 years experience
Hireability: MEDIUM — active PhD student, no explicit open-to-work signals, but likely approaching final year based on 2022 first publication; prime transition window

Dazhong Shen

medium hireability

Associate Professor@Nanjing University of Aeronautics and Astronautics

Previously: Researcher @ Shanghai Artificial Intelligence Laboratory

Nanjing, CN

Strong diffusion model + multimodal researcher: video outpainting (Be-Your-Outpainter, 2024), MoVA mixture-of-vision-experts (80 citations), Phased Consistency Models
PhD USTC 2023, explicit expertise in Diffusion Models. ~3 years post-PhD
Hireability: MEDIUM — postdoc at NUAA since March 2025 (~13 months in, Jiangsu Postdoc Fellowship 2025), no explicit open-to-work signals but postdoc track typically has 2-3yr horizon

Guocheng Gordon Qian

medium hireability

Research Scientist@Snap

Previously: Research Intern @ Snap

San Francisco, US

Senior Research Scientist at Snap (since Dec 2023, ~1.5 years) directly working on video diffusion transformers and multimodal generation
Key papers: VD3D and AC3D (camera control in video diffusion transformers, 2024), 'I Think, Therefore I Diffuse' (multimodal in-context reasoning in diffusion models, 2025)
Also building VLM-diffusion models (5B–30B scale) and self-forcing for real-time streamable video at Snap
PhD KAUST 2023, ~2.5 years post-PhD — well within <5 years
Located in Palo Alto, CA
Hireability: MEDIUM — 1.5 years at current role (relatively new), no explicit open-to-work signal, but active website (48 updates, most recent Feb 2026) and personal site collaboration invite suggest engagement

Hangjie Yuan

medium hireability

AI Research Scientist@Alibaba DAMO Academy

Previously: Research Intern @ Alibaba DAMO Academy

Hangzhou, CN

Core video diffusion researcher at Alibaba DAMO — lead/co-lead on VideoComposer (453 citations, NeurIPS 2023), ModelScopeT2V (610 citations), I2VGen-XL (299 citations), VGen, Lumos-1 (ICLR 2026), and UniLumos (NeurIPS 2025)
Strong multimodal work via RLIP/RLIPv2 series (relational language-image pre-training)
PhD ZJU, graduated Summer 2024, h_index 18
Hireability: MEDIUM — ~1.5-2 years post-PhD at DAMO, actively publishing (ICLR 2026 accepted), no explicit open-to-work signal but within the transition window

Han Shi

medium hireability

Software Programmer@Alibaba

Previously: Software Engineer @ Huawei

Beijing, CN

Strong diffusion model expertise (DiffFit 104 cites, DiM: Diffusion Mamba 63 cites) and directly relevant multimodal LLM work (Visual Token Grouping for efficient MLLMs, 2025 CoT RL for multimodal LLMs)
No explicit video diffusion papers, but image diffusion + multimodal background is highly transferable
Principal Researcher at Huawei Noah's Ark Lab, Hong Kong; ~4 years post-PhD (HKUST, Twitter handle 'HanShi96' consistent with ~2022 graduation)
Hireability: MEDIUM — stable senior Huawei role, no job-seeking signals, website/GitHub dormant since Dec 2024, positions self as recruiting interns rather than seeking work

Hanxue Liang

medium hireability

PhD student@University of Cambridge

Previously: Research Scientist Intern @ NVIDIA

Direct match: first-authored/co-authored Diffusion4D (64 citations, 2024) — a video diffusion model framework for 4D content generation, plus L4GM (82 citations), Comp4D (LLM-guided 4D scene generation), and Feed-Forward Bullet-Time Reconstruction of dynamic scenes
Research expertise: 3D vision, neural representation learning, 4D reconstruction
H-index 14 as a Cambridge PhD student
Hireability: MEDIUM — PhD student at Cambridge, actively publishing through Oct 2025, no explicit job-seeking signals. Publication history suggests ~4-5 years into PhD (papers since 2021), likely nearing completion, prime transition window

Junli Cao

medium hireability

Research Engineer@Snap

Previously: Machine Learning Engineer @ Snap

Los Angeles, US

Strong video diffusion researcher at Snap with directly relevant work: '4Real: Photorealistic 4D Scene Generation via Video Diffusion Models' (CVPR 2024, 48 cites), 'SF-V: Single Forward Video Generation Model' (2024, 20 cites), and 2025 paper on physical understanding in video generation. h-index 10, active through 2025
PhD student at UCLA CS (started ~2019)
Hireability: MEDIUM — likely late-stage PhD (~5-6 years in program as of 2026), no explicit job market signals but within typical completion/transition window; currently at Snap as Research Engineer

Lingmin Ran

medium hireability

PhD student@National University of Singapore

Singapore, SG

Core video diffusion researcher at Show Lab NUS (advisor: Mike Zheng Shou)
Authored Show-1 (text-to-video, pixel+latent diffusion hybrid), TPDiff (Temporal Pyramid Video Diffusion, ICLR 2026), X-Adapter (CVPR 2024, diffusion plugin compatibility), and EvolveDirector (NeurIPS 2024, VLM-guided text-to-image) — directly on-target for video diffusion + multimodal
PhD started 2023, <5 years experience
Hireability: MEDIUM — year 3 of 4-year PhD program (2023-2027), still ~1 year from typical graduation window, but active and productive

Alexander William Bergman

low hireability

Principal Researcher@Hedra

Previously: PhD student @ Stanford University

CTO/Co-founder of Hedra Labs (expressive portrait video generation), now founding Rhoda AI
Papers include 'Phased Consistency Models' (NeurIPS 2024, text-to-video diffusion acceleration) and 'Real-time One-Step Diffusion-based Expressive Portrait Videos Generation' (2024)
Stanford PhD
Core video diffusion expertise directly matching query
Hireability: LOW — serial founder actively building new startup (Rhoda AI) after leaving Hedra as co-founder; unlikely to leave own company

Chaoyang Wang

low hireability

Research Scientist@Snap

Previously: Graduate Research Assistant @ Carnegie Mellon University

Los Angeles, US

PhD CMU 2023 (~3 yrs post-PhD experience)
Led video diffusion research at Snap Creative Vision Team — VD3D (ICLR 2025, camera-controllable video diffusion), 4Real-Video (CVPR 2025 Highlight, 4D video diffusion), 4Real (4D scene generation via video diffusion)
Research expertise explicitly includes video generation and 4D generation; h-index 19
Now building world models for robotics at Tesla AI
Hireability: LOW — joined Tesla AI ~4 months ago (Dec 2025 website update), no signals of active job searching

Chieh Hubert Lin

low hireability

Research Scientist@Stealth AI Startup

Previously: Intern @ RealityLabs - Surreal @ Meta

Strong video diffusion researcher — published 'Motion-Conditioned Diffusion Model for Controllable Video Synthesis' (2023, 89 citations) and 'Taming Latent Diffusion Model for NeRF Inpainting' (2024); PhD at UC Merced completed May 2025
Now Research Scientist at Stealth AI Startup
Meets <5 years experience threshold
Hireability: LOW — recently started at Stealth AI Startup (~<1 year in role per pre-computed pipeline signal), consistent with typical low-mobility window for new hires

Dongyang Liu

low hireability

Assistant Professor@The Chinese University of Hong Kong, Shenzhen

Previously: Research Assistant Professor @ The Chinese University of Hong Kong

Shenzhen, CN

Direct hit for both query dimensions: video diffusion (VEnhancer, Lumina-Video) and multimodal models (LLaMA-Adapter 962 citations ICLR 2024, SPHINX 315 citations, Lumina-T2X)
H-index 11 with core contributions to the Lumina series (video+image generation via flow-based diffusion transformers) and VEnhancer (video enhancement via diffusion). 2nd-year PhD student at MMLab/CUHK since 2024.09, supervised by Hongsheng Li; prior Master's at ICT/CAS (2021-2024) where most of this work was done
Hireability: LOW — only ~1.5 years into PhD (started Sept 2024), typically 3+ years before graduation window

Jay Zhangjie Wu

low hireability

Research Scientist@NVIDIA

Previously: Research Scientist Intern @ NVIDIA

Core video diffusion researcher — first author of Tune-A-Video (1116 citations, ICCV 2023), Show-1 (294 citations), and MotionDirector (165 citations); contributed to NVIDIA Cosmos world foundation model (215 citations)
PhD NUS (Dec 2025), research focus squarely on generative models for images, videos, 3D and 4D — a near-perfect match for the video diffusion/multimodal query
Hireability: LOW — joined NVIDIA Spatial Intelligence Lab in Dec 2025, already promoted to Senior Research Scientist by March 2026 (~4.5 months in); no open-to-work signals detected

Jia-Wei Liu

low hireability

Research Scientist@Meta

Previously: Research Scientist @ Meta

San Francisco, US

Core video diffusion researcher: MagicAnimate (324 cites), Show-1 (288 cites), MotionDirector (165 cites) — textbook on-query for video diffusion
VideoLLM-online (96 cites) adds multimodal video depth. h_index=18, PhD NUS, now Research Scientist at Meta Superintelligence Lab (ex-FAIR) in Menlo Park
Hireability: LOW — recently settled at Meta with no job-seeking signals; internal move from FAIR to Superintelligence Lab is a career progression signal, not an exit signal

Runs

#1completed0 qualified / 0 foundApr 22, 6:55 AM