Back to dashboard

video diffusion model, multimodal model with <5 years of experience

completed18 qualified1 runApr 22, 6:55 AMvideo-diffusion-model-multimodal-model-with-5-years-of-exper-1776840919
Parsed2 topics · Junior · Hybrid
Generating seed nodes
0 proposed
Explored 0 queries
0/0 done
    3
    Expanding nodes
    queued
    4
    Qualifying candidates
    queued

    Qualified Candidates (18)

    DZ

    David Junhao Zhang

    high hireability

    Ph.D. student@National University of Singapore

    Previously: Student Researcher @ Google

    US

    • Final-year PhD student at NUS Show Lab, core contributor to video diffusion and multimodal LLM research
    • Co-authored Show-1 (294 citations, pixel+latent T2V diffusion), Show-o (367 citations, unified multimodal understanding+generation transformer), and MotionDirector (165 citations, T2V motion customization); also curates Awesome-Video-Diffusion. h-index 19
    • Hireability: HIGH — explicitly listed as final-year PhD on website, prime transition window for industry roles
    JX

    Jinheng Xie

    high hireability

    PhD student@National University of Singapore

    Previously: Intern @ Google Research

    Singapore, SG

    • Lead author of Show-o (ICLR 2025, 357 citations) — unified multimodal understanding+generation transformer, and Show-o2 (NeurIPS 2025)
    • Also has video+multimodal work ('Learning Video Context as Interleaved Multimodal Sequences', 2025) and diffusion model papers
    • PhD at NUS ShowLab focused on 'unifying multimodal understanding and generation'
    • Hireability: HIGH — final-year PhD (graduation expected 2025-2026), currently only a Google internship not a full-time role, prime transition window
    LX

    Liangbin Xie

    high hireability

    Ph.D. Student@University of Macau

    Previously: MS Student @ Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences

    Montreal, CA

    • Strong video diffusion + multimodal fit: first-author on Image Conductor (AAAI 2025, video synthesis), Videopainter (2025, video inpainting/editing), SmartEdit (CVPR 2024 Highlight, multimodal MLLM-based image editing), and T2I-Adapter (1401 citations, controllable generation). 3rd-year PhD at U
    • Macau / SIAT, internships at Adobe Research and Kuaishou KLING
    • Well within <5 years experience
    • Hireability: HIGH — website explicitly states 'I am looking for full-time positions after summer 2026'; cv_update + new papers added Mar–Jun 2025
    AK

    Anil Kag

    medium hireability

    Research Scientist@Snap

    Previously: Research Assistant @ Boston University

    Los Angeles, US

    • Senior RSci leading Efficient Generative AI team at Snap; co-authored Snap Video (text-to-video, 102 citations, 2024), SF-V (single-forward video generation, 2024), and DenseDPO (video diffusion preference optimization, 2025) — exact video diffusion match
    • Note: overall ML career 10+ years, but specialization in video/multimodal diffusion started at Snap in 2023 (~3 years)
    • Hireability: MEDIUM — ~3 years into current role at Snap, no open-to-work signals on GitHub/LinkedIn/website (pipeline shows no changes), but within typical transition window
    BZ

    Bohan Zeng

    medium hireability

    PhD student@PhD student, Peking University

    • Strong video diffusion + multimodal AI researcher with NeurIPS 2022 (FNeVR face animation), AAAI 2024 (Controllable Mind Visual Diffusion Model), ICLR 2025 (IPDreamer 3D generation), and 2025 papers on video diffusion (spatio-temporal zero-shot synthesis) and multimodal LLMs (Mavors, VersaVid-R1)
    • Actively contributing to OpenDCAI/OpenWorldLib (unified world models framework) as of April 2026. h-index 12
    • PhD student at Peking University, ~3-4 years in program, <5 years experience
    • Hireability: MEDIUM — active PhD student, no explicit open-to-work signals, but likely approaching final year based on 2022 first publication; prime transition window
    DS

    Dazhong Shen

    medium hireability

    Associate Professor@Nanjing University of Aeronautics and Astronautics

    Previously: Researcher @ Shanghai Artificial Intelligence Laboratory

    Nanjing, CN

    • Strong diffusion model + multimodal researcher: video outpainting (Be-Your-Outpainter, 2024), MoVA mixture-of-vision-experts (80 citations), Phased Consistency Models
    • PhD USTC 2023, explicit expertise in Diffusion Models. ~3 years post-PhD
    • Hireability: MEDIUM — postdoc at NUAA since March 2025 (~13 months in, Jiangsu Postdoc Fellowship 2025), no explicit open-to-work signals but postdoc track typically has 2-3yr horizon
    GQ

    Guocheng Gordon Qian

    medium hireability

    Research Scientist@Snap

    Previously: Research Intern @ Snap

    San Francisco, US

    • Senior Research Scientist at Snap (since Dec 2023, ~1.5 years) directly working on video diffusion transformers and multimodal generation
    • Key papers: VD3D and AC3D (camera control in video diffusion transformers, 2024), 'I Think, Therefore I Diffuse' (multimodal in-context reasoning in diffusion models, 2025)
    • Also building VLM-diffusion models (5B–30B scale) and self-forcing for real-time streamable video at Snap
    • PhD KAUST 2023, ~2.5 years post-PhD — well within <5 years
    • Located in Palo Alto, CA
    • Hireability: MEDIUM — 1.5 years at current role (relatively new), no explicit open-to-work signal, but active website (48 updates, most recent Feb 2026) and personal site collaboration invite suggest engagement
    HY

    Hangjie Yuan

    medium hireability

    AI Research Scientist@Alibaba DAMO Academy

    Previously: Research Intern @ Alibaba DAMO Academy

    Hangzhou, CN

    • Core video diffusion researcher at Alibaba DAMO — lead/co-lead on VideoComposer (453 citations, NeurIPS 2023), ModelScopeT2V (610 citations), I2VGen-XL (299 citations), VGen, Lumos-1 (ICLR 2026), and UniLumos (NeurIPS 2025)
    • Strong multimodal work via RLIP/RLIPv2 series (relational language-image pre-training)
    • PhD ZJU, graduated Summer 2024, h_index 18
    • Hireability: MEDIUM — ~1.5-2 years post-PhD at DAMO, actively publishing (ICLR 2026 accepted), no explicit open-to-work signal but within the transition window
    HS

    Han Shi

    medium hireability

    Software Programmer@Alibaba

    Previously: Software Engineer @ Huawei

    Beijing, CN

    • Strong diffusion model expertise (DiffFit 104 cites, DiM: Diffusion Mamba 63 cites) and directly relevant multimodal LLM work (Visual Token Grouping for efficient MLLMs, 2025 CoT RL for multimodal LLMs)
    • No explicit video diffusion papers, but image diffusion + multimodal background is highly transferable
    • Principal Researcher at Huawei Noah's Ark Lab, Hong Kong; ~4 years post-PhD (HKUST, Twitter handle 'HanShi96' consistent with ~2022 graduation)
    • Hireability: MEDIUM — stable senior Huawei role, no job-seeking signals, website/GitHub dormant since Dec 2024, positions self as recruiting interns rather than seeking work
    HL

    Hanxue Liang

    medium hireability

    PhD student@University of Cambridge

    Previously: Research Scientist Intern @ NVIDIA

    GB

    • Direct match: first-authored/co-authored Diffusion4D (64 citations, 2024) — a video diffusion model framework for 4D content generation, plus L4GM (82 citations), Comp4D (LLM-guided 4D scene generation), and Feed-Forward Bullet-Time Reconstruction of dynamic scenes
    • Research expertise: 3D vision, neural representation learning, 4D reconstruction
    • H-index 14 as a Cambridge PhD student
    • Hireability: MEDIUM — PhD student at Cambridge, actively publishing through Oct 2025, no explicit job-seeking signals. Publication history suggests ~4-5 years into PhD (papers since 2021), likely nearing completion, prime transition window
    JC

    Junli Cao

    medium hireability

    Research Engineer@Snap

    Previously: Machine Learning Engineer @ Snap

    Los Angeles, US

    • Strong video diffusion researcher at Snap with directly relevant work: '4Real: Photorealistic 4D Scene Generation via Video Diffusion Models' (CVPR 2024, 48 cites), 'SF-V: Single Forward Video Generation Model' (2024, 20 cites), and 2025 paper on physical understanding in video generation. h-index 10, active through 2025
    • PhD student at UCLA CS (started ~2019)
    • Hireability: MEDIUM — likely late-stage PhD (~5-6 years in program as of 2026), no explicit job market signals but within typical completion/transition window; currently at Snap as Research Engineer
    LR

    Lingmin Ran

    medium hireability

    PhD student@National University of Singapore

    Singapore, SG

    • Core video diffusion researcher at Show Lab NUS (advisor: Mike Zheng Shou)
    • Authored Show-1 (text-to-video, pixel+latent diffusion hybrid), TPDiff (Temporal Pyramid Video Diffusion, ICLR 2026), X-Adapter (CVPR 2024, diffusion plugin compatibility), and EvolveDirector (NeurIPS 2024, VLM-guided text-to-image) — directly on-target for video diffusion + multimodal
    • PhD started 2023, <5 years experience
    • Hireability: MEDIUM — year 3 of 4-year PhD program (2023-2027), still ~1 year from typical graduation window, but active and productive
    AB

    Alexander William Bergman

    low hireability

    Principal Researcher@Hedra

    Previously: PhD student @ Stanford University

    • CTO/Co-founder of Hedra Labs (expressive portrait video generation), now founding Rhoda AI
    • Papers include 'Phased Consistency Models' (NeurIPS 2024, text-to-video diffusion acceleration) and 'Real-time One-Step Diffusion-based Expressive Portrait Videos Generation' (2024)
    • Stanford PhD
    • Core video diffusion expertise directly matching query
    • Hireability: LOW — serial founder actively building new startup (Rhoda AI) after leaving Hedra as co-founder; unlikely to leave own company
    CW

    Chaoyang Wang

    low hireability

    Research Scientist@Snap

    Previously: Graduate Research Assistant @ Carnegie Mellon University

    Los Angeles, US

    • PhD CMU 2023 (~3 yrs post-PhD experience)
    • Led video diffusion research at Snap Creative Vision Team — VD3D (ICLR 2025, camera-controllable video diffusion), 4Real-Video (CVPR 2025 Highlight, 4D video diffusion), 4Real (4D scene generation via video diffusion)
    • Research expertise explicitly includes video generation and 4D generation; h-index 19
    • Now building world models for robotics at Tesla AI
    • Hireability: LOW — joined Tesla AI ~4 months ago (Dec 2025 website update), no signals of active job searching
    CL

    Chieh Hubert Lin

    low hireability

    Research Scientist@Stealth AI Startup

    Previously: Intern @ RealityLabs - Surreal @ Meta

    US

    • Strong video diffusion researcher — published 'Motion-Conditioned Diffusion Model for Controllable Video Synthesis' (2023, 89 citations) and 'Taming Latent Diffusion Model for NeRF Inpainting' (2024); PhD at UC Merced completed May 2025
    • Now Research Scientist at Stealth AI Startup
    • Meets <5 years experience threshold
    • Hireability: LOW — recently started at Stealth AI Startup (~<1 year in role per pre-computed pipeline signal), consistent with typical low-mobility window for new hires
    DL

    Dongyang Liu

    low hireability

    Assistant Professor@The Chinese University of Hong Kong, Shenzhen

    Previously: Research Assistant Professor @ The Chinese University of Hong Kong

    Shenzhen, CN

    • Direct hit for both query dimensions: video diffusion (VEnhancer, Lumina-Video) and multimodal models (LLaMA-Adapter 962 citations ICLR 2024, SPHINX 315 citations, Lumina-T2X)
    • H-index 11 with core contributions to the Lumina series (video+image generation via flow-based diffusion transformers) and VEnhancer (video enhancement via diffusion). 2nd-year PhD student at MMLab/CUHK since 2024.09, supervised by Hongsheng Li; prior Master's at ICT/CAS (2021-2024) where most of this work was done
    • Hireability: LOW — only ~1.5 years into PhD (started Sept 2024), typically 3+ years before graduation window
    JW

    Jay Zhangjie Wu

    low hireability

    Research Scientist@NVIDIA

    Previously: Research Scientist Intern @ NVIDIA

    CA

    • Core video diffusion researcher — first author of Tune-A-Video (1116 citations, ICCV 2023), Show-1 (294 citations), and MotionDirector (165 citations); contributed to NVIDIA Cosmos world foundation model (215 citations)
    • PhD NUS (Dec 2025), research focus squarely on generative models for images, videos, 3D and 4D — a near-perfect match for the video diffusion/multimodal query
    • Hireability: LOW — joined NVIDIA Spatial Intelligence Lab in Dec 2025, already promoted to Senior Research Scientist by March 2026 (~4.5 months in); no open-to-work signals detected
    JL

    Jia-Wei Liu

    low hireability

    Research Scientist@Meta

    Previously: Research Scientist @ Meta

    San Francisco, US

    • Core video diffusion researcher: MagicAnimate (324 cites), Show-1 (288 cites), MotionDirector (165 cites) — textbook on-query for video diffusion
    • VideoLLM-online (96 cites) adds multimodal video depth. h_index=18, PhD NUS, now Research Scientist at Meta Superintelligence Lab (ex-FAIR) in Menlo Park
    • Hireability: LOW — recently settled at Meta with no job-seeking signals; internal move from FAIR to Superintelligence Lab is a career progression signal, not an exit signal

    Runs

    #1completed0 qualified / 0 foundApr 22, 6:55 AM