Back to dashboard

Video ai researchers with experience in multimodal learning, diffusion models, a…

completed49 qualified1 runApr 22, 1:25 AMvideo-ai-researchers-with-experience-in-multimodal-learning
Parsed4 topics · Researcher
Generating seed nodes
0 proposed
Explored 0 queries
0/0 done
    3
    Expanding nodes
    queued
    4
    Qualifying candidates
    queued

    Qualified Candidates (49)

    CY

    Ceyuan Yang

    medium hireability

    Research Scientist@Shanghai Artificial Intelligence Laboratory

    Previously: Researcher @ ByteDance

    San Francisco, US

    • Core video diffusion researcher — AnimateDiff (1.2k+ citations, ICLR 2024 Spotlight), LaVie, CameraCtrl, and SparseCtrl are all landmark video diffusion papers directly matching the query
    • Published 'Diffusion Adversarial Post-Training for One-Step Video Generation' (ICLR 2025) showing direct inference optimization expertise
    • Now Research Scientist at ByteDance Seed team (DB outdated: shows Shanghai AI Lab) building flagship video models Seedance 1.0 and Seaweed-7B
    • H-index 29
    • Hireability: MEDIUM — deeply embedded at ByteDance with extensive 2025 output, no open-to-work signals (website says they are hiring interns, not looking themselves), but within tenure window and ByteDance faces US regulatory headwinds
    CW

    Chaoyang Wang

    medium hireability

    Research Scientist@Snap

    Previously: Graduate Research Assistant @ Carnegie Mellon University

    Los Angeles, US

    • Research Scientist at Snap Creative Vision Team leading 4D reconstruction and generation
    • Strong match across all three query dimensions: video diffusion (VD3D — video diffusion transformers, ICLR 2025, 82 citations; 4Real — 4D scene generation via video diffusion, 48 citations; 4Real-Video 2025), inference/efficiency (DELTAv2: Accelerating Dense 3D Tracking; DELTA: Dense Efficient Long-Range 3D Tracking; LightSpeed: Light and Fast Neural Light Fields on Mobile), and multimodal/text-guided work (language-guided 3D scene editing, text-to-character blendshapes)
    • PhD CMU Robotics Institute, h-index 19
    • Hireability: MEDIUM — Research Scientist at Snap for ~4+ years, active publication output through Dec 2025, no explicit mobility signals but well within the 3-5 year transition window
    DH

    De-An Huang

    medium hireability

    VP, Global Supply Chain & Procurement@NVIDIA

    Previously: Senior Director, Semiconductor Sourcing @ NVIDIA

    US

    • Research Scientist at NVIDIA with near-perfect match to query
    • Leads NVILA (efficient frontier VLMs), Eagle 2/2.5 (long-context multimodal learning for video), STORM (token-efficient long video understanding), T-Stitch (diffusion sampling acceleration via trajectory stitching), and Efficient Video Diffusion Models
    • PhD Stanford, h-index 38, with 8+ highly-cited 2025 publications across all three query pillars
    • Hireability: MEDIUM — confirmed Research Scientist at NVIDIA (pipeline signals updated LinkedIn title to 'AI Research Scientist' in Jan 2026); no explicit open-to-work signals but active publication cadence and role identity update suggest continued engagement in research market
    DX

    Dejia Xu

    medium hireability

    Research Scientist@Luma AI

    Previously: Research Assistant @ University of Texas at Austin

    San Francisco, US

    • Research Scientist at Luma AI (video generation company)
    • Strong query match: Diffusion4D (video diffusion models, 2024), CamCo (image-to-video generation, 2025 CVPR Highlight), LightGaussian (15x 3DGS compression for 200+ FPS — direct inference optimization, 340 citations). h_index 28, PhD UT Austin (VITA Group)
    • Hireability: MEDIUM — estimated ~2-3 years at Luma AI based on paper timeline (started ~2023-2024), within typical transition window; no explicit open-to-work signals but website states 'open to coffee chats'
    DG

    Denis A Gudovskiy

    medium hireability

    Senior Deep Learning Researcher@Panasonic

    Previously: Senior Wireless Engineer @ Intel

    San Francisco, US

    • Strong multimodal and inference optimization researcher at Panasonic AI Lab (Mountain View, CA)
    • SparseVLM (ICML'25, 113 cites) on visual token sparsification for efficient VLM inference; 2025 paper on shortcutting diffusion/flow models for faster sampling; DFM: Dual Flow Matching (2024); CFLOW-AD (WACV'22, 719 cites) on flow-based generative models
    • Strong match on multimodal learning, diffusion/flow models, and inference optimization — video-specific work not evident. h-index 14
    • Hireability: MEDIUM — long tenure at Panasonic (~7+ years), Jan 2026 LinkedIn title change appears internal (still at Panasonic), no open-to-work signals
    DG

    Difei Gao

    medium hireability

    National U. of Singapore; Institute of Computing Technology, Chinese Academy of Sciences

    Previously: Postdoc @ National U. of Singapore

    SG

    • Strong match: published 'Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation' (292 citations, 2023) directly covering diffusion models for video; 'VideoLLM-online' (CVPR 2024, 97 citations) covers online/streaming video inference optimization; 'Egocentric Video-Language Pretraining' (NeurIPS 2022, 266 citations) covers multimodal video learning
    • H-index 20, based in Singapore at Show Lab NUS
    • Hireability: MEDIUM — long publication history since 2015 (likely postdoc or senior researcher at NUS), not listed in current Show Lab PhD roster, no explicit availability signals, but active publishing through 2025 and within typical transition window for academics
    FI

    Forrest Iandola

    medium hireability

    AI Research Scientist@Meta

    Previously: Head of Perception @ Anduril Industries

    San Francisco, US

    • Video AI researcher at Meta Reality Labs (2022-present) with strong alignment to all three query dimensions: published VEditBench (text-guided video editing, 2025) and EfficientSAM (CVPR 2024); co-authored score distillation/diffusion papers (SteinDreamer, Taming Mode Collapse 2024); inference optimization is his core brand — creator of SqueezeNet (50x parameter reduction), SqueezeBERT, MobileLLM (on-device LLMs)
    • PhD EECS UC Berkeley, h_index 26
    • Hireability: MEDIUM — ~4 years at Meta within typical transition window, but no active job-seeking signals (no LinkedIn changes, no website CV updates, no open-to-work bio)
    GL

    Guilin Liu

    medium hireability

    Research Scientist@NVIDIA

    Previously: Research Intern @ Adobe

    San Francisco, US

    • Research Scientist at NVIDIA with deep expertise across all three query dimensions: video AI (Video-to-Video Synthesis 2018, 1351 citations; slow-fast video multimodal LLM 2025), multimodal learning (Eagle/Eagle 2/Eagle 2.5 VLM series), and diffusion models (DiffiT, PYOCO video diffusion, 318 citations). h-index 24, strong publication record at NeurIPS/ECCV/ICCV
    • Hireability: MEDIUM — ~9 years at NVIDIA (long tenure), but pipeline signals show a position_update on website June 2025 suggesting possible role change; no explicit open-to-work signals
    GQ

    Guocheng Gordon Qian

    medium hireability

    Research Scientist@Snap

    Previously: Research Intern @ Snap

    San Francisco, US

    • Senior Research Scientist at Snap Research leading pretraining/post-training for 5B-30B VLM-diffusion models
    • Covers all three query dimensions: video generation with diffusion models (VD3D and AC3D — video diffusion transformers, 80+ citations each), multimodal learning (VLM-diffusion integration, Canvas-to-Image with multimodal controls), and diffusion models broadly (Magic123, ICLR24, 432 citations)
    • Some inference optimization work (GES efficient radiance field rendering, DELTAv2 accelerating 3D tracking, Dr2Net memory-efficient finetuning) but not his primary focus
    • PhD KAUST 2023, h-index 18
    • Hireability: MEDIUM — ~2-3 years at Snap (within transition window); website had moderately recent CV updates (56 days ago); no explicit open-to-work signal on GitHub or LinkedIn
    HW

    Haofan Wang

    medium hireability

    Member of Technical Staff@Lovart AI

    Previously: Senior Research Engineer @ Xiaohongshu

    Singapore, SG

    • Leading diffusion model researcher — InstantID (378 citations, 2024), InstantStyle (172 citations, 2024), EasyControl (2025 FLUX DiT control); founded InstantX open-source generative models team
    • MTS at Lovart AI (video generative AI startup)
    • Multimodal work includes video-language pre-training and CLIP/ECLIP
    • Inference optimization is implicit (InstantStyle 'free lunch' efficiency, EasyControl 'efficient' control) but not a primary focus
    • No PhD (MS CMU)
    • Hireability: MEDIUM — ~2 years at Lovart AI (joined 2024), within transition window; website actively updated through March 2026 with no explicit open-to-work signals
    HS

    Harry Saini

    medium hireability

    Weaver (Founding Research Engineer)@Black Forest Labs

    Previously: Research Engineer @ Stability AI

    IN

    • Founding Research Engineer at Black Forest Labs; co-authored 'Scaling Rectified Flow Transformers for High-Resolution Image Synthesis' (SD3, 2024) and FLUX.1 Kontext (2025) — core diffusion/flow-matching and multimodal image-text generation work
    • Strong match on diffusion models and multimodal learning; video AI is indirect (image generation background highly transferable to video diffusion); no direct inference optimisation evidence
    • Hireability: MEDIUM — founding team member at BFL (~2 years in role, within transition window), recently relocated from India to San Francisco per pipeline signals, no open-to-work signals detected
    HY

    Hongxu Yin

    medium hireability

    Principal Research Scientist@NVIDIA

    Previously: Senior II / Staff Research Scientist @ NVIDIA

    San Francisco, US

    • Principal Research Scientist & Research Lead at NVIDIA leading the VILA multimodal LLM series — video AI (AutoGaze CVPR 2026: 100x token reduction, 19x speedup on long-form video), multimodal learning (VILA/NVILA/LongVILA, 654+ citations), diffusion-guided generation (Loss-Guided Diffusion Models, 157 citations), and inference optimization (NVILA efficiency work). h-index 38
    • Hireability: MEDIUM — pipeline shows position_update signals Jul 2025 (likely internal promotion at NVIDIA), currently recruiting for own NVIDIA team; no open-to-work signals, but senior enough to be worth a targeted approach
    HL

    Huan Ling

    medium hireability

    Senior Research Scientist@Nvidia

    Previously: Research Scientist @ Nvidia

    Toronto, CA

    • Core contributor to video diffusion research: co-authored 'Align your Latents' (1519 citations, foundational video LDM), Sana-Video (ICLR 2026 Oral, efficient Block Linear Diffusion Transformer — strong inference optimization angle), NVIDIA Cosmos/Cosmos-Transfer1 (multimodal physical AI world models), and Gen3C (video generation with camera control). h_index 21, Research Manager at NVIDIA
    • Website says 'Working on a startup and building a research team
    • We're hiring' — strong signal of career transition
    • Hireability: MEDIUM — promoted to Research Manager at NVIDIA in July 2025 (~9 months in role), but active startup-building signals suggest potential openness to conversations
    KK

    Karsten Kreis

    medium hireability

    Principal Research Scientist@NVIDIA

    Previously: Senior Research Scientist II @ NVIDIA

    Vancouver, CA

    • Principal Research Scientist at NVIDIA Research (Vancouver, H-index 32) with landmark contributions across all three query dimensions: authored 'Align Your Latents: High-Resolution Video Synthesis with Latent Diffusion Models' (2023, 1507 citations), 'Align Your Steps: Optimizing Sampling Schedules in Diffusion Models' (2024, inference optimization), and 'eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers' (984 citations, multimodal)
    • Deep expertise in score-based and denoising diffusion models
    • Hireability: MEDIUM — senior/stable role at NVIDIA, but pipeline shows a position_update in March 2025 and recent GitHub activity (April 2026), signalling career motion; recent pivot toward protein/molecular design may indicate openness to new directions
    KA

    Kelsey R Allen

    medium hireability

    Senior Research Scientist@DeepMind

    Previously: Research Assistant @ UC Davis/Stanford

    London, GB

    • Senior RSc at DeepMind (London) with recent pivot toward diffusion models (Neural Assets: 3D-Aware Multi-Object Scene Synthesis with Image Diffusion Models, ICLR 2024) and video AI (Scaling 4D Representations 2024; Direct Motion Models for Assessing Generated Videos 2025)
    • Also has VLM/multimodal papers in 2025
    • Primary background is cognitive science + physics simulation; inference optimisation is absent
    • Co-author of Aleksander Holynski (video generation researcher)
    • Hireability: MEDIUM — Senior RSc at DeepMind with no pipeline mobility signals, established role, no open-to-work indicators
    RG

    Ruiqi Gao

    medium hireability

    Staff Research Scientist@DeepMind

    Previously: Research Scientist @ Google

    San Francisco, US

    • Co-author of Imagen Video (video + diffusion models, 2022) and CAT4D (4D multi-view video diffusion, CVPR 2025)
    • Key inference optimization work: EM Distillation for one-step diffusion (NeurIPS 2024) and On Distillation of Guided Diffusion Models
    • Research at Google DeepMind spans multimodal learning across language, image, video, and 3D
    • Staff RS, h-index 24, based in SF
    • Hireability: MEDIUM — Staff RS at Google DeepMind (4+ years), but removed CV link from public website on April 10, 2026 (12 days ago) after a July 2025 cv_update — a subtle career-motion signal worth noting
    YL

    Yam Levi

    medium hireability

    Research Engineer@Black Forest Labs

    Previously: Data Research @ Stability AI

    Vancouver, CA

    • Founding Research Engineer at Black Forest Labs; co-authored 'Scaling Rectified Flow Transformers for High-Resolution Image Synthesis' (SD3/FLUX foundation paper, ICML 2024 Oral, 2896 citations) — direct evidence of diffusion model expertise and multimodal text-image architecture (MM-DiT bidirectional image+text token flow)
    • Works on the FLUX model series at BFL, which has expanded into video generation
    • Hireability: MEDIUM — founding team member at BFL (~1.5-2 years), title recently upgraded to 'Founding Research Engineer', no open-to-work signals detected
    ZE

    Zion English

    medium hireability

    Research Scientist@Black Forest Labs

    Previously: Machine Learning Engineer @ Stability AI

    Irvine, US

    • Core author on SDXL and SD3 (Scaling Rectified Flow Transformers) at Stability AI — two landmark diffusion model papers
    • Now Research Scientist at Black Forest Labs (co-authored with co-founder Andreas Blattmann), working on FLUX models which cover text-to-image and inference-optimised generation (FLUX.2 [klein] sub-second inference, 2026)
    • SD3 architecture uses bidirectional multimodal attention between image and text tokens
    • Based in Irvine, US
    • Hireability: MEDIUM — ~1.5-2 years into current role at BFL (founded mid-2024), no explicit open-to-work signals, within typical transition window
    AD

    Alex Dimakis

    low hireability

    Co-Founder and Chief Scientist@Bespoke Labs

    Previously: Professor @ University of Texas at Austin

    San Francisco, US

    • Tenured Full Professor at UC Berkeley EECS + Co-Founder/Chief Scientist at Bespoke Labs
    • Strong diffusion models researcher (h-index 74): Soft Diffusion, Ambient Diffusion, multiple NeurIPS papers on diffusion for inverse problems + DDIM-type sampler analysis
    • Multimodal: DataComp (611 citations) and DataComp-LM (155 citations)
    • Recent video AI work: Warped Diffusion (solving video inverse problems with image diffusion models, 2024), ego-exo viewpoint video papers (2024-2025)
    • Inference optimization is weakest pillar
    • Hireability: LOW — tenured professor and actively running his own startup (Bespoke Labs), no pipeline signals of career motion
    AS

    Alex Schwing

    low hireability

    Associate Professor@University of Illinois Urbana-Champaign

    Previously: Assistant Professor @ University of Illinois Urbana-Champaign

    Urbana-Champaign, US

    • Strong match across all three query dimensions: video AI (CVPR 2024 video object segmentation, ICCV 2023 Tracking Anything), multimodal learning (MMAudio CVPR 2025 multimodal video-to-audio synthesis), and diffusion/inference efficiency (DiT-Air 2025 diffusion architecture efficiency, Variational Rectified Flow ICML 2025)
    • H-index 74, leading vision/ML researcher at UIUC ECE
    • Hireability: LOW — tenured Associate Professor with AMD named fellowship, NSF CAREER Award, and active industry research collaborations (NVIDIA, Adobe, Microsoft); no job-seeking signals
    AH

    Ali Hatamizadeh

    low hireability

    Research Scientist@NVIDIA

    Previously: PhD student @ University of California, Los Angeles

    San Francisco, US

    • Research Scientist at NVIDIA with direct relevance across all 3 query dimensions: diffusion models (DiffiT, ECCV 2024, 122 citations), inference optimization (FasterViT, ICLR 2024, 148 citations), and multimodal/video-applicable vision backbones (MambaVision, CVPR 2025, 333 citations)
    • Also contributing to Mamba/SSM architecture research (Gated Delta Networks, ICLR 2025)
    • H-index 31, based in SF
    • Hireability: LOW — extremely productive at NVIDIA with back-to-back top venue papers (CVPR 2025, ICLR 2025, ICLR 2026), zero mobility signals in pipeline or GitHub, no open-to-work indicators anywhere. Very settled
    AK

    Angjoo Kanazawa

    low hireability

    Assistant Professor@University of California, Berkeley

    Previously: Research Scientist @ Google

    San Francisco, US

    • Top-tier video AI researcher at UC Berkeley (KAIR lab, h-index 55)
    • Directly relevant work across all three query dimensions: video AI (Shape of Motion, MegaSaM CVPR 2025 Best Paper HM, Segment Any Motion), diffusion models (Decentralized Diffusion Models CVPR 2025, Rethinking Score Distillation, State of the Art on Diffusion Models for Visual Computing survey), and inference optimization (NerfAcc efficient NeRF sampling, PlenOctrees real-time rendering)
    • Also Amazon Scholar on Frontier AI & Robotics team, ex-Google Research, ex-Luma AI CTA
    • Hireability: LOW — tenured-track professor at UC Berkeley, Sloan Fellow 2023, PAMI Young Researcher 2024, no pipeline signals of career movement. Exceptional candidate to reach out to but unlikely to move
    AS

    Axel Sauer

    low hireability

    Co-Founder@Black Forest Labs

    Previously: Research Scientist @ Stability AI

    Freiburg, DE

    • Co-Founder of Black Forest Labs (FLUX image generation models) and core author of Stable Diffusion 3 (2273 citations, ICML 2024 Best Paper)
    • ADD (Adversarial Diffusion Distillation) and LADD papers are directly relevant to inference optimisation in diffusion models
    • Image-focused, not video-specific, but his diffusion + inference optimisation expertise is highly query-relevant
    • Hireability: LOW — Co-Founder at BFL, actively building the company; very unlikely to be seeking new roles
    BM

    Ben Mildenhall

    low hireability

    Co-Founder@World Labs

    Previously: Research Scientist @ Google

    San Francisco, US

    • NeRF inventor and co-author of DreamFusion (text-to-3D via diffusion, 3.3K citations) and ReconFusion (3D reconstruction with diffusion priors) — directly relevant to diffusion models and multimodal learning
    • Inference optimization evidenced by MERF (memory-efficient radiance fields for real-time view synthesis) and Baking NeRF
    • Currently Co-Founder at World Labs building 3D world models
    • Hireability: LOW — co-founder of a well-funded startup (World Labs, led by Fei-Fei Li); no open-to-work signals; pipeline shows a single cv_update 8 months ago (Aug 2025), which is not a strong mobility signal
    BP

    Ben Poole

    low hireability

    Senior Staff Research Scientist@DeepMind

    Previously: Research Scientist @ Google

    San Francisco, US

    • Core video AI + diffusion researcher at Google DeepMind: leads GenMedia 3D team, senior author on Veo 3, Imagen Video (2022, 2034 citations), CAT4D (CVPR 2025 Oral), DreamFusion (text-to-3D, 3287 citations), Variational Diffusion Models (1546 citations), and EM Distillation for One-step Diffusion Models (2024, inference optimization)
    • PhD Stanford. h-index 45
    • Hireability: LOW — Senior Staff at DeepMind leading their flagship video generation program (Veo), no LinkedIn or website activity signals, just shipped Veo 3, deeply embedded
    BG

    Bernard Ghanem

    low hireability

    Advisor@CAMEL-AI.org

    Previously: Deputy Director of AI Initiative @ KAUST

    London, GB

    • Full Professor at KAUST and Chair of Center of Excellence for Generative AI; leads IVUL (Image and Video Understanding Lab)
    • Strong match on all three query dimensions: video AI (temporal action detection, video generation), multimodal learning (multimodal egocentric datasets, BOLT for long-form video), and diffusion/inference acceleration (Vivid-ZOO NeurIPS 2024, Adaptive Guidance for diffusion)
    • H-index 82
    • Hireability: LOW — tenured Full Professor at KAUST with multiple senior leadership roles (Chair of CoE for Generative AI, PI of IVUL); no pipeline signals of career transition; DB pre-computed hireability also rated low
    BZ

    Bolei Zhou

    low hireability

    Associate Professor@University of California, Los Angeles

    Previously: Chief AI Scientist @ Coco Robotics

    Los Angeles, US

    • Strong video AI and multimodal researcher at UCLA (h-index 78): Temporal Relation Networks for video understanding, audio-driven video portrait generation, diffusion-based scene generation ('Urban Scene Diffusion' 2024, 'Ctrl-X' 2024), and VLA/multimodal learning work (X-fusion 2025, co-speech gesture generation)
    • Inference optimization is a minor thread (QuantV2X quantization)
    • Hireability: LOW — recently promoted to Associate Professor at UCLA (pipeline shows position_updates May-June 2025), holds NSF CAREER + ONR Young Investigator + Intel Rising Star awards; well-entrenched in academia with no signals of seeking industry roles
    BC

    Brian Curless

    low hireability

    Researcher@Google

    Previously: Professor @ University of Washington

    Seattle, US

    • Active video AI researcher with strong recent output on diffusion models and video synthesis: 'MusicInfuser' (multimodal audio+video diffusion, 2025), 'Generative Inbetweening' (image-to-video keyframe interpolation, 2025), 'ExtraNeRF' (NeRF + diffusion models, 2024), 'HumanNeRF' (free-viewpoint video rendering, 2022, 665 citations), 'FILM' (video frame interpolation, 2022)
    • No inference optimization papers found
    • Hireability: LOW — tenured Professor at UW Allen School (h-index 67), Google Research collaborator; no signals of job search activity
    BC

    Bryan Catanzaro

    low hireability

    Vice President, Applied Deep Learning Research@NVIDIA

    Previously: Senior Researcher @ Baidu

    San Francisco, US

    • VP of Applied Deep Learning Research at NVIDIA (h-index 73); leads multimodal LLM work (NVLM, Eagle VLMs), pioneered Few-shot Video-to-Video Synthesis (2019), authored DiffWave (diffusion model), and co-created cuDNN — the foundational GPU inference library
    • Hits all three pillars: multimodal, diffusion/generative, and inference optimization
    • Hireability: LOW — long-tenured VP at NVIDIA with no pipeline or open-to-work signals
    DL

    Dahua Lin

    low hireability

    Associate Professor@The Chinese University of Hong Kong

    Previously: Assistant Professor @ The Chinese University of Hong Kong

    Hong Kong, HK

    • Prolific video AI + multimodal researcher (h-index 118) with directly relevant work: LaVie (cascaded latent diffusion for video generation, 417 citations), Vchitect-2.0 (parallel transformer for video diffusion, 2025), InternVL3 (open-source multimodal models, 370 citations), PyramidDrop (inference acceleration for vision-language models, 86 citations)
    • Director of CUHK-SenseTime Joint Laboratory
    • Hireability: LOW — tenured Associate Professor at CUHK since 2020 with no signals of career transition; typically this profile stays in academia
    DJ

    David Jacobs

    low hireability

    Professor@University of Maryland

    Previously: Engineering Manager @ Meta

    Bethesda, US

    • Tenured CS professor at UMD (h-index 68) with strong video AI output directly relevant to query: 'Preserve your own correlation: noise prior for video diffusion models' (2023, 318 cites), 'Long video generation with VQGAN+transformer' (2022, 293 cites), CinePile video QA benchmark (2024, 77 cites), plus 2025 papers on multimodal agentic control and video captioning
    • No specific inference optimisation work
    • Hireability: LOW — tenured professor with no pipeline signals of career transition or open-to-work indicators
    DL

    Dominik Lorenz

    low hireability

    Researcher@Stability AI

    Previously: Doctoral Researcher @ Karlsruher Institut für Technologie (KIT)

    Karlsruhe, DE

    • Core member of Robin Rombach's group — co-authored Stable Video Diffusion, Adversarial Diffusion Distillation (1–4 step inference), Latent Diffusion Models (Stable Diffusion), SD3, and multi-modal flow matching (image, video, audio)
    • Now at Black Forest Labs publishing FLUX.1 Kontext and self-supervised multi-modal synthesis (2026)
    • Directly matches query across video AI, diffusion models, and inference optimisation
    • Hireability: LOW — recently joined BFL (2024), actively embedded in the founding group, has followed Rombach/Blattmann/Esser through KIT → Stability AI → BFL; no open-to-work signals
    ER

    Elisa Ricci

    low hireability

    Head of Research Unit@Fondazione Bruno Kessler

    Previously: Associate Professor @ Università di Trento

    Trento, IT

    • Full Professor at UniTrento / Head of Research Unit at FBK (h-index 64)
    • Strong video AI profile: co-authored 'Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis' (2024, 102 cites), multiple diffusion-for-video papers (2023-2024), and 2025 work on multimodal LLMs for video
    • Diffusion models + video + multimodal trifecta clearly hit
    • Inference optimisation less explicit but test-time/training-free methods present
    • Hireability: LOW — Full Professor (confirmed OpenReview) + Head of Research Unit at FBK, no LinkedIn/website activity or job-change signals; tenured academic unlikely to leave
    HL

    Hsin-Ying Lee

    low hireability

    Researcher@stealth mode startup

    Previously: Research Scientist @ Snap

    • Senior researcher (h-index 34, PhD UC Merced 2020) with direct match on all 3 query dimensions: video AI (Panda-70m 2024, VD3D 2025, 4Real 2024), multimodal learning (cross-modality video captioning, Show Me What and Tell Me How), and diffusion models (video diffusion priors, 4D scene generation). 5 years at Snap Research
    • Hireability: LOW — recently left Snap to found/join stealth startup focused on AI+3D+architecture; likely building her own venture and unlikely to be open to external roles
    IG

    Igor Gilitschenski

    low hireability

    Assistant Professor@University of Toronto

    Previously: Research Scientist (visiting) @ Toyota Research Institute

    Toronto, CA

    • Lab director at UofT TISL with prolific, current output in video generation (SG-I2V image-to-video 2025, DenseDPO video diffusion 2025, Mind the Time 2025) and diffusion models (SlotDiffusion 68 cites, SPAD 48 cites)
    • Multimodal work via Vid2Robot (video-conditioned cross-attention transformers, 49 cites) and EventCLIP (event-camera + CLIP, 30 cites)
    • Some inference efficiency work (neural pruning 19 cites, efficient latent-space NeRF)
    • H-index 40
    • Hireability: LOW — actively running own research lab as Assistant Professor at UofT, no pipeline signals of career movement
    IK

    Ira Kemelmacher-Shlizerman

    low hireability

    Principal Scientist, Director@Google

    Previously: Senior Staff Research Scientist @ Google

    Seattle, US

    • Prolific video AI and diffusion model researcher (h-index 36): led TryOnDiffusion (2023, 189 cites), Fashion-VDM video diffusion model (2024), MusicInfuser multimodal video+audio generation (2025), and Generative Inbetweening for video models (2024)
    • Directly covers all three query pillars — video AI, diffusion models, and multimodal learning — with active 2024-2025 output
    • Hireability: LOW — tenured professor at UW + Principal Scientist/Director at Google with no mobility signals (no LinkedIn changes, no website activity detected)
    IS

    Ivan Skorokhodov

    low hireability

    Research Scientist@Snap

    Previously: Research Scientist @ Snap

    San Francisco, US

    • Core video AI researcher and author of Snap Video (text-to-video spatiotemporal transformers, CVPR 2024, 95 cites), SF-V (single-forward-pass inference optimisation for video diffusion, 2024), Hierarchical Patch Diffusion Models for high-res video (CVPR 2024), and VIMI (multimodal instruction grounding for video generation)
    • Research expertise per DB: 'diffusion models, video generation, autoencoders, generative models'. h-index 21, PhD KAUST 2023
    • Palo Alto, US
    • Hireability: LOW — GitHub bio updated to 'Research at RhodaAI' with pipeline position_update on Jan 15 2026; ~3 months into a new startup role, unlikely to move so soon
    JH

    Jia-Bin Huang

    low hireability

    Research Scientist@Meta

    Previously: Assistant Professor @ Virginia Tech

    College Park, US

    • Capital One endowed Associate Professor at UMD with strong video AI + diffusion model track record: PYoCo (noise prior for video diffusion, 318 citations), Latent-Shift (efficient latent diffusion for text-to-video, 151 citations), FlowVid (video-to-video synthesis via optical flow + diffusion)
    • Teaches a Multimodal Foundation Models course
    • Inference-efficiency angle present (Latent-Shift). h-index 69
    • Hireability: LOW — endowed associate professorship at UMD, actively recruiting PhD students for Fall 2026; no open-to-work signals; career firmly in academia
    JR

    Jian Ren

    low hireability

    Research Scientist@Snap Inc.

    Previously: Staff Research Scientist @ Snap

    Los Angeles, US

    • Former Principal Research Scientist at Snap Inc., now Head of Creative Tech Research at Netflix (acquired via InterPositive as founding CSO)
    • H-index 34
    • Covers all three query dimensions: SnapFusion (text-to-image diffusion on mobile in 2s — inference optimization), Panda-70M (70M video dataset with cross-modality captioning — video AI + multimodal), Snap Video (spatiotemporal transformers for text-to-video), plus BitsFusion (1.99-bit diffusion weight quantization) and EfficientFormer (mobile-speed vision transformers)
    • Hireability: LOW — joined Netflix through acquisition ~11 months ago, now in research leadership role; no open-to-work signals
    JZ

    Jun-Yan Zhu

    low hireability

    Michael B. Donohue Assistant Professor of Computer Science and Robotics@Carnegie Mellon University

    Previously: Research Scientist @ Adobe

    Pittsburgh, US

    • Exceptional match across all three query dimensions: (1) Video AI — MotionStream (ICLR 2026, real-time video generation), Video-to-Video Synthesis (NeurIPS 2018), multi-subject video personalization (CVPR 2025); (2) Diffusion models — SVDQuant (4-bit diffusion quantization, ICLR 2025), Custom Diffusion (CVPR 2023), SDEdit (2K+ citations), img2img-turbo (one-step SD); (3) Inference optimization — SVDQuant (4-bit quantization), GAN Compression, Efficient Spatially Sparse Inference for GANs/diffusion
    • H-index 59 with foundational work on CycleGAN and pix2pix
    • Hireability: LOW — tenured-track assistant professor at CMU running the Generative Intelligence Lab with active PhD students; no job-seeking signals anywhere (no LinkedIn changes, no website activity, neutral GitHub bio)
    KM

    Kevin Patrick Murphy

    low hireability

    Principal Scientist@DeepMind

    Previously: Senior Staff Research Scientist @ Google

    San Francisco, US

    • Exceptional fit: authored VideoBERT (1678 citations, video+multimodal), multiple 2024-2025 diffusion papers including EM Distillation for one-step diffusion (inference optimisation), and 'Direct Motion Models for Assessing Generated Videos' (2025)
    • H-index 108
    • Principal Scientist at Google DeepMind, SF
    • Hireability: LOW — long-tenured, extremely established senior researcher at DeepMind with no pipeline signals of career motion (no LinkedIn changes, no website activity)
    KA

    Kfir Aberman

    low hireability

    Founding Member, US office@Decart

    Previously: Principal Research Scientist @ Snap

    San Francisco, US

    • World-class diffusion models researcher (DreamBooth 3.8K citations; HyperDreamBooth for fast inference-time personalization; VideoAlchemy + Multi-subject video personalization 2025)
    • MyVLM shows multimodal experience
    • H-index 30, SF-based, now Founding Member at Decart (video AI company)
    • Hireability: LOW — joined Decart ~3 months ago as Founding Member (high equity/commitment), DB pipeline confirms tenure_months=3 with high confidence
    MR

    Michael Rubinstein

    low hireability

    Principal Scientist@DeepMind

    Previously: Research Scientist @ Google

    Boston, US

    • Senior video AI researcher at Google DeepMind (Principal Scientist / Director)
    • Led Lumiere (2024, space-time video diffusion model), DreamBooth (CVPR 2023 Best Student Paper HM), Muse (text-to-image masked generative transformer), and StyleDrop — strong alignment with video generation, diffusion models, and multimodal learning. h-index 48
    • No inference optimisation work specifically but very strong on the video+diffusion dimensions
    • Hireability: LOW — entrenched Principal Scientist/Director at DeepMind, no LinkedIn changes or website activity detected, no open-to-work signals
    NW

    Neal Wadhwa

    low hireability

    Staff Software Engineer@Google

    Previously: Staff Software Engineer @ Google

    New York, US

    • PhD MIT / h-index 30, Staff SWE at Google Research (NYC)
    • Strong fit on video AI (Phase-based Video Motion Processing, ReCapture 2024/2025 generative video camera controls) and diffusion models (HyperDreamBooth 273-cite, RealFill image completion)
    • Weaker on inference optimisation — no published work, though llama.cpp fork suggests practical interest
    • Hireability: LOW — ~9.5 years at Google with no open-to-work signals, no LinkedIn or website activity changes detected, actively publishing in 2025 which suggests he's comfortable where he is
    PE

    Patrick Esser

    low hireability

    PhD student@Heidelberg University

    • Co-founder of Black Forest Labs and co-creator of Stable Diffusion, Latent Diffusion Models, and FLUX.1
    • Directly relevant to all three query dimensions: video synthesis ("Structure and content-guided video synthesis with diffusion models", 2025), diffusion models (foundational LDM/SD work, SD3 Rectified Flow 2024), and inference optimization (LADD — adversarial diffusion distillation for 4-step synthesis, 2024). h-index 19, 4000+ citations for Taming Transformers alone
    • Hireability: LOW — co-founder of BFL (~50 people, founded 2024), actively shipping FLUX.1 Kontext (2025); unlikely to leave his own company
    RE

    Rahim Entezari

    low hireability

    Applied Scientist@Wayve

    Previously: Research Scientist @ Stability AI

    London, GB

    • SD3 co-author (2592 citations, ICML 2024 best paper) with strong diffusion model experience
    • Also on SD3.5-Flash (fast inference via flow distillation) and Stable Cinemetrics (professional video generation eval, NeurIPS 2025)
    • Multimodal background via DataComp (663 citations)
    • Research expertise: text-to-image/text-to-video, diffusion models, inference optimization, multimodal data curation
    • Based in London at Wayve
    • Hireability: LOW — joined Wayve ~Jan 2026 (~3-4 months ago), already promoted to Senior Applied Scientist; too new to be moving again
    RR

    Robin Rombach

    low hireability

    Researcher@Black Forest Labs

    Previously: Researcher @ Stability AI

    • Original creator of Stable Diffusion / Latent Diffusion Models (LDM) and SDXL; co-authored image-to-video synthesis (CVPR 2021); Stability AI generative-models repo includes Stable Video Diffusion
    • Strong match on diffusion models + multimodal (text-to-image conditioning)
    • Hireability: LOW — co-founder of Black Forest Labs (est. 2024, building FLUX frontier generative models), actively building own company
    SK

    Sumith Kulal

    low hireability

    co-founder and research scientist@Black Forest Labs

    Previously: research scientist @ Stability AI

    • Co-founder of Black Forest Labs and core contributor to FLUX.1 and SD3 (Scaling Rectified Flow Transformers)
    • Strong diffusion model background from Stability AI and Black Forest Labs; PhD from Stanford
    • Primarily image generation (FLUX.1), not video specifically, but foundational to the broader diffusion+multimodal domain the query targets
    • Hireability: LOW — co-founder actively building Black Forest Labs, no open-to-work signals, no LinkedIn changes or recent CV updates
    TD

    Tim Dockhorn

    low hireability

    Co-Founder@Black Forest Labs

    Previously: Research Scientist @ Stability AI

    Waterloo, CA

    • Co-founder & research scientist at Black Forest Labs (FLUX.1)
    • Deep diffusion model expertise: co-authored SDXL (ICLR 2024 Spotlight), Stable Video Diffusion, FLUX/SD3 (ICML 2024 Best Paper), and GENIE (higher-order denoising diffusion solvers — inference optimization)
    • PhD from U Waterloo, h-index 12
    • Hits all three query dimensions: video AI (SVD), diffusion models (SDXL/FLUX/SD3), and inference optimization (GENIE, LADD distillation)
    • Hireability: LOW — co-founder of a well-funded AI startup (BFL), actively building FLUX.1; no job-seeking signals and no website activity in 6+ months

    Runs

    #1completed0 qualified / 0 foundApr 22, 1:25 AM