Back to dashboard

Video ai researchers with experience in multimodal learning, diffusion models, a…

completed9 qualified1 runApr 22, 1:47 AMvideo-ai-researchers-with-experience-in-multimodal-learning-1776822446
ParsedOpenAI · 4 topics · Junior · Researcher · United States
Generating seed nodes
0 proposed
Explored 0 queries
0/0 done
    3
    Expanding nodes
    queued
    4
    Qualifying candidates
    queued

    Qualified Candidates (9)

    SK

    Soo Min Kwon

    high hireability

    Graduate Research Assistant@University of Michigan, Ann Arbor

    Previously: Student Researcher @ Google

    Ann Arbor, US

    • Final-year PhD student at UMich (ECE) specializing in diffusion models for inverse problems and efficient neural network inference
    • Strong query-relevant work: ICLR 2024 Spotlight on latent diffusion models (hard data consistency), NeurIPS 2024 BLAST paper on block-level structured matrices for efficient DNN inference
    • No explicit video AI or multimodal work, but diffusion model expertise directly transfers to video generation; inference optimization work (BLAST) is directly relevant
    • US-based (Ann Arbor), not from excluded companies
    • Hireability: HIGH — final year PhD student in prime transition window, Google Research NYC internship completed Nov 2025, 20 website changes including multiple cv_update and position_update signals indicating active job market engagement
    BB

    Bikram Boote

    medium hireability

    Graduate Research Assistant@University of Illinois Urbana-Champaign

    Previously: Software Development Engineer @ Amazon

    Champaign, US

    • Strong video AI + multimodal researcher at UIUC Rehg Lab: Ego-Exo4D (CVPR 2024, 333 citations), CVPR 2024 Oral on multimodal social interaction modeling, ECCV 2024 point tracking
    • Egocentric vision and hand-object interaction are core expertise
    • No diffusion model or inference optimization work evident
    • US-based (Champaign, IL), PhD student ~yr 2-3, <3 years industry experience fits query
    • Hireability: MEDIUM — 2-3 years into PhD at UIUC, not yet in final-year transition window, no open-to-work signals detected
    BS

    Bipasha Sen

    medium hireability

    Founder@Stealth Startup

    Previously: Graduate Research Assistant @ MIT CSAIL

    San Francisco, US

    • Video AI researcher with INR-V (video generation, TMLR 2022), FaceOff (video-to-video face swapping, WACV 2023), diffusion models work (EDMP, 2024), and multimodal research (ConceptGraphs, lipreading)
    • MIT PhD, early-career, no OpenAI/DeepMind/xAI
    • Inference optimization not clearly evidenced
    • Hireability: MEDIUM — currently Founder at stealth startup (low by default), but GitHub bio explicitly states 'looking to work on the next big challenge!', signaling openness to new roles
    DA

    Daksh Aggarwal

    medium hireability

    AI Research Summer Associate@Balyasny Asset Management

    Previously: Undergraduate Researcher @ The Fields Institute For Research In Mathematical Sciences

    Austin, US

    • Published 'Force Prompting: Video Generation Models Can Learn and Generalize Physics-based Control Signals' (NeurIPS 2025) and a follow-up 'Goal Force' (2026), both directly on video generation
    • Also co-authored 'Self-Correcting Self-Consuming Loops For Generative Model Training' (ICML 2024 + ICLR 2025) on stabilizing generative model training
    • Math/AI PhD student at Brown with clear pivot to video generation and generative models — 4 top-venue papers in 3 years
    • Not at excluded companies (summer intern at Balyasny hedge fund, not OpenAI/DeepMind/xAI)
    • Inference optimization is not covered, but video generation + diffusion training is strong
    • Hireability: MEDIUM — PhD student at Brown, started publishing 2022, likely 3rd-4th year; no explicit job-seeking signals but within normal transition window
    FR

    Fiona Ryan

    medium hireability

    Graduate Researcher@Georgia Institute of Technology

    Previously: Student Researcher @ Meta

    Atlanta, US

    • Strong video AI and multimodal learning researcher (Ego4D + Ego-Exo4D co-author, egocentric gaze estimation, audio-visual gaze anticipation, CVPR 2025 x3 including two Highlights)
    • No clear diffusion model work; inference optimization absent — Polar-VL uses LoRA-style parameter updates (adjacent but not inference opt)
    • US-based (Atlanta), <3 years industry experience (PhD student throughout)
    • Hireability: MEDIUM — just defended PhD dissertation (April 2026), but removed 'looking for postdoc opportunities' statement from website in Nov 2025, suggesting she may already have a role lined up
    NG

    Nate Gillman

    medium hireability

    Research Intern@Google

    Previously: Research Intern @ Amazon

    New York, US

    • Video generative modeling PhD student at Brown with NeurIPS 2025 paper on physics-based video generation (Force Prompting), ICML 2024 paper on generative model training (Self-Correcting Self-Consuming Loops), and ICLR 2025 paper on LLM distributions
    • Strong fit for video AI + diffusion models
    • No explicit inference optimization work
    • US-based (NY), <3 years work exp (internships at Google Research 2025, Amazon Science 2024)
    • Not at excluded companies
    • Hireability: MEDIUM — still active PhD student at Brown (latest commit Feb 2026), graduation timeline unclear from available signals
    XH

    Xinyu Hu

    medium hireability

    applied scientist@Microsoft

    Previously: MS student @ Stanford University

    • Currently building agentic RL for video AI independently (nomadic, SF-based); formerly Applied Scientist & Tech Lead at Microsoft AI on long-horizon multimodal reasoning
    • Strong diffusion models paper (Solving Inverse Problems with Latent Diffusion Models, 157 citations, ICLR adjacent), video generation metric paper (WYSIWYM 2024), and multimodal evaluation work
    • DB expertise: diffusion models, multimodal foundation models, LLMs
    • Stanford MS CME, CA-based
    • Work experience borderline: ~3-4 years post-MS (nominally above <3yr requirement)
    • Hireability: MEDIUM — building independently (genmini-ai/OpenCanvas startup mode), website quiet for 7+ months; not explicitly on market but may be open given solo-building phase
    ZT

    Zhengxu Tang

    medium hireability
    • PhD student at UMich (Liyue Shen lab) with strong diffusion model and video AI work: CCS controllable diffusion sampling (NeurIPS 2025 poster), latent space disentanglement in diffusion transformers, and SeqBench benchmarking text-to-video models
    • Solid multimodal learning background via vision-language pre-training
    • Inference optimization is weak (no dedicated speed/efficiency work)
    • Based in Ann Arbor, MI
    • Hireability: MEDIUM — appears 2-3 years into PhD with no explicit graduation signal or job market activity detected
    ZZ

    Zilai Zeng

    medium hireability

    Ph.D. student@Brown University

    Previously: Research Intern @ ByteDance

    US

    • Video-centric ML researcher at Brown University; uses diffusion models for policy learning (NeurIPS 2024 'Text-Aware Diffusion for Policy Learning') and internet video knowledge for robotic tasks (ICLR 2025)
    • Strong fit on video AI + diffusion models, moderate on multimodal; no inference optimization work visible
    • Interned at ByteDance Seed
    • Hireability: MEDIUM — active PhD student (~year 4-5 based on 2023-2026 publication range), ByteDance internship shows industry interest, but no explicit job-seeking signals detected

    Runs

    #1completed0 qualified / 0 foundApr 22, 1:47 AM