Back to dashboard

Junior GPU kernel engineers in the US with CUDA/Triton experience

completed3 qualified1 runApr 21, 8:56 PMjunior-gpu-kernel-engineers-in-the-us-with-cudatriton-experi-1776804972
Parsed3 topics · Junior · Engineer · US
Generating seed nodes
0 proposed
Explored 0 queries
0/0 done
    3
    Expanding nodes
    queued
    4
    Qualifying candidates
    queued

    Qualified Candidates (3)

    GO

    Gabriele Oliaro

    medium hireability

    CS PhD Student@Snowflake AI Research

    Previously: Research Scientist Intern @ Snowflake

    • Strong GPU kernel match — owns CUDA kernel repo (softmax-argmax fused kernel, not a fork), co-authored Korch paper on optimal kernel orchestration for tensor programs (ASPLOS 2024), forks flashinfer CUDA kernel library, and has multiple CUDA stream/kernel PRs in FlexFlow
    • Research focus is ML systems + parallel computing + GPU kernel optimization at CMU CATALYST lab
    • US-based (Pittsburgh, PA)
    • Hireability: MEDIUM — 4th year CMU PhD with expected graduation 2027, currently Research Intern at Snowflake AI Research; ~1.5 years from graduation, signaling industry interest but not yet in final-push job market mode
    ZZ

    Zhihao Zhang

    medium hireability

    Ph.D. student@Carnegie Mellon University

    Previously: MS student @ Carnegie Mellon University

    Pittsburgh, US

    • PhD student at CMU Catalyst (advised by Zhihao Jia) building LLM inference systems
    • Pinned FlashInfer (CUDA kernel library for LLM serving) on GitHub — direct hands-on CUDA kernel work
    • OSDI 2025 paper on Mirage tensor program superoptimizer (GPU kernel optimization), plus papers at ASPLOS 2024 (SpecInfer), ICLR 2025 (TidalDecode)
    • Research expertise: ML Systems
    • Pittsburgh, US
    • Hireability: MEDIUM — advanced PhD student with strong publication record (ASPLOS/ICLR/ICML/OSDI), website says 'open to collaboration'; LinkedIn profile went private in Jan 2026 (ambiguous signal, possibly nearing graduation)
    ZC

    Zhuoming Chen

    medium hireability

    Ph.D. student@Carnegie Mellon University

    Previously: Research Intern @ Meta

    New York, US

    • LLM inference systems researcher at CMU (Beidi Chen + Zhihao Jia lab)
    • Core papers: SpecInfer (381 citations), MagicPIG LSH attention (46 citations), Sequoia speculative decoding (69 citations)
    • Deep GPU memory systems expertise; forks flash-linear-attention and flex-block-attn suggesting Triton familiarity, but public repos are Python-only — no explicit CUDA/Triton kernel code
    • Adjacent to kernel engineering rather than a direct kernel author
    • Based in US
    • Hireability: MEDIUM — 3rd year PhD at CMU (started 2023), not yet in final-year transition window; CV update 67 days ago; active with Meta FAIR internship in 2025

    Runs

    #1completed0 qualified / 0 foundApr 21, 8:56 PM