Shen Zheng

mypic2.png

Hi! I am a Senior Research Scientist at Bytedance Seed Team, working on code LLMs and Agents. I got my MS in CS from UIUC and decided to move into industry rather than PhD. Before that, I obtained my bachelor degree in Zhejiang University.

My research work includes:

  • Automated curation for scaling posttrain data
  • Test-time scaling
  • LLM pretraining
  • LLM in coding and math [1] [2]

Previously in academia, I worked on:

selected publications & models

  1. Model
    Seed-1.6-thinking
    ByteDance Seed Team
    Bytedance Seed, 2025
  2. Tech Report
    Seed-Coder: Let the Code Model Curate Data for Itself
    ByteDance Seed Team Core Contributor
    Bytedance Technical Report, 2025
  3. Model
    Doubao-1.5-Pro
    ByteDance Seed Team
    Bytedance Seed, 2025
  4. NAACL
    GPT-Fathom: Benchmarking Large Language Models to Decipher the Evolutionary Path towards GPT-4 and Beyond
    Shen Zheng, Yuyu Zhang, Yijie Zhu, and 4 more authors
    Findings of the Association for Computational Linguistics: NAACL 2024, 2024
  5. ACL
    BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving
    Ran Xin, Chenguang Xi, Jie Yang, and 6 more authors
    ACL 2025, 2025