xiaoli.cst@gmail.com
Google Scholar
Dinghao Building, Tower B, Beijing, China
NOTE: Our team (Seed LLM) is hiring interns and full-time researchers. If you are interested in LLM pretraining/reasoning/agent, feel free to contact me!
I am a researcher at ByteDance Seed, as a member of TopSeed Program, working with Ke Shen since 2024. I received my Ph.D. in 2025 from the Department of Computer Science and Technology at Tsinghua University (THU). I was a member of TSAIL Group, led by Prof. Bo Zhang and Prof. Jun Zhu, where I worked closely with Prof. Xiaolin Hu and Prof. Bo Zhang. I obtained my Bachelor’s degree from Tsinghua University.
My long-term goal is to build scalable, robust, and generalizable autonomous agents that operate reliably in the real world and fundamentally relieve humans from tedious labor. In the near term, I focus on building world-class multimodal foundation models that create real economic value and measurably improve productivity (not models optimized merely for leaderboard performance). I believe accelerating progress toward this goal requires rethinking the current paradigm along these directions:
- End-to-end system optimization: Re-examining pretraining, post-training, and evaluation from a unified perspective, jointly optimizing for scalability and generalization rather than treating stages independently.
- Predictable scaling ladders: Developing principled, systematic scaling strategies to accelerate model iteration while improving reliability and reducing empirical trial-and-error.
- Diving deeper: Exploring new training paradigms (objectives, optimizers, data strategies, etc) that better leverage pretraining data, RL signals, and human supervision while enabling continuous knowledge acquisition.
Current Work & Progress
- Core contributor to the development of flagship foundation models (Seed 1.6, Seed 1.8, and the upcoming Seed 2.0) as well as open-source models (Seed-OSS).
- Research on the emergence and enhancement of reasoning patterns in pretrained models and end-to-end strategies to extend their capability ceilings (Some findings are temporarily confidential due to company policy).
Current Interns
Huanran Chen (Tsinghua University, LLM pretraining dynamics)
news
| Jan 26, 2026 | We propose the COD framework that accurately predicts LLM downstream performance before training, achieving a 1.36% average prediction error on a 70B parameter model. This work has been accepted in ICLR 2026. Read more |
|---|---|
| Dec 18, 2025 | We released Seed 1.8. I was responsible for improving the model’s generalizable reasoning capabilities. |
| Aug 21, 2025 | We released Seed-OSS model. I was responsible for enhancing the model’s reasoning density, aenabling it to maintain longer chains of thought and tackle more challenging problems. |
| Jun 25, 2025 | We released Seed 1.6. I was responsible for the multimodal mixed continual training (MMCT) stage, enabling the model to achieve strong native multimodal capabilities without sacrificing its text performance. |
| Aug 30, 2024 | We propose the Faster-GCG algothrim, a foundamental and efficient discrete optimization approach for jailbreak attacks against large language models. Read more |
selected publications
2026
- ICLRUnveiling downstream performance scaling of llms: A clustering-based perspectiveInternational Conference on Learning Representations (ICLR), 2026
2025
- ICLRADBM: Adversarial diffusion bridge model for reliable adversarial purificationInternational Conference on Learning Representations (ICLR), 2025
2024
- arXivFaster-GCG: Efficient Discrete Optimization Jailbreak Attacks against Aligned Large Language ModelsarXiv preprint arXiv:2410.15362, 2024
2023
- TPAMIRecognizing Object by Components With Human Prior Knowledge Enhances Adversarial Robustness of Deep Neural NetworksIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023