xiaoli.cst@gmail.com
Google Scholar
Dinghao Building, Tower B, Beijing, China
NOTE: Our team (Seed LLM) is hiring interns and full-time researchers. If you are interested in LLM pretraining/reasoning/agent, feel free to contact me!
I am a researcher at ByteDance Seed, as a member of TopSeed Program, working with Ke Shen since 2024. I received my Ph.D. in 2025 from the Department of Computer Science and Technology at Tsinghua University (THU). I was a member of TSAIL Group, led by Prof. Bo Zhang and Prof. Jun Zhu, where I worked closely with Prof. Xiaolin Hu and Prof. Bo Zhang. I obtained my Bachelor’s degree from Tsinghua University.
My long-term goal is to build scalable, robust, and generalizable autonomous agents that operate reliably in both digital and real world and fundamentally relieve humans from tedious labor. In the near term, I focus on building world-class foundation models that create real economic value and measurably improve productivity (not models optimized merely for leaderboard performance). I believe accelerating progress toward this goal requires rethinking the current paradigm along these directions:
- End-to-end system optimization: Re-examining pretraining, post-training, and evaluation from a unified perspective, jointly optimizing for scalability and generalization rather than treating stages independently.
- Predictable scaling ladders: Developing principled, systematic scaling strategies to accelerate model iteration while improving reliability and reducing empirical trial-and-error.
- Diving deeper: Exploring better training paradigms (objectives, optimizers, data strategies, etc) that fully leverage pretraining data, RL signals, and human supervision while enabling continuous knowledge and skill acquisition.
Current Work & Progress
- Core contributor to the development of flagship foundation models (Seed 1.6, Seed 1.8, and the upcoming Seed 2.0) as well as open-source models (Seed-OSS).
- Research on the emergence and enhancement of reasoning patterns in pretrained models and end-to-end strategies to extend their capability ceilings.
Current Interns
Huanran Chen (Tsinghua University, LLM pretraining dynamics)
Previous Experiences
- 2024.6 - 2024.9, 01.AI, VLM Pretraining Group
- 2023.5 - 2024.4, Huawei 2012 Lab, Foundation Model Research Group
news
| Apr 13, 2026 | We introduce Nexus, a novel optimizer that achieves better downstream generalization at the same pretraining loss by aligning gradients and encouraging convergence to shared minima. |
|---|---|
| Feb 14, 2026 | We released Seed 2.0. Building on our efforts to improve generalizable reasoning in Seed 1.8, I was responsible for extending these capabilities across domains and modalities. |
| Jan 26, 2026 | We propose the COD framework that accurately predicts LLM downstream performance before training, achieving a 1.55% average prediction error on a 70B parameter model. This work has been accepted in ICLR 2026. Read more |
| Dec 18, 2025 | We released Seed 1.8. I was responsible for improving the model’s generalizable reasoning capabilities. |
| Aug 21, 2025 | We released Seed-OSS model. I was responsible for enhancing the model’s reasoning density, aenabling it to maintain longer chains of thought and tackle more challenging problems. |
selected publications
2026
- arXivNexus: Same Pretraining Loss, Better Downstream Generalization via Common MinimaarXiv preprint arXiv:2604.09258, 2026
- ICLRUnveiling downstream performance scaling of llms: A clustering-based perspectiveInternational Conference on Learning Representations (ICLR), 2026
2025
- ICLRADBM: Adversarial diffusion bridge model for reliable adversarial purificationInternational Conference on Learning Representations (ICLR), 2025
2024
- arXivFaster-GCG: Efficient Discrete Optimization Jailbreak Attacks against Aligned Large Language ModelsarXiv preprint arXiv:2410.15362, 2024
2023
- TPAMIRecognizing Object by Components With Human Prior Knowledge Enhances Adversarial Robustness of Deep Neural NetworksIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023