me.png

Li Xiao (李潇)

ByteDance Seed Team

Member of TopSeed Talent Program

xiaoli.cst@gmail.com
Google Scholar
  Dinghao Building, Tower B, Beijing, China

NOTE: Our team (Seed LLM) is hiring interns and full-time researchers. If you are interested in LLM pretraining/reasoning/agent, feel free to contact me!


I am a researcher at ByteDance Seed, as a member of TopSeed Program, working with Ke Shen since 2024. I received my Ph.D. in 2025 from the Department of Computer Science and Technology at Tsinghua University (THU). I was a member of TSAIL Group, led by Prof. Bo Zhang and Prof. Jun Zhu, where I worked closely with Prof. Xiaolin Hu and Prof. Bo Zhang. I obtained my Bachelor’s degree from Tsinghua University.

My long-term goal is to build scalable, robust, and generalizable autonomous agents that operate reliably in the real world and fundamentally relieve humans from tedious labor. In the near term, I focus on building world-class multimodal foundation models that create real economic value and measurably improve productivity (not models optimized merely for leaderboard performance). I believe accelerating progress toward this goal requires rethinking the current paradigm along these directions:

  • End-to-end system optimization: Re-examining pretraining, post-training, and evaluation from a unified perspective, jointly optimizing for scalability and generalization rather than treating stages independently.
  • Predictable scaling ladders: Developing principled, systematic scaling strategies to accelerate model iteration while improving reliability and reducing empirical trial-and-error.
  • Diving deeper: Exploring new training paradigms (objectives, optimizers, data strategies, etc) that better leverage pretraining data, RL signals, and human supervision while enabling continuous knowledge acquisition.

Current Work & Progress

  • Core contributor to the development of flagship foundation models (Seed 1.6, Seed 1.8, and the upcoming Seed 2.0) as well as open-source models (Seed-OSS).
  • Research on the emergence and enhancement of reasoning patterns in pretrained models and end-to-end strategies to extend their capability ceilings (Some findings are temporarily confidential due to company policy).

Current Interns

Huanran Chen (Tsinghua University, LLM pretraining dynamics)

news

Jan 26, 2026 We propose the COD framework that accurately predicts LLM downstream performance before training, achieving a 1.36% average prediction error on a 70B parameter model. This work has been accepted in ICLR 2026. Read more
Dec 18, 2025 We released Seed 1.8. I was responsible for improving the model’s generalizable reasoning capabilities.
Aug 21, 2025 We released Seed-OSS model. I was responsible for enhancing the model’s reasoning density, aenabling it to maintain longer chains of thought and tackle more challenging problems.
Jun 25, 2025 We released Seed 1.6. I was responsible for the multimodal mixed continual training (MMCT) stage, enabling the model to achieve strong native multimodal capabilities without sacrificing its text performance.
Aug 30, 2024 We propose the Faster-GCG algothrim, a foundamental and efficient discrete optimization approach for jailbreak attacks against large language models. Read more

selected publications

2026

  1. ICLR
    Unveiling downstream performance scaling of llms: A clustering-based perspective
    Chengyin Xu, Kaiyuan Chen, Xiao Li, Ke Shen, and Chenggang Li
    International Conference on Learning Representations (ICLR), 2026
    cluster.png

2025

  1. Report
    Seed1.8 Model Card: Towards Generalized Real-World Agency
    Bytedance Seed
    Model Card, 2025
    seed18.png
  2. Report
    Technical Introduction to the Seed1.6 Model Series
    Bytedance Seed
    Technical Report, 2025
    seed16.png
  3. ICLR
    ADBM: Adversarial diffusion bridge model for reliable adversarial purification
    Xiao Li, Wenxuan Sun, Huanran Chen, Qiongxiu Li, Yining Liu, Yingzhe He, Jie Shi, and Xiaolin Hu
    International Conference on Learning Representations (ICLR), 2025
    adbm.png

2024

  1. arXiv
    Faster-GCG: Efficient Discrete Optimization Jailbreak Attacks against Aligned Large Language Models
    Xiao Li, Zhuhong Li, Qiongxiu Li, Bingze Lee, Jinghao Cui, and Xiaolin Hu
    arXiv preprint arXiv:2410.15362, 2024
    fastergcg.png

2023

  1. TPAMI
    Recognizing Object by Components With Human Prior Knowledge Enhances Adversarial Robustness of Deep Neural Networks
    Xiao Li, Ziqi Wang, Bo Zhang, Fuchun Sun, and Xiaolin Hu
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
    rock.png