Xin Dong

Cambridge, MA, May 2020

I am a research scientist at NVIDIA Research, where I lead training of intelligent and efficient foundation models.

I earned my Ph.D. in Computer Science from Harvard University in 2023, where I was advised by H.T. Kung and honored as a recipient of the Harvard James Mills Peirce Fellowship. I also gained valuable research experience at univerisities like Nanyang Technological University, UC San Diego and companies like Meta/Facebook, Tencent.

Email: xind [at] nvidia.com, xindong [at] alumni.harvard.edu

News

Jun 2025 » We have released Nemotron-Reasoning-1.5B (Top10 trend), trained using "Prolonged RL," where we significantly scaled up RL training steps (+2k) and problems (+130k). The RL-trained model makes substantial progress on problems that the base model cannot solve. Our 1.5B model has achieved impressive results, on par with DeepSeek-R1-Distill-Qwen-7B.
Apr 2025 » We have released CLIMB, a robust LLM pre- and post-training dataset building method.
We have now completed the puzzle of data preparation, model architecture, training recipes, and alignment to achieve state-of-the-art SLM. A small but strong reasoner is on the way.
Jan 2025 » Gave a talk at Scale ML + MLSys @ MIT on our Hymba work. Thanks for the invite.
Dec 2024 » 🏆 Join our Data Filtering Challenge for Edge LLMs and help shape the future of language models. Solve real-world problems, showcase your skills, and win amazing prizes!
Nov 2024 » We released the first hybrid-head model, Hymba-1.5B (accepted by ICLR 2025 as spotlight), which outperforms LLaMA 3.2-3B, despite being trained on 7× fewer tokens and achieving 12× cache reduction. Try them from Huggingface for your on-device LLM applications.🔥
Oct 2022 » Our Additive Power-of-Two Quantization (ICLR'20) is now supported by offical PyTorch APIs. It is a non-uniform quantization that fits well to weights distribution and offers great hardware efficiency. Try it out!
Oct 2022 » Our Direct Model Inversion is accepted by BMVC 2022 and featured by MIT Technology Review, SingularityHub. Thank collaborators from NVIDIA and Harvard.

Experiences

NVIDIA

Research Scientist, 2024 - Present

Sony

Research Scientist, 2023 - 2024

Meta Reality Lab

Research Scientist Intern, 2021, 2022

NVIDIA

Research Scientist Intern, 2020

Tencent America

Research Scientist Intern, 2019

Projects

▶ Social Media Highlights

Hymba: A Hybrid-head Architecture for Small Language Models

International Conference on Learning Representations (ICLR 2025 Spotlight)

Xin Dong*, Fonggan Fu*, Shizhe Diao, Wonmin Byeon, Zijia Chen, Ameya Sunil Mahabaleshwarkar, Shih-Yang Liu, Matthijs Van Keirsbilck, Min-Hung Chen, Yoshi Suhara, Yingyan (Celine) Lin, Jan Kautz, Pavlo Molchanov

[PDF] [HF🔥]

Hymba-1.5-1.5B (release soon) achieves better performance than LLaMA 3.2-3B and Qwen-2.5-1.5B with 10x less kv cache memory usage.
Falcon-H1 (up to 34B) from UAE adopted a similar parallel architecture as Hymba with impressive performance.

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Mingjie Liu, Shizhe Diao, Ximing Lu, Jian Hu, Xin Dong, Yejin Choi, Jan Kautz, Yi Dong

[PDF] [HF🔥]

(To our best knowledge) This is the first open-source model scaling RL training steps to 2000.

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Shizhe Diao, Yu Yang, Yonggan Fu, Xin Dong, Dan Su, Markus Kliegl, Zijia Chen, Peter Belcak, Yoshi Suhara, Hongxu Yin, Mostofa Patwary, Yingyan (Celine) Lin, Jan Kautz, Pavlo Molchanov

[PDF] [HF🔥]

Industry-grade LLM pre-training dataset curation pipeline.
Highly customizable and flexible for different optimization objectives and human effort requirements.

Unraveling the Mechanics of Learning-Based Demonstration Selection for In-Context Learning

Annual Meeting of the Association for Computational Linguistics (ACL 2025 Oral)

Hui Liu, Wenya Wang, Hao Sun, Chris Xing Tian, Chenqi Kong, Xin Dong, Haoliang Li

[PDF]

LongMamba: Enhancing Mamba's Long-Context Capabilities via Training-Free Receptive Field Enlargement

International Conference on Learning Representations (ICLR 2025)

Zhifan Ye, Kejing Xia, Yonggan Fu, Xin Dong, Jihoon Hong, Xiangchi Yuan, Shizhe Diao, Jan Kautz, Pavlo Molchanov, Yingyan Celine Lin

[PDF]

LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models

International Conference on Machine Learning (ICML 2025)

Dachuan Shi, Yonggan Fu, Xiangchi Yuan, Zhongzhi Yu, Haoran You, Sixu Li, Xin Dong, Jan Kautz, Pavlo Molchanov, Yingyan Celine Lin

[PDF]

A Deeper Look at Depth Pruning of LLMs

ICML 2024 Workshop on Theoretical Foundations of Foundation Models (ICML Workshop 2024)

Shoaib Ahmed Siddiqui, Xin Dong, Greg Heinrich, Thomas Breuel, Jan Kautz, David Krueger, Pavlo Molchanov

[PDF] [Code]

The Cost of Down-Scaling Language Models: Fact Recall Deteriorates before In-Context Learning

International Conference on Learning Representations (ICLR 2024)

Tian Jin*, Nolan Clement*, Xin Dong*, Vaishnavh Nagarajan, Michael Carbin, Jonathan Ragan-Kelley, Gintare Karolina Dziugaite

(*: equal contribution)

[PDF]

Finding needles in a haystack: A Black-Box Approach to Invisible Watermark Detection

2024 European Conference on Computer Vision (ECCV 2024)

Minzhou Pan, Zhenting Wang, Xin Dong, Vikash Sehwag, Lingjuan Lyu, Xue Lin

[PDF]

A Simple Background Augmentation Method for Object Detection with Diffusion Mode

2024 European Conference on Computer Vision (ECCV 2024)

Yuhang Li, Xin Dong, Chen Chen, Weiming Zhuang, Lingjuan Lyu

[PDF]

Segment Every Out-of-Distribution Object

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024)

Wenjie Zhao, Jia Li, Xin Dong, Yu Xiang, Yunhui Guo

[PDF] [Code]

Is Heterogeneity Notorious? Taming Heterogeneity to Handle Test-Time Shift in Federated Learning

Conference on Neural Information Processing Systems (NeurIPS 2023)

Yue Tan, Chen Chen, Weiming Zhuang, Xin Dong, Lingjuan Lyu, Guodong Long

[PDF]

Privacy Vulnerability of Split Computing to Data-Free Model Inversion Attacks

British Machine Vision Conference (BMVC 2022)

Xin Dong, Hongxu Yin, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov, H.T. Kung

[PDF]

SphereFed: Hyperspherical Federated Learning

European Conference on Computer Vision (ECCV 2022)

Xin Dong, Sai Qian Zhang, Ang Li, H.T. Kung

[PDF]

SplitNets: Designing Neural Architectures for Efficient Distributed Computing on Head-Mounted Systems

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022)

Xin Dong, Barbara De Salvo, Meng Li, Chiao Liu, Zhongnan Qu, H.T. Kung, Ziyun Li

[PDF]

Neural Mean Discrepancy for Efficient Out-of-Distribution Detection

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022)

Xin Dong, Junfeng Guo, Ang Li, Wei-Te Ting, Cong Liu, H.T. Kung

[PDF]

DRESS: Dynamic REal-time Sparse Subnets

Efficient Deep Learning for Computer Vision (ECV) CVPR Workshop (ECV 2022)

Zhongnan Qu, Syed Shakib Sarwar, Xin Dong, Yuecheng Li, Ekin Sumbul, Barbara De Salvo

[PDF]

Training for Multi-resolution Inference Using Reusable Quantization Terms

The 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2021)

Sai Qian Zhang, Bradley McDanel, H.T. Kung, Xin Dong

[PDF] [Code]

A free lunch from ANN: Towards efficient, accurate spiking neural networks calibration

International Conference on Machine Learning (ICML 2021)

Yuhang Li, Shikuang Deng, Xin Dong, Ruihao Gong, Shi Gu

[PDF] [Code]

MixMix: All You Need for Data-Free Compression Are Feature and Data Mixing

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021)

Yuhang Li, Feng Zhu, Ruihao Gong, Mingzhu Shen, Xin Dong, Fengwei Yu, Shaoqing Lu, Shi Gu

[PDF]

Additive Powers-of-Two Quantization: A Non-uniform Discretization for Neural Networks

International Conference on Learning Representations (ICLR 2020)

Yuhang Li*, Xin Dong*, Wei Wang

(*: equal contribution)

[PDF] [Code] Star

RTN: Reparameterized Ternary Network

The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020)

Yuhang Li*, Xin Dong*, Sai Qian Zhang, HaoliBai, Yuanpeng Chen, Wei Wang

(*: equal contribution)

[PDF]

exBERT: Extending Pre-trained Models with Domain-specific Vocabulary Under Constrained Training Resources

The 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)

Wen Tai, H.T. Kung, Xin Dong, Marcus Comiter, Chang-Fu Kuo

[PDF] [Code]

Efficient Bitwidth Search for Practical Mixed Precision Neural Network

arXiv 2020

Yuhang Li, Wei Wang, Haoli Bai, Ruihao Gong, Xin Dong, Fengwei Yu

[PDF]

Differentiable Dimension Search for Binary Neural Networks

1st Workshop on Neural Architecture Search at ICLR 2020 (ICLR 2020 Workshops))

Yuhang Li, Ruihao Gong, Fengwei Yu, Xin Dong, Xianglong Liu

[PDF]

A Main/Subsidiary Network Framework for Simplifying Binary Neural Network

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019)

Yinghao Xu*, Xin Dong*, Yudian Li, Hao Su

(*: equal contribution)

[PDF] [Code]

Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019)

Shilin Zhu, Xin Dong, Hao Su

[PDF] [Code]

Full-stack Optimization for Accelerating CNNs with FPGA Validation

The ACM International Conference on Supercomputing (ICS 2019)

Bradley McDanel, Sai Qian Zhang, H.T. Kung, Xin Dong

[PDF]

Maestro: A Memory-on-Logic Architecture for Coordinated Parallel Use of Many Systolic Arrays

The 30th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2019)

Bradley McDanel, Sai Qian Zhang, H.T. Kung, Xin Dong

[PDF] [Slides]

Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon

Thirty-first Conference on Neural Information Processing Systems (NeurIPS 2017)

Xin Dong, Shangyu Chen, Sinno Jialin Pan

[PDF] [Code]

Academic Services

Reviewer/Area Chair for ICML, NeurIPS, AAAI, IJCAI, CVPR, ICCV, ECCV, EMNLP, ACL
Co-organizer of the 1st international workshop on The Practical Deep Learning in the Wild (PracticalDL-22) at AAAI 2022
Teaching Fellow of Harvard CS242 Compute at Scale