Jue Wang
Hello, I am currently a postdoc/research scientist at Together Computer, under the guidance of Prof. Ce Zhang. Before that, I got my Ph.D. degree from Zhejiang University, advised by Prof. Lidan Shou.
My current research interests lie in Distributed Systems and Effective and Efficient Algorithms for NLP (both training and inference). I am also interested in NLP in a broad sense. If you want to get in touch, please send me an email.
Updates
- Sep 2023: We had a paper accepted to NeurIPS.
- Aug 2023: LLaMA-7B-32K and LLaMA-7B-32K-Instruct have been released.
- Jun 2023: RedPajama-7B-v1 has been released.
- Apr 2023: We got two papers accepted to ICML 2023!
- Mar 2023: OpenChatKit has been released, cheers!
- Nov 2022: Check out our demo of GPT-JT!
- Nov 2022: We had a paper accepted to AAAI 2023. Congratulation to the collaborators!
- Nov 2022: Check out our benchmark on LLMs!
- Sep 2022: We had one paper accepted to NeurIPS 2022. Congratulation and thanks to all the collaborators!
- Apr 2022: We got a paper accepted to IJCAI 2022.
- Mar 2022: I had a visit to ETH Zurich.
- Feb 2022: As the first author, I had one long paper accepted to ACL 2022.
- Jun 2021: I graduated from CentraleSupélec with diplôme d’Ingénieur (master degree), cheers!
- Dec 2020: As the first author, I had one long paper accepted to AAAI 2021.
- Sep 2020: As the first author, I had one long paper accepted to EMNLP 2020.
- Apr 2020: As the first author, I had one long paper accepted to ACL 2020.
Education
- Zhejiang University, PhD in Computer Science, Sep 2018 - Jun 2023
- Université Paris Saclay (CentraleSupélec), Master (Engineer) in General Engineering, Sep 2016 - Jun 2018
- Zhejiang University, Bachelor in Electrical Engineering, Sep 2014 - Jun 2018
Manuscripts
- Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt
Zhaozhuo Xu, Zirui Liu, Beidi Chen, Yuxin Tang, Jue Wang, Kaixiong Zhou, Xia Hu, Anshumali Shrivastava
ArXiv preprint.
[Paper]
Publications
Skill-it! A Data-Driven Skills Framework for Understanding and Training Language Models
Mayee F. Chen, Nicholas Roberts, Kush Bhatia, Jue Wang, Ce Zhang, Frederic Sala, Christopher Ré
To appear at NeurIPS 2024.
[Paper]CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks
Jue Wang$^{*}$, Yucheng Lu$^{*}$, Binhang Yuan, Beidi Chen, Percy Liang, Christopher De Sa, Christopher Re, Ce Zhang.
In Proc. of ICML 2023.
[Paper] [Code]Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Zichang Liu, Jue Wang, Tri Dao, Tianyi Zhou, Binhang Yuan, Zhao Song, Anshumali Shrivastava, Ce Zhang, Yuandong Tian, Christopher Re, Beidi Chen.
In Proc. of ICML 2023.
[Paper] [Code]Holistic Evaluation of Language Models
Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan Kim, Neel Guha, Niladri Chatterji, Omar Khattab, Peter Henderson, Qian Huang, Ryan Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, Yuta Koreeda.
TMLR.
[Paper] [Code]Effective Continual Learning for Text Classification with Lightweight Snapshots
Jue WANG$^{*}$, Dajie Dong$^{*}$, Lidan Shou, Ke Chen, Gang Chen
In Proc. of AAAI 2023
[Paper]Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees
Jue Wang$^{*}$, Binhang Yuan$^{*}$, Luka Rimanic$^{*}$, Yongjun He, Tri Dao, Beidi Chen, Christopher Re, Ce Zhang.
In Proc. of NeurIPS 2022.
[Paper] [Code]SkipBERT: Efficient Inference with Shallow Layer Skipping
Jue Wang, Ke Chen, Gang Chen, Lidan Shou, and Julian McAuley.
In Proc. of ACL 2022.
[Paper] [Code]Continual Federated Learning Based on Knowledge Distillation
In Proc. of IJCAI 2022.
[Paper]Effective Slot Filling via Weakly-Supervised Dual-Model Learning
Jue Wang, Ke Chen, Lidan Shou, Sai Wu, and Gang Chen.
In Proc. of AAAI 2021.
[Paper] [Code] [Video]Two are Better than One: Joint Entity and Relation Extraction with Table-Sequence Encoders
Jue Wang and Lu Wei.
In Proc. of EMNLP 2020.
[Paper] [Code] [Video]Pyramid: A Layered Model for Nested Named Entity Recognition
Jue Wang, Lidan Shou, Ke Chen, and Gang Chen.
In Proc. of ACL 2020.
[Paper] [Code] [Video]
Contact
College of Computer Science and Technology, Zhejiang University
38 Zheda Rd, Xihu Qu, Hangzhou, Zhejiang, 310027
Email: [email protected]