2024-06-02 05:15
本条微博链接
[CL]《Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training》W Du, T Luo, Z Qiu, Z Huang… [The University of Hong Kong Hong Kong University of Science and Technology] (2024) 网页链接 #机器学习# #人工智能# #论文#
………………………………