文章预览
LG - 机器学习 CV - 计算机视觉 CL - 计算与语言 AS - 音频与语音 RO - 机器人 1、[LG] TokenFormer:Rethinking Transformer Scaling with Tokenized Model Parameters 2、[LG] Scalable watermarking for identifying large language model outputs 3、[CL] $100K or 100 Days:Trade-offs when Pre-Training with Academic Resources 4、[LG] In-context learning and Occam's razor 5、[LG] Accelerating Direct Preference Optimization with Prefix Sharing 摘要:用Token化模型参数重新思考Transformer缩放、用于识别大型语言模型输出的可扩展水印、用学术资源进行预训练时的权衡、上下文学习与奥卡姆剃刀、利用前缀共享加速直接偏好优化 1、[LG] TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters H Wang, Y Fan, M F Naeem, Y Xian... [Max Planck Institute for Informatics & Google] TokenFormer:用Token化模型参数重新思考Transformer缩放 要点: Tokenformer 通过增量添加键值参数对来
………………………………