文章预览
LG - 机器学习 CV - 计算机视觉 CL - 计算与语言 AS - 音频与语音 RO - 机器人 1、[LG] What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? 2、[LG] LAUREL:Learned Augmented Residual Layer 3、[CL] Controllable Context Sensitivity and the Knob Behind It 4、[LG] Model Stealing for Any Low-Rank Language Model 5、[CL] Language Models as Causal Effect Generators 摘要:学习动态与LLM推理泛化能力的关联、习得增强残差层、可控环境敏感性与背后的“旋钮”机制、任何低秩语言模型的模型窃取、语言模型作为因果效应生成器 1、[LG] What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? K Kang, A Setlur, D Ghosh, J Steinhardt… [UC Berkeley & CMU] 学习动态与LLM推理泛化能力的关联 要点: 作者发现模型在记忆训练数据中精确推理步骤之前的预记忆训练精度(pre-memorization train accuracy)是其测试精度的强有力预测指标,远超标
………………………………