文章预览
LG - 机器学习 CV - 计算机视觉 CL - 计算与语言 AS - 音频与语音 RO - 机器人 1、[CV] CROME:Cross-Modal Adapters for Efficient Multimodal LLM 2、[CL] Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models 3、[CL] LongWriter:Unleashing 10,000+ Word Generation from Long Context LLMs 4、[CL] Diversity Empowers Intelligence:Integrating Expertise of Software Engineering Agents 5、[AS] Music2Latent:Consistency Autoencoders for Latent Audio Compression 摘要:基于跨模态适配器的高效多模态LLM、语言模型中的语义泄漏、释放长上下文LLM的万词生成能力、软件工程Agent专业知识的集成、基于一致性自编码器的潜音频压缩 1、[CV] CROME: Cross-Modal Adapters for Efficient Multimodal LLM S Ebrahimi, S O. Arik, T Nama, T Pfister [Google Cloud AI Research] CROME:基于跨模态适配器的高效多模态LLM 要点: 提出CROME,一个基于轻量门控跨模态适配器的新的视觉
………………………………