超越 SOTA！成为 RAG 关键组件！OCR 2.0 来了

GitHubStore · 公众号 · · 2024-09-24 09:47

文章预览

项目简介基于 GenAI + 多模态 LLM 实现的 GOT 580M 的端到端 OCR 模型正式发布！支持处理复杂任务，如表格、公式甚至几何形状 BLEU 0.972 的分数，支持提取 Arxiv 公式、多页 OCR 以及 1024*1024 分辨率大小识别！安装我们的环境是cuda11.8+torch2.0.1 克隆此存储库并导航到 GOT 文件夹 git clone https://github.com/Ucas-HaoranWei/GOT-OCR2.0.git cd 'the GOT folder' 安装包 conda create -n got python=3.10 -y conda activate got pip install -e . 安装 Flash-Attention pip install ninja pip install flash-attn --no-build-isolation 演示纯文本 OCR： python3 GOT/demo/run_ocr_2.0.py --model-name /GOT_weights/ --image-file /an/image/file.png --type ocr 格式化文本 OCR： python3 GOT/demo/run_ocr_2.0.py --model-name /GOT_weights/ --image-file /an/image/file.png --type format 细粒度 OCR： python3 GOT/demo/run_ocr_2.0.py --model-name /GOT_weights/ --image-file /an/image/file ………………………………

原文地址：访问原文地址
快照地址：访问文章快照
总结与预览地址：访问总结与预览

分享到微博

推荐文章

哲学园 · 哥德尔反对强人工智能的核心要点（哥德尔的心灵哲学）

16 小时前

哲学园 · 300年才出一个张松茂！“当代瓷王”再现名画真风采

2 天前

SEPCOIII微视界 · 刘明华赴公司广东片区开展调研

7 月前

高才-高校人才网 · 招聘日报丨高校人才网2024年9月30日招聘信息（275条）

6 月前

PChouse家居画报 · 藏在上海市中心的绝美老洋房，四孩家庭的隐逸慢生活！

3 月前