文章预览
前接: 【学习LangChain】04. RAG基础 【学习LangChain】05. RAG进阶 - Embeddings & Loaders RAG进阶:Splitter CharacterTextSplitter 在深入loader之后,我们来看看splitter会怎样影响RAG的运行。在之前的示例中,我们使用的都是CharacterTextSplitter来切分文档: import os from langchain.text_splitter import CharacterTextSplitter from langchain_community.document_loaders import TextLoader from langchain_openai import ChatOpenAI, OpenAIEmbeddings load_dotenv() # set the directory for vector store cur_dir = os.getcwd() file_name = "Top 20+ RAG Interview Questions.txt" file_path = os.path.join(cur_dir, "articles/RAG" , file_name) vdb_dir = os.path.join(cur_dir, 'db' , 'chroma_db' ) os.makedirs(vdb_dir, exist_ok= True ) # load loader = TextLoader(file_path) docs = loader.load() text_splitter = CharacterTextSplitter(chunk_size= 500 , chunk_overlap= 0 ) chunks = text_splitter.split_d
………………………………