ICS Docs Chat Bot
Goal
To create a repeatable solution to enable RAG enabled LLM endpoints within Databricks. This should include a vector store and search endpoint, as well as a model serving endpoint.
As a POC we will create a solution to enable a chat bot on the i2dbdocs site. We will then use this to design a pattern/tools to allow us to create a similar solution for customers as needed.
Steps
-
Pull docs from the Git repo
-
Chunk the files into a table
- langchain?
-
Create a vector store from the document chunks
- Create store using code
-
Create a vector search endpoint
- Create endpoint using code
-
Create a model serving endpoint that uses the vector search
- Add instructions to the system message to instruct the LLM to return a response of "i dont know" if there is no related information in the vector store.
-
Create a notebook to test the model endpoint responses
-
Create a web front end to enable a user to chat with the endpoint
- embedded chat window
Models
- DBRX?
- Llama3
- Qwen
- Embedding
- databricks
Notes
- Limitations on number of deployments
- 100 endpoints / workspace
- 20 indexes per search endpoint