Objective: learn and try basic text chunking, embeddings, and search from vector store. I want results that are accurate (no hallucination) and that show the source document.
I wanted to chat with my house, and after a few months of brewing the idea at the back of my mind, I've cooked up a prototype of an AI bot that can answer questions based on instruction manuals for our appliances.
I ask it how and how often to care for the little humidifier in the bed room. The hard truth is that most weeks taking the garbage out can feel like a chore, I ain't washing the humidifier every week even if the manuals tells me to. Garbage in, garbage out.
The code and setup is quite simple, my notebook is on GitHub:
Results: I'm generally happy with the results, I am impressed at how fast and easy this is to set up.
Next steps: I will try this with a much larger set of documents, try alternative models, and build a UI for it.