Build Multi-Document QnA App with Streamlit and OpenAI
Table of Contents
- Introduction
- Demonstration of the Application
- Getting Started with Streamlit
- Creating the Main Python File
- Setting up the Requirements
- Connecting the GitHub Repository
- Deploying the Streamlit Application
- Uploading and Managing Documents
- Asking Questions from Uploaded Documents
- Dynamic Updating of Vector DB
- Customizing the Application
- Conclusion
Introduction
In this article, we will discuss how to Create an end-to-end Streamlit application that can communicate with other documents. The application will have the functionality to talk to multiple documents, add or remove documents, and alter the vector DB directly from the application itself. We will be using Streamlit, Lagrange, ChromaDB, and OpenAI to build this application.
Demonstration of the Application
Before diving into the details of creating the application, let's first take a look at a quick demo of what We Are going to build. The demo showcases a file upload box where users can drag and drop documents. Once the documents are uploaded, users can ask questions about the content of the documents. The application will extract answers from the documents and display them along with the corresponding sources. Users can also remove or add documents on the fly, and the vector DB will be dynamically updated accordingly.
Getting Started with Streamlit
To start building the application, we need to first set up Streamlit. Streamlit is a super easy way to build and host Python applications. To get started, You can sign up for Streamlit on their Website. Once signed up and logged in, you will be redirected to a page where you can view and create applications.
Creating the Main Python File
In order to create the application, we need to create a main.py file that contains the Python code for the application. The main file will import all the necessary libraries, define functions to extract text from documents, set the page layout, create file upload functionality, and handle user questions and answers. We will also need to initialize the OpenAI embedding extractor and the vector store.
Setting up the Requirements
Apart from the main Python file, we also need to create a requirements.txt file that lists all the libraries required for our application. This file is essential for the deployment of our application on Streamlit. The requirements.txt file should include all the necessary library dependencies.
Connecting the GitHub Repository
To connect our application with Streamlit, we need to link our GitHub repository. This can be done by providing the repository URL, the branch, and the main file path in the Streamlit settings. Additionally, we also need to provide any secret keys required for our application, such as the OpenAI API Key.
Deploying the Streamlit Application
Once we have set up the GitHub connection and provided the necessary settings, we can deploy our Streamlit application. This process may take some time. Once deployed, we can access the application using the provided URL. The application can be accessed from anywhere in the world using any device.
Uploading and Managing Documents
In the deployed Streamlit application, users can drag and drop multiple documents into the file upload box. The application will keep track of the uploaded documents and allow users to remove or add documents as needed. The vector DB will be dynamically constructed and updated Based on the uploaded documents.
Asking Questions from Uploaded Documents
Users can ask questions about the uploaded documents using the input text box provided in the application. Once a question is asked, the application will leverage the OpenAI GPT model to extract and provide Relevant answers from the uploaded documents. The answers will be displayed along with the sources.
Dynamic Updating of Vector DB
As users modify the uploaded documents by removing or adding new documents, the vector DB will be dynamically updated to reflect these changes. This ensures that the application can provide accurate and up-to-date answers based on the available documents.
Customizing the Application
The Streamlit application provides various customization options. Users can change the appearance of the application by switching between light and dark themes. These settings can be accessed from the application's settings menu.
Conclusion
In this article, we have explored how to create an end-to-end Streamlit application that can communicate with other documents. We have covered the steps involved in setting up Streamlit, creating the main Python file, connecting the GitHub repository, deploying the application, and managing documents and user questions. With the use of Streamlit, Lagrange, ChromaDB, and OpenAI, we have built a flexible and interactive application for document-based question answering.