Retrieval-augmented generation (RAG) is often used to develop customized AI applications, including chatbots, recommendation systems and other personalized tools. This system uses the strength of vector databases and large language models (LLMs) to provide high-quality results.
Selecting the right LLM for any RAG model is very important and requires considering factors like cost, privacy concerns and scalability. Commercial LLMs like OpenAI’s GPT-4 and Google’s Gemini are effective but can be expensive and raise data privacy concerns. Some users prefer open source LLMs for their flexibility and cost savings, but they require substantial resources for fine-tuning and deployment, including GPUs and specialized infrastructure. Additionally, managing model updates and scalability can be challenging with local setups.
A better solution is to select an open source LLM and deploy it on the cloud. This approach provides the necessary computational power and scalability without the high costs and complexities of local hosting. It not only saves on initial infrastructural costs but also minimizes maintenance concerns.
Let’s explore a similar approach to develop an application using cloud-hosted open source LLMs and a scalable vector database.
Several tools are required to develop this RAG-based AI application. These include:
In this tutorial, we will extract data from Wikipedia using LangChain’s WikipediaLoader module and build an LLM on that data.
Start setting your environment to use BentoML, MyScaleDB and LangChain in your system by opening your terminal and entering:
Begin by importing the WikipediaLoader from the langchain_community.document_loaders. wikipediamodule. You’ll use this loader to fetch documents related to “Albert Einstein” from Wikipedia.
This uses the load method to retrieve the “Albert Einstein” documents, and the print method to print the contents of the first document to verify the loaded data.
Import the CharacterTextSplitter from langchain_text_splitters, join the contents of all pages into a single string, and then split the text into manageable chunks.
Your data is ready, and the next step is to deploy the models on BentoML and use them in your RAG application. Deploy the LLM first. You’ll need a free BentoML account, and you can sign up for one on BentoCloud if needed. Next, navigate to the Deployments section and click on the Create Deployment button in the top-right corner. A new page will open that looks like this:
Select the bentoml/bentovllm-llama3-8b-instruct-service model from the drop-down menu and click “Submit” in the bottom-right corner. This should start deploying the model. A new page like this will open:
The deployment can take some time. Once it is deployed, copy the endpoint.
Note: BentoML’s free tier only allows the deployment of a single model. If you have a paid plan and can deploy more than one model, follow the steps below. If not, don’t worry — we will use an open source model locally for embeddings.
Deploying the embedding model is very similar to the steps you took to deploy the LLM:
Next, go to the API Tokens page and generate a new API key. Now you are ready to use the deployed models in your RAG application.
You will define a function called get_embeddings to generate embeddings for the provided text. This function takes three arguments. If the BentoML endpoint and API token are provided, the function uses BentoML’s embedding service; otherwise, it uses the local transformers and torch libraries to load the sentence-transformers/all-MiniLM-L6-v2model and generate embeddings.
This setup allows flexibility for free-tier BentoML users, who can deploy only one model at a time. If you have a paid version of BentoML and can deploy two models, you can pass the BentoML endpoint and Bento API token to use the deployed embedding model.
Iterate over the text chunks (splits) in batches of 25 to generate embeddings using the get_embeddings function defined above.
This prevents overloading the embedding model with too much data at once, which can be particularly useful for managing memory and computational resources.
Now, create a pandas DataFrame to store the text chunks and their corresponding embeddings.
The knowledge base is complete, and now it’s time to save the data to the vector database. This demo uses MyScaleDB for vector storage. Start a MyScaleDB cluster in a cloud environment by following the quickstart guide. Then you can establish a connection to the MyScaleDB database using the clickhouse_connect library.
Create a table in MyScaleDB to store the text chunks and embeddings. The table schema includes an id, the page_content and the embeddings.
The next step is to add a vector index to the embeddings column in the RAG table. The vector index allows for efficient similarity searches, which are essential for retrieval-augmented generation tasks.
Define a function to retrieve relevant documents based on a user query. The query embeddings are generated using the get_embeddings function, and an advanced SQL vector query is executed to find the closest matches in the database.
Note: The distance method takes an embedding column and the embedding vector of the user query to find similar documents by applying cosine similarity.
Establish a connection to your hosted LLM on BentoML. The llm_client object will be used to interact with the LLM for generating responses based on the retrieved documents.
Define a function to perform RAG. The function takes a user question and the retrieved context as input. It constructs a prompt for the LLM, instructing it to answer the question based on the provided context. The response from the LLM is then returned as the answer.
Finally, you can test it out by making a query to the RAG application. Ask the question “Who is Albert Einstein?” and use the doragfunction to get the answer based on the relevant documents retrieved earlier.
If you ask the RAG model about Albert Einstein’s death, the response should look like this:
BentoML stands out as an excellent platform for deploying machine learning models, including LLMs, without the hassle of managing resources. With BentoML, you can quickly deploy and scale your AI applications on the cloud, ensuring they are production-ready and highly accessible. Its simplicity and flexibility make it an ideal choice for developers, enabling them to focus more on innovation and less on deployment complexities.
On the other hand, MyScaleDB is explicitly developed for RAG applications, offering a high-performance SQL vector database. Its familiar SQL syntax makes it easy for developers to integrate and use MyScaleDB in their applications, as the learning curve is minimal. MyScaleDB’s Multi-Scale Tree Graph (MSTG) algorithm significantly outperforms other vector databases in terms of speed and accuracy. Additionally, MyScaleDB offers each new user free storage for up to 5 million vectors, making it a desirable option for developers looking to implement efficient and scalable AI solutions.
What do you think about this project? Share your thoughts on Twitter and Discord.
YOUTUBE.COM/THENEWSTACK
Tech moves fast, don't miss an episode. Subscribe to our YouTube channel to stream all our podcasts, interviews, demos, and more.
SUBSCRIBE
More here:
Develop a Cloud-Hosted RAG App With an Open Source LLM - The New Stack
- Box for Android - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- eUKhost - eNlight Cloud Hosting! - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Cloud Computing -- Oracle is Ready to Take You There - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- What is Cloud Computing? - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Webinar - Cloud Computing: Why You Should Care - 2010-10-14 - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- What is Cloud Hosting? - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Cloud Computing Misconceptions and Benefits - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Cloud Hosting and How it is Set to Change Internet Commerce - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Awesome Cloud Computing Explained with Animation - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Rackspace Cloud Race - UK cloud hosting - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Improved Cloud Service Delivery And Hosting | IBM - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Cloud Computing Explained - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Software companies turn to Savvis for cloud hosting and other SaaS services - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Sky News Tech Report on Cloud Computing - Macquarie Telecom Interview - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- BitNami Cloud Hosting Demo - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Fully managed Cloud Computing solution using your current IT infrastructure (Closed Caption) - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- Cloud Hosting Server Provisioning - Video [Last Updated On: February 26th, 2012] [Originally Added On: February 26th, 2012]
- iomart Hosting Provides Cloud Storage and Backup for new Branding Network [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Harris plans to stop offering remote cloud hosting [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- iomart Hosting provides cloud storage and backup for new UK branding network [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- DynamicOps Debuts "Fastest Path to Cloud" Seminar and Webinar [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Harris Corporation to Discontinue Cyber Hosting Operation; Will Continue Providing Advanced Cyber Security and Cloud ... [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Tutorial! Amazon Cloud Minecraft Server Hosting! - Video [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- MachPanel 4.3 - SaaS and Cloud Hosting Control Panel for Windows - Video [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- Webair Carrier Neutral Cloud: Open Network Access in the Cloud [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- iomart Hosting Takes UK Digital Media Agency Into the Cloud [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- FireHost Grows Executive Team on Heels of European Expansion; Appoints Jim Ciampaglio as Sr. Vice President of Global ... [Last Updated On: February 28th, 2012] [Originally Added On: February 28th, 2012]
- INetU Managed Hosting is SOC 2 and SOC 3 Compliant [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- Web Host Webair Adds Carrier Neutral Cloud Services [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- FireHost Appoints Jim Ciampaglio as Sr. Vice President of Global Sales and Marketing [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- BitRock CEO on BitNami Cloud Hosting - Video [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- Harris kills remote hosting service as customers shun cloud storage [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- Understand Cloud computing in 60secs - Video [Last Updated On: February 29th, 2012] [Originally Added On: February 29th, 2012]
- Systech Integrators® Forms Strategic Relationship With Rackspace Hosting® to Offer Cloud Hosting Services for SAP® ... [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- Dedicated & Cloud Hosting Provider Codero Names Industry Veteran Emil Sayegh, President & CEO [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- Cloud Computing and Technology Mobility - Video [Last Updated On: March 1st, 2012] [Originally Added On: March 1st, 2012]
- Cloud Hosting Providers - Video [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- Online Education Innovator Gives Virtual Internet Cloud Services an A+ [Last Updated On: March 3rd, 2012] [Originally Added On: March 3rd, 2012]
- SingleHop Introduces the Hosting Industry's First Customer Bill of Rights [Last Updated On: March 6th, 2012] [Originally Added On: March 6th, 2012]
- Cloud Services Provider Intermedia Launches Integrated Partner Program [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Cloud Services Provider Intermedia Now Offering Microsoft Office 365 [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Inside IT Cloud Computing Security - Video [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Lansing Cloud Host Introduces Faster ‘Storm SSD’ [Last Updated On: March 7th, 2012] [Originally Added On: March 7th, 2012]
- Leading Industry Analyst Firm positions Hosting.com as a Challenger in Managed Hosting Magic Quadrant [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- Hosting.com Positioned as Challenger in Managed Hosting in Gartner's Magic Quadrant [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- ServInt Announces the First Finalist for Its Inaugural Sextant Award, Recognizing the Most Effective Use of the ... [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- Leading Analyst Firm Recognizes Savvis as a Leader in Two Cloud-Focused Magic Quadrants [Last Updated On: March 8th, 2012] [Originally Added On: March 8th, 2012]
- UK Cloud Computing Company iomart Hosting Recruits Scotland Footballers to Kick off New Campaign [Last Updated On: March 9th, 2012] [Originally Added On: March 9th, 2012]
- Rackspace Hosting Positioned as a Leader in the Leaders Quadrant of the Magic Quadrant for Managed Hosting Providers [Last Updated On: March 9th, 2012] [Originally Added On: March 9th, 2012]
- 4t Networks Offers Red Hat Enterprise Linux 6 for Cloud Hosting [Last Updated On: March 9th, 2012] [Originally Added On: March 9th, 2012]
- elchemyv2.wmv - Video [Last Updated On: March 9th, 2012] [Originally Added On: March 9th, 2012]
- Steve VanRoekel Keynote, NIST Cloud Computing Forum and Workshop IV - Video [Last Updated On: March 11th, 2012] [Originally Added On: March 11th, 2012]
- Hosting.com Enhances Backup Capabilities to Deliver Leading-Edge Data Recovery Solution for Businesses Any Size ... [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- Online Tech Hosts Webinar on Cloud Computing in EHR/RCM Systems [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- Hosting.com Enhances Backup & Data Recovery [Last Updated On: March 12th, 2012] [Originally Added On: March 12th, 2012]
- ServInt Introduces Its New Flex Line of High-Performance, Fully Managed Dedicated Servers [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- Telefonica targets LatAm with business cloud [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- TCWH Announces New InMotion Hosting Review 2012 [Last Updated On: March 14th, 2012] [Originally Added On: March 14th, 2012]
- Lokahi Expands Cloud Offering to Include Managed Security Services Through Partnership With StillSecure [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Eco Cloud Hosting IPv6 Ready with Web Application Firewall and Load Balancer - Video [Last Updated On: March 15th, 2012] [Originally Added On: March 15th, 2012]
- Private SharePoint Cloud Beats Other Cloud Hosting Options for Enterprises on Price, Practicality [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- Private SharePoint Cloud Beats Other Cloud Hosting Options for Enterprises, Says AISN [Last Updated On: March 17th, 2012] [Originally Added On: March 17th, 2012]
- CaymanSecurity.com Introduces Secure Cloud Hosting Services [Last Updated On: March 19th, 2012] [Originally Added On: March 19th, 2012]
- Storm On Demand Introduces Windows Cloud Hosting [Last Updated On: March 20th, 2012] [Originally Added On: March 20th, 2012]
- Citrix Streamlines Delivery of Cloud-Hosted Apps and Desktops [Last Updated On: March 20th, 2012] [Originally Added On: March 20th, 2012]
- Cloud Computing Explained.mp4 - Video [Last Updated On: March 20th, 2012] [Originally Added On: March 20th, 2012]
- AMD Opteron 3200 Chips Target Cloud, Web Hosting [Last Updated On: March 20th, 2012] [Originally Added On: March 20th, 2012]
- Understanding the Cloud Computing Stack: SaaS, PaaS and IaaS | CloudU - Video [Last Updated On: March 21st, 2012] [Originally Added On: March 21st, 2012]
- Racemi Joins Rackspace Cloud Tools Program [Last Updated On: March 22nd, 2012] [Originally Added On: March 22nd, 2012]
- iNetRadio Adds User Music Cloud Hosting [Last Updated On: April 18th, 2012] [Originally Added On: April 18th, 2012]
- Managed Hosting Company, OneNeck IT Services, Selected by Southwest Home Builder for Cloud Services [Last Updated On: April 18th, 2012] [Originally Added On: April 18th, 2012]
- What is Cloud Hosting? - Australian Cloud Hosting Providers - Video [Last Updated On: April 18th, 2012] [Originally Added On: April 18th, 2012]
- Courion Leverages NaviSite's Enterprise Cloud to Deliver Identity and Access Management Software-as-a-Service [Last Updated On: April 24th, 2012] [Originally Added On: April 24th, 2012]
- TLD Solutions Launches Next Generation "4GH" Web Hosting [Last Updated On: May 4th, 2012] [Originally Added On: May 4th, 2012]
- ElasticHosts unveils simple cloud web hosting for SMEs [Last Updated On: May 4th, 2012] [Originally Added On: May 4th, 2012]
- Rackspace Hosting 1Q net income up on higher sales [Last Updated On: May 8th, 2012] [Originally Added On: May 8th, 2012]
- Infinitely Virtual Announces Support for Microsoft SQL Server 2012, Providing Cloud-Ready Hosting with Mission ... [Last Updated On: May 8th, 2012] [Originally Added On: May 8th, 2012]
- Kore Domains Launches Revolutionary New "4GH" Web Hosting Solution [Last Updated On: May 8th, 2012] [Originally Added On: May 8th, 2012]
- 4GH Web Hosting Europa Launches 4GH Cloud Web Hosting Solution in European Data Center [Last Updated On: May 10th, 2012] [Originally Added On: May 10th, 2012]
- Hughes Cloud Services & Hosting Showcases Its Comprehensive Enterprise IT Offering At ... [Last Updated On: May 12th, 2012] [Originally Added On: May 12th, 2012]