URL Retriever

By default, large language models can still give vague, outdated, or simply false answers. To ensure the answers are always up-to-date and relevant, generative models must be pointed to the right knowledge. To do this, FlowHunt uses the Retrieval-Augmented Generation (RAG) approach, which supplies generative models with user-specified knowledge sources. You can apply this method…
Categories:
URL Retriever

By default, large language models can still give vague, outdated, or simply false answers. To ensure the answers are always up-to-date and relevant, generative models must be pointed to the right knowledge.

To do this, FlowHunt uses the Retrieval-Augmented Generation (RAG) approach, which supplies generative models with user-specified knowledge sources. You can apply this method using the Retriever components, including the URL Retriever. This way, your Chatbots, and AI Tools will have constant access to relevant information.

What is the URL Retriever?

This component allows the flow to retrieve knowledge from a specific URL or a set of URLs. It’s great for linking external knowledge to Chatbots and grabbing real-time information from frequently updated pages.

Enabling your Flow to analyze the content of various URLs also opens the door to creating a wide array of content, SEO, and productivity tools.

FlowHunt's URL retriever component

Advanced Settings

  • Load from pointer: Only load the exact part of the article related to the query. If unchecked, the whole article gets loaded.
  • Skip the last Heading: The last heading of an article is often the conclusion, the FAQ section, or even the author bio. These are often unnecessary and may even harm the results.
  • Use Metadata: You can control whether the bot should grab metadata, such as pictures or product listings. Without it, only plain text gets grabbed. By clicking the “Select…” button, you can choose which metadata you want to include.

Max Tokens

AI Tokens are like digital credits you use to access and run different AI-powered features or tasks. They allow for the retrieval and processing of information, as well as generating the answers. This setting limits the total number of tokens used to perform these tasks, ensuring the process doesn’t become too expensive or take too long.

Strategy

The bot might crawl many documents to create the final output. The Strategy setting allows you to control how it utilizes these documents while staying within the token limit.

Currently, there are two possible strategies:

  • Include equal size from each document: Utilize all found documents equally.
  • Concat documents, fill from first up to tokens limit: Link the documents together while prioritizing the information by relevance to the query.

How to connect the URL Retriever to your Flow

The URL Retriever offers a variety of input and output options that will cover any use cases.

Input

  • Text URLs – Input URLs as plain text by connecting Chat Input or similar component.
  • URL Records – Connect a component that outputs URLs, such as the GoogleSearch component.

Output

  • Documents: Outputs the content as plain text for chat answers or further processing.
  • Raw Documents: Outputs documents along with the media and metadata. It is used for components that need the data, such as Widgets.
  • Documents as Tool: Turns the retrieved document into a tool that autonomous Agents can use.

How to use the URL Retriever

The multiple input and output options make the URL Retriever one of the most versatile components available. Let’s keep this example simple and look at the most specific use of the URL Retriever—content tools.

We’ll create a simple tool that allows you to create many SEO and content tools by changing the Prompt message. For example, you can use it to turn YouTube videos into articles or summarize articles, and all you have to do is send a URL in chat.

For our example, we’ll create a Google Ads Generator. The Flow will analyze your articles for key points and use Chat GPT-4o to create catchy and effective Google Ads according to our instructions:

  1. Drag in the Chat Input and the URL Retriever.
  2. Connect Chat Input to the Text URLs input option. This allows you to simply input URLs instead of sending the whole text.
  3. Add the Prompt. You want the Flow to use the URL as a source of information. In other words, the context. Connect the URL Retriever as the Prompt’s Context.
  4. The Flow needs to understand what to do with the input. That’s where the Prompt template message comes in:

“You are a professional Google ads Ad copywriter.
Analyze input document representing URL {input} and generate 10 versions of Ad for Google Ads.
Ad text should have the same language as the analyzed document. Titles must be up to 30 characters long. Descriptions must be up to 90 characters long.

— DOCUMENT START
{context}
— DOCUMENT END

10 versions of google ads ad:”

  1. Now you just add the AI Generator with built-in ChatGPT-4o and send it all to Chat output.
  2. This is what the resulting Flow should look like:
Flow using URL Retriever component

Let’s test it out. Input an article URL. The Flow will crawl the URL to understand the key points. Then it will have Chat GPT-4o generate 10 Google Ads according to our instructions:

FlowHunt's chatbot answers using URL Retriever compontent

Our website uses cookies. By continuing we assume your permission to deploy cookies as detailed in our privacy and cookies policy.