Our Guide to Browserbear Data Extraction (Data Types & Examples)

Browserbear powers a wide range of web scraping possibilities, with data types including text, links, images, and tables. Learn about our data extraction actions and how to use them in this informative guide.
by Julianne Youngberg · April 2024

Contents

    Browser automation tools such as Browserbear can perform practically any action taken when manually browsing a website—clicking, loading URLs, and even filling out forms. While there are so many use cases for browser automation, it is perhaps most often used for data extraction.

    You can use browser automation tools to capture data for various uses, such as:

    When choosing a web scraping tool, it’s important to consider the data types you’re working with. Selecting something that can extract a wide variety of data types allows you to work with many websites and use cases.

    Screenshot of Browserbear web scraping landing page

    Browserbear powers a wide range of data extraction possibilities, even integrating AI to detect and extract data matching your description. Let’s walk through all of the extraction types and how you can expect them to look in your output log.

    Types of Data You Can Extract with Browserbear

    Browserbear has 10 different actions you can use to extract various pieces of data from a webpage. The majority rely on Helper config (CSS selectors, XPath, JS selectors, etc.) to locate elements, but there are also AI-powered actions that use simple instruction labels to find the most relevant snippets.

    Let’s walk through the data extraction actions one by one:

    Save Attribute

    The Save Attribute action is a versatile option that saves any attribute of one or multiple elements. HTML attributes provide additional information about an element, such as titles, alt text, links, and more.

    When setting up this action, you will need to provide Helper config to identify the element as well as specify the attribute you want to scrape (href, src, alt, etc.).

    Screenshot of Browserbear save attribute step

    Checking All will extract the attributes of all elements matching the config.

    Screenshot of Browserbear save attribute step with all checked

    Running a task with this action will yield an output log that looks something like this:

    Screenshot of Browserbear save attribute output log

    Save Clipboard

    Save Clipboard instructs Browserbear to save the contents of your clipboard to the output feed. This can save a lot of time building a longer task that pastes text, images, or files into an app to save it another way. Because the clipboard is only used for temporary storage, saving it to your output feed ensures you can continue accessing it for your workflows.

    Setting it up is simple and only involves adding a Save Clipboard action following an action that copies something to your clipboard, such as clicking a Copy button.

    Screenshot of Browserbear save clipboard step

    Running the task will show clipboard contents saved to your log:

    Screenshot of Browserbear save clipboard output log outlined in red

    Bear Tip 🐻: Another way to access clipboard contents is with the Paste interaction, which will paste clipboard content into a selected text field.

    Save HTML

    The Save HTML action allows you to save the entire webpage as an HTML file, which can be accessed by link. The file will be hosted on Browserbear servers for 24 hours, after which you need to store it elsewhere for continued access.

    To save the HTML of your current webpage, add the action to your task following a step that loads your page to the state you want it in.

    Screenshot of Browserbear save html step

    Your output log should return a link that leads to the HTML of that page.

    Screenshot of Browserbear save html output log outlined in red

    Save Image

    You can save photos on a webpage with the Save Image action. This returns link or paths which you can access in your output log, which can then be stored in a database or sent to other workflows.

    Setting up a step to save images involves adding the action to your task, inserting Helper config, then clicking Save.

    Screenshot of Browserbear save image step

    Checking All will extract the attributes of all elements matching the config.

    Screenshot of Browserbear save image step with all checked

    Running the task should return one or more links to the specified images.

    Screenshot of Browserbear save image output log

    Bear Tip 🐻: This step returns links to images from the website being scraped. If you want to host these images independently, consider using Puppeteer or a tool like Zapier to create a process that downloads and saves image files according to your preferences.

    Save Structured Data

    The Save Structured Data action saves multiple elements from a webpage into a JSON object. It’s ideal for scraping multiple elements within identically structured parent containers, as is often the case for product pages and other types of lists.

    To set up the step, you’ll need to insert Helper config that identifies a parent container, then add a label, config, and attribute to each individual child element using the Data Picker.

    Screenshot of Browserbear save structured data step

    Browserbear will scrape data from all parent containers containing the same child elements, then return it in an array:

    Screenshot of Browserbear save structured data output log

    Save Table Data

    Save Table Data enables you to automatically save structured data from a table, organized by column or heading. This is especially helpful for HTML tables that would be difficult to work with if copied as plain text.

    Setting up the action involves inserting config for the entire table, then specifying headings.

    Screenshot of Browserbear save table data step

    Running the task should yield an array of structured data, which you can then save to a database or table of your choice:

    Screenshot of Browserbear save table data output log

    Save Text

    This action saves text from one or multiple elements specified with config. It’s best used when you need small snippets of text and not any other types of data, in which case the Save Structured Data tool might be more appropriate.

    To set it up, add a Save Text action to your task and insert config for the element.

    Screenshot of Browserbear save text step

    You can also generalize the config and check All to extract multiple elements matching your identifier.

    Screenshot of Browserbear save text step with all checked

    Your output log should return one or multiple lines of text:

    Screenshot of Browserbear save text output log

    Save Window Location

    Save Window Location saves the URL of your current webpage, making it easy to come back to. This can be quite helpful when you’re logging where you left off or picking up from the latest update.

    Minimal setup is needed for this action, and you simply need to add it to your task at the right step of the process.

    Screenshot of Browserbear save window location step

    The link is saved in your output log, and it can then be stored in a database or used to initiate other workflows.

    Screenshot of Browserbear save window location output log outlined in red

    Bear Tip 🐻: To access the dynamic URL in later steps of your Browserbear task, use a Go action combined with a variable URL.

    AI - Save Data

    This variation of the Save Data action uses AI to locate and save snippets of data from webpages. While this may not be ideal for every scenario, it can save you a lot of time in many simple use cases and situations when page structure varies.

    Setting it up involves inserting some instruction labels, which may only include letters and underscores.

    Screenshot of Browserbear ai save data step

    Running the task should yield output that best matches your labels:

    Screenshot of Browserbear ai save data output log

    Keep in mind that AI is not completely accurate, and it may struggle to locate all the information you need, especially if it is located in a table or multiple containers.

    The AI-powered Save Links action stores URLs from a webpage, most often for looping through later on. This is helpful in cases where page structure varies.

    To set it up, add the action to your task and specify the type of links you want the AI to find.

    Screenshot of Browserbear ai save links step

    The scraper should return a list of links matching your description:

    Screenshot of Browserbear ai save links output log

    Following this up with a Save Data step will cause Browserbear to loop through each link and extract the data you want to save.

    Nocode Web Scraping Made Easier

    Choosing a web scraping tool that can work with many types of data will open up the most possibilities in terms of websites and types of information you can extract. Browserbear offers a versatile array of data extraction capabilities, empowering users to efficiently gather information with ease.

    Whether your purpose is to collect mission-critical industry information or maintain an archive for documentation purposes, our nocode web scraping features make the extraction of data more accessible and streamlined than ever before.

    About the authorJulianne Youngberg@paradoxicaljul
    Julianne is a technical content specialist fascinated with digital tools and how they can optimize our lives. She enjoys bridging product-user gaps using the power of words.

    Automate & Scale
    Your Web Scraping

    Browserbear helps you get the data you need to run your business, with our nocode task builder and integrations

    Our Guide to Browserbear Data Extraction (Data Types & Examples)
    Our Guide to Browserbear Data Extraction (Data Types & Examples)