Extracting data

Use Apify actors or scrapers to extract any data you need from any website.

Jan Čurn avatar Marek Trunkát avatar Lukáš Křivka avatar +2
9 articles in this collection
Written by Jan Čurn, Marek Trunkát, Lukáš Křivka and 2 others

Scraping a list of URLs from Google Spreadsheet

Learn how to crawl a list of URLs specified in a Google Spreadsheet using one of the Apify web scraping actors.
Jan Čurn avatar
Written by Jan Čurn
Updated this week

How to scrape pages with shadow DOM

Shadow DOM enables the isolation of web components, but causes problems for web scrapers. Here's an easy workaround.
Marek Trunkát avatar
Written by Marek Trunkát
Updated over a week ago

Scraping iframes with Puppeteer

How to get information from inside iframes using Puppeteer
Lukáš Křivka avatar
Written by Lukáš Křivka
Updated over a week ago

Scraping using sitemap.xml

Website with proper sitemap.xml = free jackpot for every web scraper
Marek Trunkát avatar
Written by Marek Trunkát
Updated over a week ago

Request labels and how to pass data to request in Puppeteer

How to handle request labels in Apify actors with Puppeteer
Václav Růt avatar
Written by Václav Růt
Updated over a week ago

Submit form with file attachment using Puppeteer

How to submit a form with attachment using Puppeteer.
Petr Čermák avatar
Written by Petr Čermák
Updated over a week ago

Crawl multiple pages with the same URL and different POST data

Crawlers skip pages that have the same URL but only differ in POST data. Learn how to make the crawler visit all such pages.
Jan Čurn avatar
Written by Jan Čurn
Updated over a week ago

Submitting a form with file attachment

How to submit a form with attachment using request-promise.
Petr Čermák avatar
Written by Petr Čermák
Updated over a week ago

Scraping data from websites using schema.org Microdata

JavaScript code to automatically extract data using schema.org tags
Jan Čurn avatar
Written by Jan Čurn
Updated over a week ago