This article relates to the legacy Apify Crawler product, which is being retired in favor of the apify/legacy-phantomjs-crawler actor. All the information in this article is still valid, and applies to both the legacy Crawler product and the new actor. For more information, please read this blog post.

For new projects, we recommend using the newer apify/web-scraper actor that is based on the modern headless Chrome browser.

Some web pages do not load all their content immediately, but only fetch it in the background using AJAX, while pageFunction might be executed before the content has actually been loaded. You can wait for dynamic content to load using the following code:

function pageFunction(context) {
    var $ = context.jQuery;
    var startedAt = Date.now();

    var extractData = function() {
        // timeout after 10 seconds
        if( Date.now() - startedAt > 10000 ) {
            context.finish("Timed out before #my_element was loaded");
            return;
        }

        // if my element still hasn't been loaded, wait a little more
        if( $('#my_element').length === 0 ) {
            setTimeout(extractData, 500);
            return;
        }

        // refresh page screenshot and HTML for debugging
        context.saveSnapshot();

        // save a result
        context.finish({
            value: $('#my_element').text()
        });
    };

    // tell the crawler that pageFunction will finish asynchronously
    context.willFinishLater();

    extractData();
}
Did this answer your question?