Apify Just Became Much More Powerful

If you've read my previous article about Apify, you'll know that I'm a big fan of all its features. I already consider it the best web scraping platform available. But web scraping is admittedly less common in the automation world than API integration and RPA are right now. And I won't lie, when I wrote that last article, a small part of me wished that Apify could be my one stop shop for both RPA and web scraping.

While RPA with Node.js is certainly possible, at least in my research there aren't any npm packages that are built specifically for RPA development. You need to do the legwork of finding relevant npm packages yourself. If your process retrieves Excel files, you'll need to find an npm package for Excel. If you create a PDF later in the process, you need to find another npm package for PDF files. Now granted, there are over 1,000,000 npm packages, so there's no shortage of packages to pick from, but right now there's no npm equivalent to rpaframework in Python, which helps automation developers spin up quick, reliable RPA solutions without the busywork of shopping around for pip packages.

So for a while I waited and wished that Apify would make their own npm version of rpaframework. Much to my surprise, Apify fulfilled my wish, but not by making the npm package I wanted. Instead they created a new SDK for Apify actors to be built in Python in addition to Node.

Since Apify is a web scraping platform first and foremost, they marketed this new SDK as creating Python web scrapers using Beautiful Soup, Scrapy, and Selenium. And Python fans will certainly love this new functionality, but the first thing that my mind went to when I heard Apify + Python was "please tell me I'm able to use rpaframework in Python actors." And after some testing, I'm pleased to report the answer is yes! Apify now has full RPA capability, and it only requires one package.

Before the Python SDK announcement, I considered Apify and Robocorp to be two unique platforms, but after this announcement I consider them to be competitors. In some ways they're very similar now, but they still have unique advantages:

  • Shared Features
    • Academy courses, communities, and developer environments (Apify has a web IDE, and Robocorp has VS Code extensions)
    • Easy management and retrieval of output files (key-value stores in Apify and artifacts in Robocorp)
    • Helpful monitoring features and visualizations
    • Integrations to other platforms such as GitHub and Slack (Apify offers a few more than Robocorp)
    • Unattended automation using rpaframework (and process scheduling)
  • Apify's Advantages
    • Built to natively support web scraping, if you have any possible use cases
    • Issues can be created and tracked for each actor
    • More developer power to use Node.js or Python, create Dockerfiles and view build logs, and have README files appear in the Information tab
    • More granular control on how to treat output data (datasets can be used as an alternative to output files), and all actor data (datasets, key-value stores, and request queues) can be named (kept) or unnamed (deleted after 7 days)
    • Pricing is much more flexible. The major drawback to Robocorp's pricing is that Robocorp limits how many processes and workspaces you have unless you pay for the Flex tier ($400/month) or higher. Apify allows unlimited actors in every tier
  • Robocorp's Advantages
    • Attended automation (Apify offers saved tasks that let you run actors in a semi-attended fashion, but not true attended automation)
    • Credential vaults to store passwords in each workspace
    • Multiple workspaces (i.e. for test vs production)
    • The ability to run unattended automations on VMs in addition to containers
    • Work item queues are great for transactional processes (transactions could be managed in Apify using request queues, but this is very unconventional)

Both Apify and Robocorp are worth keeping an eye on, and it will be interesting to see what each one comes up with next. I recommend using both to find out which one best fits your requirements (pricing, use cases, etc). I also recommend learning from the mistake I made by assuming that certain platforms won't offer features we want. One of the best parts of automation is being surprised by new features that we never expect.