Recently, I’ve been facing issues when I wanted to download shared folders from Google Drive: there’s an existing out-of-the-box option to download complete folders (at least since the last few years). However, this options does not seem to be bug-free – this may be by design, or not:
Before the download is started, the web interface signals the file compression. The download also works, but the downloaded zip file does not contain all of the JPEG photos which are in the shared folder. A somehow strange behavior which unfortunately has been very reproducible (both for being logged out and also in).
The workaround: download each file individually. For small shared folders this could be doable – however, it’s likely that the zip files are also fine then. So for larger shared folders, this process should be automated.
First approach: Selenium IDE (failed)
My first idea was to use Selenium IDE, as I had gained some experience with it in the past. Sadly, Silneium IDE currently isn’t supported in Firefox Quantum. (Firefox 57+ Support #24, there’s even an xpi file in the /releases, but it didn’t work for me – issue #61.)
Second approach: Selenium Server Standalone (failed)
As the first approach did not work for me I stumbled across the Selenium Server Standalone and found some old docs about using the htmlSuite option:
java -jar selenium-server-standalone-3.11.0.jar -htmlSuite "*firefox" "http://www.google.com" "HTMLSuite.html" "Results.html" Download the Selenium HTML Runner from http://www.seleniumhq.org/download/ and use that to run your HTML suite.
Okay, so that seems to be outdated..
Third approach: Selenium HTML Runner (failed)
I couldn’t get it working, even after trying to fix the geckodriver issue.
java -jar selenium-html-runner-3.10.0.jar -htmlSuite "*firefox" "http://www.google.com" "HTMLSuite.html" "Results.html" Multi-window mode is longer used as an option and will be ignored. Mar 18, 2018 4:38:09 PM org.openqa.selenium.server.htmlrunner.HTMLLauncher mainInt WARNING: Test of browser failed: *firefox java.lang.IllegalStateException: The path to the driver executable must be set by the webdriver.gecko.driver system property; for more information, see https://github.com/mozilla/geckodriver. The latest version can be downloaded from https://github.com/mozilla/geckodriver/releases at com.google.common.base.Preconditions.checkState(Preconditions.java:847) at org.openqa.selenium.remote.service.DriverService.findExecutable(DriverService.java:124) at org.openqa.selenium.firefox.GeckoDriverService.access$100(GeckoDriverService.java:41) at org.openqa.selenium.firefox.GeckoDriverService$Builder.findDefaultExecutable(GeckoDriverService.java:141) at org.openqa.selenium.remote.service.DriverService$Builder.build(DriverService.java:339) at org.openqa.selenium.firefox.FirefoxDriver.toExecutor(FirefoxDriver.java:158) at org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:120) at org.openqa.selenium.server.htmlrunner.HTMLLauncher.createDriver(HTMLLauncher.java:297) at org.openqa.selenium.server.htmlrunner.HTMLLauncher.runHTMLSuite(HTMLLauncher.java:106) at org.openqa.selenium.server.htmlrunner.HTMLLauncher.mainInt(HTMLLauncher.java:253) at org.openqa.selenium.server.htmlrunner.HTMLLauncher.main(HTMLLauncher.java:281)
Anyway, I didn’t find good docs about how the HTML file should look like as the Silenium IDE I had fiddled around with just provided me a .side script with content which looks like JSON.
Final approach: use Selenium Client with WebDriver Python Bindings
I am not a Python programmer but my working script ended to look like this:
How to call the script:
How the script works:
- start an instance of the Firefox browser and open the given URL
- send the ‘v’ key press to change from grid to list view (no image preview!)
- send several ‘arrow down’ key presses to skip the given number of files
- send the ‘Enter’ key press to open a preview of the first photo
- for each photo:
- click the download button
(the file is then automatically downloaded in the background)
- click the next button to get to the next photo
- click the download button
Changes you need to make:
- there’s an option to skip photos at the beginning – replace the magic number “1749” with your value (however, the skipped number may be a little smaller due to loading delays)
- change the words “Weiter” (for “next”) and “Herunterladen” (for “download”) to match your language
- every now and then I had a selenium.common.exceptions.ElementNotInteractableException exception which I haven’t handled and which crashes the script – I just called the script again with an incremented “skip number”
- the skript may be quite slow as I added a delay after clicking the next and download buttons
Room for improvements:
- output the current file names in the console output – I couldn’t manage to do this reliably – the elements on the website seem to become invisible, when not hovered
- check downloaded files against files on the web (and check if we really got all the files)
- improved exception handling
- hand over URL and options (e.g. destination folder) via command line options
- fix tabs and spaces 😉
- comments in the code 😉 😉
- <add yours>
Feel free to contribute: https://github.com/maehw/DownloadGoogleDrive
Get Selenium here: https://www.seleniumhq.org/download/