Facebook Gallery Hate

Dear Lazyweb, how do I get the URLs of all of the photos in a Facebook gallery?

I have this script, galdown, that is capable of bulk-downloading galleries from Facebook, Flickr, etc. However, a couple of months ago, Facebook broke: now it will only download the first 28 images in a gallery, because they changed their galleries to use that bullshit "we only load the images after you've scrolled the mouse to the bottom of the screen" trick that is so popular with the web-breaking microcephalics these days.

How do I find the URLs of photos 29 and later?

I haven't found a user-agent that makes a difference. Adding &_fb_noscript=1 to the URL seems to do nothing. I can't even tell what URL it's loading to get (the presumably JSON list of) the remainder of the images.

Tags: , , , ,

6 Responses:

  1. Thomas says:

    You may have better luck scraping their mobile site (m.facebook.com), as it appears to do things the old-fashioned way of static html.

    (PS: your blog keeps erroring with "We were unable to authenticate your claimed OpenID" when commenting with a LJ account)

    • jwz says:

      Good idea, I got it working again with the mobile site.

      I assume your OpenID error translates to "Livejournal shat the bed again".

  2. krv says:

    So much of the web is now busted with web 2.0 crap I just reach for WWW::Mechanize::Firefox and MozRepl first. PhantomJS is the other option.

  3. dzek says:

    Even heard of Firebug (for Firefox) // Dragonfly (Opera, built-in) // Developer tools (IE8+, built-in, not sure if can do this) // Developer tools for Chrome (not sure if built-in)? Probably all of them (I'm using Opera) has way to see request sent by browser.
    Example request for Facebook gallery:
    http://pastebin.com/sYBCGQz5
    (I've trimmed off cookies and other info that could be exploitable or private, well, it's just example, you will figure it out)

    • florin says:

      I've also found the same request with Chrome (it helps if you select Documents under Developer Tools/Network); the fetch_size parameter does what it says on the tin.

  4. Otto says:

    You can get the photo listings via the Graph API, along with all the pain that entails. But really, it's not hard.

    First get an auth token with the user_photos permission. This is the hard part.

    After that, https://graph.facebook.com/USER-ID/albums will list the photo albums, and https://graph.facebook.com/ALBUM-ID/photos will list the photos. It's pages at 25 per request though, so the paging data in the return gives you the next page to fetch. Whole thing comes back in JSON.

    Downside is that initial token BS along with the need to have an "app" on Facebook to get the auth token. I can work up some example code if you like. Email me.