this post is about teh pr0n

Good golly, there sure is a lot of porn on webcollage now that I've updated it to use Alta Vista's new random link URL. Apparently that link is supposed to be returning pages only from the non-porn part of their database, but there's a lot of porn in their porn-filtered searches these days due to some porn spammers having figured out how to weasel in under the radar, and countermeasures have not yet been applied.

Anyway, I've never done anything in webcollage to try and filter out porn, my thinking being, it's supposed to show what the web looks like, and if the web looks like throbbing cock, well, there you have it. So I was initially somewhat irritated to learn that the random-link URL was supposed to be content-filtered.

But you know, I'm getting really tired of looking at junkies giving blowjobs. There are less pictures of text recently, but it's still way less interesting than before. Blah!

I never really understood why so little porn had been showing up in webcollage. I had been getting random pages by feeding random words from the dictionary into various search engines, and the percentage of porn was pretty low. But Alastair (the Alta Vista guy who implemented the random link) had a plausible theory: he said, ``when people search our site for porn, they don't use big words.''

Most of the words in the dictionary? On the big side.


It used to be possible to pull random images out of the various image-hosting services by generating the right random URLs, and I got a lot of good webcollage mileage out of those until they caught on. (Several of them used to compose their URLs of: user=X, album=Y, photo=Z; but all you really needed was Z, which was world-unique. But then they changed it so that it would give you an error if you didn't get a matching Y for the Z, which made it nearly impossible to guess right the first time.)

Hey brad, how many photos are in FotoBilder these days? Enough for me to resume bugging you for a random-picture link yet?


Do any of you know anyone who works at Google? Apparently Ray is never gonna answer my mail again, so maybe I can track down someone else who has the ability to add a random-link to their site.

Tags: , , , , , ,

12 Responses:

  1. brad says:

    FotoBilder's kinda on hold while I wrap up a bunch of LiveJournal projects. I'll resume work on it in late January and make you that link.

  2. uke says:

    You know that Frederick works there now, right?

  3. brad says:

    What about a random LJ image?

    We can just keep track of all referenced images in public posts, and make a "last 1000" sort of index, then make a URL to return one of those.

    • jwz says:

      SWEET! That would so rule.

    • eqe says:

      Damn, I tend to watch my logs after posting images to LJ to keep track of who's reading my journal. It would be disconcerting to suddenly have people in Bratislava hitting my machine.

      Could be fun, though. Hmm, randomize image returned based on referer... or based on user-agent. What does webcollage masquerade as, again?

    • compwiz says:

      I'd say that would be a somewhat better idea than getting from fotobilder (the pictures would be huge, getting rid of the whole "collage" effect), but you'd also have to exclude people who don't allow their site to be indexed by robots, since technically webcollage would be a robot of sorts.

      As for the porn on webcollage, I'd think the reasoning would be pretty simple (other than the "people don't search for porn with big words" excuse) - out of all the words in the English dictionary, I'd think less than 0.5% of them are things that can be related to pornography, whereas something like two-thirds (last time I heard?) of the Internet is made up of pornography. So if you say hypothetically that altavista indexed every site on the Internet, a random page from their database would have a 2/3 chance of turning up porn, whereas a page explicitly searched for using 4 dictionary words would have an even less than 0.5% chance of turning up porn.

  4. mendel says:

    But webcollage's ability to mystically avoid porn is the stuff of legend. Or, at least, of some mystery between myself and a coworker.

    A little while ago, I (re-)added it to my xscreensaver rotation, and Mike pointed out that that might not be a good idea, because then people walking through the area might find the resulting throbbing cock objectionable. So I left it running over lunch with the screen turned off, Just To Be Safe.

    We get back from lunch.

    "Hey, there's no porn."

    "Where the fuck is the porn?"

    "I don't know how it manages to get no porn. *I* can't avoid porn."

    "Yeah, I was looking for some advisory the other day and got porn. You can't avoid porn."

    "But observe that it has avoided porn."

    "Where the fuck is the porn?"

    We still marvel at its porn-avoidance abilities whenever it comes up. Pulling down images from the web and not getting porn is a way cooler hack than drawing fractals.

  5. compwiz says:

    Oh, and I have a friend who started working at Google. I'll see what I can do.