Redacting the Redactors

Timothy B. Lee: Studying the Frequency of Redaction Failures in PACER

I wrote software to detect redaction rectangles—it turns out these are relatively easy to recognize based on their color, shape, and the specific commands used to draw them. Out of 1.8 million PACER documents, there were approximately 2000 documents with redaction rectangles. (There were also about 3500 documents that were redacted by replacing text by strings of Xes.)

Next, my software checked to see if these redaction rectangles overlapped with text. My software identified a few hundred documents that appeared to have text under redaction rectangles, and examining them by hand revealed 194 documents with failed redactions. The majority of the documents (about 130) appear be from commercial litigation, in which parties have unsuccessfully attempted to redact trade secrets such as sales figures and confidential product information. Other improperly redacted documents contain sensitive medical information, addresses, and dates of birth. Still others contain the names of witnesses, jurors, plaintiffs, and one minor.

Previously, previously, previously, previously, previously, previously.

Tags: , , ,

Google drops another turd in the punchbowl

You may have heard that Google went and invented a new still-image format, because the zillion we already have apparently aren't good enough. It's a disaster and Mozilla has rejected it, but they're putting it in Chrome anyway.

Oh well, despite that, I'm sure it will be every bit as successful as VP8, Orkut, Wave and Buzz were. (And Ogg, though we can't pin that one on them.)

Jeff Muizelaar:

WebP also comes across as half-baked. Currently, it only supports a subset of the features that JPEG has. It lacks support for any color representation other than 4:2:0 YCrCb. JPEG supports 4:4:4 as well as other color representations like CMYK. WebP also seems to lack support for EXIF data and ICC color profiles, both of which have be come quite important for photography. Further, it has yet to include any features missing from JPEG like alpha channel support. [...]

Every image format that becomes "part of the Web platform" exacts a cost for all time: all clients have to support that format forever, and there's also a cost for authors having to choose which format is best for them. [...]

Where does that leave us? WebP gives a subset of JPEG's functionality with more modern compression techniques and no additional IP risk to those already shipping WebM. I'm really not sure it's worth adding a new image format for that. Even if WebP was a clear winner in compression, large image hosts don't seem to care that much about image size. Flickr compresses their images at libjpeg quality of 96 and Facebook at 85: both quite a bit higher than the recommended 75 for "very good quality". Neither of them optimize the huffman tables, which gives a lossless 4--7% improvement in size. Further, switching to progressive JPEG gives an even larger improvement of 8--20%.

Tags: , ,

jwz mixtape 146E

My one hundredth mixtape is coming up soon, but before that, I thought I'd re-release a few of my favorite mixtapes from the first year. These are audio-only, and so they will expire in two weeks. Please enjoy mixtapes ØØ1, ØØ4, ØØ6 and Ø14.
Tags: , ,

Physics of My Little Pony

"How to fix this: Butterflies could have been made from dark matter."

Tags: , , ,

It's probably about time that you re-read Apocamon.

Previously, previously.

Tags: , ,

Moonman-language tweets mentioning @jwz in the last two weeks:

xleahhutten  dankjewel dat jullie er voor me zijn jullie zijn de beste @jwz
Melchiortje  kan wel een eeuw op je wachten @JWZ
MaxKloppert  Liefde op 1e gezichtt . @jwz
anoukloveyouu  ik maak me druk samen met @xssanneee @jwz
martijndbgc  @EstherKievit Nog gefeliciteerd! @jwz
virgy58  vlikker een eind op beetje stoer doen maar in het echt durf je niks @jwz
OLL_98  @jwz bedoel ik. #:
xxxelvira  ilovemnliefste @jwz
bomeenhuis  Was mooie laatste dag #morgenvergaatdewereld @jwz. Slapen #sweetdreams #gn
Jayjay0485  haha mensen die gaan haten zijn jaloers omdat ze niet kunnen wat wij kunnen @jwz
Joycex24  @sophievhelmond dankje BITCH <333 #hahah @jwz #polleke
dh_bambitch  vanavond naar #denhaag :) @jwz
x_Mabell  Haal je zelf niet naar beneden als er niemand is om je op te vangen @jwz
ElisaStapel  @jwz wie een meisje pikt al je vriendjes af dat wijf
lisannethe  haaaa mupo'baas'kamerplant'schoenzool'dinges @jwz
Minailatjj  Ik vraag me af waar ze isz :p @JWZ
vickydevikking  you left my in the gutter @jwz.
suusiiej  tity moet echt aan haar werk gaan @jwz
sarcarsten  Argh. @jwz verlinkt einen Artikel von 2008, und ich erkenn' den am Link, als wär's gestern gewesen.
Richelle3690  JEROEN. Kan je het zien @jwz ;d. x
Mandyhvj  @xierrr Khoophet maar dp bounce bro @jwz
marie160xx  4646 vooooooor @Jwz. <3
jilliloveyouu  @smurftweets lol, jij noemt 3 tweets al spamme x] because a boy like you is impossible to find, you're impossible to find.. ;$ @jwz
EefkeeX  iehl @jwz
irmafienx  ben er voor je @jwz
moniquex96  @nynkeeloveuu ga je het vanaaf egt zeggen @jwz
Carlijn_98  @ @Anouk073 schatje je bent echt zoo lief wil jou ook nooit meer kwijt #iloveyouu en mss volgend @jwz
mc_jef15  Iets niet snappen is iets anders dan iets niet willen snappen @jwz
_kleineN  @_lisaaaxx kan je er niks aan doen aan dat van @jwz
naomyxbieber  4001 VOOR MYN NASTY BOY @JWZ:$
mariotje145  KAK hope da ze vnv wel kan @jwz
xIsaBelle_  Sexy kindjes #lkkrman @jwz
alishaaxxxx  @Elseexxx wij zijn raar ,, wij zvallen beide op kindjes uit groep 6 hahhaha @jwz
judithhxxx  'k hou van jou ! @jwz
chrisje94  mensen moeten eerst de oorzaak van sommige dingen weten voordat ze iemand anders ervan gaan beschuldigen @jwz
Yannickhh  Schijte... @jwz
Michaelstylezz  weer terug van school was gezellig met @jwz
_kleineN  Ben er voor je @jwz
hartjemelaniee  Het zal je toch overkomen dat 2 vriendinne dezelfde jongen leuk vindee @jwz hahahaha,
Anneberberrr  @riggll18 haha tuurlijk jonguh @jwz wannneer krijg iik ze ? Hehe
wesselsclaudia  Ik mis je zo schat!, ik zou je nooit vergeten! iloveyou, niemand boven jou!@jwz
SimoneRoggen  Lobii vo @jwz
Manonnnxxxx  @kusjexxRomy ikook:$@jwz
sharifalovesyou  @jwz
bertjanlobregt  i want you (l) @jwz
Emma_078  met je hodi. denk je nu dat je stoer ben? oke klap me maar. nee dankje. @jwz.
xmarijex12  Jezus flikker op en laat me gwn @jwz >:(


Tags: , , ,

Click Trajectories: End-to-End Analysis of the Spam Value Chain

This paper is awesome:

Spam-based advertising is a business. While it has engendered both widespread antipathy and a multi-billion dollar anti-spam industry, it continues to exist because it fuels a profitable enterprise. We lack, however, a solid understanding of this enterprise’s full structure, and thus most anti-spam interventions focus on only one facet of the overall spam value chain (e.g., spam filtering, URL blacklisting, site takedown). In this paper we present a holistic analysis that quantifies the full set of resources employed to monetize spam email— including naming, hosting, payment and fulfillment—using extensive measurements of three months of diverse spam data, broad crawling of naming and hosting infrastructures, and over 100 purchases from spam-advertised sites. We relate these resources to the organizations who administer them and then use this data to characterize the relative prospects for defensive interventions at each link in the spam value chain. In particular, we provide the first strong evidence of payment bottlenecks in the spam value chain; 95% of spam-advertised pharmaceutical, replica and software products are monetized using merchant services from just a handful of banks.
Tags: ,


Tags: , ,

Bobby Pflugelblager

Tags: , , ,

Referer snooping

Dear Lazyweb, is there a service like Google Blog Search but that actually works?

We will define "works" as: gives me an RSS feed of timely references to or links to my sites; omits links that are months or years old; and is not brimming with spam. (Google Blog Search fails on all three of these.)

Tags: , ,

  • Previously