


Our analysis found over 14,000 words that are recognized as words when spelled accurately, but that won't be corrected even when they are only slightly misspelled. However, the vast majority of these words are technical or very rarely used words: "nephrotoxin," "sempstress," "sheepshank," or "Aesopian," to name a few.
But among this list as well are more frequently used (and sensitive) words such as "abortion," "abort," "rape," "bullet," "ammo," "drunken," "drunkard," "abduct," "arouse," "Aryan," "murder," and "virginity." [...] In total, our analysis found dozens of words that were not identified as jargon or technical words but nonetheless did not offer corrections -- charged words like "bigot," "cuckold," "deflower," "homoerotic," "marijuana," "pornography," "prostitute," and "suicide." [...]
To find the list of excluded words, we came up with two different misspellings for roughly 250,000 words -- including all of the ones in the internal dictionary that ships with its desktop operating system -- and wrote an iOS program that would input each misspelled variant into an iOS simulator (a computer program that mimics the behavior of a factory-condition iPhone). We then made a separate program that simulated a user selecting from the menu of suggested corrections and recorded the results. After narrowing down the list to roughly 20,000 words that looked problematic, we tested 12 more different misspelling combinations. Words that did not offer an accurate correction any of the 14 times were added to our list of banned words. [...]
An Apple spokesperson declined to comment for this article.
Apple also declined to comment on changes made to Siri regarding abortion and birth control.
Asked by The Daily Beast why Apple software won't correct "abortiom" to "abortion," Siri responded only: "I'm sorry, I don't understand."
Previously
I don't think there's anything sinister going on here. The autocorrect feature is enough of a pain in the arse without it inserting highly charged words like "rape" into my text messages.
P.S. Dear Apple, did you have to put the SEND button just above the O and P keys? Why not put it somewhere else altogether?
Ockham for the win.
People love to ignore the "all other things being equal" clause.
There are good reasons for not completing on "nephrotoxin" and "sempstress". Not completing "abortiom" is goofy, and is a policy decision, one that has demonstrably changed over the years, and the (old, but still pretty funny) story here is Apple's reflexive secrecy about all such policy decisions.
Maybe it's a reasonable decision. Maybe not. But their unwillingness to talk about what their decisions are, let alone why they were made, is typical.
Obviously people are curious, as demonstrated by the lengths they went through in this research, but Apple wants people to just stop asking them uncomfortable questions. When a corporation finds a question uncomfortable, you should ask it again and louder.
I don't think there's a difference between completing and changing to in this My speculation is that they have a list of words that they decide require extra deliberation to type: rape, ejaculate, fucking. It's not that my phone doesn't want me to type some things —contrary to the headline's hand-wringing — it's that it doesn't want to help me mistype those things. "Can some parents show up early to help with the mummy raping?" True story. No really.
And oh god I should know better than to speculate about functionality on this of all blogs, but we're speculating about motive for functionality (some of us in the headlines of the cited article, Mr. Keller…) so what the hell, why not.
Not knowing to what degree autocorrect/complete can be biased (is there an algorithm that tilts "thrre" towards there instead of three? Is there a manual tweak that can be added to override sheer similarity matrices in favor of common usage?), I can only guess that the only practical way to prevent your phone from suggesting small children be mummified and violated is to completely remove particular words from that completion function, full stop. Thus, "abortion" doesn't get auto-completed…ever.
Or, to put it another way, allow people to type cunt all day long so long as they actually exercise true deliberation in their typing.
And given that words like abortion aren't getting redacted to baby murder, I'm kind of not that worried about what word did or did not make it on the list. I mean really, I can think of some questions that I'd like to ask Apple that might make them uncomfortable and what words they think people might not accidentally inserted into their texts just isn't one that seems sinister in its motive.
Oh for…gah. Just start reading at "My speculation…"
Stupid irony.
Uh, it's kinda like you didn't even read anything I said.
I think your main points were, "[Apple's] unwillingness to talk about what their decisions are, let alone why they were made, is typical. […] When a corporation finds a question uncomfortable, you should ask it again and louder."
Short version
I think they're not uncomfortable about this question. I just think they think there's an obvious answer.
Long version
I think there are some questions that make Apple uncomfortable and ought to be asked in a probative manner, yes. Prime among these might be questions about corporate subsidiaries and weaselly shunting of monies to avoid taxes, and ensuring that their supply chain doesn't contain inhumane worker practices. Secondary might be be questions about their app store approval process and general curation policies.
But by the time we get to why certain words don't rank auto-completion, I honestly don't see them squirming because you're probing at some nefarious secretive policy. I just think it's a conversation they don't think is worth having.
You of all people know what it's like to get a million questions, some of which you just plain ignore. Yes, you have a totally different obligation to explain yourself than does Apple, but I think the same principle actually applies in this case: there is an obvious and satisfactory answer already at hand (i.e. nothing's being suppressed, it's just probably a pretty good idea to not have the phone make it easy to mistype cunt or abortion, and if you know some cunt who needs an abortion and need to type about it then you can still do so, just with more deliberation than usual) and thus, really, they're not talking about it because it doesn't cross Apple's threshold for discussion.
Makes that time my customer's phone autocorrected my name to Labia EVEN FUNNIER.
The list of unsuitable words in the english language is an unbounded set.
The English language is a bounded set. You are bad at math.
4r3 y0u 5ur3?
Yes, still bounded.
Yes, I am sure you're bad at math.
I am apparently also bad at making funnies.
I'm trying to figure out how to express mathematically the idea that at any moment in time, the English language is bounded, but it's impossible to predict the directions in which that boundary will expand in the future, and a lot of that expansion is labeled "unsuitable".
Then, too, unsuitability is in the eye (or ear) of the beholder.
I guess I'm saying that finding a tight bound on "unsuitable words" is a strong-AI problem.
If you want a tight bound, a sane person would bound it by the OED, which to my knowledge is explicit, not algorithmic. Any grey area you can call "not yet English."
But let's say you are insane.
Assume the available alphabet is all of Unicode. Assume words have length no longer than the number of electrons in the universe (Eddington number, Nᵉᵈᵈ ≈ 1.57×10⁷⁹).
The number of available words here is finite. I doubt anyone will communicate a longer English word, or an English vocabulary which cannot be encoded into this scheme.
I guess I was thinking more about attempts I've seen to filter out "unsuitable" words in chat, where the relevant corpus is more Urban than Oxford English.
You're right, though, that for spelling correction, you're only going to correct to "legitimate" words, which puts a much tighter bound on the unsuitable set.
Interviewing Siri. Brilliant!