Staring At Empty Pages: Spam

Showing posts with label Spam. Show all posts

Monday, February 28, 2011

IP blocklists, email, and IPv6

Engineers in the Internet Engineering Task Force, in the Messaging Anti-Abuse Working Group, and elsewhere have been debating how to handle e-mail-server blocklists in an IPv6 network. Let’s take a look at the problem here.

We basically have three ways to address spam, in our goal of reducing the amount of spam in our inboxes:

Prevent its being sent in the first place.
Refuse to accept it when it’s presented for relay or delivery.
Discard it or put it into a junk mail folder at (or after) delivery.

The last is handled by what we usually think of as spam filters, which analyze the content and other aspects of the messages. Dealing with the first involves law enforcement, as well as adoption of best practices for legal email marketers. To implement the second, we try to do various analyses during the actual transmission of the email messages, in order to respond at the protocol level with some sort of refusal. It’s rather like standing between your postal carrier and the mailbox at your house, and telling the carrier that she may put this envelope into the box, but she should take those two catalogues and the credit-card offer right back to the post office with her.

And one can actually imagine doing that, by looking at the envelopes and applying rules such as, If it’s pre-sorted, it’s probably junk, and, The more urgent it claims to be, the more likely it is to be junk. But a better way, still, would be if we could get this to happen as soon as the junk mail entered the postal system, by having a way to say, See that guy who’s dropping that pile of mail at the post office? He only sends junk, and when you see him coming just make him go away. Don’t even let him bring his pile in the door.

We have that in our email systems, in what we call IP blocklists (or blacklists). These are lists of the numeric Internet addresses of email servers that we think send so much spam that we won’t even let them come to the door. When one of these servers makes an Internet connection to one of our mail servers, we don’t even start an email protocol exchange with them — we just refuse the connection. We make them go away.

Estimates vary as to what portion of attempted spam this blocks, but at least some estimates are on the order of 90%. Despite the problems with this mechanism (legitimate mail servers do find themselves on blocklists, for various reasons, and sometimes have a hard time getting the list-managers to remove them), it’s a critical one in the fight against spam, saving a great deal of time and computing resources by cutting the spam messages off much earlier in the process.

But note that it deals with IP addresses. Today, of course, that means IPv4 addresses, those things that look like 192.168.0.1, and that there are around 4 billion of. 4 billion is a large number, but, as we’ve seen, it’s notably finite and manageable. It’s reasonable to take every IP address we ever see trying to send mail, and keep it on a list, sorting the addresses into the good ones and the bad ones. It’s feasible to block Internet connections from the ones in our list that are marked bad.

Not so when we consider IPv6. Bumping the IP address from 32 bits to 128, bumping the 4 billion up to a billion billion billion or so — the number doesn’t matter, at that point — makes it infeasible to keep a list of bad addresses. There are enough addresses there to allow the bad guys to use a new one every time, so we’d never see repeats. There are, of course, ways we can group addresses into large blocks, and know that any address we see in one of those blocks will be bad, but even that isn’t enough to make it work.

We could switch to a pass list, a whitelist of known good addresses — that would still be small enough to be manageable — and refuse anything else. But that makes it very hard for an organization to deploy a new server, or for a new organization to join in.

John Levine has one approach: leave the email system on IPv4 for the foreseeable future. Even, John points out, when many other services, customer endpoints, mobile and household devices, and the like have been — have to have been — switched to IPv6, we can still run the Internet email infrastructure on IPv4 for a long time, leaving the IP blocklists with v4 addresses, and a system that we’re already managing fine with.

Of course, some day, we’ll want to completely get rid of IPv4 on the Internet, and by then we’ll need to have figured out a replacement for the IP blocklist mechanism. But John’s right that that won’t be happening for many years yet, and he makes a good case for saying that we don’t have to worry about it.

At least not until he and I have long been retired

Monday, December 27, 2010

Taking anti-spam work personally

Via Brent, comes the AP story of a man who quit his job, went to law school, and now sues spammers:

Eight years ago, Balsam was working as a marketer when he received one too many e-mail pitches to enlarge his breasts.
Enraged, he launched a Web site called Danhatesspam.com, quit a career in marketing to go to law school and is making a decent living suing companies who flood his e-mail inboxes with offers of cheap drugs, free sex and unbelievable vacations.
I feel like I’m doing a little bit of good cleaning up the Internet, Balsam said.

As Brent says, Go! Go! Go!

As for me, I say it’s too bad I didn’t have the confidence to do something like that when I left IBM.

Monday, October 25, 2010

Challenge/response still lives (barely)

Wow; I haven’t gotten one of these in a long time:

ATTENTION!
A message you recently sent to a 0Spam.com user with the subject "[redacted]" was not delivered because they are using the 0Spam.com anti-spam service. Please click the link below to confirm that this is not spam. When you confirm, this message and all future messages you send will automatically be accepted.

I wrote about challenge/response anti-spam systems about three years ago, but probably haven’t seen a challenge message in at least two years. I thought people had given up on them.

Alas, no. But if the last two years is something to judge by, they’ve at least fallen further into disfavour.

Anyway, it’s worth a re-post, then, of my three-year-old item about them. All the problems, all the reasons one shouldn’t use them, are still valid now. So, here’s the link again: head over and read (or re-read) it.

Tuesday, September 28, 2010

Analyzing some spam

I got an amusing little piece of email spam this morning. Amusing, that is, from the point of view of someone who likes to figure out what the spammers are doing and what they’ve compromised in order to do it. Here’s the message, as displayed to me in gmail (I’ve inserted spaces in the URL and email addresses, so your browser won’t make them clickable):

from McDonald’s Survey Department. <survey @ mcdonalds.com>
reply-to survey @ mcdonalds.com
to
date Mon, Sep 27, 2010 at 15:01
subject McDonald’s Survey

Dear customer,

Please give us only 5 minutes of your valuable time to ask you some questions about our products . Please be aware that we will not ask you about any personal information.

In return, we will credit $90.00 to your account - just for your time.

If you want to answer our simply 8 questions , please click the link below :

http: //dyn248.ele.uri.edu/.mcdonalds.com/survey/index.html

Thank you for helping us to become better .

Sincerely, McDonald’s Survey Department.

Please do not reply to this email. This mailbox is not monitored and you will not receive a response.

Of course, the message isn’t really from anyone at mcdonalds.com, but you knew that.

The first interesting thing is the URL. As is often the case with spam URLs, they’ve tried to make it look like a legitimate URL from the company by sticking their domain name in there somewhere — in this case, it’s after the slash, and one has to know how to read URLs to understand that putting it there just makes it information that’s passed to the web server, and has nothing to do with what web server gets used.

And the web server it’s pointing us to is at uri.edu, which is what piqued my interest. This isn’t some throwaway domain, nor anything else registered by the spammer, but something residing at the University of Rhode Island. In particular, this looks like a temporary name assigned to some computer connected to U of RI’s network.

My guess is that a student machine was compromised — malware got installed on it — and the malware set up a hidden web server that’s meant to handle these requests.

Let’s look at where the email message really came from, by checking out the Received lines in the headers. Here are the two operative ones:

Received: from www-7419bfef271.modrsoft.com ([218.24.93.98])
by hormel7.ieee.org (8.13.8/8.13.8/Debian-3)
with ESMTP id o8S55UDI020590; Tue, 28 Sep 2010 01:05:32 -0400
Received: from User ([99.97.107.229]) by www-7419bfef271.modrsoft.com
with Microsoft SMTPSVC(6.0.3790.4675); Tue, 28 Sep 2010 02:41:35 +0800

Reading bottom up, the message was submitted by an IP address in SBC Internet Services, to an IP address at Modrsoft, a legitimate service provider in China. The spammers appear to have found an open relay in Modrsoft’s network, or else Modrsoft doesn’t block port 25, and they compromised a machine there, as well.

Here’s what it looks like:

A compromised computer on SBC’s network was ordered to submit the spam message.
It submitted it to a compromised computer on Modrsoft’s network.
That computer relayed the message to its recipients (including me).
The message directs users to a clandestine web server on a compromised machine at University of Rhode Island.

Unfortunately, the trail goes cold there: I tried to snag the web page, to see what it’s meant to do... but I can’t contact a web server at that address. The machine has been taken offline, has a new address, or has been cleaned up. In any case, it’s not serving the bad guys at the moment. That’s often true of these things: they may only work for a brief time, but they can certainly do their work in that time. They might do the dirty work directly, or redirect you to another web server that will.

Probably, visiting that web site with a susceptible browser (or user) would result in the installation of malware on the visiting computer, adding it to the zombie network. In addition, they’re offering $90 to your account for participating, so they’ll obviously be asking you to give them some sort of account information where they can deposit the money — an account they’re actually be sucking dry as soon as they have access to it.

Too bad I didn’t get to it soon enough, to see for sure what the web page is trying to do.

Friday, September 17, 2010

Kindle and security

Wednesday, I talked about Amazon’s email-in service, which lets you send documents to your Kindle by email. The nicest part of it for me is the PDF conversion feature, but you can, in general, sent any personal documents you like, with or without conversion to AZW.

The way it works is this:

When you buy your Kindle, it’s automatically registered to your Amazon account, so ebooks that you buy there are pushed to the Kindle for you. You also get an email address at kindle.com (and also free.kindle.com), and documents you send there are sent on to your Kindle — free if they’re sent by WiFi, and for a small fee if they’re sent over 3G (if you want to make sure you’re not charged, you can send things only to the free.kindle.com address).

You can control who’s allowed to send stuff to your Kindle by listing the authorized email addresses at the Manage Your Kindle page, or through the settings on the Kindle itself, and the only address that’s authorized by default is the one you use for your Amazon account. Makes sense.

But here’s the thing: there’s no password or other security, other than the sender’s email address. You may or may not know this, but it’s trivial for anyone to send email using someone else’s email address. Anyone who knows my email address can guess that I might use that same address on Amazon, and the address to send to at kindle.com defaults to the left-hand side of that address. So it would not be hard for anyone to send stuff to my Kindle, whether I want them to to or not, and whether I want what they’re sending or not.

So what? If people want to send me free ebooks, why is that a problem?

It’s a problem we’re all aware of: spam. Because it’s not just ebooks that can be sent; PDFs, MS Word documents, and plain text can all be sent, as well as pictures and other images. Imagine getting a kindle-ful of advance-fee fraud scams, Viagra ads, and pornographic images. And then imagine paying for those, if you have a 3G Kindle (I don’t, so it’s all free over WiFi).

The good thing is that Amazon’s Manage Your Kindle page lets you do three things that help here:

set the maximum charge allowed for any one document sent to your Kindle,
change the email addresses that can send to your Kindle, and
change your Kindle’s email address.

Because I never want to accept any charges, I’ve set the maximum charge to zero. I’ve also removed the authorization for my regular email address, and authorized only an email address that no one knows. And, most importantly, I’ve changed the email address of my Kindle to something unguessable, essentially a strong password.

I recommend that everyone do the same (except perhaps for the maximum charge, if you want to be able to send things yourself that you’ll be charged for). At the least, everyone should change her Kindle’s email address to something that isn’t likely to be a target for spammers, and that means something long and relatively ugly.

I’m sure that Amazon does spam filtering on kindle.com, but we all know how much gets by the spam filters, in general. I can’t wait until Kindle spam joins email spam, Facebook spam, Twitter spam, and the rest.

Thursday, February 04, 2010

DXing the spam

An old hobby of mine, but one I haven’t engaged in for more than 30 years, is amateur radio (also called “ham radio”, for reasons lost in antiquity^[1]). I keep my license current (N3BL), but, while I was active in high school and college, I dropped it after that, initially because I had no way to install an antenna, and then mostly because I was too busy with other things.^[2]

Many “hams” deal with the antenna issue by sticking to higher frequencies — VHF and UHF, the two-meter band and above — which need only small antennas that can pretty much go anywhere. But communication at those frequencies is generally short-distance (due to radio propagation issues) and by voice (by convention), and my preference, when I was into it, was long-distance communication with Morse code.

There’s a ham-radio term for long-distance communication: DX, following a convention of using an initial with an “x” to replace the rest of the word (TX is transmitter or transmission, RX is receiver or reception, and so on). What qualifies as DX varies by context; a contact on UHF that’s 200 miles away would be DX there, but on the 20-meter band one expects to contact the world.

As a general term, “DXing” usually refers to talking with people in other countries. In fact, there’s a certificate one can get, called DXCC (DX Century Club), which acknowledges that one has demonstrated contact with 100 countries. There are stickers for increments of 50 beyond.

The other day, it occurred to me that we could do a new kind of DXing, and maybe look toward a DXCC for it. We could record the countries from which we got spam. A sort of Internet DXing for the 21st century, yes?

Of course, we have to have some sort of “rules” to determine the country of a spam message. I’ve decided on these, admittedly a bit fluffy, in order of priority:

If the sender claims to be from a particular country, accept that, whether or not it’s actually true. “Hello I am Mrs Amassa Smith, and I am from Ouagadougou.”
If there is a URL in the message, use the country in which the domain is registered. If there are multiple URLs with domains from different countries, this rule does not apply. It also doesn’t apply if the domain is AOL, Facebook, or the like.
If the body specifically lists an email address to contact, use the country that the domain is registered in. The email address must be listed in the message body — the address it claims to come from in the header doesn’t count. Common email domains, such as aol.com, gmail.com, and yahoo.com, also don’t count.
Check the first reliable Received line in the email headers, and use the country in which that domain is registered. In considering “reliable”, one has to account for possibly forged Received lines.

Of course, this means I have to actually start looking at my spam, more than I already do (I actually do scan it now, and read some of the ones that look like they might be amusing), but it should be fun to keep track of the list. So far, in three days of checking, here’s what I have (go here for the current list):

Nigeria
Côte d’Ivoire
Canada
Sierra Leone
Iraq
Russia
England
Benin
Italy
Taiwan (Republic of China)
France
United States
Korea
United Arab Emirates
Japan
Chile
Gabon
India
Ukraine
China (People’s Republic of China)
Kuwait
Germany
Andorra
Australia
Brazil
Spain
Zimbabwe
Romania
Thailand
Guatemala
Indonesia
Hungary
Dubai
Netherlands
Singapore
South Africa

I’m well on my way. Who wants to play too?

Update: I have created a permanent page to hold the current list.

^[1] There are many suggested origins, all of questionable veracity; let’s just stick with “origin uncertain.”

^[2] I’ll note that some of the commenters to these pages are also hams; I know that Jim, Ray, and Brent are.

Thursday, January 28, 2010

No more anonymous commenting

Speaking of spam: I am turning off the ability to comment anonymously. I’m getting tired of rejecting spam comments, which I’m getting at a rate of ten to twenty a day. Almost all of the spam comments are anonymous, and almost all of the anonymous comments are spam.

Those few of you who don’t want to identify yourselves in any real way can still create a Google or OpenID account using a pseudonym, and you can even change your pseudonym from time to time, if you like. But you'll have to log in, and the “anonymous” choice won’t be there any more.

I’m sorry to have to do this, but, well, blame the spammers.

A new spam study

According to the lede in a New Scientist article from Monday:

Spammers’ own trickery has been used to develop an “effectively perfect” method for blocking the most common kind of spam, a team of computer scientists claims.

The team turned one of their computers into a zombie, but, well, not quite: they were still in control of it, even while it was part of its botnet. And while it followed the orders of the botnet controller, the researchers recorded and analyzed what was going on.

In particular, they looked at the variations in the messages, and used that to form a profile of the spam the botnet was generating:

After analysing 1000 emails generated by this compromised machine — less than 10 minutes’ work for most bots — the researchers were able to reverse-engineer the template. Knowledge of that template then enabled filters to block further spam from that bot with 100 per cent accuracy.
High accuracy can be achieved by existing spam filters, but sometimes at the cost of blocking legitimate mail. The new system did not produce a single false positive when tested against more than a million genuine messages, says Andreas Pitsillidis, one of the team: “The biggest advantage is this false positive rate.”

How useful is this?

Not very. It’s interesting as a case study — and I’d like to see the paper that came out of this work. But it has little practical value. First, as Michael O’Reirdan points out in the article, even if we can stop a spam run one minute in, much less ten, the botnet would have sent out millions of messages already.

Second, for this to be of more than passing interest, we’d have to make sure the people using it had machines on every spam botnet out there, or at least most of them.

Third, smart botnet software can get around this mechanism by changing its template every couple of minutes, and can even learn to detect the spy machine and isolate it from the botnet. In the worst case, it might even be able to feed the spy bad information that could result in the blocking of legitimate mail — just the opposite of the zero false-positive rate the researchers are so happy with.

Finally, it’s not really a surprising result, that infiltrating a botnet allows us to figure out how it works and to temporarily interfere with its operation. But botnet software changes rapidly, and we have to keep learning as it changes.

I like the idea of using this to investigate and experiment with botnets. But let’s keep our expectations realistic. This, as everything else that anyone’s proposed, is not the Final Ultimate Solution to the Spam Problem.

Friday, October 16, 2009

New technique for comment spam

The other day, I left a comment on this post at the 360 blog — a math-related blog run by some math professors in Rochester, NY. I went back the next day to see if there were responses to my comment, and I found that a few hours after I posted it, someone had re-posted the same comment, adding the line, “Sorry... forgot to say great post - can’t wait to read your next one!” Click the screen-shot on the right to enlarge it, and you’ll see what it looked like.

Now, I didn’t post that second one, and there are a few clues to that:

I wouldn’t re-post the whole comment, just to add that line.
I wouldn’t say the added line, in any case: it’s too trite.
In a few ways, the added line isn’t up to my standards of punctuation.
[You might not know any of that, but the last two are more obvious.]
My photo isn’t there on the second comment.
You can’t do it in the screen-image, but if you had put the mouse over my name in the two comments, you’d see that the first one correctly links to these pages, while the second one had a link to somewhere else.

And that link to somewhere else was, of course, the point of the second comment. Someone — undoubtedly some automated process — plucked the most recent comment off their blog entry, appended the extra line, and re-posted it with the same name, but with a different link. It’s link spam. But it’s link spam using a technique I haven’t seen before. It’s actually quite a clever idea.

My comment was acceptable to the target blog, as seen by its presence there. So by using my name and repeating the content of my comment, the spam comment was expected to pass muster. And it did — in fact, it bypassed the blog’s moderation queue, because I’m a known commenter. They added an innocuous line that was unlikely to override the other good points and trigger suspicion, but which provided a semi-plausible excuse for the re-posting.

Of course, they weren’t actually logged in as me, so they didn’t get my profile photo (and they couldn’t easily fake that, because the system would require them to register an account in order to have a photo there).

What they’re doing is one of the sleazy aspects of the business that’s come to be called “Search Engine Optimization” (SEO). The legitimate part of the business involves giving people advice on designing their web site to maximize the likelihood that the site will show up as one of the top “hits” when someone searches for their business’s name, or for related search terms. The sleazy part involves using techniques like link spam to artificially push their site up in the search results. Because Google uses the number of links to a web site from other sites as a factor in gauging the site’s popularity, and, therefore, likely relevance to a search, it’s in the SEO folks’ interest to pump up the number of links to their clients’ web sites.

Every time they manage to get a link to a client’s site into a comment on someone’s blog, they get one more tick mark from Google for it. They’re one step closer to pushing the client’s site higher in the search results.

And I’m having none of it. Just as email spammers give a bad name to companies that use email marketing appropriately and responsibly, these link-spammers give a bad name to responsible SEO consultants who do their work by helping their customers design good web sites.

Monday, August 17, 2009

Charitable donations to send email

In 2004, Mark Wegman, Peter Capek, Scott Fahlman, and I, wrote a paper about using charitable donations to “stamp” email messages as a mechanism against unwanted mass mailings (see Charity Begins at… your Mail Program (PDF)). The paper was not accepted at that year’s Conference on Email and Anti-Spam (CEAS).

In this year’s CEAS, Yahoo! presented a similar paper, profiling a micro-donation system that they’re piloting (see CentMail: Rate Limiting via Certiﬁed Micro-Donations (PDF)). Running code is always more compelling, and five years brings a change in focus.

The basic idea is that if you take a company’s or an individual’s existing donations, you can break them into small chunks of, say, one cent each, and count each of those chunks toward a “stamp” for your mail. On the theory that a spammer sending 50 million messages would not be willing to spend half a million dollars to stamp them, you can give at least some preference through your spam filtering to stamped mail.

MacGregor Campbell just wrote an article for New Scientist about Yahoo’s CentMail pilot, and he refers back to the IBM work and quotes Scott Fahlman and me (though he didn’t get the part that I’m not with IBM any more).

Here’s what he got from me:

Barry Leiba, also at IBM, points out that one of CentMail’s core features could also be a weakness, though.
People may not wish to receive messages plugging a cause they don’t agree with. “I might feel that by accepting his messages, I’m implicitly supporting his charity choices — choices that I might be vehemently against.”

I don’t think this is an insurmountable problem, but I don’t know how to get around it. Here’s the scenario in full, as I gave it to Mr Campbell:

I have a colleague whom I like and respect professionally, and with whom I get on well personally... except that we’re politically opposite. If we should start using Charity Seals or CentMail, I might feel that by accepting his messages, I’m implicitly supporting his charity choices — choices that I might be vehemently against.

Note that this issue exists whether or not we disclose the specific charities. The fact that I know what kind of organisations he’s likely to donate to is sufficient to trigger it. So we can’t mitigate this just by saying (as Yahoo! appears to be doing) that the message is stamped, without saying to what charity the sender gave money.

The responses I get to this concern are usually either

“That won’t really be a problem,” which amounts to summary denial, or
“We’ll only choose non-controversial charities,” which I think is somewhat naïve, and perhaps unworkable.

It might indeed be that it won’t turn out to be a problem. We won’t know that until it’s out there, and we see how it works. I worry, though, that if it does become a problem, it’ll be harder to solve at that point.

That said, I think the charitable donations thing is a good idea, worth pursuing, piloting, experimenting with. I’m eager to see how Yahoo’s program goes.

Wednesday, July 29, 2009

“Send to a friend” abuse

I’ve started getting a few “419 scam” messages using the “email this cartoon” feature of the Dilbert web site as a vector. The 419 scam, also called the Nigerian scam, is that form of advance-fee fraud we see in email so much, where someone sends you a message claiming that he’s the son of the deposed Nigerian president, or some such, and promising free money if you will only help.

The Dilbert web site needs no introduction, I’m sure. Below the day’s Dilbert cartoon is a convenient “Email” button.

Now, the thing about the button is that it pops up a nice, convenient mini-window in your browser, and the window has fields for the sender’s name, your name and email address, and a “personal message”. You fill those in, you press “Send”, it sends the email... and then you can press a button to send another. If you do that, it retains what you put into all three fields. And there’s no CAPTCHA.

You can see how easy it would be to use this to send a boatload of identical messages. Once you get started, it’s a sequence of clicking “again”, pasting another email address into the destination, and clicking “send”. The scammers have seen that, too, obviously: I got one yesterday that looked something like this:

Subject: The Good Lord Loves You is sending you some Dilbert!

Your friend The Good Lord Loves You wanted us to send you this from Dilbert.com.

Message from The Good Lord Loves You:

[419-scam message goes here, something about a church and orphans and whatnot. And money; a lot of money.]

[Image of Dilbert cartoon goes here.]

Sigh. Leave it to the fraudsters to ruin “email” links for the rest of us.

On the other hand, as I tell all my friends: if you want to send someone a pointer to a web page... copy and paste the URL, and send them that. The email message comes from you, you can put your own personalization on it, and you haven’t given your friend’s email address to the web site.

Repeating that last point: please don’t give random web sites your friends’ email addresses. It’s not hard to send the email yourself.

Monday, May 18, 2009

Can zombies live again?

Some UCSB researchers managed to infiltrate the command-and-control system of a botnet, and got lots of information out of it, which they wrote up in a paper.

Their results are interesting to read. But, really, I’m not at all surprised that lots of people continue to get their computers infected, that so many use bad passwords, or that so many use the same password on many web sites. It’s always nice to get specific data on all that, but it’s something we’re well aware of.

What I find especially troubling is this part:

Interestingly, a large number of the financial institutions that had been breached required “monumental effort” in order to notify the victims, according to the report. In fact, financial institutions weren’t the only ones—interacting with registrars, hosting facilities, and law enforcement were all “rather complicated,” indicating that there’s a long way to go in order to make notifying botnet victims easier.

Unfortunately, the reporter got the “monumental” sense completely wrong; they did not say that a large number of the institutions required monumental effort, but that “the large number of institutions that had been breached made notifying all of the interested parties a monumental effort.” That’s not the same thing.

Still, there’s a problem, so let’s say that again, taking it from the paper itself:

Another insight obtained from the experience of taking over the botnet was that interacting with registrars, hosting facilities, victim institutions, and law enforcement is a rather complicated process. In some cases, simply identifying the point of contact for one of the registrars involved required several days of frustrating attempts. We are sure that we have not been the first to experience this type of confusion and lack of coordination among the many pieces of the botnet puzzle.

They suggest that U.S. law could make this significantly better by imposing “simple rules of behavior,” not on the criminals, but on the entities that one has to involve in reining the criminals in.

I’m skeptical of that. Perhaps it’s true in theory, but experience shows that laws help little in this area, and, in fact, a poorly crafted law can actually make things worse, when parties are forced to adhere to the letter, rather than to the spirit.

What would work better, at least on the U.S. side, is the designation of an organization responsible for sorting through the pieces — I suggest the Federal Trade Commission, which is already responsible for dealing with many aspects of the spam problem. I’m not surprised that the researchers had trouble getting through all of this: the organizations involved each had to confirm, to their own satisfaction, that the story they were being given was true, and that they weren’t dealing with yet another set of “bad guys” who were trying to hack the system. And in cases where legal devices (such as search warrants) might be needed, the researchers were likely unfamiliar with the law, and not used to dealing with those requirements. A department in the FTC would be properly equipped to pursue this sort of thing much more efficiently and effectively.

Saturday, January 17, 2009

Spam prose and poetry

I always like it when spam arrives in a batch that sort of goes together in some linguistic or “literary” way. Sometimes it’s a batch of similar nonsense that makes a kind of haiku-like thing. Sometimes it’s... well, see for yourself. Here’s a set of three that arrived recently, at about the same time, apparently from the same spam run (the content was the same in all three, urging me to purchase some Viagra-like philtre). These are the subject lines — reordered by my own artistic license, but the words are all the spammer’s:

Put your doughnut in her oven
Postpone your love bomb’s explode
She’ll reward you so much

Saturday, December 27, 2008

Anti-spam vs free speech

In a recent Washington Post op-ed piece, writer James McGrath Morris looks at the fight against spam with an eye toward its effect on free speech:

Spam was once a simple annoyance. But its exponential growth — reports suggest that about 90 percent of all e-mail is spam — has led e-mail users to build daunting ramparts to block unwanted messages and companies to circulate blacklists of alleged spammers. One cannot fault people for seeking ways to avoid unwanted or aggressive solicitations, but the consequences of some anti-spam measures may not be what the people seeking protection from spam had in mind. Some efforts to block unwanted e-messages are threatening free speech on the Internet.

Mr Morris’s essay leans in a particular direction, of course: as a writer, he cares more about getting his message across than he does about keeping people’s inboxes clean and usable. So take his column with a grain or two of salt. But don’t over-season it, either, because he has a valid point.

Your right to free speech ends at my door — or, in the e-case, my electronic mailbox. You have a right to make your opinion available to me, but you don’t have a right to force me to hear it, to see it, to read it. Turned around, that means that I have the right to refuse to accept it into my inbox. I have a right to use spam filters that might disallow your message, even if it’s a false positive — that is, even if your message is collateral damage of the war.

But when Mr Morris strikes a chord is when he addresses the spam filters that are not under your control, but are run on your behalf by your service provider — often without your knowledge, input, or acceptance. Users of services such as Gmail and Yahoo mail, for example, have their mail filtered for them, and they cannot turn it off or adjust its operation. By receiving your mail through such services, you agree to this, and you accept that you might “lose” messages that you’d actually want, but that have been classified as spam because they contain phrases such as those Mr Morris mentions:

The inclusion of “young adult,” “getting nasty” and “hot” among the thousands of words in my publication was like poison. Indiscriminate spam-blocking software would spot those words, ignore the context and group my newsletter with unsolicited e-mails from purveyors of smut.

Such messages are shunted off to a “Spam” or “Junk” folder, which you might or might not check from time to time — likely not, if you don’t keep on top of it, because the task gets too daunting as the size of the folder grows beyond the scope of a brief glance. Unless you know you’ve missed something, and thus know what you’re looking for, you’re not likely to find it in the mess of misleading missives designed to say, “Open me!”

To be sure, some spam filters are better than others, and there’s lots of spam-blocking software out there that’s better than “indiscriminate”. That said, when one doesn’t configure and run the software oneself, one doesn’t have control of the discrimination. And any anti-spam software, no matter how discriminating, will have “false positives” — those messages that are incorrectly classified as spam. Those messages that Mr Morris worries about.

So, yes, writers start thinking about that, and alter the way they write things to accommodate it. At the same time, of course, so do the spammers. The result looks like a tug-of-war contest, with Mr Morris and his compatriots pulling on one end of the rope, spammers pulling on the other, and a deep pile of mud in the middle. In the end, some of Mr Morris’s team will wind up in the mud.

The problem is that if we don’t filter what we think is probably spam, our inboxes will look, all of them, like that spam folder does now, and Mr Morris’s newsletters, and other messages like them, will be lost, invisible, unfindable, free speech and all. It does little good to orate from one’s soap box on the corner, when the traffic noise completely drowns one out.

It’s easy to blame the spammers — it is their fault that we are where we are with this, after all. But pointing at them doesn’t help solve the problem. They deserve the blame, but it does us no good to place it there. Instead, we just have to keep working on anti-spam technology, getting our “hit rate” up and our “false positive rate” ever lower.

We aren’t going to win the war as long as spam is cheap — bordering on free — to send. The business model that supports spam is just too strong. We can just keep making our weapons sharper, and understand that there’ll be collateral damage.

Wednesday, December 24, 2008

Having trouble viewing this blog?

I’ve been getting a great deal of spam, lately, that has a particular characteristic: the body of the message has nothing more than an image with alternative text that says, “Having trouble viewing this email? Click here to view as a webpage.” The image (and the alt-text) is a clickable link to the web page they want you to go to. The spammers have clearly clued into the fact that many email readers are refusing to automatically show images from unknown sources.

Spam like that — essentially image only — is more difficult to filter than text is, but we do have technology for it. Here’s an example of one of these images from recent email (click to see it full-sized):

Note a few things about it:

While I don’t have other images to compare this one to, there doesn’t appear to have been any attempt to use background variations or funny colour mixtures in order to make this image different from those in other spam messages.
The image contains easy-to-read text. Nothing’s been distorted.
The message is straightforward, but short; there isn’t a lot of text to work with.
What text there is would be considered very spam-like if it were plain text being analyzed by a typical spam filter.

The first level of technology we have is similarity checking: how similar is this image to images that appear in known spam messages? In this case, it’s likely that the image is identical, or nearly so, to known spam, so it will have been caught by that.

The next step is to do character recognition within the image. We can easily pick out clean, clear, non-obfuscated text from images, and then treat it as plain text (we can also do it with some degree of obfuscation, though we’re not as accurate with that). This image should certainly have been no problem there, and it will have been caught by this technique as well.

Finally, there’s image analysis that looks for certain characteristics, such as shapes, colours, element edges, and the like. The success with this sort of thing is a bit hit-and-miss. It’s the sort of thing that might be used to guess that there are naked bodies, and classify an image as pornography... though it will often get that wrong. It probably could have detected the MasterCard logo, but it wouldn’t be reasonable to declare a message as spam just because of that.

What’s more, a great deal of this spam is sent as purportedly coming from my — from my own email address. That, too, will often trip the spam filters, and likely did (all of these messages were, indeed, classified as spam). One wonders why spammers persist in doing this. While it’s certainly reasonable, and sometimes common, for someone to send email to herself, such mail is not likely to come from arbitrary places on the Internet.

Of course, they also use the usual spammer trick of various innocuous sounding subject lines, often with the “Re:” prefix to try to make the recipient think it’s actually a reply to an earlier communication. Surely, no one is fooled by that any more. Here are some of the subjects I’ve recently seen for these types of messages:

Re: Message
Re: Order status
Re: Your inquiry
Delivery Status Notification (failure)

That last, of course, is meant to make me think this is a “bounce” message, and that some important message I sent to someone never got delivered... so I’d better look to see what it was.

What’s good is that we’re pretty much catching all of these. What’s bad is that the spammers just keep sending more and more and more of them.

Monday, December 08, 2008

Nigerian scams work

Bloggers are calling her the stupidest woman alive, which is perhaps unfair. The most gullible, maybe, or most naïve. In any case, a woman from Sweet Home, Oregon is making big news for having sent about $400,000 to Nigerian scammers, in one small increment after another:

A Sweet Home woman has fallen prey to an international network of professional scam artists, and now she’s working with a money management service to climb out of $400,000 in debt.
“As a reverend and an American, I just wasn’t prepared for that level of dishonesty,” said Janella Spears.
Spears, a nursing administrator in Lebanon and a volunteer at Sweet Home Community Chapel, lost the huge sum of money through many small payments to various e-mail scammers that claimed to be from Canada, Texas, Africa and other places.
She’s been defrauded by people claiming to represent banks, credit companies, her relatives, the police and even the FBI.

It’s hard to imagine how she could not have heard of this before, how she could continue to believe it, how she could believe that she would be getting personal messages from the head of the FBI and the President of the United States, and how she could ignore the advice of everyone from friends to police, who were telling her that it was a scam.

But if you’ve ever wondered why the scammers persist in sending out those ridiculous messages... there’s why.

Tuesday, November 11, 2008

New Storm worm subject lines

The “Storm” worm is so named because it initially lured its victims with subject lines about horrific storms in Europe and Asia. Clicking the “video” link in the message or running the attachment has you install a “video player” to see the purported news item. Of course, it’s not a video player, but malware that will turn your computer into a zombie.

The subject lines soon varied from the original storm-related ones, with headlines such as “White House In Flames” and “Plane Crashes Into Eiffel Tower”. Well, there’s a new crop; I got these today (click to see the full image, and note the first and third items):

Thursday, October 23, 2008

Spammers in court again

More catching up; today, week-old news that a U.S. district court has frozen the assets of a group of spammers and ordered them to stop their operations:

The Federal Trade Commission won a preliminary legal victory against what it called one of the largest spam gangs on the Internet, persuading a federal court in Chicago on Tuesday to freeze the group’s assets and order the spam network to shut down.
The group, which used several names but was known among spam-fighting organizations as HerbalKing, sent billions of unsolicited messages to Internet users over the last 20 months, promoting replica watches and a variety of pharmaceuticals, including weight-loss drugs and herbal pills that supposedly enhanced the male anatomy, according to the commission.
“This is pretty major. At one point these guys delivered up to one-third of all spam,” said Richard Cox, chief information officer at SpamHaus, a nonprofit antispam research group.

(Here’s the FTC’s press release on the case.)

This gang is at the forefront of spam technology, with a large zombie network, or “botnet”, and worldwide operations — the investigators cite the group’s connections to Australia, New Zealand, India, China, Russian, and Canada, in addition to the United States. The FTC worked with the FBI and with their counterparts in Australia and New Zealand (one of the principals, Lance Atkinson, is from there).

The group sells “medicine”, both real and fake, including “male enhancement” pills containing sildenafil (Viagra), hoodia remedies, and prescription drugs shipped from India. According to the Times, the FTC says that the spam operation “cleared $400,000 in Visa charges in one month alone.” Think about that: how many people out there are responding to this spam by actually buying the products? Do you wonder why there’s so much spam? Do you shake your head and say that no one pays attention to this stuff? Think again.

If things work the way the FTC would like, we’ll be seeing less spam about these things now. That seems unlikely, though, except for the briefest transition period as others take over. I agree with Trend Micro’s Paul Ferguson, quoted in USA Today: “Someone else will fill the void. While it’s great they caught these guys, the last time a major spam king was busted, the spam increased.” I don’t know that I specifically expect it to increase because of this, but there’s just too much money in spamming for one prosecution and one injunction to stop much.

Also, this isn’t a conviction, but only a temporary injunction — the order is for them to stop their operations while the court case proceeds. There’s no guarantee that it’ll end in a conviction.

Nevertheless, I will stress that it’s great they caught these guys, and it’s another case that shows that the CAN-SPAM Act of 2003 is effective, even with its flaws.

Thursday, September 04, 2008

Conference on Email and Anti-Spam (CEAS 2008)

We recently had the fifth annual Conference on Email and Anti-Spam, and I’ve been meaning to get around to writing up a “highlights” post. I finally have, here.

We were particularly pleased to have a keynote talk by Lois Greisman, head of the Division of Marketing Practices in the U.S. Federal Trade Commission’s Bureau of Consumer Protection. Ms Greisman told us about what the FTC is doing in the anti-spam (and related abuse) war, and talked about what we, the anti-spam community can do to help. She stayed for the day and talked with a number of the attendees, several of whom are interested in sharing their work with the FTC.

Rather than trying to go through all the presentations and give summaries, I’ve picked three papers to highlight here because I found them particularly timely, insightful, or interesting. That’s not to say that these were the only interesting ones; I just had to select a few. In order of their presentation at the conference (links to the papers are PDFs)....

“Exploiting Transport-Level Characteristics of Spam”

[authors: Robert Beverly, Karen Solins]

We’re always looking for alternatives to content analysis in the fight against spam — not to replace content analysis completely, but to reduce our dependency on it and to find other mechanisms to work alongside it. The authors of this paper have analyzed network traffic at the transport layer and looked for characteristics that differentiate spam from non-spam.

Their results are preliminary, but interesting, and the idea merits more study. It’s not clear to what extent it reflects their specific environment, whether use of their current results would have the effect of punishing poorly connected legitimate mail senders, and whether any results they get would see the spammers easily adapt to them.

Those concerns mean that lots more careful work is needed before any such analysis could really be used to combat spam. But our need to have mechanisms that do not rely on the content of the messages is such that lots more careful work on this is warranted.

“Social Honeypots: Making Friends With A Spammer Near You”

[authors: Steve Webb, James Caverlee, Calton Pu]

The conference is looking outside of email, addressing spam in other contexts. One that crops up here and there is social networking — in particular, spam identities and spam “friend” requests. The authors created 51 “honeypot” MySpace identities, one for each U.S. state and one for the District of Columbia. They created bots to keep them all logged in all the time, to make them more appealing (currently logged-in identities show up higher in lists). They waited to see who befriended them, and then the bots automatically rejected the requests.

They found that friend requests did not come in a geographically proximate way, but that their identities in the midwest were called on to be friends more often than ones in other U.S. regions. Most of the originators of the requests claimed to be in California.

They got almost 1600 requests in all, mostly over a two-month period, and when they eliminated duplicates and compared the profiles, they found only 226 profiles that were sufficiently distinct to not be considered effective duplicates. When they boiled these down to target URLs and removed duplicates and redirection there, it all came down to 6 profile URLs and 5 redirection URLs — 2355 URLs eventually reduced down to 11.

“Breaking out of the Browser to Defend Against Phishing Attacks”

[authors: Diana Smetters, Paul Stewart]

“Phishing”, the practice of trying to fool people into giving away personal information, usually involving access to financial accounts, amounts to a significant amount of the spam that’s now sent. It can also be a particularly tricky kind of spam to separate from the real mail, and failure to filter it exposes users and financial institutions to real losses.

Recognizing that phishing is mostly a social engineering problem, and that there are real limits to what anti-spam technology can do for it, the authors designed and tested some user-interface changes to address the issue. They created a set of secure bookmarks for protected sites — sites that would include banks and credit card providers, for example. Those bookmarks reside in a special, secure container, and they launch a specially locked-down browser that would refuse to visit other sites and would disallow such things as cross-site scripting. The bookmark container will only hold authorized secure bookmarks, and it and the bookmarks in it are protected from tampering.

The idea is that if users are taught to use only the secure bookmarks to access high-value sites, they could not be fooled into giving their login information for their bank account to a fake web site run by a fraudster.

Sunday, June 15, 2008

Square whole can be voguish and have indulge you through

I’d generally find it tedious to post my spam here. And, certainly, bad grammar in spam is nothing new. But... well, this just has some of the most amazingly bad grammar and eccentric capitalization that I’ve ever seen, and I just couldn’t resist:

Dear Winner
We Apologies, for the delay of your payment and all the Inconveniences And Inflict that we might have indulge you through.
However, we are Having some minor problems with our payment system, this is Inexplicable, And have held us stranded and Indolent, not having the Aspiration to devote our 100% Assiduity in accrediting foreign payments.We Apologies once again from the Records of outstanding winners due for payment With {ABV CYBER PROMOTION} your name and Particular was discovered as next on the list of the outstanding winners who are Yet to received their payments.
Emails were selected anonymously through a Computer ballot system from over 35,000 companies and 70,000 individual E-mail addresses all over the world and your e-mail address emerged as the winner of the 11 selected emailaddress. This program is promoted and sponsored by Orient software corporation (Orient Networks) in collaboration with The Abv Cyber International.
I wish to inform you now that the square peg is now in Square whole and can be voguish for your payment is being processed and will be released to you as soon as you respond to this letter. Also note that from our record in our File, your outstanding winning payment is $950.215.00 (NINE HUNDRED AND FIFTY THOUSAND, TWO HUNDRED AND FIFTEEN DOLLARS).Payment will be made to you in a certified bank draft or wire transfer into a nominated bank Account of your choice, as soon as you get in touched with.

There follows the name, phone number (in England), and email address of the person with whom I’m meant to get in touched (there’s no postal address, of course). So I guess I’d better be voguish right away, lest the square peg fall out of Square whole before I do.