Staring At Empty Pages: Writing

Showing posts with label Writing. Show all posts

Wednesday, March 10, 2010

Plagiarize! Let no one else’s work evade your eyes.

In case you haven’t been following the latest New York Times plagiarism scandal, you can get a good summary from ombudsman Clark Hoyt’s March 6th Public Editor column:

ZACHERY KOUWE, a Times business reporter for a little over a year, resigned last month after he was accused of plagiarizing from The Wall Street Journal. An internal review of his work turned up more articles — he said he was shown four — containing copy clearly lifted from other news sources.

Mr Hoyt calls for a full accounting by the Times, listing all the instances they turned up where plagiarism was clear, and telling readers what’s being done to address the situation in general, beyond the dismissal of Mr Kouwe.

For Mr Kouwe’s part, according to Mr Hoyt he expressed his own surprise at being shown what he’d done. It’s an honest mistake, he says, editing copied material in without remembering that it had been copied, thinking that it was his own writing.

I find this completely puzzling.

I’ve never worked at a news desk, and have never had the pressure, stress, competitiveness, and tight deadlines for my writing that Mr Kouwe faced, and that his colleagues still do. Perhaps it’s the pressure and deadlines that explain it. Perhaps when one is under that kind of stress, one does forget. And yet....

When I get source material, I keep it separate. And I never include it without attribution. Look around these pages: there’s nothing that shows up here written by someone else, unless it’s within quotation marks or in a <blockquote>. I can’t understand how a professional writer can carelessly mix up his own writing with copied material.
I know my own writing. Perhaps more to the point, I know what’s not my own writing. Once in a while, there’ll probably be something that could go either way, but in general I can just look at something and say, “That’s not mine; I didn’t write that.”

I want to believe Mr Kouwe when he says that it was an accident. I just find it very hard to. And, anyway, I doubt he’ll be working for any reputable news organization again. But what am I to think when the next journalist makes a similar claim?

In any case, dear readers, be assured that every sentence, clause, or phrase in these pages is my own, unless it’s clearly identified otherwise.

[Thanks to Tom Lehrer for this post’s title.]

Plagiarize!
Let no one else’s work evade your eyes.
Remember why the good Lord made your eyes.
So don’t shade your eyes,
But plagiarize! Plagiarize! Plagiarize!
(Only be sure always to call it, please, “research”.)
— Tom Lehrer, “Lobachevsky”

Tuesday, August 25, 2009

Blogs as journalism: what standards?

Some friends and I were having a conversation recently that seems reasonable to report on here.

A friend sent to some others of us a link to a technology column, and I, unimpressed with the column’s author, responded with some strong criticism:

Given that he’s being paid to write, it’s a pity he doesn’t write better: he misspells, he doesn’t use commas correctly, he gets subject/verb agreement and number agreement wrong, and he has awful, run-on sentences that are so convoluted even the writer can’t get the ending right. And that’s just in one article.

The sender’s response to that was that hey, it’s a blog, not the New York Times, implying that “blogs” shouldn’t be held to standards as high as those we’d hold the Times to. To which another correspondent said, “That’s one of my bigger complaints about blogs. A lot of bloggers are in dire need of an editor, not merely an author.”

The conversation finished with the sender’s noting that some blogs are “rougher hewn,” and that that’s OK, “as long as reader expectations are set and met consistently.” But there’s the thing: there are all different kinds of blogs, all different kinds of readers, and all different kinds of expectations.

There are individual blogs like this one. No pay, no pretense to journalism. Widely varying quality of writing, and the people who read them know what to expect from the ones they read. I try to maintain good writing standards, and I think I usually succeed. But it’s not something one expects when one stumbles onto a blog like this.

There are group blogs that work pretty much as individual blogs, except that there are multiple contributors. They usually vary by contributor. There are also group blogs that are more formal, and some where contributors do get paid.

And then there are “blogs” like the Huffington Post, like the “technology blogs” (one of which started this discussion), and like the blogs that are actually part of the New York Times. These are labelled as “blogs”, but they certainly aspire to “journalism”. Some are simply less-formal, less-edited columns written by actual journalists, who otherwise write formal, edited pieces for the same outlets. David Pogue, for instance, has technology columns in the Times, as well as a blog there.

Should we be applying different standards to Mr Pogue, say, depending upon whether we read his comments on www.nytimes.com or on blogs.nytimes.com ?

And back to the author in question, who is associated with a major techno-journalism outlet: is it OK for him to write badly because he has an established readership, and his readers accept it?

Ultimately, everyone’s job is to make one’s boss happy. If the people who are paying the guy are pleased, then who am I to say? And, yet, it bothers me. It bothers me that people are being paid to write, and they write badly. It bothers me to know that there are good writers out there who can’t get work, and, yet, bad writers are... making their bosses happy. It bothers me that standards of writing and of journalism are deteriorating.

It bothers me that standards seem now to be driven by what readers will tolerate, rather than by what they deserve from paid professionals.

Monday, June 22, 2009

Truth and rumour and journalism

The New York Times recently ran an item about bloggers and other online writers “competing” with mainstream journalism. A main point of the article was that news-bloggers often take more risks, do less fact-checking, worry less about reliability of sources, and so on. And the idea, it seems, is that when they miss — a “fact” isn’t, or a source turns out to have been wrong — it doesn’t matter much, because they’re “only bloggers”, but when they hit, they have a real scoop.

But seeking credibility may be a less-important strategy for the blogs at this stage. Mr. Arrington, a lawyer, is quick to point out that he has no journalism training. He is at ease, even high-minded, in explaining the decisions to print unverified rumors.
Mr. Arrington and the other bloggers see this not as rumor-mongering, but as involving the readers in the reporting process. One mission of his site, he said, is to write about the things a few people are talking about, “the scuttlebutt around Silicon Valley.” His blog will often make clear that he’s passing along a thinly sourced story.

The point is that when you consider the resources needed to do all the real, you know, journalism work, you see that the little guy can’t compete with the big media outlets... but there is, they say, a place for that little guy and his tossing out of questionable material, hoping that enough of it is right — or right enough — to have value.

I’m uncertain. It seems to me that we used to call such people “gossip columnists”, and we used that as a pejorative term. When we wanted to know what was going on in the real world, we turned to the real news and we expected reliable facts and reliable sources, news items written by reporters who took the time to investigate what they were reporting on. Breaking stories demanding urgent reporting were always different, of course, but even then we expected something with real facts.

When we wanted to know who was dating whom in Hollywood, who was on the outs and who was having whose baby, well, we were happy to turn to whispered, unsourced, unchecked innuendo, often put in the form of rhetorical questions. “And who’s that sexy blonde who’s was seen with Herkermer Biffelwogg in Cannes last month?”

And now, it seems, the latter is encroaching on the former. Now that one no longer needs a publisher to be published, now that one can be a soi-disant journalist on a whim, trained by no one and hired by no one, now that any 10-year-old with a broadband connection[1] can publish what he has to say to the world, readers, not writers, are often the ones expected to check the facts.

The Times article describes a situation where a blogger ran with a rumour (about the health of Apple’s Steve Jobs) that turned out to be right:

Mr. Lam says it taught him a lesson. “If we don’t have rumors, what do we have as journalists?” he asks. “You have press releases. So maybe there is some honor in printing rumors.”

Is that really the dichotomy: what isn’t “rumour” is just canned material released for gullible journalists to reproduce with little editing and less thought? I don’t think so. That’s clearly not what Woodward and Bernstein did with the Watergate break-in. It’s not even what Damon Darlin did for this very New York Times article. Mr Darlin neither printed someone’s press release nor pasted random rumours into his computer. He talked to people. He made some phone calls, he probably followed chains of references, he sorted and culled what he got, and he wrote an article with some though behind it.

There’s still lots of that sort of reporting out there, and it’s what I prefer to read.

I’m not sure to what extent I have any interest in rumours, but I know this: I expect to see them labeled as such. Facts, opinions, analysis, and rumours are all different things, and it needs to be clear which is which.[2]

[1] I used to say, “any 10-year-old with a modem,” but, well, times have changed.

[2] In case there’s any question: I cite my sources, and everything else here is opinion, always. But, then, I also don’t claim to be a journalist.

Monday, June 15, 2009

Scientific garbage: enablers

In Cross-checks on ethics, I wrote about how well-meaning journals and conferences can miss ethics violations that include rigging experiments, making up data, padding the co-author list, and other cheats of that sort. Legitimate publishers of scientific studies and data can only go so far in validating what they’re asked to publish, and sometimes bogus papers get through.

There’s another side to the bogus-publication coin, though: the enablers. These are the journals and conferences that specialize in providing a forum for questionable — or downright garbage — studies and research reports.

At the questionable end are studies whose funding causes conflict-of-interest concerns, ones lacking rigorous methodology, and ones with insufficient data to deduce anything from the results. These problems usually are caught during peer review, if the papers are otherwise honestly written, and the top journals and conferences reject them. It’s easy to see how students, research faculty, and professors, living in a publish-or-perish environment, look to less reputable outlets for their work.

We’ve recently heard that Elsevier colluded with Merck — the pharmaceuticals company that made Vioxx, and that makes Fosamax, Vytorin, and Zocor — to produce a fake journal, one that looks like a peer-reviewed publication, but isn’t:

An “average reader” (presumably a doctor) could easily mistake the publication for a “genuine” peer reviewed medical journal, he said in his testimony. “Only close inspection of the journals, along with knowledge of medical journals and publishing conventions, enabled me to determine that the Journal was not, in fact, a peer reviewed medical journal, but instead a marketing publication for [Merck’s Australian subsidiary].”

In fact, soon after that it came out that Elsevier had a whole series of such “journals”:

Scientific publishing giant Elsevier put out a total of six publications between 2000 and 2005 that were sponsored by unnamed pharmaceutical companies and looked like peer reviewed medical journals, but did not disclose sponsorship, the company has admitted.

This was particularly disturbing because of Elsevier’s reputation, and the extent of their publication world. But they aren’t the only outlet for dicey data. For years, now, there have been publications that will accept your work for a fee. That makes these pay-to-publish “journals” places where you can take that paper that’s been rejected everywhere else, and make it count on your résumé.

As I've participated in peer reviews, I’ve seen papers with no substance, and papers that are so far off topic as to be ridiculous (a mechanical engineering paper submitted to a computer science conference, for instance). Some people will submit anything anywhere, in the hope of getting something published.

But publications with no standards are... well, check this out:

So [Philip] Davis teamed up with Kent Anderson, a member of the publishing team at The New England Journal of Medicine, to put Bentham’s editorial standards to the test. The pair turned to SCIgen, a program that generates nonsensical computer science papers, and submitted the resulting paper to The Open Information Science Journal, published by Bentham.
The paper, entitled “Deconstructing Access Points” made no sense whatsoever, as this sample reveals:
In this section, we discuss existing research into red-black trees, vacuum tubes, and courseware [10]. On a similar note, recent work by Takahashi suggests a methodology for providing robust modalities, but does not offer an implementation [9].

And, yet, the paper was accepted, and The Open Information Science Journal would publish it for an $800 fee, “to be sent to a PO Box in the United Arab Emirates.” The director of publications claims that they knew it was bogus and were just trying to smoke the author out by pretending to accept the paper. That excuse seems unlikely, though I would believe that they’d have taken his $800, had he sent it, and then thrown the paper out.

The article goes on to mention the infamous World Multi-Conference on Systemics, Cybernetics and Informatics (WMSCI).[1] That conference, devised by Nagib Callaos (who claims to be a retired professor), and now in its 13th successful year, is basically a conference with no focus and no standards — and, hence, no standing — that exists for the purpose of making money by attracting participants. Speakers must pay the meeting fee to attend, which is what feeds the conference.

Most conferences expect speakers to pay the meeting fee, but what’s different here is the number of people they try to suck in, and the fact that they only allow each speaker to present one paper, and charge an extra fee (see here) if a speaker wants to present a second paper (for example, that of a colleague who didn’t have the money to travel to the conference). I know of no reputable conference that does that.

Other clues about WMSCI are these:

The peer review process includes a provision for a “non-blind” review in which the author selects the reviewers (see item number 2 here).
The enormous “program committee” — 284 members (see here). Normally, that’s the list of peer reviewers, but in this case it’s artificially inflated by the inclusion of, essentially, everyone who’s ever agreed to participate in the conference (and probably some who haven’t). It’s nice of Callaos, though, that he claims to have removed “those who manifested no interest.”
The absence from the program committee of institutions that are respected in the field. There are no PC members from Columbia, Cornell, Princeton, MIT, Carnegie-Mellon, Stanford, or Georgia Tech, for example. But there is one from Quinnipiac University, a small school in Connecticut that has no doctoral program in Computer Science.[2]
The unbounded scope (see here) leaves no topic behind. The conference covers everything from user interfaces to information retrieval to object-oriented programming to ethics and computer crime to security and privacy and hacking to artificial intelligence to computer graphics to wireless networks to gaming to....
The lack of sponsorships from reputable companies and organizations.

A few years ago, in an incident similar to the one engineered by Philip Davis, above, the SCIgen folks at MIT got a nonsense paper accepted to WMSCI 2005, and planned to attend and present the paper with a nonsense presentation (see the “Examples” and “Talks” sections on the SCIgen page). They outed themselves, though and Callaos rescinded the acceptance. And David Mazières, then at NYU, submitted this paper to WMSCI 2005. According to his web page, “We never received official notification of whether the paper was accepted or rejected.” The figures are especially inspired.

Of course, silliness aside, these for-profit-only journals and conferences are a real problem, in that they serve as traps for the unwary. Someone reading a paper and not knowing that the journal isn’t reputable might base a major grant proposal on junk, wasting a lot of time and money and causing much embarrassment. And if you wound up having your paper accepted to a phony conference, would you, once you realized it, be willing to admit that you were hoodwinked. What would your boss, who paid for the trip, think?

Companies and schools involved in research know what the first-tier and second-tier conferences are, their positions decided by other researchers and based on the quality of the work presented and the selectivity of the program committees. In my field, the first tier includes, for example, MobiSys, Ubicomp, and CHI, among a host of others. Second-tier conferences and journals are fine, too — they just don’t draw the best research work as well as the first-tier does. The faculty at any research university will know what’s reputable and what’s not.

[1] Please don’t confuse WMSCI with WMCSA, now called HotMobile, the International Workshop on Mobile Computing Systems and Applications, which is a reputable workshop.

[2] I don’t mean to disparage Quinnipiac University, only to say that its inclusion on a program committee for a real cybernetics and informatics conference wouldn’t be appropriate, especially considering the schools that are not there.

Monday, March 02, 2009

Leiba’s Lament

On This-Page-Intentionally-Left-Blank day, Beth mentioned “Leiba’s Lament” in a comment, and I said that I might have to explain that in a post some time soon.

First, a bit of background.

Some 25 years ago, or so, Mike Cowlishaw (and here’s his IBM web page) developed a software distribution system called TOOLS[1], which was widely used internally for many years. Under the TOOLS system, users could place data files on virtual disks, optionally grouping the files into packages. The disks could be public or private, and other users — any users, for public disks, and authorized users, for private ones — would retrieve the files when they needed them. The system kept a catalogue of the packages and files, and the catalogue could be searched.

One feature provided by the TOOLS system was forums, a kind of conferencing system. Forums were special files placed on TOOLS disks, and users could use the system to append entries, much as one comments on a blog or places a response into a Lotus Notes teamroom today. The forums have now been moved to network news servers, but in their heyday on the TOOLS system they provided a wildly popular place to have discussions, ask technical questions (and get quick answers, often), and the like. Some forums saw activity only rarely; some got hundreds of appends a day.

The TOOLS disks were arranged according to broad topics — IBMVM and VMTOOLS for discussions and software related to the VM system; IBMPC and PCTOOLS for the same about PCs, and so on. The IBMTEXT disk was filled with forums about document preparation and publishing software: Document Composition Facility (DCF) and SGML, Book Master and Book Manager, Print Services Facility (PSF), and the like.

We called those discussions of the publishing software “IBMTEXT1”, because there were other forums on the same disk, which we grouped as “IBMTEXT2”, devoted to discussing language and writing. Those forums had names like WORDS, ENGLISH, WHOSAID, and a bunch of others that included the infamous NITPICK, which was dedicated to the merciless trashing of anything written less than perfectly. The discussions in the IBMTEXT2 forums were often of uncertain business value on the surface, but they became easily justified when one realized that they provided a place to practice writing properly, to discuss proper writing, and, especially if one was not a native English speaker, to ask questions about correct English usage.

In the summer of 1992, in the middle of a discussion of something else, an IBMer from Dublin made a comment about how a language tool represented “a dipthong”, which prompted a participant from Hursley to reply thus (the quotes here are mostly verbatim, but I’ve changed the attributions a little to protect people’s anonymity):

- WORDS FORUM appended at 14:26:39 on 92/06/11 GMT (by ZZZZZ at WINVMC) -
Subject: Egad - I just saw a small child here in our suburban office
Ref:     Append at 18:28:17 on 92/06/10 GMT (by XXXXXX at DUBVM1)

BTW, what exactly is a 'dipthong'? I suspect it isn't a scanty
bathing costume. Have you spelled it correctly?

Bosco

That Bosco was being sarcastic was missed by at least three participants, who each quickly (within an hour and a half) and “helpfully” followed up with definitions of “diphthong” (note correct spelling). To those, I responded thus:

- WORDS FORUM appended at 18:19:41 on 92/06/11 GMT (by LEIBA at WATSON) -
Subject: Dipthong (was: Egad - I just saw a small child...)
Ref:     Append at 14:55:12 on 92/06/11 GMT (by XXXXXX at AUSVM1)
         Append at 15:18:55 on 92/06/11 GMT (by XXXXXX at DUBVM1)
         Append at 15:46:38 on 92/06/11 GMT (by XXXXXX at WINVMC)

Oh, sigh.  I suspect that Bosco knows (and knew) very well what a
"diphthong" is; you three have simply missed his sarcasm.  Whither goes
the world, when wit and rhetoric are lost unless accompanied by some
accursed icon or other?

From:  Barry Leiba, Watson VM Systems                 LEIBA    at WATSON

Bosco replied the next day, with this append:

- WORDS FORUM appended at 12:30:51 on 92/06/12 GMT (by ZZZZZ at WINVMC) -
Subject: Dipthong (was: Egad - I just saw a small child...)
Ref:     Append at 18:19:41 on 92/06/11 GMT (by LEIBA at WATSON)

>                                                           Whither goes
>the world, when wit and rhetoric are lost unless accompanied by some
>accursed icon or other?

I echo Leiba's Lament. Well said, Barry.

Bosco

At that point, “Leiba’s Lament” entered the IBMTEXT lexicon, defined by the excerpt that Bosco quoted. Real IBMTEXTers almost always eschewed smiley-faces, frowny-faces, and other sometimes bizarre icons (and in more recent times you would never see any of us use abominations such as “LOL” and “ROFL”). As Beth said, WDNNS icon.[2]

[1] TOOLS was developed and ran on the VM operating system, and eventually became the VM/DSNX product. The system is stil around today, but for most of its uses it’s been replaced with web servers and browsers, wikis, blogs, RSS/Atom feeds, news readers, and the other modern conveniences we’ve all come to rely on.

[2] WDNNS, another item in the lexicon, stands for “We don’t need no stinkin’ ...”. It’s a reference to the movie Blazing Saddles, wherein a Mexican bandit, claiming to be a lawman and challenged for his badge, says, “Badges? We don’t need no stinking badges.”[3]

[3] The line in Blazing Saddles is actually itself a reference to the Humphrey Bogart movie The Treasure of the Sierra Madre. The original quote in that movie is, “Badges? We ain’t got no badges. We don’t need no badges. I don’t have to show you any stinking badges. ”

Wednesday, January 07, 2009

Help for the politically correct

Philip Corbett has a blog in the New York Times in which he talks about linguistic issues that come up in reference to Times articles. In his latest column he talks about “Discussing Disabilities”, citing an admonition in the paper’s stylebook to avoid using phrases such as “the disabled” and “the blind”, preferring those words’ use as adjectives (“blind people”) rather than as nouns.

Mr Corbett goes on to say this:

The difference between “the disabled” and “disabled people” (or “people with disabilities”) is subtle but significant. The shorthand might occasionally be unavoidable — in tight headlines, for example. But it’s better to refer to people who, among other characteristics, have some disability, rather than to use the disability as the sole label.
Some advocates, in fact, object to any phrase that refers to the disability before the person. They would uniformly use “people who are blind” rather than “blind people,” or “a person with a disability” rather than “a disabled person.”

To me, all of this, and that second paragraph especially, represents the pinnacle of hyper-PC silliness. All of the examples in the “Discussing Disabilities” section seem just fine as written. And that makes sense, since none of them are talking about a person — they are all talking about the disabilities, so it makes sense that the emphases lie there.

Note the difference between the article that’s talking about software that reads food labels to people who can’t (where I think “the visually impaired” works fine), and an article that might be talking about how social workers need to work differently with blind people than with sighted ones. In the latter, I do agree that the emphasis is on working with people, and the phrasing should support that.

But apart from that, English uses word order for understanding, not for decoration. Indeed, it certainly can sometimes be used for emphasis... but more often, we say things as we do because that’s the way it’s said in English. Adjectives (and other modifiers) come before what they modify; when I say that I have a bottle of “red wine”, I’m not emphasizing the redness at the expense of the wine-hood. And a phrase like “wine that is red”, unless it’s taking poetic or humorous license, looks foolish.

Take this New York Times article, for instance, with the headline, “Helping a Blind Woman Build a Future”: does it in any way undervalue the woman because it puts the word “blind” first? Of course not. It would sound contrived to say it as, “Helping a Woman Who Is Blind Build a Future”; that’s just not the word order we’d normally use in English.

Insistence, in general, on convolutions such as “person who is blind” because of some illusion that “blind person” dehumanizes the subject is misguided. We should write in a way that works well for what we’re writing. Respect (or dis-) comes from the whole, not from one or two disembodied phrases.

Give me your people who have been deprived of rest, your people who are economically disadvantaged,
Your large groups of people forced to spend time in close quarters who are yearning to breathe free,
Those in unfortunate circumstances in your overcrowded communities.
Send these, people at high risk of being without housing and having to spend nights exposed to the elements, to me,
I lift my politically correct thesaurus beside the golden door!
— freely adapted, with apologies, from Emma Lazarus

Sunday, September 14, 2008

Parallel lists

A kosher meatpacking plant in Iowa, the largest such plant in the country, has been in the news lately for non-kosher labor practices. In the latest round, they’re accused of violating child-labour laws. In the version of the report on National Public Radio’s news, the reporter said that their employment of under-age workers is “illegal because the children are exposed to dangerous chemicals, power machinery, and often work longer hours than permitted.”

Lists have to have parallel structure to make them easier to understand, but that’s a rule that’s violated more often than labour laws are. One can almost think of it mathematically: you ought to be able to “factor out” the common bits, or, alternatively, to factor them back in. Each item in the list should make sense if it’s attached to the “factored out” part by itself.

Let’s put NPR’s list in bullet form, to highlight the point:

It is illegal because the children are exposed to
dangerous chemicals.
power machinery.
often work longer hours than permitted.

Item three is the odd one out: it doesn’t make sense when you stick it onto the introductory phrase.

So let’s back up:

It is illegal because
the children are exposed to dangerous chemicals.
the children are exposed to power machinery.
the children often work longer hours than permitted.

There, that makes sense. We can still factor “the children” out, of course. And pulling it back to this point also shows us that “exposed to power machinery” is a somewhat awkward or unclear way to say it. Is it that they’re actually operating the power machinery? Or are they working in the area while someone else is operating it? The Times article tells us this:

The complaint charges that the plant employed workers under the legal age of 18, including seven who were under 16, from Sept. 9, 2007, to May 12. Some workers, including some younger than 16, worked on machinery prohibited for employees under 18, including “conveyor belts, meat grinders, circular saws, power washers and power shears,” said an affidavit filed with the complaint.

Then here’s another try:

It is illegal because the children
are exposed to dangerous chemicals.
operate power machinery.
often work longer hours than permitted.

And putting it back into narrative form, their employment of under-age workers is “illegal because the children are exposed to dangerous chemicals, operate power machinery, and often work longer hours than permitted.”

We need to think about this when we’re writing presentations, where we often use bullet-lists and they’re often not written to be parallel. For example, this bullet list is from a recent presentation I attended:

Methodology:
Identify business problems
Discover and leverage existing activities
Ongoing communication
New technology evaluation

Please excuse “methodology” and “leverage”; I didn’t write the slide. The point is that the first two bullets are active — they have verbs and they tell us what the speaker is proposing to do — and the last two are passive — they don’t have verbs; they’re just noun phrases.

I’d rewrite the slide this way (leaving the business-speak alone, much as I’d like to change it):

Methodology:
Identify business problems
Discover and leverage existing activities
Communicate regularly [with whom?]
Evaluate new technology [for inclusion?]

That reads better, and it also points out that the last two bullets are missing something: without some specified target, they’re just empty phrases that mean nothing. Of course, that’s often why we write them the original way. “Ongoing communication” is clearly a good thing, and we want to avoid having to be too specific about it. But it makes for a weak presentation, and anyone who’s paying attention will see that.

It’s easy to say, when someone goes on about something like parallelism in lists, that it’s just unnecessary pickiness and “You understand what I’m saying.” But this parallelism thing isn’t just a random rule; thinking about this — and stuff like it — leads to clearer, more effective communication. Isn’t a better presentation worth it?

Saturday, August 16, 2008

A dead zone for headlines

Hm. Which of these headlines gives a significantly different impression from the others?:

National Public Radio: “Dead Zones” Multiplying In World’s Oceans
Science News: Coastal dead zones expanding
or the New York Times: Rapid Growth Found in Oxygen-Starved Ocean “Dead Zones”

Maybe it’s just me, but I interpreted the Times version as meaning that things were found growing — rapidly — in the dead zones. And if I hadn’t heard the NPR item on the radio yesterday morning, I’d have held that impression and thought it was a good thing (until I read the article, of course).

There’s a simple fix: Rapid Growth Found in Number of Oxygen-Starved Ocean “Dead Zones”

Unfortunately, the headline is already too long to add those two words. I think the headline writer for the Times needs some remedial work.

[The Science News headline’s not quite there either. It implies that the existing dead zones are getting bigger. While that’s also somewhat true, the real point, which they do get in their lede, is that there are more of them. NPR’s the only one that really gets the headline right.]

Monday, March 03, 2008

I never metaphor I didn't like

The most recent Carnival of Feminists highlights this item by Jess McCabe, which criticizes this jerk, who rants on about how attempts to make English more gender-neutral are “raping” the language. The rant is also available at The Weekly Standard, here.

Jess attacks his column well enough that I have little to add on that front. I’ll only say that his complaint about “firefighter” is probably the silliest. I know several firefighters — all men — and they all have preferred that term for years. And note that “fireman” also refers to the guy (as it always was) on the train who had the job of stoking the fire. “Firefighter” is not only gender-neutral, but also clearer and unambiguous.

But my point in writing this is to note that I really found David Gelernter’s column to be a hard-to-read slog. I thought about why:

It’s not because I disagree with his premise. I do, but I read a lot of things I disagree with that aren’t tedious.

It’s not that nearly everything he says to support that premise is crap. It is, as Ms McCabe says in more detail, but that’s not what makes it a slog.

It’s only partly because he’s ranting and it’s overlong. Again, he is, and it is — I wondered when it would end, or whether someone had invented the bottomless web page — but that’s not the biggest problem.

The biggest problems are that he’s pompous and has a horrid writing style. He’s packed the thing from beginning to end with ridiculous similes and with mixed and strained metaphors, and the result is laughable, a caricature of what it’s meant to be. He quotes Shakespeare, as though quoting Shakespeare gives anything credibility and erudition, and then he tosses out exactly the conciseness he praises Shakespeare for:

The prime rule of writing is to keep it simple, concrete, concise. Shakespeare’s most perfect phrases are miraculously simple and terse. ("Thou art the thing itself." "A plague o’ both your houses." "Can one desire too much of a good thing?") The young Jane Austen is praised by her descendants for having written "pure simple English." Meanwhile, in everyday prose, a word with useless syllables or a sentence with useless words is a house fancied-up with fake dormers and chimneys. It is ugly and boring and cheap, and impossible to take seriously.

Impossible to take seriously, indeed. Have a look at these gems, right up there with the “dormers and chimneys” line:

arrogant ideologues began recasting English into heavy artillery to defend the borders of the New Feminist state.
where he-or-she’s keep bashing into surrounding phrases like bumper cars and related deformities blossom like blisters
The well-aimed torpedo of Feminist English has sunk the whole process of teaching students to write.
straight from a magic spring that bubbled for him alone.
the he-or-she epidemic that was sweeping the country like a bad flu (or a bad joke).
feminism had already got America in a chokehold.
Unsatisfied with having rammed their 80-ton 16-wheeler into the nimble sports-car of English style, they proceeded to shoot the legs out from under grammar
The she-sentences that result tend to slam on a reader’s brakes and send him smash-and-spinning into the roadside underbrush
Who can afford to allow a virtual feminist to elbow her way like a noisy drunk into that inner mental circle where all your faculties (such as they are) are laboring
tendency to simplify and compress its existing structure (like a settling sea-bed)

I’ve left out the characterizations like “style-smashers”, “language rapists”, and “feminist warriors”; “the Academic-Industrial Complex”, “commissar-intellectuals”, and “the running dogs of the Establishment”.

One or two of these, even a few, would brighten up the piece. Overused, as they are, they just make it silly. Yes, David Gelernter knows how to turn a phrase. Into garbage.

And he finishes off his pompous diatribe with a question: “Do we have the courage to rebuild [the English language]?” Excuse me: courage? We might say that it requires courage to rebuild Iraq. Or New Orleans.

But dealing with a shift in language usage doesn’t take courage. You only have to get off your high horse.

Tuesday, November 06, 2007

The media: dancing around a point

Paul Krugman, in his excellent op-ed piece last week about Rudolph Giuliani’s false claims about US vs UK health care and what the Democratic candidates’ plans for health care reform are, not only takes Mr Giuliani to task for saying things he knows — or should know — to be false. He also takes the press to task for parroting the sort of misinformation that’s being thrown around, instead of calling it out for what it is:

It would be a stunning comparison if it were true. But it isn’t. And thereby hangs a tale — one of scare tactics, of the character of a man who would be president and, I’m sorry to say, about what’s wrong with political news coverage.
Let’s start with the facts: Mr. Giuliani’s claim is wrong on multiple levels — bogus numbers wrapped in an invalid comparison embedded in a smear.
[...]
And much of the coverage seems weirdly diffident. Memo to editors: If a candidate says something completely false, it’s not “in dispute.” It’s not the case that “Democrats say” they’re not advocating British-style socialized medicine; they aren’t.
The fact is that the prostate affair is part of a pattern: Mr. Giuliani has a habit of saying things, on issues that range from health care to national security, that are demonstrably untrue. And the American people have a right to know that.

This point is related to one made by the New York Times’ Public Editor back in September, when he opined that the news media have to use even language, avoiding politically charged terms like “liar”.

Mr Krugman’s point is somewhat different, though. If we create a charged atmosphere by describing someone with the word “liar”, we also downplay a situation by saying things like, “critics say” (or, worse, “critics claim”). It’s a matter of how we soak things in as we read them.

Suppose I lead a story with this: “In his commencement speech yesterday, the president of the University of Southern North Dakota at Hoople lambasted climate-change activists for ‘trying to shut down the Internet with their scare tactics’, and told the new graduates that they should fight against attempts by liberals to ‘throw us back into an age of low-technology communication.’ ”

And then suppose that I go on in that vein, talking about the USND-at-H president’s speech, and in paragraph four or five I write this: “Climate-change activists deny that they are trying to shut down the Internet.”

What message is the average reader left with? Whether I meant it or not, my readers hear most strongly the university president’s bogus statement. In the interest of “even language” and an attempt to report “facts” only, I’ve given my readers the impression that there is, in fact, a push to shut down the Internet, and that it’s now been exposed.

That impression is made for a few reasons:

We tend to attribute more authority to what we read earlier in the news article.
We tend to pay less attention, in general, as the article goes on, so that we might not really notice the “denial”.
Many readers won’t even get to the denial, having stopped reading altogether before that.
Denials are often looked at as being sort of weasely, as attempts to hide the truth. Those making wild accusations know that, and count on it.

Sure, alert readers, savvy readers, skeptical readers will see the whole picture. But I’ve made them work too hard, and I’ve completely fooled the less savvy.

As Mr Krugman implies, the media needs to put the truth up front. “Though there is no evidence to support his claim, in his commencement speech....” “Mr Giuliani, contrary to all fact, claimed that the Democrats....”

This connects to what I said a year or so ago about how the media have adopted the Bush administration’s terms for things like “detainees”, which we would call “political prisoners” if they were being held by, say, the Chinese. Another example from the news of today is “waterboarding, an interrogation technique that opponents say is torture.” The media are allowing the government to whitewash its actions by buying into these euphemisms, and they should not be sucked into that. They should instead be saying, “waterboarding, a torture technique that the administration calls ‘interrogation’.” That’s not politically charged language that unbiased news should avoid; it’s the truth.

A result of the failure to do this is that public figures can say anything they want to, filled with all manner of lies, and those who are slammed by it will have to spend time and resources denying it... and a good portion of the public will believe the lies anyway, because they “read it in the New York Times.”

Update, 7 Nov: On this morning’s NPR news, Nina Totenberg reported on the Mukasey vote, and the controversy about his refusal to make a definitive statement about waterboarding. In it, she said that waterboarding is “a form of controlled drowning” (rather than the usual and false “simulated”), and said that it’s been in use since the Spanish Inquisition and has been considered a war crime for the last century. OK, now that is what I’m talking about. Thank you, Ms Totenberg!

Saturday, October 06, 2007

Peer reviews: Wikipedia references

I like Wikipedia, despite the flak it’s taken from those who say the information there is unreliable. It’s a form of controlled chaos that works surprisingly well most of the time. There’s vandalism, bias, unresearched and incorrect information, but there is a lot of oversight, both “official” and ad hoc, and Wikipedia is, in general, a valuable tool. Using Wikipedia to learn about things and as a springboard for further research, while understanding its limitations, is certainly not a problem.

Researchers submitting papers for peer review also like Wikipedia. As a reviewer, I’m seeing more and more papers that cite Wikipedia in their references. And that is a problem, precisely because of the “chaos” aspects.

For one thing, Wikipedia entries are constantly changing. When you cite an entry today, the bit of it that supports what your paper is saying might change tomorrow. The key paragraph might say something very different when I review your paper, compared with what it said when you cited it; indeed, the information you’re citing might not be there at all any more.

For another, even if the information you’re citing is still there and remains intact, as you read it... it might simply be wrong, or insufficiently supported to hold up to peer review.

Wikipedia itself encourages external references within its entries, and discourages unsubstantiated claims. Clearly, though, any extensive text can’t cite references for every fact — the references would outweigh the content, in the end — so a great deal goes into Wikipedia entries as “fact” with no cited support. That information might or might not be correct; it takes further research to determine whether it is.

And that is the job of the researcher: to do the research. As it stands, Wikipedia is so easy to use and so rife with mostly-correct information that citing it is a convenient shortcut. But it’s a too-convenient shortcut that this reviewer, at least, will not accept. You’ll notice that I rarely refer to Wikipedia even here, in these loose, informal, non-refereed pages, and that when I do, it’s for a very limited purpose, to provide a pointer to an informative explanation, and nothing more.

So I expect in research papers, and I will set that out here for anyone who thinks that I, or some like-minded reviewer, might one day be asked to review her work:

Do not use Wikipedia as a reference for any substantive or normative information. I will ding you for it in the review, and I will insist that such references be replaced by proper ones before recommending publication. I will accept references to Wikipedia for purely explanatory things that don’t matter to the substance of your paper — but even there, please don’t overdo it.

Here’s a contrived example of an unacceptable Wikipedia reference for a hypothetical paper that says something about inflation:

During periods of moderately high inflation, it can actually be easier to establish a startup business because investors are more willing to spend their money than to save it.[1]
[1] http://en.wikipedia.org/wiki/Inflation

The reference is to a section of the Wikipedia article, at the time that I write this, that says this:

Inflation is also viewed as a hidden risk pressure that provides an incentive for those with savings to invest them, rather than have the purchasing power of those savings erode through inflation. In investing, inflation risks often cause investors to take on more systematic risk, in order to gain returns that will stay ahead of expected inflation.

...but if the statement of that sentence in the paper is important, a stronger reference is needed.

Here’s a version of the example that I would consider acceptable:

During periods of moderately high inflation[1], it can actually be easier to establish a startup business because investors are more willing to spend their money than to save it.[2]
[1] http://en.wikipedia.org/wiki/Inflation
[2] ...a reference to a peer-reviewed paper on venture capital...

In this case, Wikipedia is used to give general information about what inflation is, in case the reader needs some background. I accept that Wikipedia is accurate enough for that. Meanwhile, the substantive reference points to a source that we consider to be more reliable.

Of course, this is all not to say that everything in peer-reviewed papers, news media, published books, and the like is reliable and correct; it clearly isn’t. Even “proper” citations can have bad information, and multiple independent sources are a good thing when that’s possible, and the reference is important enough.

Monday, October 01, 2007

He’s checking it twice...

In a post on Positive Liberty, Jason Kuznicki addresses some criticism he gets from Timothy Sandefur, a former Positive Liberty co-blogger. He quotes from Mr Sandefur’s complaint:

There is a sort of person who strangely relishes such accusations to the degree that they will immediately believe them without awaiting indicia of credibility. Such people may not enjoy the substance of the story, but they so enjoy whacking what they think is a mole that they’ll bang away at anything that moves.

Detaching this from the Sandefur-Kuznicki feud, on which I have no opinion to state, I see that it brings up an interesting question: To what extent do (or should) bloggers have a responsibility to “await indicia of credibility”? Are bloggers subject to the expectations of independent verification of information that we have for “real journalists”? Should we be?

Blogs differ, of course, in how they handle this. For my part, I prefer to cite a source when I comment on something, and I try to find the source that I consider to be the most credible. If I see something on another blog or on an obviously biased “specialist media” site (say, a radical liberal news site), I’ll look at the source they cite, if there is one (often there isn’t), or try to chase down a credible source myself. There’ve been times when I’ve found something interesting enough to comment on but of questionable origin, and I’ve said in my comments that I’m uncertain of it. And sometimes the information available is incomplete or sketchy, and I’ve said that too, noting that with more complete information I might have a very different opinion of the situation.

I usually don’t cite multiple sources, unless they each provide information lacking in the other, or give interestingly different perspectives. But I often do look at multiple sources when I’m writing an entry. And I certainly do if my first source isn’t one of my favourites, or if the item seems suspicious.

On the other hand, blogs aren’t newspapers, and commentary isn’t news, so I believe there’s some latitude. At the same time, readers won’t be happy for long with a blog that spews uninformed comments. I certainly wouldn’t be happy writing such comments.

That said, there has been, and will continue to be, value in having a portion of the public that jumps on things right away. Yes, they might be leaping before they look, and they might be carrying on about something that turns out not to be as it seemed on first report. But they draw it out in the open for all to see. They make it harder for things to hide. And that might be worth the occasional embarrassment.

I think it’s valid to criticize bloggers, in general, for being lax in this regard, but I think that in the end, bloggers choose for themselves, and readers will tend to read, trust, and enjoy those that are more rigorous about it.

Thursday, September 20, 2007

Peer reviews: Too much background information

Reviewing some conference papers recently has given me some thoughts that I'd like to put down. Here's one of them.

We all had experience in grade school with “padding” papers: adding unnecessary fluff to make the paper a bit longer. The stuff was easy to write, and we thought the teacher wouldn’t notice, and that we’d get a bit more credit for it than for turning in a shorter paper.

The teachers didn’t always tell us so, but they noticed.

Now that we’re submitting papers to peer-reviewed journals and conferences, many of us still do the same thing, and I think we still think that “they” — in this case, the reviewers — won’t notice.

We do, and I, at least, will tell you so if I’m reviewing your paper.

I can’t tell you how many papers I’ve reviewed that talk about spam solutions, and that spend at least an entire page — as much as 15-20% of the paper — giving the background, telling the reader what spam is, why it’s a problem, how serious a problem it is, how long it’s been a problem, etc. Let’s be clear here: anyone who will read your paper already knows all this. It’s just tedious, useless fluff that we have to get through, and as I read that stuff I’m saying to myself, “Yes, yes, get to the point already!”

Some brief lead-in is fine, and is even necessary for the flow of the paper, but the operative word here is “brief”. As I’ve said before, in other contexts, write to your audience. A technical paper that’s meant to be read by people who work in the field neither need nor want the well-known detailed background. If you’re explaining your work to the press, you’ll need to add some of it back in. It’s all a question of whom you’re writing for.

Of course, you do have to pay attention to the rules for the conference or journal to which you’re submitting your paper. Usually there’s a maximum size, but not a minimum — full-length papers for the Conference on Email and AntiSpam are limited to eight pages, for example, but the papers can certainly be shorter. I’m happy to review a paper that’s, say, five or six pages, or even four, if that’s what the details fit into. If there’s no minimum size (there is, for some journals), don’t pad.

Even if there is a minimum, don’t pad with fluff. Look for background information or details about related work that will actually be interesting to your readers. If you make us read junk, it will affect how we regard your paper. Wouldn’t you rather your readers find your paper fascinating, not dull and obvious?

This really can make the difference between getting your paper accepted and not... and between having your paper cited frequently as a reference and not. If you’re going to the trouble of writing it, take the time to make the most of it.