User talk:Citation bot/Archive 0

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Note This page is deprecated as I try to make bugs simpler to report and resolve. New bugs are now reported at User talk:Citation bot.

Archives were scrambled due to poor configuration of CB3. Everything is back here. Many things were duplicated (maybe archived, then back, then tagged ad 'resolved' and re-archived. Chronology should be re-enforced.


Perennial problems

Updating year for articles on final publication

One other change I noticed in that same edit. It's quite common to cite medical articles when they have been published online but have not been officially assigned year, volume, and pages. For example, Autism therapies formerly contained this citation:

Shimabukuro TT, Grosse SD, Rice C (2007). "Medical expenditures for children with an autism spectrum disorder in a privately insured population". J Autism Dev Disord. doi:10.1007/s10803-007-0424-y. PMID 17690969.{{cite journal}}: CS1 maint: multiple names: authors list (link)

because the paper was published online in 2007. Eventually the paper was published in the official journal in 2008, and Citation bot updated the citation by adding volume=38 and pages=546, resulting in this partially-improved version:

Shimabukuro TT, Grosse SD, Rice C (2007). "Medical expenditures for children with an autism spectrum disorder in a privately insured population". J Autism Dev Disord. 38: 546. doi:10.1007/s10803-007-0424-y. PMID 17690969.{{cite journal}}: CS1 maint: multiple names: authors list (link)

To finish the improvement, I had to manually change the year=2007 to year=2008, add issue=3, and add the last page number (552), resulting in the following:

Shimabukuro TT, Grosse SD, Rice C (2008). "Medical expenditures for children with an autism spectrum disorder in a privately insured population". J Autism Dev Disord. 38 (3): 546–52. doi:10.1007/s10803-007-0424-y. PMID 17690969.{{cite journal}}: CS1 maint: multiple names: authors list (link)

I understand that Citation bot does not have the issue=3 and the last-page 552 information available, so it cannot fix that part of the citation. However, it does have the date available, so it could update year=2007 to year=2008, thus saving me a bit of work. (I have to clean up after the Citation bot a lot, so every bit would help.) Could you please fix the citation bot to add 1 to the year if necessary, when it adds a volume= and pages= info? Thanks. Eubulides (talk) 16:54, 14 October 2008 (UTC)

I can do this. The downside is that where the data in the central database is incorrect, there is no way for users to stop the bot inputting the incorrect year each time it visits a page. I'll leave it up to you to decide which will cause editors more inconvenience – it's a tricky one to resolve! Martin (Smith609 – Talk) 23:01, 14 October 2008 (UTC)
The common pattern I run into is that I cite a prepublication version of a paper dated 2008, and then the final version comes out in 2009. Typically the prepublication version lacks volume and page number (since it hasn't been decided yet). So how about this heuristic: if the citation lacks volume and page number and its year is lower than the published year and its year does not have a comment, then update the year; otherwise, leave the year alone. This heuristic would handle most of the problems I run into, and should be easy to override (with a comment) in the rare cases that it goes awry. Eubulides (talk) 16:08, 6 July 2009 (UTC)

Undesirable location= and publisher= for Cite book

This edit to Autism added "|publisher= AMERICAN PSYCHIATRIC PRESS INC (DC) |location= United States" to two citations of DSM-IV-TR. In both cases, the publisher= and location= information is undesirable: a "location= United States" is useless for an American organization, and a "|publisher= AMERICAN PSYCHIATRIC PRESS INC (DC)" is simply duplicated (and poorly-capitalized) information for a citation that already says "|author= American Psychiatric Association". The Citation bot used to not make changes like this; can you please fix it so that it continues to not make these changes, or let me know how to shut it off for these citations? In the mean time I cleaned up by hand. Thanks. Eubulides (talk) 04:54, 23 October 2008 (UTC)

I've been thinking about this; can you propose a solution for how the bot can work out when it's inappropriate to add a publisher and location to a citation? If not, the usual trick of adding a <!-- comment --> into any field you want the bot to ignore will work. And I'll make the capitalisation prettier when I get the chance. Martin (Smith609 – Talk) 15:38, 25 October 2008 (UTC)
Hmm, well, I can't think of a good heuristic in general. But one thing does stick out: how about not inserting "|location=" if the ISBN is present? With modern books, the location information is almost invariably useless and even misleading information. Readers don't need to know that the Oxford University Press is in Oxford, and for a major publisher like McGraw-Hill it's pretty much irrelevant whether the book says that it was published in New York or in Chicago. Eubulides (talk) 16:13, 6 July 2009 (UTC)
Omitting location information from citations generally would need to be discussed more widely, perhaps at "Wikipedia talk:Citing sources" or the village pump. It is still the practice in many contexts (journal articles and library catalogues, for instance) to provide the location of publishers. Also, some publishers produce different editions of the same book in different locations, so the location information helps to differentiate one country edition from another. — Cheers, JackLee talk 18:03, 6 July 2009 (UTC)

Is this a bug?

Is this a bug? I don't know why it added that parameter to the citation template... I'm still semi-new to things here. Killiondude (talk) 07:59, 26 October 2008 (UTC)

It's not a bug, it's to bring to editors' attention the fact that some data is included in the template but is not displayed because it lacks a parameter (e.g."title=") before it. Martin (Smith609 – Talk) 14:46, 26 October 2008 (UTC)

Bot never finishes on "Causes of autism"

When I visit http://toolserver.org/~verisimilus/Bot/DOI_bot/ and enter "Causes of autism", check only the "Thorough mode" box (without committing edits), and hit "Submit Query", the bot seems to give up about halfway through. The last few lines of output look like this. Maybe that citation is putting it into a loop?

Mercury exposure and child development outcomes
Already has a DOI. All details present – no need to query CrossRef. No CrossRef record found.
Determining format of URL...assessing URL Done.
Checking that the DOI is operational...

Eubulides (talk) 19:29, 14 November 2008 (UTC)

I'll look into it; there are still some issues with the toolserver servers which are making debugging difficult at the moment, so it might be a short while. Martin (Smith609 – Talk) 22:27, 14 November 2008 (UTC)
Thorough mode is ugly – it might be a while before I can fix this. Meanwhile, it works in standard mode. Martin (Smith609 – Talk) 03:23, 17 February 2009 (UTC)

UPPERCASE change to Titlecase inappropriately

Some journals have all uppercase words, e.g., FEBS Journal. Your otherwise very useful bot changes these to titlecase (Febs in this case). Xasodfuih (talk) 20:05, 14 January 2009 (UTC)

For practical reasons I have to add these exclusions on an individual basis. Let me know if there are any others. Martin (Smith609 – Talk) 03:14, 17 February 2009 (UTC)
You can now list exclusions at User:Citation bot/capitalisation exclusions. Martin (Smith609 – Talk) 21:12, 13 May 2009 (UTC)

Suggestion: ISBN

Any reason why the bot doesn't search for ISBNs for {{cite book}}?Headbomb {ταλκκοντριβςWP Physics} 08:46, 17 December 2008 (UTC)

API thingy whatever an API is.Headbomb {ταλκκοντριβςWP Physics}
The database only permits 100 automated queries per day; once the bot has exceeded this limit it cannot search for more. These queries are prioritised so manually-initiated uses of the bot get first dibs on the queries. Martin (Smith609 – Talk) 17:30, 3 January 2009 (UTC)

Outstanding bugs and suggestions

Citation bot 3 is removing all author's first initials

When it replaces 'author' with 'last/first, last2/first2', etc., a name like Smith AB is replaced with Smith B. Also, it sometimes combines adjacent author names, ie 'Smith AB, Jones CD' may become 'Smith J.C.', with the periods added. I have been undoing these edits to the ref templates in my watchlist... as I'm pathologically nitpicky. Anyway, just a heads up.-- Rcej (talk) 02:11, 6 September 2009 (UTC)

Oh dear, I thought I'd fixed that. There seem to be so many subtly different ways of coding author parameters, and no easy way to automatically distinguish them. I recall running across your name in many of the problematic cases; I suspect that you had been very helpfully adding authors that the bot had missed to the list using comma separators. Unfortunately this format confused the bot, as I couldn't create an algorithm that could robustly determine whether 'SMITH, JO' is Jo Smith or J.O. Smith. However, the edits you describe as seeing are less easy to rationalise... could I enquire, are you only encountering the errors on citations you have edited by hand (is it possible that they occur elsewhere)? If so, is the magnitude of such pages of the order that I could manually go through and correct them myself? The bot is now much better trained at finding authors, so it should no longer be necessary for users to manually add second authors – if we can fix the existing citations, then the bug should not manifest itself again. Sorry to have inconvenienced you. Martin (Smith609 – Talk) 01:25, 7 September 2009 (UTC)

The ref templates that I have edited are the ones I initially 'queue jumped', and are still in my watchlist... which are too many to keep track of, but my 9/5 thru 9/6/09 contribs list shows many of them. I've never had to add an author; my edits prior to undoing Citation Bot 3's 9/5 activity consisted of –

1. reformatting author listings from the default 'Smith, Ab;' to read 'Smith AB,'

2. removing any url that doesn't provide 'Free full text', though indicated to; or removing urls that point to relevant disease info websites instead of the journal abstract/article. Also, when there is a PMC for the citation, removing the url lets the PMC hyperlink the title... in some of those instances, I've removed viable urls only if there is a PMC that gives a more easily accessible 'Free full text' version.

3. adding a missing PMID, which is a rare occurrance

Those are pretty much my routine edits to the templates I initiated through Citation Bot. I hope this info. helps... sorry if it's too sketchy sounding. It seemed like templates I've edited were the only ones CB3 edited, but I haven't confirmed that.-- Rcej (talk) 03:07, 7 September 2009 (UTC)

[cDs] edits to be undone

All recent erroneous edits with an edit summary beginning [cDs] will be undone presently. Martin (Smith609 – Talk) 07:42, 17 October 2009 (UTC)

Escaped pipe in title

This edit seems to have been an error due to the bot ignoring the <nowiki></nowiki> tags around a pipe character in the title field of the citation. The result was a broken template so I reverted. --Dbratland (talk) 17:54, 16 May 2010 (UTC)

Perhaps the bot could fix this in future by replacing the escaped pipe with &#x7c;, which renders as |. Would this work in all cases? Martin (Smith609 – Talk) 18:11, 16 May 2010 (UTC)
Makes it pretty hard for humans to read, unless everyone knows what things like &#x7c; mean. It might work but I would lean towards the page being written for the convenience of human editors rather than bots. --Dbratland (talk) 18:59, 16 May 2010 (UTC)

Links to JSTOR in presence of DOI

The bot tends to add links to JSTOR even if equivalent DOI is already given. Here is an example. Two major objections:

  • the links http://jstor.org/stable/2008781 and doi:10.2307/2008781 are equivalent; there is no point to give them both;
  • papers at JSTOR are not freely available, making links to them misleading and taking place of what may have been a freely available version of the paper.

Therefore, I would strongly object to adding direct links to JSTOR if equivalent DOI link is already given. Maxal (talk) 17:47, 19 May 2010 (UTC)

Not everyone has the same set of subscriptions. In the case you give, my library provides access to that paper via JSTOR but not via the DOI which resolves to Cambridge Journals Online. Both have utility. LeadSongDog come howl! 18:10, 19 May 2010 (UTC)
I think, resolving DOI does not depend on subscription. Are you sure that doi:10.2307/2008781 resolves to Cambridge Journals Online (the journal Mathematics of Computation has nothing to do with Cambridge)? If so, could you please show a particular link to which it resolves? Maxal (talk) 21:56, 19 May 2010 (UTC)
My apologies, I somehow confused it with doi:10.1017/S0305004100038470, which was from the other ref affected in that edit. Too many windows open at once, I'm afraid. Of course a DOI that resolves to a JSTOR page is equivalent to or better than the JSTOR link itself. LeadSongDog come howl! 02:05, 20 May 2010 (UTC)
The bot should not add links to JSTOR. It should add |jstor=2008771, or |id={{JSTOR|2008771}}. Headbomb {talk / contribs / physics / books} 20:22, 19 May 2010 (UTC)
While I agree on not adding links to JSTOR, I think that having multiple references to JSTOR is still excessive. Either |doi=10.2307/2008781, or |jstor=2008771 is enough, but not both (the former is preferred, I guess). But in the cases, when doi is missing or resolves to a website different from JSTOR, having |jstor=2008771 is useful. Maxal (talk) 22:10, 19 May 2010 (UTC)
In some cases, the 10.2307/ doi does not work with JSTOR articles (in others, it does). The bot should be able to determine this a fair but not 100% level of accuracy. Given that, once you reach a consensus on the optimal behaviour, I'll be happy to implement it. Martin (Smith609 – Talk) 21:06, 19 May 2010 (UTC)
I think, the optimal behavior for the bot is: (i) if doi is present and resolves to JSTOR, do nothing; (ii) if doi is missing or resolves to a site different from JSTOR, add |jstor=... (if it's not yet present); (iii) remove existing links |url=... to JSTOR and process the remaining citation as in (i) and (ii). Maxal (talk) 18:44, 26 May 2010 (UTC)
Maxal's suggestion makes sense to me. LeadSongDog come howl! 17:36, 1 June 2010 (UTC)

[1] This problem still exists. And also, for some reason Citation bot adds |publisher= although |journal= etc. is given. —bender235 (talk) 20:22, 24 July 2010 (UTC)

This is from three days ago: [2]. Waltham, The Duke of 09:23, 28 September 2010 (UTC)
(i) Could somebody explain why using |jstor = ? It is not documented in {{cite journal}} and I couldn't make it work by experiment. (ii) Mind you, {{JSTOR}} is an extra template, and the number of templates per article is limited. Materialscientist (talk) 06:11, 2 January 2011 (UTC)

Addition of incorrect publisher, changing of URLs

Citation bot 1 is adding incorrect publisher data to correctly-formatted citations. In the above example, it added Nielsen Business Media, Inc. as both the publisher and the author of two Billboard articles written in the 1940s and 1950s. Although Billboard was owned by Nielsen between 1993 and 2009, it wasn't the publisher when the article was written, nor is it the publisher now. The redundant "Nielsen Business Media, Inc. Nielsen Business Media, Inc." is also not helpful. Also, the bot appears to be changing working (valid) URLs that were tested as working using other Wiki tools. Firsfron of Ronchester 18:56, 28 May 2010 (UTC)

Also, I tried to use Citation bot late last year and got different but also incorrect author and publisher information. Firsfron of Ronchester 19:04, 28 May 2010 (UTC)
Citation bot has recently misidentified two preprints[3][4] on Arxiv.org that were cited in the article Astronomical unit as being published by some organization called "The Journal of Business". This is wrong on three counts:
  • It disagrees with the publication details given at Axiv.org,
  • It is prima facie unlikely that astronomical articles would be published in a business journal,
  • It is incorrect for a journal name to be listed as a publisher.
Could you tidy up this error in the bot; I've taken care of the article. Thanks SteveMcCluskey (talk) 21:29, 6 June 2010 (UTC); revised 22:01, 6 June 2010 (UTC)
Thanks for this report and for fixing the article; the publisher data is obtained from http://referee.freebaseapps.com/ and I have contacted the maintainer of this API to request that he examine this bug. Martin (Smith609 – Talk) 14:34, 7 June 2010 (UTC)
  • hi, i've checked the examples and the recent history of the citation bot, and nothing has come from my api. the first example, the url is from google books, and if you click 'About this magazine' you'll see 'Nielsen Business Media, Inc.' .. so if its an error, its google's error.

(if its sent to my api, google isnt a publisher, so its blank.- [5] .)
I imagine the Arxiv error is similar. Spencerk (talk) 18:13, 7 June 2010 (UTC)

Page number from appendix

Is this edit correct? It looks like the bot may be treating a page number from appendix "D" as a range of pages (from "D" through 5). --Stepheng3 (talk) 05:05, 9 June 2010 (UTC)

It needn't be an appendix. It might be a section identifier; I have come across books before where there were (say) four main sections, each of which began page numbering at 1: thus there were pages A-1 to A-64, B-1 to B-32, C-1 to C-196 etc. --Redrose64 (talk) 11:02, 9 June 2010 (UTC)
Nevertheless, Stephen is right that in this instance, a hyphen rather than an en-dash would be the appropriate punctuation. Could you suggest a rule whereby the bot could discern between a section-page and and page numbers containing letters (e.g. e142–e159; xi–4)? Martin (Smith609 – Talk) 15:05, 9 June 2010 (UTC)

Repeated authors?

Hi, do you know why {{Cite doi/10.2307.2F604080}} (JSTOR 604080) contains last3, author1 and author2? There should be (at most) two authors. Shreevatsa (talk) 02:35, 15 June 2010 (UTC)

Hm, I bet it's something to do with how Jstor stores special characters. I'll investigate this when I get the opportunity. Martin (Smith609 – Talk) 15:18, 15 June 2010 (UTC)

Is this one related? It's for a paper with a single author, with an accented name, on jstor; the bot added a second copy of the name without the accent. —David Eppstein (talk) 17:38, 5 August 2010 (UTC)

ed = editor 3rd

Here "ed = 3rd" was corrected to "editor = 3rd". I changed this to "edition = 3rd" (of course) but wanted to report the bug. --Fama Clamosa (talk) 10:29, 19 June 2010 (UTC)

Problem with unused_data when no spaces exist

Example --JimWae (talk) 10:39, 19 June 2010 (UTC)

Mangled author info

In this edit, the bot added clearly mangled author information for this book. (I had left the author info in the citation template blank because the work is authored by the publishing organization, and Google Books' consequent description of the author is somewhat lame.) Magic♪piano 10:48, 15 July 2010 (UTC)

Google just reflects the lame author data from the Library of Congress (who did the scanning for the Internet Archive). See OCLC 1850353 and its linked handles too. Bot fails at handling the complex case |author=Norwich (Conn.); General Society of Colonial Wars (U.S.). Connecticut. LeadSongDog come howl! 11:33, 15 July 2010 (UTC)
This bug was fixed in r172. Martin (Smith609 – Talk) 13:54, 15 July 2010 (UTC)

It is still happening, are these real authors?

-- SWTPC6800 (talk) 01:12, 22 July 2010 (UTC)

These were added because there was no publisher, editor etc information, so the expansion seemed warranted. I would propose that the best way to avoid these bugs would be to maintain a list of words that aren't likely to appear in names, i.e. not to add as an author anything containing "Society", "Corporation", "Magazines", etc. I can implement this when I get the chance if it sounds workable. Martin (Smith609 – Talk) 14:17, 23 July 2010 (UTC)
I don't have time to check this thoroughly right now, but it looks like there's a period at the end of each of these authors in their worldcat entries. This might be a useful indication of corporate authorship. See, e.g. this. LeadSongDog come howl! 16:46, 23 July 2010 (UTC)
Many magazines publish articles that have no attributed author, these are normally but not always written by a member of the magazine's staff. I have also used advertisements in magazines as a reference. Who is the author, the advertising agency or the product company? It most certainly is not the magazine publisher. Ad for Radio Hat There is no need to add an author to every reference. -- SWTPC6800 (talk) 02:50, 24 July 2010 (UTC)

Dead link: false positive

In this edit, Citation bot flagged this link as dead. It is not. --Stemonitis (talk) 16:48, 16 July 2010 (UTC)

False positives seem quite common with dead links and are presumably caused by short-term outages of the hosting servers. I am contemplating disabling the checking of links for activity; what do people think? Martin (Smith609 – Talk) 21:00, 16 July 2010 (UTC)
How about queuing a list of them for later revisitation? If they're back when revisited the tag can be deleted, otherwise it will sit in Category:Articles with dead external links for a couple of years until fixed. LeadSongDog come howl! 21:26, 16 July 2010 (UTC)
Disabled link-checking in r178; may re-enable when this issue and the placement of the deadlink template can be resolved.

Incomplete unification

The bot changed one cite web to citation, but left one: [6]. Wizard191 (talk) 17:28, 18 July 2010 (UTC)

Keep cite ordering in-place

In this edit[7], the bot fixed the spelling of a cite parameter (good), but also moved it to the end of the citation, away from the related name parameters (not so good). Could you perhaps please leaving the ordering/grouping of related parameters as-found. Many Thanks, —Sladen (talk) 10:19, 21 July 2010 (UTC)

Hm, I attempted to implement this in r175; I'm not sure why this edit slipped through the algorithm. I'll investigate as soon as I can. Martin (Smith609 – Talk) 18:41, 21 July 2010 (UTC)

Incorrect casing EMJ

In this edit you have erroneously lower-cased 'An' in 'Early Modern Japan: An Interdisciplinary Journal'. See EMJ home page --candyworm (talk) 20:22, 22 July 2010 (UTC)

Just found User:Citation bot/capitalisation exclusions and added there, hopefully correctly. --candyworm (talk) 20:25, 22 July 2010 (UTC)
Oddly, the journal's own site here uses both capitalizations. Is there some reason to think the subtitle should be titlecased? LeadSongDog come howl! 21:26, 22 July 2010 (UTC)
I think that's the OSU Knowledge Bank site you're looking at. The actual EMJ Network site is here and appears to be unequivocal about 'An'. --candyworm (talk) 21:56, 22 July 2010 (UTC)
The Library of Congress Catalog shows this, which is more in keeping with conventional naming. See also OCLC 177505028.LeadSongDog come howl! 03:48, 23 July 2010 (UTC)
Not sure what you're driving at. The LOC link shows "Early modern Japan" (lowercase M) and "An interdisciplinary journal", while Worldcat link shows "Early modern Japan : an interdisciplinary journal". So now we have four variants, it seems most logical to go to the journal itself as I originally suggested. --candyworm (talk) 07:53, 23 July 2010 (UTC)

Incorrect "broken doi" tagging; seems critical

Example, not an isolated one. The tagged dois are valid and are clickable before the bot operation (the bot actually used some of those dois to expand the refs.). Materialscientist (talk) 23:39, 4 August 2010 (UTC)

Blocked the bot. While the bot is enormously useful, this bug appears damaging and might need a re-run of the bot over recent edits. Materialscientist (talk) 23:47, 4 August 2010 (UTC)
Thanks for stepping in with a block whilst I fixed this. It is resolved in r184 and no longer marks the problematic DOIs in Benzene as broken, so it should now be safe to unblock the bot. If the problem recurs, feel free to block the bot again. Martin (Smith609 – Talk) 21:15, 5 August 2010 (UTC)
Unblocked, thanks. Materialscientist (talk) 22:08, 5 August 2010 (UTC)

Resumed. Especially clear when the bot is run over an article once and then again (I do that often because the bot misses some parameters from one run, e.g. adding both doi and pmid, etc). Focus on doi:10.2113/gsecongeo.39.2.109. Here, the bot expanded it and then tagged as broken. I then went to the bot history and stumbled upon a funny example when the bot first untagged dois and then re-tagged them. Reverse example [8] [9]. Materialscientist (talk) 06:11, 13 August 2010 (UTC)

The best explanation I can think of is that the bot is temporarily unable to connect to the server, so think that the DOI is broken. To get around this, I guess that I can try the bot on a DOI that is known to work, and see if it also thinks that this is broken; if so, it won't mark the DOI as inavtive. Does this sounds like a workable solution? Martin (Smith609 – Talk) 16:05, 17 September 2010 (UTC)
I think you're right that this falls into the category of server/connection glitches. I haven't seen this recently. Your idea to have some reference for the bot to sense server/connection problems sounds great, as it might avoid other problems I don't see, not only doi-related. Materialscientist (talk) 05:09, 28 September 2010 (UTC)

Bot corrupted lead editor's name

As you can see here [10], the bot reassigned the first name of "editor5" and made it the first name of "editor1" (there are six authors for this veterinary medical text). The correct name of the lead editor is "Ross D. Clark, DVM" and the correct name of editor5 is "Jacob Mosier, DVM". Thanks! Astro$01 (talk) 11:47, 5 August 2010 (UTC)

{{cite book}} does not recognise more than four pairs of editor names; the highest permissible is |editor4-last= |editor4-first= --Redrose64 (talk) 14:51, 5 August 2010 (UTC)

Citation bot breaks [[doi:...]] interwiki links

In this edit, Citation bot changed an interwiki doi link of the form [[doi:10.1016/j.ejc.2008.09.028]] (inside the id field of a citation template) to the broken format [[doi=10.1016/j.ejc.2008.09.028]]. Note by the way that this is not the doi of the paper itself; it is the doi of an erratum to the paper, which is why it was specified that way rather than using a doi field. Please avoid breaking links like this. —David Eppstein (talk) 02:03, 8 August 2010 (UTC)

[[doi:10.1016/j.ejc.2008.09.028]] is a pretty unconventional way of doi linking on wikipedia, and I won't blame the bot for not understanding it. {{doi|10.1016/j.ejc.2008.09.028}} would be much more common. Materialscientist (talk) 04:15, 8 August 2010 (UTC)
Does that form even work inside an id field? Conventional or not, it was not broken before the bot edited and bot owners are responsible for putting right mistakes of their bot. I would expect the bot owner to check the bots edits for any further examples. SpinningSpark 08:17, 8 August 2010 (UTC)
Very unusual input wikitext syntax. I've just normalized it. I suppose the bot could check for id=wikilink and refuse to do anything, but I don't accept that it was "not broken before the bot edited". By mixing citation styles that way, the input obscured the fact it was linking to two separate publications (with two diffent forms of the journal name), even though that was clear to a humean reader in the rendered text. Consider what happens when you ask your (automated) research assistant (agent) to "please get me a copy of all the references to this article". We want the citations to be as consistent as we can to maximize the agent's success. LeadSongDog come howl! 15:39, 27 August 2010 (UTC)
ID field not used in accordance with template documentation --> not supported by bot. Martin (Smith609 – Talk) 18:01, 8 September 2010 (UTC)

I insist that this is not resolved. The bot is perfectly welcome to change fields of the form doi= within a citation template. If it messes up anything else that is not a doi= field, but merely has the "doi" substring in it for some other reason, as it did in this edit, then it is a bug. It should be more careful to check that the "doi" that it thinks it is matching really is part of a field name in a template (e.g. that the letters "doi" follow a vertical pipe, not true in this example, and that they aren't nested inside other double-curly or double-square bracket pairs, also not true in this example). —David Eppstein (talk) 19:18, 8 September 2010 (UTC)

The real surprise here for me is that somehow or other, despite being in double square brackets, doi:10.1016/j.ejc.2008.09.028 gets linked to the resolver with the doi of the intended paper. I suppose someone's been busy on mediawiki development. An improper doi in the same form still links to the resolver, e.g. doi:10.1016/abc.123 links to http://dx.doi.org/10.1016/abc.123 (just mouseover and see the url pop up in your browser's status bar).
In this instance, a user completed the |id= parameter in a fashion that violated the template documentation. Unfortunately, I only have time to make the bot operate in line with consensus policies, and cannot accommodate custom uses of templates that are at odds with their documentation. Before I amend the bot's behaviour, I would therefore first ask that consensus is obtained for this novel usage of the id parameter, and that this consensus is reflected in the cite journal documentation. Martin (Smith609 – Talk) 21:35, 8 September 2010 (UTC)

Page blanking

Not sure why, but the bot just blanked an article here when it was trying to note that a doi was incorrect. Great bot by the way - I've saved hours since I found out about {{cite doi}} and {{cite jstor}}. Smartse (talk) 12:28, 27 August 2010 (UTC)

(It worked the second time). Smartse (talk) 12:29, 27 August 2010 (UTC)
Probably a server error. I've never been able to replicate this. Sorry for the inconvenience. Martin (Smith609 – Talk) 15:55, 17 September 2010 (UTC)

{{resolved}}

Confusing issue and page numbers

Citation bot 1 keeps confusing issue numbers for page numbers for online publications. See this edit of Lemur, particularly the refs named "2009Groeneveld", "2008Braune", and "2008Orlando". – VisionHolder « talk » 18:49, 1 September 2010 (UTC)

I suspect that this is an error in the publisher's database. I'll investigate further when I get the opportunity. Martin (Smith609 – Talk) 15:57, 17 September 2010 (UTC)
Any update on this? The bot is still doing this, and is one of the most common errors I have to go back and fix. One of the latest examples: [11] – VisionHolder « talk » 13:19, 3 December 2010 (UTC)
Looking at the XML from PMID 18442367, we see

               <JournalIssue CitedMedium="Internet">
                   <Volume>8</Volume>
                   <PubDate>
                       <Year>2008</Year>
                   </PubDate>
               </JournalIssue>

and

           <Pagination>
               <MedlinePgn>121</MedlinePgn>
           </Pagination>

so clearly they've got the same error. I'd suggest that in such cases, where no true issue number nor page number applies, we would do better to follow the format given in Citing Medicine as shown as example "36. Journal article on the Internet with location/extent expressed as an article number". To whit, it shows: Pasanen K, Parkkari J, Pasanen M, Hiilloskorpi H, Mäkinen T, Järvinen M, Kannus P. Neuromuscular training and the risk of leg injuries in female floorball players: cluster randomised controlled study. BMJ [Internet]. 2008 Jul 1 [cited 2008 Nov 17];337:a295 [7 p.]. Available from: http://www.bmj.com/cgi/reprint/337/jul01_2/a295 Free full text article. DOI: 10.1136/bmj.a295 Of course that implies that we define a new |articleno=a295 which overrides both issue and page.LeadSongDog come howl! 20:33, 3 December 2010 (UTC)

I'm fine with this. If someone implements it, let me know and I can fix up some of my articles. Alternatively, I can let Citation bot actually save me work rather than create it. – VisionHolder « talk » 06:09, 19 December 2010 (UTC)
The data is actually coming from CrossRef rather than PubMed. CrossRef's first_page data is usually good, so there are some possibilities:
  • Report the error to CrossRef and hope that they fix it
  • Don't add any page numbers if they happen to be the same as an issue number
  • Don't add page numbers if the journal is equal to BMC biology
The first option would be the best if it worked. Any other suggestions are welcome! Martin (Smith609 – Talk) 21:20, 12 February 2011 (UTC)

Name mangling in doi templates

Has this sort of thing been fixed? Rich Farmbrough, 04:23, 18 September 2010 (UTC).

Multiple names

The attempted application at Copula (statistics) for cite to Onken et al. left all authors on Last= line and 5 rather than 4 "first*=". Result edited manually. Melcombe (talk) 11:14, 21 September 2010 (UTC)

Bizzare edit by bot

http://en.wikipedia.org/w/index.php?title=Template%3ACite_doi%2F10.1021.2Fop9700385&action=historysubmit&diff=398313359&oldid=374677940  Ronhjones  (Talk) 22:46, 25 November 2010 (UTC)

Something similar has happened before; however I'm completely unable to replicate this. If anyone has any ideas, please let me know. Martin (Smith609 – Talk) 22:56, 25 November 2010 (UTC)
Evidently the pasted text was from Alcohol and cancer. Was the bot working on that article when the problem arose?LeadSongDog come howl! 06:54, 26 November 2010 (UTC)

et al

I think the bot should detect if 'et al' is present in situations like this one. Ruslik_Zero 19:17, 6 December 2010 (UTC)

You're right, that's clearly a bug. On finding lastn=et al. or similar, adding values for firstn, for lastn+1, etc should only happen if a substantive value is provided for lastn in lieu of et al. Of course that doesn't mean the additional names must be displayed, but the metadata should be corrected in any case. LeadSongDog come howl! 21:54, 6 December 2010 (UTC)

Strange bug - the bot's fascination with Magnus Barelegs

For some reason, the bot seems to have a bit of a fascination with using the information about this paper instead of the paper that the doi/pmid actually links to. I've now seen it across multiple articles, for example in Template:Cite pmid/2439888, Template:Cite doi/10.1038.2F174515a0 and there are about 50 other occurences. Any idea what's up? If you can let me know how to fix them relatively speedily then I'll have a go. (you must sometimes wish you ever bothered inventing the bot musn't you?) SmartSE (talk) 23:43, 2 January 2011 (UTC)

Impressive spot! It's rather curious; I don't seem able to replicate the error. I'll scratch my head after dinner... Martin (Smith609 – Talk) 00:06, 3 January 2011 (UTC)
I'm not sure what you did, but it looks to have been fixed. Thanks SmartSE (talk) 22:08, 3 January 2011 (UTC)t
I fixed them manually; couldn't get at the underlying cause, though. I'll add a "die()" function to the bot at some point, to see if I can stop it in its tracks. Martin (Smith609 – Talk) 22:13, 3 January 2011 (UTC)


Ignoring last references

I'm trying to run the bot on Genome-wide association study. However, it does not seem to be reacting at all to references with PMID's after the 32nd reference. There are no error output so unfortunately I can't be more helpful than than. --LasseFolkersen (talk) 16:10, 7 December 2011 (UTC)

The problem citations were missing the closing }} (double brace) before the closing </ref> tag. This is a typo that trips up the bot, though I would think it should be possible for the bot to automagically detect and fix it, at least in simple cases. After correcting the typos, the bot ran to completion. That said, this edit incorrectly changed {{citation}} to {{cite journal}}, it should have been {{cite web}}.
Sidebar: The source is a pretty good quality genetics blog, but it's no journal. It does, however, cite a Nature Genetics article that may be more of use. I'd suggest checking it.
The url given was missing the http:// protocol prefix. While this didn't seem to trouble the bot, it should have been fixed, as it will trouble some web browsers. Again, this is something the bot could be made to remedy automagically. LeadSongDog come howl! 17:58, 7 December 2011 (UTC)
Thank you so much for the help --LasseFolkersen (talk) 08:07, 8 December 2011 (UTC)
You're quite welcome. Interesting article.LeadSongDog come howl! 14:36, 8 December 2011 (UTC)

MD5 hash?

When I run it on Bioorthogonal Chemistry, I get this error: Edit may have failed. Retrying: xxx Still no good. One last try: Failed. Error code: BADMD5: The supplied MD5 hash was incorrect. What is a BADMD5 anyway?

Replicated. Toolserver is very sluggish today, it seems. "BADMD5" is an indication that the MD5 hash calculation revealed a problem with the integrity of a file transferred. Probably related to toolserver issues.LeadSongDog come howl! 14:41, 9 December 2011 (UTC)
still a problem. citation bot won't complete Autoimmune lymphoproliferative syndrome.  —Chris Capoccia TC 19:41, 12 December 2011 (UTC)
bot works on shorter pages like Infoveillance.  —Chris Capoccia TC 20:00, 12 December 2011 (UTC)
Seems to have been one specific troublesome DOI 10.1002/ajh.21007 that wasn't expanding. Worked around it by using pmid instead, then it expanded. Odd indeed.LeadSongDog come howl! 22:58, 12 December 2011 (UTC)

I am getting the same error on Potassium iodide. When I run the page in manual mode and check the output, it is mangling all the non-ascii characters.  —Chris Capoccia TC 14:54, 22 May 2012 (UTC)

Incorrect linkage to Main Page instead of to article

Every time I run Citation Bot it gives me some incorrect linkage. The results page says

Processing page '(Article name)'

but when I click on (Article name) I am taken to the Main Page instead. Thanks, Shearonink (talk) 01:11, 20 October 2012 (UTC)

Error message

For the last day or so, I have received the following message in response to attempts to use citation bot: "Blank page. Perhaps it's been deleted? ** Blank page retrieved." The message is returned for all pages and they are clearly not blank. 173.62.242.128 (talk) 11:44, 29 August 2013 (UTC)

Discussed at user talk:Citation bot#Citation bot reports blank page (erroneously, on every cite doi template, every time) LeadSongDog come howl! 13:58, 29 August 2013 (UTC)

Hebrew wikipedia

Hi! I've tried to use the tool in the He wikipedia. I wrote in the first box "he:User:Neta90/Bibliography" and in the secound "Neta90" (with the default parameters). Ufortunately, it didn't work out. The error was "Writing to He:User:Neta90/Bibliography ... Edit may have failed. Retrying: xxx Still no good. One last try: Failed. Error code: NOTOKEN: The token parameter must be set. history / last edit". Is it a bug? Thanks, Neta90 (talk) 15:02, 12 November 2013 (UTC)

False negative on pmid hunt?

In this edit there were ten instances which, like the example below, didn't find the PMID. Is the bot querying pubmed by doi before declaring "abandoned. nothing found"? I suspect it is not, as a manual pubmed query for "10.1056/NEJMoa060068" found the record and returned PMID 17093248.

*-> Rabbit antithymocyte globulin versus basiliximab in renal transplantation 1: Tidy citation and try ISBN SICI Brennan, DC, Daller JA, Lake KD, Cibrik D, Del Castillo D 2: Find DOI - Checking CrossRef database... Match found: 10.1056/NEJMoa060068 3: Find PMID & expand - Searching PubMed... - Errors detected in PMID search; abandoned. nothing found. 4: Expand citation - Checking CrossRef for more details Done. Just a couple of things to tweak now... Checking that DOI 10.1056/NEJMoa060068 is operational... LeadSongDog come howl! 16:29, 15 September 2010 (UTC)

Looks like the PMID servers were temporarily down; it looks to me like it's now behaving as expected. Martin (Smith609 – Talk) 18:49, 17 September 2010 (UTC)

{{resolved}}

Now tagged {{nobots}}. Keeps changing an URL to the wrong page. --Old Moonraker (talk) 20:42, 16 September 2010 (UTC)

Thanks for the fast response. --Old Moonraker (talk) 14:39, 17 September 2010 (UTC)
{{resolved}}

Removing author and editor info

I like to include author data in my citation tags, even if they won't be displayed. However, Citation bot 1 keep deleting "extra" author information. See this edit of Lemur, particularly refs named "2009Mittermeier" and "2008MittermeierGroves". Can this "feature" be disabled? – VisionHolder « talk » 18:49, 1 September 2010 (UTC)

That input wikisource erroneously used {{cite journal}} (with a journal, authors, volume, issue, pages, and doi) in lieu of {{cite book}} (with editors, a publication-place, publisher, and an isbn). It also erroneously listed editors as if they were authors. It is less than surprising the bot got confused by this one. Wikisource fixed. LeadSongDog come howl! 21:36, 1 September 2010 (UTC)
If you want the extra author information in the source code, you could specify |author9=Jones, A.B; Smith, C.D; Milton, E.F.. This still won't be displayed (it just triggers the display of "et al.") but the bot will leave them be. Martin (Smith609 – Talk) 16:15, 17 September 2010 (UTC)

{{resolved}}

ISBN problem

On Tempo (chess) the bot replaced a 13-digit ISBN with a 10-digit ISBN from duplicate data. Bubba73 (You talkin' to me?), 05:13, 3 September 2010 (UTC)

Hopefully this will be fixed with the next version of duplicate-data handling (in progress). Martin (Smith609 – Talk) 16:08, 17 September 2010 (UTC)

It also made an unwanted change to Rook and pawn versus rook endgame yesterday. The authorlink was commented out because the author is linked previously. The bot changed it. Bubba73 (You talkin' to me?), 05:58, 3 September 2010 (UTC)

The change doesn't seem to affect the output of the citation. If you don't like the "unused_data" parameter, you could type |authorlink=<!-- John Nunn--> to the same effect. Martin (Smith609 – Talk) 16:08, 17 September 2010 (UTC)

{{resolved}}

bot linked book vs review of bk [12]

The review of the book is posted at the place the book is listed so the bot thinks the book is the reference and sticks in the isbn. Smkolins (talk) 23:32, 10 September 2010 (UTC)

Sounds like a false positive; see instructions on user page to keep ISBN out. Martin (Smith609 – Talk) 14:15, 17 September 2010 (UTC)

{{resolved}}

Problem when there are two ISBN numbers

In the article Oral Roberts, there are citations with an "id" parameter that has two ISBN numbers. This works as intended when the parameter is named "id", but when the "id" is changed to "isbn", the two ISBN numbers are interpreted as one big ISBN number, which is incorrect. I reverted the changes that the bot made. Obankston (talk) 15:29, 17 September 2010 (UTC)

My understanding is that you should only cite the ISBN of the cited source, and that unique sources only have one ISBN. Sounds to me like the template was incorrectly completed. Let me know if that's not the case. Martin (Smith609 – Talk) 15:53, 17 September 2010 (UTC)
Investigation shows that ISBN 0-87975-369-2 is the 1987 edition, and ISBN 0-87975-535-0 is the 1989 edition. Since the two references concerned both show the year as 1989, you should show only |isbn=0-87975-535-0. --Redrose64 (talk) 16:40, 17 September 2010 (UTC)
But the question remains as to what is to be done with those citations which used "id=" with more than one ISBN. Should they be changed by hand when they are discovered, or should there be an automated way of dealing with them? It is easy for me to suggest that someone else do the work, as long as I don't have to answer for it. TomS TDotO (talk) 17:11, 17 September 2010 (UTC)
I fixed the citation in the article :) thanks for the investigation. My suggestion for the bot is: if the parameter name is changed from "id" to "isbn", validate the parameter value to determine if it is a valid ISBN number, as is done by http://en.wikipedia.org/wiki/Special:BookSources/0879755350. If the parameter value is invalid, do not make the change, but append the citation to a list of items to be handled manually. Obankston (talk) 22:56, 17 September 2010 (UTC)
It is actually incorrect to say that a given book only has one ISBN. Certain books published jointly have more than one – I once had one that had four ISBNs printed on it. Obankston is correct, at least check that you are converting 13 or 10 or 9 digits worth of characters, simply skip anything else since it might be valid (by all means create a list). Rich Farmbrough, 04:22, 18 September 2010 (UTC).
Another suggestion would be to change the citebook template to handle cases with multiple ISBNs. I've noticed that hardbound and paperback editions have different ISBNS even though the text is identical; and multivolume works can have one ISBN for the whole set and distinct ISBNs for each individual volume. I don't know what to do when making a citation to a work with more than one ISBN. TomS TDotO (talk) 11:02, 18 September 2010 (UTC)
{{Cite book}} and the other cite templates are already fairly heavy duty, especially since that can be used many, many times on one page. I have been meaning to look at them in detail, but there are a lot of experts in these templates (Redrose64 is one), so it may not be worth it. Rich Farmbrough, 15:19, 18 September 2010 (UTC).
(edit conflict) You're supposed to give the details of the edition which you consulted.
For both jointly-published works, and hardback/paperback versions, there may well be multiple ISBNs on the copyright page (I have some such books myself), but there will be just one ISBN on the back cover – that's the one to give. For multi-volume sets where individual volumes have their own ISBN, give the ISBN for the specific volume containing the cited text. --Redrose64 (talk) 15:21, 18 September 2010 (UTC)
"There will be just one ISBN on the back cover" – not always – I once had one that had four ISBNs printed on it, on the back cover, wish I could lay my hands on it. Oh and when we did the big ISBN clean-up a few years back, we found several books with invalid ISBNs printed on them. It is like quantum mechanics, anything that is possible to happen will happen. Rich Farmbrough, 19:14, 20 September 2010 (UTC).

Don't forget it is perfectly acceptable to indicate more than one ISBN in, say, a "Further reading" section or a section containing a list of an author's works to alert readers to the fact that a particular book exists in more than one format. — Cheers, JackLee talk 19:38, 20 September 2010 (UTC)

If someone is really interested in presenting other ISBNs, they can always mention them outside of the citebook template. Maybe using the template a second time, without repeating all of the information, just a barebones one like this: (paperback= ed.). ISBN 123456789. {{cite book}}: Check |isbn= value: length (help); Missing or empty |title= (help)? TomS TDotO (talk) 10:13, 21 September 2010 (UTC)
The problem with using a "barebones" {{cite book}} is that sooner or later, a well-meaning editor (or bot) will fill in all the absent fields, and possibly not do so in the intended manner. If there are two editions or formats to show in "Further reading", they could each be given a full citation: see Template:Rolt-Red – if you do this yourself, it's less likely that the aforementioned well-meaning ed/bot will get it wrong. --Redrose64 (talk) 14:21, 21 September 2010 (UTC)
Thank you. That is an excellent point, and I will be careful about that. TomS TDotO (talk) 16:17, 21 September 2010 (UTC)
One of the inherent weaknesses of the ISBN is that it encodes the publisher in the number. Thus any edition with multiple publishers legitimately gets multiple ISBNs. This commonly reflects exclusive national distribution agreements rather than differences in the book itself. The telltale is often the use of an adhesive alternate ISBN barcode label on the flyjacket, covering the original ISBN barcode in the second country. However, in the case of controversial content, national sensitivities or interests are sometimes reflected in editorial choices, even with simultaneous release dates. Care is needed to ensure we give the consulted version's ISBN if we give any. An ISBN is not mandatory and so, in such cases where it is unclear, should probably be blocked by a comment, e.g.: |isbn=<!-- deliberately omit isbn to avoid confusion -->

LeadSongDog come howl! 16:54, 21 September 2010 (UTC) {{resolved}}

Screwed up doi

Hello. I used the cite doi template for the doi 10.1136/bmj.1.3363.1111-a on Samuel Taylor Darling to create this reference (some time ago, but I've only just noticed it's wrong!). The bot then expanded this for me, completely incorrectly, such that only the doi bit that I originally input is actually correct. Ta, Chris (talk) 08:38, 26 September 2010 (UTC)

Wow, the bot conflated two different BMJ articles from volume 1 page 1111. The other is Br Med J 1979; 1 : 1111 doi: 10.1136/bmj.1.6171.1111 For some reason, the correct article, Br Med J (13 Jun 1925); 1 : 1111 doi: 10.1136/bmj.1.3363.1111-a JSTOR 25445494 is not indexed in Pubmed. LeadSongDog come howl! 02:15, 27 September 2010 (UTC)

{{resolved}}

Disrupting the Google book reference

Diff=http://en.wikipedia.org/w/index.php?title=Tribute&diff=385236599&oldid=378447098

Please do not remove &hl=en#v=onepage. "onepage" became "snippet".

  • Original url
http://books.google.com/books?id=vj8ShHzUxrYC&pg=PA482&dq=tribute+korea+china&hl=en#v=onepage&q=tribute%20korea%20china&f=false
  • Bot modified url (removed books, &hl=en#v=onepage and &f=false)
http://books.google.com/?id=vj8ShHzUxrYC&pg=PA482&dq=tribute+korea+china&q=tribute%20korea%20china
  • Actual url modified by Google. (added books, &hl=en#v=snippet and &f=false)
http://books.google.com/books?id=vj8ShHzUxrYC&pg=PA482&dq=tribute+korea+china&q=tribute+korea+china&hl=en#v=snippet&q=tribute%20korea%20china&f=false
―― Phoenix7777 (talk) 22:11, 16 September 2010 (UTC)
I've temporarily disabled Google-URL modifications until I can resolve this more adequately. I think that just keeping anything after the # would always ensure the correct behaviour; is this correct? Martin (Smith609 – Talk) 14:13, 17 September 2010 (UTC)
On this basis the bot now retains anything after the hash. ⇒ {{fixed}} in r187. Martin (Smith609 – Talk) 14:33, 17 September 2010 (UTC)
Isn't it sufficient to keep the part of the URL up to the page number indicated after "PA", i.e., http://books.google.com/books?id=vj8ShHzUxrYC&pg=PA482 in the example above? That does away with the highlighting of terms in the book, unless for some reason that is desired (I can't really imagine why). — Cheers, JackLee talk 19:43, 20 September 2010 (UTC)
There may be some cases where highlighting is deliberate; the bot could never discern these. {{resolved}} Martin (Smith609 – Talk) 03:50, 5 November 2010 (UTC)}}

Cite doi and page numbers

Whenever I use {{cite doi}}, I always have to make [13] this sort of change: [14], http://en.wikipedia.org/w/index.php?title=Template:Cite_doi/10.1260.2F095830503765184583&diff=prev&oldid=384324518], [15], etc. Only the first page of the article ever gets properly added to the citation template. I'm not sure why this is, but is there any way to fix it? NW (Talk) 02:43, 10 October 2010 (UTC)

Unfortunately not. The database consulted by the bot does not contain this information. Martin (Smith609 – Talk) 03:01, 10 October 2010 (UTC)

{{resolved}}

Didn't create the template

I jumped the queue with this link [16] and it went through the motions but never created the template. (I ended up just creating it myself.) I can't reproduce with other dois. ErikHaugen (talk) 22:31, 4 November 2010 (UTC)

Hmm, I'll look into it. Let me know if anything else unusual happens; I might have to roll back the latest revision. Martin (Smith609 – Talk) 03:44, 5 November 2010 (UTC)

Also is it deliberate that first names are abbreviated? I almost always end up editing the cite doi subpage to a.) include entire page range and b.) spell out first names. ErikHaugen (talk) 22:31, 4 November 2010 (UTC)

Yes, it's very deliberate: see Template:Cite doi#Formatting. Martin (Smith609 – Talk) 03:44, 5 November 2010 (UTC)
That section doesn't say it explicitly. Do people get upset when first names are spelled out? ErikHaugen (talk) 05:33, 5 November 2010 (UTC)
People get upset when things are inconsistent. Initials are sometimes the only data that the bot can get so if we're making an arbitrary decision as to whether to consistently use initials or names, initials win (ceteris parabus). Martin (Smith609 – Talk) 23:28, 1 January 2011 (UTC)

{{resolved}}

Don't convert citation to cite patent

This is an old diff, but it just came to my attention today that this old edit of your bot completely broke the citation template. The citation parameters for a patent are completely different than that of {{cite patent}}, so you should never convert between these two. Wizard191 (talk) 13:52, 5 November 2010 (UTC)

Your bot did it again: http://en.wikipedia.org/w/index.php?title=Polytetrafluoroethylene&diff=prev&oldid=398195949. Please stop converting patent templates. Wizard191 (talk) 16:13, 22 November 2010 (UTC)
Surely the solution here is to make the patent templates inter-compatible. I've made some headway here at Template:Cite patent; your feedback there would be welcome. Martin (Smith609 – Talk) 17:14, 22 November 2010 (UTC)
I have no problems with making them interchangeable, but in the meantime we can have the bot going around breaking things. Wizard191 (talk) 18:46, 22 November 2010 (UTC)
Great. They're now interchangeable. Martin (Smith609 – Talk) 23:25, 1 January 2011 (UTC) {{resolved}}

Oxford Dictionary of National Biography

Could you please get the bot to ignore Oxford Dictionary of National Biography references. Duplicating authors or adding the editor of the dictionary as the author of the article is inappropriate. DrKiernan (talk) 08:14, 19 June 2010 (UTC)

Please supply an example of the bug so I can understand what has caused it. Martin (Smith609 – Talk) 21:45, 9 July 2010 (UTC)
[17][18] Someone said it was because the CrossRef database incorrectly lists one of the editors of the encyclopedia (Brian Harrison) as an author of the article (sorry, forgotten who or where). DrKiernan (talk) 13:13, 4 August 2010 (UTC)
A manual override seems to be the best solution in this case. Martin (Smith609 – Talk) 23:49, 1 January 2011 (UTC)

{{resolved}}

Hi. The bot updated the PMID for ref 1 in this article, from a correct PMID to an incorrect PMID. The updated PMID pointed to a paper (I guess a short paper) that had the same start page as the correct paper - I suspect this is why the mistake occurred. This does, however raise the question of whether this will happen every time more than one paper starts on the same page of a journal. Any thoughts? Regards, GILO   ACCIDENT & EMERGENCY 05:33, 2 January 2011 (UTC) {{fixed}} in r231 – never overwrite editor input. Martin (Smith609 – Talk) 14:27, 2 January 2011 (UTC){{resolved}}

Mangles URL with id in it and {{doi}}

http://en.wikipedia.org/w/index.php?title=Collatz_conjecture&diff=373121445&oldid=371737940

http://code.google.com/p/citation-bot/issues/detail?id=60

Cheers, — sligocki (talk) 01:45, 13 July 2010 (UTC)

Until the googlebooks url parsing works properly, I'd suggest that it may be better to simply tag the entry for human attention. Certainly this behaviour of simply replacing "id" with "doi" within the url is a bug. LeadSongDog come howl! 14:22, 13 July 2010 (UTC)
{{fixed}} in r173. Martin (Smith609 – Talk) 15:36, 13 July 2010 (UTC)

{{resolved}}


This edit has problems

  1. Proper Citation (journal) changed to Cite book, removed journal title, volume, issue. Should be journal.
  2. Proper Citation (journal) changed to Cite book, removed journal title, volume, issue. Should be journal.
  3. Irregular, but not improper. Tagged improper, but added unnecessary postscript=
  4. Proper Citation within a quote template ending }}</ref>}} forced insertion of unnecessary postscript=
  5. Proper Citation (web) changed to Cite document (should be web), tagged improper, added unnecessary postscript=
  6. Proper Citation (web) changed to Cite document (should be web), tagged improper, added unnecessary postscript=

So whatever is detecting the need for postscript changes seems too sensitive, and the bot should detect "journal=" and "newspaper=". --Lexein (talk) 16:36, 30 September 2010 (UTC)

I've re-shuffled the way that the bot checks the type of the source; can you provide any way that the bot could have told that the last two citations were specifically web-based documents (not, for example, an upload of a book chapter)? Martin (Smith609 – Talk) 03:47, 5 November 2010 (UTC)

{{resolved}}

Unnecessary edits

Why is Citation Bot 1 naming all unnamed refs in articles it's working on? This is not only unnecessary, but actually unhelpful, as the point of names is not only to collect together cites from the same source, but also as a mnemonic device for editors. They are never going to remember what "Ref_a" or "Ref_n" is. Please fix this. Example: [19] Beyond My Ken (talk) 00:53, 19 December 2010 (UTC)

This feature has been requested by me – though I have mostly envisioned its use in medical citations where
  • it is easy to autogenerate a sensible refname such as "author_year"
  • some editors use shortcuts such as {{pmid 123456}} which nobody can remember what they refer to
It is obviously much more difficult to apply for web citations, not sure if it is doable. Richiez (talk) 10:32, 19 December 2010 (UTC)
The reference names are currently generated from the COINS metadata produced by citations. The "author" and "year" fields are used. If there are other fields that are appropriate to web citations (perhaps part of the URL?) then it would be easy to use these in the reference name. Martin (Smith609 – Talk) 13:56, 19 December 2010 (UTC)
I'm with Beyond My Ken on this one, these edits are more hurtful than helpful because the bot is now just applying non-intuitive names to references. In most cases names aren't even needed because they are only used once in the article, therefore just crowding up the code even more. Please stop this action. Wizard191 (talk) 20:59, 19 December 2010 (UTC)
As a first measure I've made the bot ignore references that don't use a cit... template to generate their output (in r223), and used the first 2 words of titles if no author is present (r225). Martin (Smith609 – Talk) 21:55, 20 December 2010 (UTC)
  • And disabled whilst I investigate some bugs. Martin (Smith609 – Talk) 22:36, 20 December 2010 (UTC)

Please re-enable it! It so immensely helpful. Users are free to rename the citations if they feel like it, but there's just no reason to keep 5 <ref> which could be consendensed into one <ref name=Ref_A> and 4 <ref name=Ref_A/>. It really tidies up the reflists. Headbomb {talk / contribs / physics / books} 00:10, 2 January 2011 (UTC)

I agree that it is useful for the bot to combine duplicate refs but the issue here is that the bot is generating names for refs that are only used once. This is useless clutter, and worse, may not fit in with the naming scheme being used by the article's human authors. I am reverting this were I see it. SpinningSpark 20:04, 2 January 2011 (UTC)
Please continue this discussion at User_talk:Citation_bot#Naming_refs_.28which_aren.27t_reused.29. Thanks! Martin (Smith609 – Talk) 13:16, 3 January 2011 (UTC)

{{resolved}}

Citation bot down?

I tried running it and it said "Activated by Alpha Quadrant" but doesn't do anything else. I waited a half hour, but nothing happened. I tried a few other articles to see if it was just that particular one, but I got the same message. Is the bot down? Thanks, Alpha Quadrant talk 03:34, 2 January 2011 (UTC)

Whilst I try to get this fixed, you can use the last stable version by replacing "DOI_bot" in the address with "citation-bot". Martin (Smith609 – Talk) 14:37, 2 January 2011 (UTC)
{{fixed}} Martin (Smith609 – Talk) 15:03, 2 January 2011 (UTC)
That was fast, thank you for fixing it. Alpha Quadrant talk 17:22, 2 January 2011 (UTC)

{{resolved}}

Vcite template support?

There are a parallel set of cite templates ({{vcite journal}}, {{vcite book}}, etc.). These behave much like {{cite journal}}, etc. except that they take certain shortcuts in order to reduce page load times. The syntax used in the vcite templates is identical to the cite templates except that support for complex parameters such as "first1, last1, editor1-first, editor1-last" is dropped and instead one must use simple "author, editor", etc parameters. I tried running Citation bot on an example article (see nuclear receptor) where I have replaced cite with vcite templates, but Citation bot doesn't seem to recognize these templates. Would it be possible to extend the Citation bot so that the vcite series of templates are recognized and updated? Thanks. Boghog (talk) 09:34, 9 January 2011 (UTC)

See the archives for details of what would be required. Martin (Smith609 – Talk) 16:39, 9 January 2011 (UTC)
I have searched for a discussion about vcite in your archives, but I cannot seem to find it. Could you provide a link? Boghog (talk) 16:58, 9 January 2011 (UTC)
OK, by googling, I found this. Is this what you were referring to? Boghog (talk) 17:14, 9 January 2011 (UTC)
Yes, thanks for dredging that up. If you're happy to take on the maintenance -- brilliant! Martin (Smith609 – Talk) 17:29, 9 January 2011 (UTC)
I am not sure I am willing to take on that responsibility ;-) In any case, a quick fix is to temporarily change the vcite into cite templates, run the bot, and then switch back to vcite templates and retain of course any changes that the bot made. That should be at least a good short term solution. Thanks for your responses. Cheers. Boghog (talk) 19:18, 9 January 2011 (UTC)

{{resolved}}

Special symbols

When expanding doi:10.1007/BF01397171, the bot took letters ü as �, but I could copy/paste those letters from the doi-target page to wikipedia. Is it possible to preserve such symbols or they are lost somewhere at crossref? Materialscientist (talk) 11:07, 15 September 2010 (UTC) {{resolved}} in r249.

Redundant data?

Seems to have the Jstor number thrice, in the second citation changed. Rich Farmbrough, 04:38, 18 September 2010 (UTC).

Four times... Rich Farmbrough, 04:41, 18 September 2010 (UTC).
{{resolved}} in r253. Martin (Smith609 – Talk) 21:07, 12 February 2011 (UTC)

Google Books URLs again

Two kinds of edits to Google Books URLs are happening. After a recent edit by the bot, I tested each of the edited URLs and found none of the edits were correct. One type of edit was previously discussed. The other is visible in this comparison of diffs from before the bot ran until after I edited to the correct URLs, thus bypassing the immediate result of the bot, thus showing the additional kind of edit by the bot. Thanks. Nick Levinson (talk) 06:03, 23 September 2010 (UTC)

In each of the changes in the first edit, the bot seems to have removed the redundant parameter &hl=en; the desired behaviour. In the second edit, the bot appears not to have changed "false" to "true", which you modified yourself. I cannot determine the difference that this makes to the page rendered by Google but I may be missing something. Is there an algorithm that the bot can use to determine, for each link, whether to modify the "f" parameter from that set by the initial editor? Or perhaps the bot should remove it entirely, as it does not seem to 'do' anything? If so, I'll be happy to implement it. Martin (Smith609 – Talk) 12:42, 23 September 2010 (UTC)
In each case, I tested by pasting the bot-generated URL into a new browser tab and not by typing it, copying it from another source, or doing a new search in a search engine. Thus, it was the bot-generated URL that was redirected by Google to another URL, which means that, from Google's perspective, the bot is making errors for both kinds of changes that Google effectively rejects by redirecting. If Google wants the directory structure and the two parameters the way they are after the redirect, then we should supply them to be sure of getting the page even if Google stops redirecting to it because of low traffic through the redirect. Nick Levinson (talk) 15:53, 23 September 2010 (UTC)
I've not been able to find any indication that Google are planning to change their URL structures. Perhaps you could point me to the details? Martin (Smith609 – Talk) 12:55, 27 September 2010 (UTC)
It's the bot or its user that's making assumptions about Google's directory/parameter structure and those assumptions don't match what Google is doing now. The URLs I entered into the articles are the ones Google was using at that moment, both before and after the bot ran. The URLs generated by the running of the bot are not what Google is using. Therefore, the speculation is the bot's or bot user's.
I don't understand why anyone or any bot should be changing any URL to any form not preferred by a destination website owner. It generally either will make no difference, and therefore is a waste of editorial time, or will lower the ability of Wikipedia users to access files the URLs represent.
There's a case where I change URLs. One forum produces a URL after a word search that arranges to highlight those words in a topic. When removing the hilite parameters, the resulting URL works just as well at accessing the page and provides a usually-clearer page because we can read the topic without highlights that may be irrelevant to a reader's purpose. But in that case I test the resulting URL to be sure it works before posting it, so there'd be no need for a bot to change it. That same principle, of using the site's preferred form, applies to any website that I know of. I don't know of an exception warranting the edits this bot is doing.
Testing URLs to be sure they work (including Google Books URLs that use what's believed to be an old URL structure), marking those that fail or that succeed only through permanent redirection, and proposing alternatives are appropriate. That's because redirection involves a visitor's browser, since the visitor is (silently) forced to go to the new address, so it's possible to discover the fact of redirection and the fact of its intended permanence.
What I'm doubting is altering a URL that is probably generated recently by the destination website itself and therefore is probably the best URL without the bot's changes.
Thanks. Nick Levinson (talk) 05:03, 28 September 2010 (UTC)

{{resolved}}

Completely wrong

The details are completely wrong for this DOI Template:Cite_doi/10.1007.2F978-3-540-89982-2_59. pgr94 (talk) 11:07, 25 November 2010 (UTC)

Wow, that was complex. The ISBN was for a book of conference proceedings, containing the paper incorrectly cited with {{cite journal}}. That paper is offered at the DOI given, one doesn't have to order the entire book. The JSTOR number seemed to bear no relation to that paper. The journal field and author1 were populated from the unrelated JSTOR lookup's title and author results. The other author fields were correctly populated from the DOI. The series title was missed entirely. I've manually edited it for now. LeadSongDog come howl! 20:13, 8 February 2011 (UTC)
{{resolved}} - sounds like this could have been avoided by judiciously-used templates. Martin (Smith609 – Talk) 21:03, 12 February 2011 (UTC)

Diacritic characters in authornames

At this edit the bot seems to have mangled the diacritics in author names.LeadSongDog come howl! 19:59, 9 December 2010 (UTC) {{resolved}}