User talk:Citation bot/Archive 22

This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 15

←

Archive 20

Better PMID url cleanup

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 16:25, 14 July 2020 (UTC)

What should happen: [1]
We can't proceed until: Feedback from maintainers

I finally looked at these. When these links work, they redirect to publisher, so they are actually a duplicate of the DOI, not pubmed ID. Curious. Will work on. AManWithNoPlan (talk) 13:50, 21 July 2020 (UTC)

https://github.com/ms609/citation-bot/pull/3316 AManWithNoPlan (talk) 15:09, 21 July 2020 (UTC)

Caps: ecancermedicalscience

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 14:21, 28 July 2020 (UTC)

what should happen = [2]

We can't proceed until: Feedback from maintainers

https://github.com/ms609/citation-bot/pull/3351 AManWithNoPlan (talk) 20:00, 29 July 2020 (UTC)

strong tags

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 18:40, 31 July 2020 (UTC)

What happens: Since the title didn't match, I TNT'd it. This is what I got [3]
What should happen: Remove strong tags. And the stray dots. If not automatically, it should at least be removed for purposes of title matching.
We can't proceed until: Feedback from maintainers

Caps: PRZ / PRZ.

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 20:14, 31 July 2020 (UTC)

What should happen: [4]
We can't proceed until: Feedback from maintainers

This is the ISO 4 abbreviation for "Przegląd" Headbomb {t · c · p · b} 20:14, 31 July 2020 (UTC)

The Citation Bot is currently blocked because of disagreement over its usage. When will it be back up?

When will it be back up?

The Citation Bot is currently blocked because of disagreement over its usage. 101Fake101 (talk) 14:51, 4 August 2020 (UTC)

please see above. {{Duplicate Issue}} AManWithNoPlan (talk) 16:01, 4 August 2020 (UTC)

Remove duplicate citations

Status: {{wontfix}} beyond scope of this bot. I am afraid that we will become the kitchen sink of bots, and we do not have the time to keep up with yet another possible vector for bugs
Reported by: RayScript (talk) 22:28, 11 May 2020 (UTC)

What happens: I noticed there are articles with duplicate citations. It seems like it would make sense to merge these duplicate citations (as the ReFill bot does) instead of leaving them separate. There are some cases (such as different pages of books) where it makes sense to cite one source many times. However, I can't think of a case where it is useful to have the exact same citation two times. Here's an example where number 32 and 33 were identical and could be collapsed. https://en.wikipedia.org/w/index.php?title=Patrick_Combs&diff=955932647&oldid=955932567&diffmode=visual
We can't proceed until: Feedback from maintainers

Extended content

I believe AWB does this already, but only on pages where refs are already "re-used". Headbomb {t · c · p · b} 22:45, 11 May 2020 (UTC)

I've also seen that AWB does this. But I don't have Windows. I can't find it now but I've seen a few pages with many many duplicate references and then it's easy to cleanup with search and replace but seems like it would be worth automating via a tool like this. RayScript (talk) 22:54, 11 May 2020 (UTC)

I have seen ReFill fail spectacularly on this task (very destructive). I feel that this task would be a heavy lift of testing/writing at this time. A feature I actually have wanted for years. Not sure worth the effort since other things do it. AManWithNoPlan (talk) 00:25, 12 May 2020 (UTC)

The bot actually used to do this, adding names to references based on authors and years of duplicated citations. I believe that there was a beaurocratic kickback: perhaps someone objected that a separate BRFA was required, but had not been obtained? The advantage of Citation Bot performing it is that it can detect identical references that differ in e.g. white space only. Martin (Smith609 – Talk) 15:35, 12 May 2020 (UTC)

If the bot's purpose it to help cleanup citations removing duplicates seems like would be a great fit for something that's so visibly annoying and an easy mistake for new users to make. Do you happen to remember when/where that discussion may have taken place? Before posting this I did a little searching on the archives of this talk page but didn't turn up anything directly related to removing duplicate citations. In regards to AManWithNoPlan, I'm not sure about the work required to make this change but if other tools do it but are buggy/destructive (refill) or only available on Windows (AWB) then perhaps it could be useful to look at how they other programs are doing it and implement something similar (given the licenses permit). However, I am respectful of the developers time and if this isn't something they're interested in doing that's okay. I wanted to put it out there to share that it's a feature that I would find helpful. — Preceding unsigned comment added by RayScript (talk • contribs) 16:05, 13 May 2020 (UTC)

WikiCleaner also does this semi-automatically. It detects exact duplicates and suggests they be "fixed". AWB does it automatically but needs another ref on the page already use "ref name" (where multiple refs are already re-used). Jonatan Svensson Glad (talk) 19:38, 13 May 2020 (UTC)

I think everyone is in favor of doing this. If we implement it, then we would have to get a lot of test case. Have it only run in tool mode to start. Phase 2: combine citations that have same parameters but in different order - including blank ones. Phase 3: existing refs with different names. AManWithNoPlan (talk) 11:49, 21 May 2020 (UTC)

Hi, anybody out here who knows what happenend to reFill? Thank you for your time. Lotje (talk) 11:22, 11 July 2020 (UTC)

Removing links from title

Why is the bot removing links from titles of articles? There is clear consensus that editors want titles to be linked to the best available online source, especially when that is free to read. --RexxS (talk) 17:57, 7 June 2020 (UTC)

Agreed. The bot is removing other editors' work and is preventing users from freely accessing information. If nobody corrects this quickly, I suggest that editors remove doi's from citations that have direct links to titles. Corker1 (talk) 20:07, 29 July 2020 (UTC)

Extended content

If you mean the recent RfC to link DOIs with doi-access=free, that's something the cite templates need to do (the sooner the better, I add), but the usage of the "url" parameter remains the same. I agree however with the sentiment that it's not particularly helpful to migrate URLs from the url parameter to an identifier when there is no method yet to mark those URLs as able to provide an open access copy (several identifiers don't have an -access=free parameter). Nemo 18:29, 7 June 2020 (UTC)

I mean Help talk:Citation Style 1 #Auto-linking titles with free DOIs and Wikipedia:Village pump (proposals)/Archive 167 #Auto-linking titles in citations of works with free-to-read DOIs where it is abundantly clear that nobody should be removing links from titles in citations without exceedingly good reason. I believe such behaviour is contrary to consensus and disruptive. I will take action to prevent that if necessary. --RexxS (talk) 18:57, 7 June 2020 (UTC)

Well, the first thing you can urgently do is to transfer the change approved by consensus from the sandbox into the actual template. That will make it easier to respect consensus. Nemo 19:09, 7 June 2020 (UTC)

In the meantime, the onus to respect consensus not to unlink titles falls on whoever is doing the unlinking. It shouldn't be necessary to have to seek sanctions against established editors when they can simply desist from disruption until the changes are made to avoid unlinking titles. --RexxS (talk) 19:32, 7 June 2020 (UTC)

The RfC only asked about the effects of the doi-access parameter. Maybe you're right and a new consensus was formed which was broader than that, but if so it wasn't immediately clear so I hope this won't escalate. It will be easier to process the effects of any broader consensus on further identifiers after we've deployed the change on DOIs, I think: discussing everything at once won't work. Nemo 20:03, 7 June 2020 (UTC)

The semantic scholar people actually asked for us to do this. I personally do not see a the consensus that you claim to exist based upon the discussion (I suggest others chime in), and I believe that if this had been raised as a possible interpretation, then things would have exploded. Lastly, I do not have time at this point to make changes to the bot to implement a new consensus. AManWithNoPlan (talk) 20:06, 7 June 2020 (UTC)

I personally do not see a the consensus that you claim to exist based upon the discussion Oh, really? How do you read these comments then?

"As a reader, it is natural to click on the title of a citation to access it. Clicking on identifiers is less intuitive, even when they are marked as free" – User:Pintoch
"In all of our articles, readers generally know that a blue-linked title takes them to free full text. We cannot expect our readers to understand (in medical content) what PMC, PMID, DOI or anything else stands for." – User:SandyGeorgia
"the simplest thing for a reader to learn is that clicking on a title link takes them to the freest available online source." – User:RexxS
"this can significantly improve usability for those whose first instinct will be to click on the title." – User:Forbes72
"it's overwhelmingly obvious that a linked title is closer to user expectation than a linked [insert weird number/abbreviation most have never heard of]" – User:Ocaasi
"The less-informed reader will end up at a site where he/she/they can read the full article" – User:Markworthen
"This is my default way to indicate that links are free." – User:Buidhe
"this is the best indicator that more information is available.....no run around links is best" – User:Moxy
"standard web formatting is that the linked title takes you to the actual article being referenced." – User:PresN
"I would like readers to have the simple benefit of clickable titles whenever there is a free and legal full-text version available for them to access." – User:Biosthmors
"Clicking on a linked source title is the intuitive interaction." – User:czar
"Agree that clicking on the title is far more intuitive than clicking on the identifier" – User:The wub
"readers are used to clicking on titles." – User:PamD
"a general rule that almost all Wikipedia article citations follow: if the source is available online then it's linked through its title" – User:Bilorv

There is no need to have a fresh RfC on exactly the same issue every time another identifier parameter is added to the citation template. I'm absolutely certain that the general sentiment is in favour of having the title linked. I'm asking you politely to respect that and not unlink any further titles. But I will take whatever steps are needed to prevent further disruption if it continues. --RexxS (talk) 21:57, 7 June 2020 (UTC)

It is very discouraging to have to go through this all over again, after We Just Did This. It is also discouraging to get answers non-BOT people can't always decipher.[5] I only want to provide links to full articles, when possible, for the benefit of our readers. Can anyone explain to me why Semantic Scholar is in a position to be dictating how we link on Wikipedia? SandyGeorgia (Talk) 22:07, 7 June 2020 (UTC)

I think that the consensus was to generally link in url all free access identifiers, but the hierarchy (if there are multiple free access identifiers) also needs consesus. For instance, a free doi is usually better than semantic scholar, which doesn't always do a good job keeping copyrighted material off their website. If a source has both free doi and semantic scholar, the title should link to the doi and it's entirely correct to remove the semantic scholar from url. b uidhe 22:13, 7 June 2020 (UTC)
- Buidhe, that is not what is occurring in any of the examples I give above; titles that have free links are simply being unlinked. Again. SandyGeorgia (Talk) 22:17, 7 June 2020 (UTC)
  - I've now blocked AManWithNoPlan indefinitely and put the block up for review at Wikipedia:Administrators' noticeboard/Incidents #Block review of AManWithNoPlan and I'm prepared to block the bot as well if that proves necessary. We should not have to re-litigate the same concerns every time a new identifier is added to the CS1 templates, because nobody should be unlinking the title in citations. --RexxS (talk) 22:34, 7 June 2020 (UTC)
Note that the bot was converting two kinds of URLs to SemanticScholar identifiers: www.semanticscholar.org/paper/ (which may not have a full text) and pdfs.semanticscholar.org (which are generally the actual full text, except when they redirect to the previous). I assume the concern is only about the links which actually go to an open access full text, but it's not always trivial to tell one from the other. There are 9000 articles containing either kind of link at the moment, let's not get everyone blocked who stumbles upon one. Would it help to stop the conversion to s2cid for now, and rethink this after the RfC has been implemented? Nemo 22:34, 7 June 2020 (UTC)
- From reading through this thread, stopping the conversion sounds like a good idea until this is sorted out. But I don't think anyone would object if the bot behavior was changed to adding the sc2cid parameter while keeping the existing url. $\langle$ Forbes₇₂ | Talk $\rangle$ 22:45, 7 June 2020 (UTC)
- @Nemo bis: The problem was not that the bot was being used to add SemanticScholar identifiers; the problem was that it was also removing the url parameter and hence unlinking the title. No doubt at some point in the future, it will be able to remove url when it adds doi, pmc, s2cid, etc. because all of the free identifiers will be coded to auto-link the title. But until that happens, I don't agree that degrading the experience for readers must be a necessary consequence of adding a free identifier. --RexxS (talk) 22:52, 7 June 2020 (UTC)
- Nemo, see my examples ... real URLs to free links were removed. SandyGeorgia (Talk) 23:22, 7 June 2020 (UTC)
  - I've now found that Headbomb has been running the bot to make hundreds more edits after I alerted him to the unlinking of titles. I've blocked the bot for now, and I'm requesting sanctions against Headbomb if he fails to take steps to restore the links he is responsible for removing. --RexxS (talk) 00:45, 8 June 2020 (UTC)
    - There is a certain oddity to the Citation Bot being threatened with being shut down for adding S2 links because they were considered either useless DOI duplicates or copyright violations, and now it gets shutdown for converting them to ID numbers. The paradox can be resolved, but still it is an odd situation which calls for discussion. AManWithNoPlan (talk) 01:12, 8 June 2020 (UTC)
      - I have been asking about this for more than a day, and we have had considerable discussion on your talk ... and just now hearing there is a copyright issue? If there is a copyright issue, and if these links are useless DOI duplicates, why are we adding them at all? Why have we honored a company that violates copyright with its own link, which is just adding clutter to citations? Could someone please indicate where these copyright concerns were raised? It seems to me that this s2cid identifier is nothing more than clutter, and I still don't understand why we have semantic scholar driving our citation style. SandyGeorgia (Talk) 01:41, 8 June 2020 (UTC)

The copyright problem has many facets. S2 has copies of licenses papers and scraped from the web papers. Wikipedia rules say to not linked to copyright infringing works (there are exceptions) - that is the scraped ones. This bot wont add links to citeceerx for that reason. Similarly, it wont add S2 links. There are editors that activelly remove these links if they cannot verify a license. S2 now has an API for determining licenses (see discussion above) and so there is a debate on adding tbose. The agrument against is tbat all legal S2 links are also free from the publisher. AManWithNoPlan (talk) 11:26, 8 June 2020 (UTC)

As long as we are discussing links and such, another reason ID conversion got started was because linking to PDFs instead of landing pages violates disabilty access rules. Is that still true? AManWithNoPlan (talk) 11:39, 8 June 2020 (UTC)

It's the opposite: direct links to PDFs work fine with screenreaders because the PDF gets downloaded immediately and the PDF reader can perform text-to-speech, while links to the landing page are not accessible because the link to full text may be hidden behind JavaScript without any semantic in HTML, so the link often cannot be followed with the keyboard. Nemo 18:37, 8 June 2020 (UTC)

Interesting that the PDF link policy has reversed, although it has been years since I read that. AManWithNoPlan (talk) 19:01, 8 June 2020 (UTC)

I've not made claims about a general policy, I'm just describing accessibility considerations for this website. With other open archives, the landing page is often more accessible because it has semantic HTML, which is probably why Citation bot traditionally prefers to link the landing page. I don't know what's the policy, although I would hope that accessibility trumps other considerations. Nemo 19:18, 8 June 2020 (UTC)

Accessibility is not the only concern. You get to see a very light HTML document with the abstract, which lets you decide if you want to download the PDF or not [which tend to be much larger]. As well as all the other benefits from the SemanticsScholar website, like share buttons, related papers, etc... Headbomb {t · c · p · b} 14:50, 10 June 2020 (UTC)

"Very light" sometimes, but definitely not in the case of Semantic Scholar, which downloads about 1 MB for JavaScript only. PDFs are often smaller than that. I just checked a sample of 100 pdfs.semanticscholar.org URLs and their median content-length was a bit less than 500 KiB, so less than half the size of the landing pages. Nemo 22:29, 13 June 2020 (UTC)

I hope that is a one time download of javascript. A lot of websitse these days are just somthing like <body><run javascript="BigFile.js" /><body>, but the script is the same for all pages. AManWithNoPlan (talk) 23:46, 13 June 2020 (UTC)

That doesn't help the occasional visitor. Most Wikipedia users are not regulars at Semantic Scholar. Nemo 10:33, 14 June 2020 (UTC)

True. More important is that PDF links expire. AManWithNoPlan (talk) 15:04, 15 June 2020 (UTC)

{{fixed}} for S2CID and we can start this discussion anew as needed. AManWithNoPlan (talk) 13:58, 17 August 2020 (UTC)

June 2020

You have been blocked indefinitely from editing for persistently making disruptive edits.

If you think there are good reasons for being unblocked, please read the guide to appealing blocks, then add the following text below the block notice on your talk page: {{unblock|reason=Your reason here ~~~~}}. RexxS (talk) 00:27, 8 June 2020 (UTC)

lol… here we go again — Chris Capoccia 💬 01:36, 8 June 2020 (UTC)

Extended content

can we at least get an example of what this "disruptive edit" would be? — Chris Capoccia 💬 01:42, 8 June 2020 (UTC)

Read two sections up, or the ANI currently on, or AManWithNoPlan's talk page. The issue is being discussed in about five places already. And typically, a result of bot operators not communicating very clearly when editors raise concerns or ask questions ... SandyGeorgia (Talk) 01:45, 8 June 2020 (UTC)

exciting… maybe everything will be back in order by august. the bot has been moving urls to identifiers for ages. — Chris Capoccia 💬 02:08, 8 June 2020 (UTC)

Where is the official conversation occurring? —¿philoserf? (talk) 03:38, 8 June 2020 (UTC)

Blocking this account is causing disruption. User:RexxS can you please unblock it? It does other useful work for me too. Graeme Bartlett (talk) 09:22, 8 June 2020 (UTC)

@Graeme Bartlett: I understand that the bot does useful work, which is why I was loathe to block it in the first place. Nevertheless it is currently set to unlink titles in citations, contrary to the principle that we should provide a free-content link in the title, as clearly demonstrated in the RfC quoted two sections above. I do not believe it is reasonable to start the bot running again until such time as it no longer removes those links. I hope you will understand the difficulty in fixing damaging edits made a high speed, and agree that such edits should not occur. --RexxS (talk) 15:36, 8 June 2020 (UTC)

moving redundant links to identifiers is what this bot has been doing for ages! if you put a link that duplicates a DOI, it gets rid of it and lists the DOI. same for JSTOR, PMID, and all the rest. it's not new behavior that should be needing some new approval. there are probably million citations formatted this way. — Chris Capoccia 💬 16:26, 8 June 2020 (UTC)

That's because the bot users didn't believe that the title needed to be linked where possible to a free text source. But when last month's RfC at Wikipedia:Village pump (proposals)/Archive 167 #Auto-linking titles in citations of works with free-to-read DOIs overwhelmingly endorsed the principle of linking in the title, the bot users should have modified its behaviour to comply. It's not as though they were unaware of the sentiment. I was therefore surprised to see the bot removing more links from titles yesterday. It needs to stop and the lost links should be restored. --RexxS (talk) 16:39, 8 June 2020 (UTC)

so we're blocking the bot because the part-time programmers haven't updated the bot yet? seems kind of silly. at the rate this bot's owners are involved, it will easily be 2 months before it's ready to go again. meanwhile, lots of bare URLs used in many articles that are just going to collect unformatted. — Chris Capoccia 💬 17:52, 8 June 2020 (UTC)

The alternative is to see hundreds or possibly thousands of links to free text being removed from citation titles. The bare urls will be fixed as soon as the bot is working again, but there's been no fix supplied for restoring the links it removed. --RexxS (talk) 18:07, 8 June 2020 (UTC)

I was under the impression that the vast majority of redundant URLs that the bot is removing are behind paywalls (sciencedirect.com, elsevier.com, etc.). If there are no freely accessible source, the title should not be linked as this results in needless MOS:SEAOFBLUE and unnecessary bloating of citation templates. Most URLs that are included in citation templates are not carefully selected but rather automatically added by citation template generators who are completely oblivious to whether the source is freely available or not. The meaning of the

icon next to document identifiers is immediately obvious. So why are we linking titles at all? Boghog (talk) 19:20, 8 June 2020 (UTC)

The bot is removing links to free text sources from the title in citations. You call that redundancy and I call it contrary to the community's clearly expressed view that titles should be linked. If you read the RfC linked above or even just check the quotes I provided above, you'll see that your view was thoroughly rejected. --RexxS (talk) 19:36, 8 June 2020 (UTC)

My view was not thoroughly rejected as I was one of several dissenting voices that participated in that discussion. Boghog (talk) 20:13, 8 June 2020 (UTC)

Your view was one of a handful of dissenting voices whose arguments were rejected. Support was so overwhelming that the RfC was closed early. --RexxS (talk) 22:53, 8 June 2020 (UTC)

@RexxS: The survey that you point to was flawed because it was not neutrally worded. A closely related RfC that was neutrally worded ended in no consensus. Boghog (talk) 06:39, 9 June 2020 (UTC)

@Boghog: The RfC was not flawed and nobody objected to the wording, which was a neutral description of Pintoch's proposal. You contributed and had the opportunity to object to the wording at the time, and it is disingenuous to try to do so retrospectively. That RfC was also recent and held at a central location, Village Pump, unlike the year-old RfC you cite at a the talk page for CS1 citations. The RfC you quote attempted to remove autolinking for the title when the pmc parameter was used, and failed to find consensus to do so. There is no doubt that a large majority of editors want to see titles linked to free content where possible, just as there is no doubt that the bot does not have approval to remove such links. --RexxS (talk) 17:06, 9 June 2020 (UTC)

Taking into account both RfCs, the opinion is not as one sided as you make it out to be. Boghog (talk) 17:17, 9 June 2020 (UTC)

The old RfC you quote contained exactly one editor arguing that titles should not be linked: and that was you. Considering you also expressed your view in the new RfC, it doesn't alter the balance of overwhelming support for the principle of linking titles one jot. In fact, in the old RfC Colin, who didn't contribute to the recent RfC, stated "Readers expect article titles to be URLs if they can read them. Hieroglyphics at the end of the citation are impenetrable to normal folk". Taking the old RfC into consideration actually makes the support for the principle of linking titles even more one-sided. --RexxS (talk) 20:13, 9 June 2020 (UTC)

@RexxS: exactly one editor arguing that titles should not be linked: and that was you – clearly false, there were a number of other editors that shared my opinion. Why are you fixated on crushing others that don't agree with you? With that kind of attitude, no wonder why WP:MED has become so dysfunctional. Can't we agree to disagree? Boghog (talk) 12:07, 10 June 2020 (UTC)

Yes, we can. --RexxS (talk) 12:23, 10 June 2020 (UTC)

I think you're both right. The bot generally removes paywalled URLs which are redundant with the DOI, but in the last few days it was particularly busy removing (legitimate) (mostly green open access) repository links now supported by a specific parameter. ::::::::::::Nemo 19:39, 8 June 2020 (UTC)

The issue is that the bot has been converting URLs into unique identifiers for over a decade. With the advent of the S2 URLs people have decided they do not want those convetered or at least dropped after conversion. For a historical note, almost all the URLs getting converted were added by citation bot, before the copyright enforcers realized that a substantial majority of them are copyright infringments. A lot of the title links people have complaimed about being removed (not all) fall into the infringement catagory. There is a discussion above about having the bot add in licenses S2 ID, but there quite a few people that believe that S2 should almost never be linked. AManWithNoPlan (talk) 19:53, 8 June 2020 (UTC)

Boghog, your impression is incomplete ... the bot was removing my carefully selected free text URLs from Featured artcles. SandyGeorgia (Talk) 15:12, 9 June 2020 (UTC)

@RexxS:, I would say the village pump proposal didn't get enough attention from users of Citation Bot. I never heard about it until now. — Chris Capoccia 💬 20:09, 8 June 2020 (UTC)

@RexxS:, Will you at least agree that (1) titles should not be linked to sites that are behind paywalls and (2) no links should be included that infringe copyrights. Most of the URLs that citation bot has removed fall into one or the other of those categories. Boghog (talk) 20:28, 8 June 2020 (UTC)

@Chris Capoccia: I'm sorry that you were unaware of the RfC at the Village Pump. It is generally regarded as the most central location to discuss proposals and to involve the broadest possible participation. I understand that it is difficult to ensure that everybody becomes aware of central discussions, but if you skim through the contributors, you'll find other users of Citation bot participated. --RexxS (talk) 22:53, 8 June 2020 (UTC)

@Boghog: (1) I'm agnostic about links to sites that are paywalled, but my concern is the free-to-read source links that the bot removes. (2) We should not be linking to sites that infringe copyrights, but my concern is the legitimate source links that the bot removes. (3) I disagree that most of the links that the bot removed fell into either of those two categories, my sampling indicates the opposite to be true. In any case, it should not be unlinking any titles that point to legitimate free-to-read sources. Do you contend that it has approval to do so? If so, a diff would be useful. --RexxS (talk) 22:53, 8 June 2020 (UTC)

i see citations where some user finds a pdf and blocks the pmc title linking by adding a URL where the article already had correctly listed pmid, doi & pmc. and goobers are objecting saying "what's a pmc? why are you deleting my url?" or someone uses the pnas or lancet link to full article for something that already has pmc. it's pointless putting these urls in when there are persistent identifiers like pmc. are you trying to have a title link for every citation? — Chris Capoccia 💬 13:13, 9 June 2020 (UTC)

That is not my situation. I provide a free full text URL when there is not a PMC. Those were being removed. AMWNP explained I can prevent that by putting a comment In the URL field. SandyGeorgia (Talk) 15:14, 9 June 2020 (UTC)

I'm one of those goobers who doesn't understand what a PMC is or how it's different from a PMID or a DOI or a JSTOR link. When I see a citation that has multiple identifier links but not a title link, I get confused and I don't know what to click on. What I want is to click on the title and be brought to a free copy of the source, if available, or if not, to the "official" or otherwise best copy of the source. I'm pretty sure most readers are with me on this. Levivich ^{[dubious – discuss]} 18:47, 10 June 2020 (UTC)

See PMID vs PMCID. The first is a link to PubMed, the second to PubMed Central. PubMed Central is a repository of freely-accessible articles. PMID is a general database that contains metadata and other things, but doesn't itself contain articles. Headbomb {t · c · p · b} 21:14, 10 June 2020 (UTC)

Headbomb, thanks for responding, but I think you've missed my point completely. I don't know which link to click on because I don't know the difference between PMID and PMC, etc. The answer to that is not to educate me about the difference. Let me rephrase: as a reader, I don't know, I don't want to know, I don't care, and I don't want to spend the time/effort to learn, the difference between different identifiers like PMID and PMC. I want to read the source article. I just want to know which link I should click on to read the source article. Presenting me with a series of links with acronyms doesn't tell me that. Don't get me wrong, the options should be there for those who know what they are and want to access the different versions. But there should be, for every citation, one link, one "main" link, that the reader clicks on, and that link should be under the title of the work. And, again, I'm pretty sure most readers are with me about wanting one link, and not wanting to learn the difference between PMC and PMID. Levivich ^{[dubious – discuss]} 21:27, 10 June 2020 (UTC)

PMC has a green lock next to it indicating it's free. It also automatically generates a link on the title. Headbomb {t · c · p · b} 21:36, 10 June 2020 (UTC)

Again, something I don't know and don't want to learn. The solution to "I don't know which of multiple identifiers is the free one" is not to expect more learning from the reader about our citation formatting. That's what "user friendly" and "intuitive UI" is all about. Everyone already knows to click the title to get to the source. It's not reasonable, or desirable, to expect every reader to learn that PMID is a repository and PMC is a database and if it's a green lock it means it's free, if it's a red lock it means it's not free, etc. The solution to "the user doesn't know how to use it" is not to educate the user. It's to make the function simpler. In this case, we're moving from something everybody already knows how to work (click the title), to something very few people know how to work (wikilinked acronyms and lock symbols). Levivich ^{[dubious – discuss]} 21:41, 10 June 2020 (UTC)

I have been following this discussion for a few days since it came up on my watchlist. I must say that I agree totally with Levivich here and I feel that the approach used by the bot and supported by Headbomb is completely wrong headed. Our focus should be on making things as easy and intuitive as possible for our readers and it is totally unreasonable to expect those readers to understand or even be remotely interested in the finer points of some obscure, arcane system of referencing sources. Where a free use version of a source is available, anything but a direct link to that source through a single click on the title is simply unacceptable. - Nick Thorne ^talk 01:08, 11 June 2020 (UTC)

I didn’t want to say anything, but since this account has been blocked for related issues: There were two issues with this edit:

It removed the URL to an open access version of a currently paywalled journal entry. It’s important this article, which discusses a very contentious issue (the Effectiveness of Alcoholics Anonymous), uses as much as possible open access publicans in its footnotes so other editors can read the footnotes and correct any errors or biases I make in my edits.
It incorrectly called The Atlantic a “Journal” (it’s a magazine)

While I have repaired the damage, these changes did reduce the quality of the article. SkylabField (talk) 00:09, 11 June 2020 (UTC)

SkylabField, The Cite web that got changed to Cite journal by the bot was an improvement. You completed the improvements. See the conversation for details on why it may be correct to remove an open access link to copyrighted paywalled content. —¿philoserf? (talk) 00:26, 11 June 2020 (UTC)

I think that I disagree. This url links to a pdf copy of the abstract and plain-language summary sections of the full report; it is not the full report, just two sections of it:

https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.CD012880.pub2/pdf/abstract

cs1|2 converts the doi to this url:

https://doi.org/10.1002%2F14651858.CD012880.pub2

which in turn gets redirected via this url to an html version of the abstract and plain-language summary sections of the full report:

https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.CD012880.pub2/full

In the html version you can see that there are seven additional sections plus appendices (right column in the grey box).

—Trappist the monk (talk) 00:41, 11 June 2020 (UTC)

I’m only seeing the abstract and plain language summary in the HTML file linked to; I am not seeing additional sections. There is nothing over here in the HTML version not available in the linked PDF version. SkylabField (talk) 01:08, 11 June 2020 (UTC)

Column on the right side of this page, grey box has this list of article sections

Abstract

Plain language summary

Authors' conclusions

Summary of findings

Background

Objectives

Methods

Results

Discussion

Appendices

Everything except the abstract and plain-language summary sections is behind the paywall. All are linked and when clicked offers institutional-user-sign-in or purchase options for everyone else.

—Trappist the monk (talk) 01:24, 11 June 2020 (UTC)

Thanks to everyone who keeps the bot in order. Blue Rasberry (talk) 13:37, 11 June 2020 (UTC)

Arbitrary break

When can we expect the unblock of Citation Bot? Grimes2 (talk) 15:37, 14 June 2020 (UTC)
- I have been waiting for the big OAuth changeover to be done before asking, which now done. Also, I have been busy doing a conversion from PHP 5.6 to 7.2, which is now done. The bot has been changed so that the S2 title-link will only be removed if one of these two is true

There is an active auto-link (Currently PMC, but soon will be more I have heard via |doi-access=free and such).
The S2 page is not licensed by S2, and was just a web-scrape.

It is worth noting that the bot currently does not work since the OAuth tokens have not yet been updated by the bot operator (he has been asked). Similarly, the URL expansion part is completely down, and many tools will not work because of the DNS changes. AManWithNoPlan (talk) 14:48, 15 June 2020 (UTC)

Can you please translate this for us lesser mortals who have no idea what the hey S2 and PMC actually are? Oh, and I don't like the sound of your first condition, but that may be because you choose to use cryptic descriptors. - Nick Thorne ^talk 13:20, 17 June 2020 (UTC)

S2 is Semantic Scholar. PMC is PubMed Central. Headbomb {t · c · p · b} 13:28, 17 June 2020 (UTC)

This user's unblock request has been reviewed by an administrator, who declined the request. Other administrators may also review this block, but should not override the decision without good reason (see the blocking policy).

Citation bot (block log • active blocks • global blocks • contribs • deleted contribs • filter log • creation log • change block settings • unblock • checkuser (log))

Request reason:

bot changed to only remove URL if a PMC link will take its place or if Semantic Scholar link is unlicensed. If other auto-linking flags such as doi-access=free create a link, we will recognize that in the future too. AManWithNoPlan (talk) 15:25, 19 June 2020 (UTC)

Decline reason:

Despite several requests, there has been no evidence shown of bot approval for the removal of links from citation titles. If such approval does actually exist, or it is sought and granted, please show the evidence in a new unblock request. If the bot is changed to remove these link removals, please make a new unblock request. Boing! said Zebedee (talk) 21:52, 25 June 2020 (UTC)

If you want to make any further unblock requests, please read the guide to appealing blocks first, then use the {{unblock}} template again. If you make too many unconvincing or disruptive unblock requests, you may be prevented from editing this page until your block has expired. Do not remove this unblock review while you are blocked.

@RexxS: Seeing as you blocked, your thoughts? The issue appears to have been dealt with. CaptainEek ^{Edits Ho Cap'n!}⚓ 23:25, 24 June 2020 (UTC)

@CaptainEek: The issue I blocked for was the removal of links from citation titles, a task for which the bot is not authorised, nor is there any consensus for it. Has that issue now been resolved unambiguously? and will the bot be editing only within its authorisation in future? --RexxS (talk) 23:38, 24 June 2020 (UTC)

The bot will only delink S2 from the title during conversion to |S2CID= if one of these is true: the PMC is linked in the title (there was a general liking if this idea) OR the S2 url is not a publisher approved copy (this will catch people off guard at times, but those links do violate WP policy). If other things like doi-access=free start auto-linking works and is available, I assume that removing the URL during S2 conversion makes sense just like PMC auto linking does now. AManWithNoPlan (talk) 23:57, 24 June 2020 (UTC)

Thanks for the update, AManWithNoPlan. Three questions then:

One of the bot edits I complained about removing citation title links was this one, which removed the link from the citation title, and was not related to s2cid. Are we now certain that it won't remove any more links from citation titles, with the possible exception of links pointing to copyvios at Semantic Scholar?
You state that it will remove the link from a citation title when it points to a copyvio at Semantic Scholar. Where is the bot approval for that task?
If the answer to 1 is in the negative, where is the bot approval for that task?

I think that satisfactory answers to those questions should be essential prerequisites to any unblocking. --RexxS (talk) 00:25, 25 June 2020 (UTC)

The JSTOR link does not link to a full free copy, so it would be removed. We have had consensus for converting URLs to IDs for a long time. As for removal of copyvio links, since we have consensus to convert links that do not link to full and open copies, the copyvio S2 links fall into the "not full and open" pile of URLs. Copyvio is a bid deal on wikipedia, a much bigger deal than linking non-free copies. AManWithNoPlan (talk) 00:38, 25 June 2020 (UTC)

Removing copyvio S2 links seems like a big win to me. I don't see why the bot should be blocked for doing this. —David Eppstein (talk) 00:57, 25 June 2020 (UTC)

@AManWithNoPlan: No, that's untrue. You have no consensus to remove links from citation titles, and the bot has no authorisation to do so. If you believe you have authorisation, then please quote and link the text of it. Without consensus and bot approval, I strongly oppose any unblocking.

Copyright is indeed a big deal on Wikipedia, and is far too important to leave to a bot's judgement. --RexxS (talk) 01:06, 25 June 2020 (UTC)

It actually S2 that provides the judgement on the copyvio status of their pages. There is no judgement, just clear facts straight from the horses mouth. AManWithNoPlan (talk) 01:46, 25 June 2020 (UTC)

There is consensus to remove non-free links redudant with identifiers, yes, and it is authorized to do so as well as many other bots. Headbomb {t · c · p · b} 02:51, 25 June 2020 (UTC)

So there's no authorisation for Citation bot to remove any links and you cannot supply text or link to its authorisation. There's no consensus for it either, and you are unable to present a link for where any consensus was reached. This bot has been used irresponsibly by a small self-selected group to impose their view of how citations should be presented. --RexxS (talk) 16:17, 25 June 2020 (UTC)

Wikipedia:Bots/Requests_for_approval/DOI_bot_2, from 2008 [see also [6]]. And template documentation, since pretty much time immemorial: Use parameters, not URLs when specific parameters are available, because URLs should be used for freely accessible versions. Headbomb {t · c · p · b} 17:27, 25 June 2020 (UTC)

You see, this a perfect example of the FUD produced when the bot's approval is questioned. From Wikipedia:Bots/Requests_for_approval/DOI_bot_2, we read

Function Summary: Add missing parameters to citations from CrossRef database, and tidy citations

Function Details: ... Consensus appears to be that specifying a URL parameter is also useful; the bot can specify the URL that the DOI redirects to and in some cases make an intelligent guess as to its nature (abstract, fulltext etc) which can be recorded in the "format" parameter.
There have also been requests for the bot to correct common mistakes, such as replacing "id = PMID 123" with "pmid=123", percent-encoding parameters within dois so they link correctly, and replacing erroneously capitalised parameters (example: "Journal=Science" with "journal=Science"). Since these seemed uncontroversial I implemented these as I went, but my sense is that an official approval would placate some of Wikipedia's adminsitrators.
In cases where there is more than one instance of a parameter, the bot will remove: If one or more are empty, the empty one; Any identical duplicates ...

Adding URLs to nonfree articles? One question: the usual style in articles I edit is that url= is reserved for articles where the entire text is freely readable, and that url= is not used for articles where just the abstract is readable (for that, you can just live with the DOI or PMID or whatever). Will the bot support this convention? That is, on such articles will it refuse to add URLs to articles that aren't entirely readable? ... I envision this being a possible bone of contention. I envision the bot providing a link where only an abstract is visible, but marking the URL as "abstract" or "subscription required" (using the "format" parameter). The rationale for this is that casual readers may not understand that a DOI or PMID provides a link to the article, and that a title link is intuitive to follow. The bot can't really tell whether editors have only chosen to provide URLs to free texts, you see.
In the majority of articles I edit (which tend to be scientific rather than medical), the convention seeems to be to provide a link, whatever - but then I guess that DOIs are rarely specified. I guess the crux of the matter is whether the title being linked is a genuine help to users, which was the sense I got from discussions on my talk page - I guess each of us has our own entrenched opinion that we're unlikely to change, so it would be helpful to get some views from the wider community!

That's what we were promised by Smith609, and that's what was authorised: absolutely nothing about removing links from titles; a request to remove more than one instance of a parameter (not to remove one parameter when it points to the same place as different parameter); a clear recognition that opinions differ about what the title link may point to; and a suggestion that wider community input would be helpful. All of that has gone out of the window.

The next part of that BRFA is really instructive:

Proposal from Wikipedia:AN
2. The bot must not remove or alter an existing URL.
The second limitation was discussed at length on Wikipedia:AN. The third, fourth, and fifth items are the only things the bot should be doing.

So we had an AN complaint brought by MCB at Wikipedia:Administrators' noticeboard/Archive143 #DOI bot blocked for policy reconsideration for "implementing a major policy change in the way Wikipedia makes web references, without large-scale community consensus and buy-in". Read it. there's nothing in there that indicates any consensus for your use of the bot to systematically strip links from citation titles, and plenty of evidence of just the opposite. The bot should respect the judgement of the editor who links the title and not impose your vision as a fait accompli. You can certainly make a case for removing links that point to copyvios, and there would be support for that, but it would require approval, because there is no approval whatsoever for the bot to remove links from citation titles. --RexxS (talk) 21:33, 25 June 2020 (UTC)

From that same BRFA The bot replaces "url=http://dx.doi.org/#" with "doi=#". Also from the bot description in May 2008 at the time of approval. Headbomb {t · c · p · b} 22:09, 25 June 2020 (UTC)

Emphasis mine: The bot replaces "url=http://dx.doi.org/#" with "doi=#" - I think this was the one URL manipulation deemed okay. Levivich ^{[dubious – discuss]} 22:20, 25 June 2020 (UTC)

Yes, because back then, the bot was touching other non-identifier-based URLs like this. That's the context for that RFC. The DOI function has since been expanded to other identifiers. Headbomb {t · c · p · b} 22:26, 25 June 2020 (UTC)

The DOI function has since been expanded to other identifiers without approval, which is why the bot is blocked right now. Let's just move on to the next part where the code that removes |url= is commented out, the bot is unblocked, and approval for removing |url= is sought. Levivich ^{[dubious – discuss]} 22:32, 25 June 2020 (UTC)

It's been expanded in line with consensus. Bots do not need re-approval for the same tasks with minor changes in scope. There is nothing different about removing a PMID url to a PMID parameter, or a JSTOR url to a JSTOR parameter than from a DOI url to a DOI parameter. Headbomb {t · c · p · b} 22:41, 25 June 2020 (UTC)

If it's been expanded in line with consensus, then it'll be a quick and easy BRFA. Levivich ^{[dubious – discuss]} 22:49, 25 June 2020 (UTC)

There's no need for a BRFA when there already is a valid one and that the expansion is in line with consensus. Headbomb {t · c · p · b} 22:51, 25 June 2020 (UTC)

There's a need when multiple editors are challenging whether or not the expansion is in line with consensus. The extreme hesitancy to seek explicit community approval is how I know, that you know, that the community will not approve. Anyone confident that consensus already exists would have started the discussion weeks ago. Levivich ^{[dubious – discuss]} 22:53, 25 June 2020 (UTC)

There's one editor with an axe to grind. This does not undo 12+ years of smooth operation concerning this exact function, nor does it warrant holding the entire community hostage to the whims of that person. Headbomb {t · c · p · b} 23:07, 25 June 2020 (UTC)

If that's true, it'll be a quick BRFA, and you'll get to say "I told you so". (But of course it's not just one editor.) Levivich ^{[dubious – discuss]} 23:11, 25 June 2020 (UTC)

"is how I know" spoken by an arrogant mind-reading jerk AManWithNoPlan (talk) 23:18, 25 June 2020 (UTC)

Mind WP:CIVIL. There's no need for this. Headbomb {t · c · p · b} 23:38, 25 June 2020 (UTC)

I apologize, there was no need for Levivich to claim to read minds and no need for me to strike back. AManWithNoPlan (talk) 23:39, 25 June 2020 (UTC)

Eh, I thought that was fair, my comment was jerk-ish, but we are simply past the point where anyone can credibly claim to hold a good faith belief that the bot is operating with clear consensus. This is not one user with an axe to grind; consensus for removing the url parameter is, at best, murky. Levivich ^{[dubious – discuss]} 06:03, 26 June 2020 (UTC)

I think I lot of people just surprised that the bot went from first mention of the problem this page to being blocked in under 7 hours. People have been actively discussion this instead of just jumping straight to request the unblock. Plus, I personally was using the time to upgrade the bot to PHP 7.3. AManWithNoPlan (talk) 23:30, 25 June 2020 (UTC)

Is there a list somewhere of the specific circumstances under which the bot deletes |url= from a citation template, currently? I see two such circumstances in the unblock request (and a third potential future circumstance), is that list complete? Levivich ^{[dubious – discuss]} 03:06, 25 June 2020 (UTC)

@Levivich: I believe currently it basically it replaces/removes |url= with specific identifier (e.g. |url=https://www.jstor.org/ with |jstor=...) when specific identifiers are available (this goes back to 2008 or so, and is in line with template documentation/standard usage). With S2CID urls currently remaining untouched when there are free full versions for now, but which will be removed once the CS1/CS2 templates are updated to support autolinking when |S2CID-access=free is set. AManWithNoPlan or Martin609 would know more though. Headbomb {t · c · p · b} 17:40, 25 June 2020 (UTC)

That is correct about replacing |url= with specific identifier. S2CID is a fairly unique case in that it often includes a full copy. So, those |url= will only be removed during the conversion IF some thing else will turn the title into a blue link (Such as PMC and hopefully soon things like |S2CID-access=free. One other exception in the current code are the copyright violating pages on S2, which the |url= will be removed (but the |S2CID= will stay) in accordance with wikipedia's "don't link to copyright violations" policy. AManWithNoPlan (talk) 18:05, 25 June 2020 (UTC)

Where's the link and text of the approval for removing "|url= [and replacing] with specific identifier"? Where's the consensus for doing that? The clear answer is that neither of those exist. --RexxS (talk) 21:40, 25 June 2020 (UTC)

@Boing! said Zebedee: "Despite several requests, there has been no evidence shown of bot approval for the removal of links from citation titles." That's patently untrue. See Wikipedia:Bots/Requests for approval/DOI bot 2 where conversions of |url= to |doi= is explicitly approved (search for The bot replaces "url=http://dx.doi.org/#" with "doi=#" at the bottom of the BRFA). This was explicitly trialled (e.g. https://en.wikipedia.org/w/index.php?title=Hubble_Space_Telescope&diff=prev&oldid=211876538). Also from the bot description in May 2008 at the time of approval. This is a function that's never been controversial since it's BRFA in 2008, which also has been approved in multiple other bots, such as Wikipedia:Bots/Requests for approval/CitationCleanerBot, and which is fully inline with template documentation (e.g. Template:Cite_journal#Identifiers): use identifiers parameters instead of parameter URLs. Headbomb {t · c · p · b} 22:06, 25 June 2020 (UTC)

What an unbelievable piece of selective quoting! This is what was actually written: "The bot replaces "url=http://dx.doi.org/#" with "doi=#" - I think this was the one URL manipulation deemed okay.

"I think this was the one url manipulation deemed okay". No other url 'manipulation' has ever been approved. We already have recent overwhelming consensus that the citation title should be linked when a free |doi= is present at Wikipedia:Village pump (proposals)/Archive 167 #Auto-linking titles in citations of works with free-to-read DOIs, so replacing url with doi is an irrelevance, a settled issue, because it won't delink the citation title. You don't have the right to unilaterally and arbitrarily extend the bot's approval from "the one url manipulation deemed okay" to unlinking the citation title when any one of a dozen or more unspecified parameters are present. Fix that first. --RexxS (talk) 23:24, 25 June 2020 (UTC)

Again, the context of that was the RFC where Citation bot was utterly mangling citations with non-identifier-based URLs like this. Headbomb {t · c · p · b} 23:34, 25 June 2020 (UTC)

@Headbomb: Sorry if you disagree, but I have read all of this carefully and it's the only conclusion I can come to. The Village Pump consensus also influenced my unblock review (and I forgot to include it in my review comments, apologies - but I'm saying it here now). I do not see authorisation for what the bot is currently doing, and I see a consensus against what it is doing. I suggest the best thing to do at this point might be to make another WP:BAG request to clarify/confirm what the bot is authorised to do and what it is not - though it might be better to clarify the consensus as to how citation titles should be treated first. Boing! said Zebedee (talk) 05:25, 26 June 2020 (UTC)

The thing that concerns me most in this whole sorry mess is that both AMWNP and Headbomb seem unable or unwilling to operate/maintain this bot within its authorisation. This is not acceptable, and how it has not ended up at AN/I is beyond me. - Nick Thorne ^talk 05:52, 26 June 2020 (UTC)

It ended up there already. Levivich ^{[dubious – discuss]} 06:03, 26 June 2020 (UTC)

1) That is perfectly within the terms of it's approval, and explicitly so. 2) I'm neither maintainer, nor operator of this bot. Headbomb {t · c · p · b} 07:04, 26 June 2020 (UTC)

The bot did not go to AN/I. It was the blocking of users of the bot instead of the bot itself that went there. AManWithNoPlan (talk) 11:46, 26 June 2020 (UTC)

This is one of the most important and widely used bots on the Wiki. This needs to be back up and running soon. I propose that AMWNP and Headbomb remove the disputed functionality, restore the bot to operational status, and then we can argue about its DOI and URL functions while the old version of the bot works. CaptainEek ^{Edits Ho Cap'n!}⚓ 18:58, 27 June 2020 (UTC)

Perhaps a new WP:BRFA is needed here, though I still support unblocking an old version of the bot while the months long BRFA goes through. CaptainEek ^{Edits Ho Cap'n!}⚓ 19:09, 27 June 2020 (UTC)

Again, I neither code, nor operate Citation bot. Headbomb {t · c · p · b} 19:24, 27 June 2020 (UTC)

Oh gosh, I'm sorry Headbomb, I didn't read my post carefully enough. CaptainEek ^{Edits Ho Cap'n!}⚓ 19:48, 27 June 2020 (UTC)

Well to be fair, there was a weird typo/word salad here. Headbomb {t · c · p · b} 20:16, 27 June 2020 (UTC)

Why is this blocked? Yes, I see there's a dispute about some minor details of urls. But guys, please keep scope in mind. Url links within the title are pretty rare and while you argue many citations are languishing as just a doi. :( I don't care whether the controversal code is removed and the bot is unblocked or the bot is unblocked as is while a consensus is reached, I just wish you wouldn't drag every wikipedian who wants to fill a citation into this dispute. Iamnotabunny (talk) 15:32, 1 July 2020 (UTC)

it's because the regular editors using citation bot don't see title URLs as any big deal but the people who actually pushed the block button are like OMG THE SKY IS FALLING!!111 NEED TITLE URLS BECAUSE NO ONE KNOWS HOW TO CLICK!! — Chris Capoccia 💬 14:53, 6 July 2020 (UTC)

I look forward to the unblocking of this tool. I hope soon.--Dthomsen8 (talk) 14:27, 10 July 2020 (UTC)
looks like CS1 has started title linking |doi-access=free… hopefully |S2CID-access=free and all the other similar ones. are things ready to revisit this blocking and reactivate? — Chris Capoccia 💬 13:45, 12 July 2020 (UTC)
- I hope so, but Wikipedia:Bots/Noticeboard#Citation_bot was not withdrawn yet. I think incremental fixes are better but maybe the proposers still prefer a full-scale review of everything under the sun. Nemo 15:17, 12 July 2020 (UTC)
  - Sorry, I have been busy with other things. I have been busy teaching a college seminar and preparing the bot for PHP 7.4 which has found a few bugs (all the ones so far have no effect on output or crash the bot) AManWithNoPlan (talk) 17:27, 12 July 2020 (UTC)
    - OK well it looks like |S2CID-access=free is not making title links, so we're probably not ready to go anyway. Maybe by August :( — Chris Capoccia 💬 23:26, 12 July 2020 (UTC)

Restart?

really i was only joking up above when i suggested the bot might be out through august.... are we any closer to a restart? — Chris Capoccia 💬 20:40, 3 August 2020 (UTC)

Since this appears to be down for a while, is there a way to remove the "Expand citations" tool (from the left side of each page) until it is available again? DougHill (talk) 18:14, 5 August 2020 (UTC)

Please restart this useful tool, or make something that does the same thing, but better. A bunch of pointless bickering about relatively small issues has completely stalled what could have improved thousands of articles in the downtime. Shameful. --Animalparty! (talk) 01:27, 8 August 2020 (UTC)

Anyone who wants to see a return of Citation bot needs to add their comments under Wikipedia:Village_pump_(proposals)#Issues_raised_by_Citation_bot. But right now it doesn't look too promising. — Chris Capoccia 💬 15:42, 8 August 2020 (UTC)

@Chris Capoccia: I expect everybody wants to see Citation Bot restarted, myself included. But, as far as I'm aware, there has not been a single statement from the bot operator indicating any intention to address the many concerns raised over the bot's editing. I'm pessimistic about the chances of seeing the bot restarted if it's just going to cause the same concerns again. --RexxS (talk) 15:59, 8 August 2020 (UTC)

I'm not seeing it. For ages the bot has deleted URLs that were duplicated by parameters. This is part of core functionality. There are some very different ideas of what the bot is supposed to be doing and I don't see the sides getting any closer. So the bot is going to stay blocked forever and not come back. — Chris Capoccia 💬 16:08, 8 August 2020 (UTC)

Well, perhaps the community will conclude that all of the concerns that folks like myself have expressed are without value, and issues like the removal of links from citation titles are part of its remit with approval and broad consensus. Then it can be restarted as it is. However, if the community agrees that valid concerns exist, then either the operator will bring the functionality into line, or it will sadly stay blocked. I'm just disappointed that there has not been a shred of compromise on the part of the bot operator that might have met the concerns half-way and made possible a restart under mutually acceptable conditions months ago. --RexxS (talk) 16:21, 8 August 2020 (UTC)

Turned off code that removed title links that violate wikipedia linking policy, so that a title link will stay in the case of S2 links. AManWithNoPlan (talk) 00:26, 13 August 2020 (UTC)

RexxS, Does that address your issue for the time being? CaptainEek ^{Edits Ho Cap'n!}⚓ 05:58, 13 August 2020 (UTC)

@CaptainEek: I'm not sure whether it will meet all of my concerns about leaving citation titles unlinked, but I guess we can't tell until we see the results of restarting. I'm certainly overjoyed that one of the bot programmers has now made an effort to address the concerns raised, and as I'm keen to see the bot restarted, I'd have no objection to seeing the bot restarted at present. I do expect that the present RfC will have significant implications for how the bot will operate in future, and I therefore expect the bot programmers to take heed of the implications of the consensuses forming there. --RexxS (talk) 19:56, 13 August 2020 (UTC)

I would unblock, but consider myself to have gotten involved, though I would encourage any passing admin to unblock. CaptainEek ^{Edits Ho Cap'n!}⚓ 03:51, 14 August 2020 (UTC)

This user's unblock request has been reviewed by an administrator, who accepted the request.

Citation bot (block log • active blocks • global blocks • contribs • deleted contribs • filter log • creation log • change block settings • unblock • checkuser (log))

Request reason:

bot changed to only remove URL if a PMC link will take its place. If other auto-linking flags such as doi-access=free create a link, we might recognize that in the future too, but there would always stay a title link for the S2 links when converting to S2CID parameter. )

Accept reason:

Accepting unblock request $Salvio$ 16:51, 16 August 2020 (UTC)

{{fixed}} AManWithNoPlan (talk) 01:16, 17 August 2020 (UTC)

Merge italics/bold

Status: {{notabug}}
Reported by: Headbomb {t · c · p · b} 18:53, 31 July 2020 (UTC)

What should happen: [7]
We can't proceed until: Feedback from maintainers

With care to ensure that something like '''Bold''' ''italics'' is handled properly and not converted to say '''Bold' italics''. Headbomb {t · c · p · b} 18:53, 31 July 2020 (UTC)

{{wontfix}}, since it looks like a rat's nest of possible non-conforming inventive editors. AManWithNoPlan (talk) 13:49, 17 August 2020 (UTC)

Process pages in Category not working

Status: {{fixed}}
Reported by: Grimes2 (talk) 17:27, 16 August 2020 (UTC)

What happens: "Process pages in Category" not working, message: Category appears to be empty
We can't proceed until: Feedback from maintainers

That is really weird. I will have to look into that. AManWithNoPlan (talk) 14:17, 17 August 2020 (UTC)

https://github.com/ms609/citation-bot/pull/3364/files Ooops. That was stupid bug. AManWithNoPlan (talk) 14:21, 17 August 2020 (UTC)

Caps i / I

Status: {{fixed}}
Reported by: Redalert2fan (talk) 21:28, 16 August 2020 (UTC)

What happens: journal= Elektriceskaja I Teplovoznaja Tjaga
What should happen: keep as journal= Elektriceskaja i Teplovoznaja Tjaga
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=VL11&diff=prev&oldid=973372345
We can't proceed until: Feedback from maintainers

https://github.com/ms609/citation-bot/pull/3364 AManWithNoPlan (talk) 13:52, 17 August 2020 (UTC)

Useless capitalization?

Status: {{fixed}}
Reported by: Redalert2fan (talk) 21:36, 16 August 2020 (UTC)

What happens: Useless capitalization of www to WWW
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=EP10&diff=prev&oldid=973373301
We can't proceed until: Feedback from maintainers

Doesn't seem helpful to me. Redalert2fan (talk) 21:36, 16 August 2020 (UTC)

Probably best to avoid any sting with www or http in them https://github.com/ms609/citation-bot/pull/3363 AManWithNoPlan (talk) 13:45, 17 August 2020 (UTC)

Lots of JSON errors

I'm getting a lot of errors of the following kind: ! Could not parse JSON for URL <urls here> Requests must have a user agent. - Redalert2fan (talk) 21:39, 16 August 2020 (UTC)

https://github.com/ms609/citation-bot/pull/3362 This once deployed will fix that, but I think URL expansion is still down, but the error should be better at least. AManWithNoPlan (talk) 13:49, 17 August 2020 (UTC)

This is {{fixed}}, but URL expansion is still down. AManWithNoPlan (talk) 14:14, 17 August 2020 (UTC)

Google Books API error

Multiple errors of:

 ! Google Books API reported error: Array

   [0] => stdClass Object
       (
           [message] => The provided API key has an IP address restriction. The originating IP address of the call (IP adres here) violates this restriction.
           [domain] => global
           [reason] => forbidden
       )

)

are showing up. The section "IP adres here" shows an actual ip adress which I have removed for this post. -Redalert2fan (talk) 21:43, 16 August 2020 (UTC)

Note to self {{fixed}} at https://console.developers.google.com/apis/credentials?project=wikipediacitationbot AManWithNoPlan (talk) 01:17, 17 August 2020 (UTC)

"|chapter= ignored" error caused in cite web

Status: {{fixed}}
Reported by: Grimes2 (talk) 19:15, 16 August 2020 (UTC)

What happens: |chapter= ignored error caused in {{cite web}} and most others that are not {{cite book}}
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=Julian_Bream&diff=973350505&oldid=973331691
We can't proceed until: Feedback from maintainers

https://github.com/ms609/citation-bot/pull/3365 AManWithNoPlan (talk) 18:15, 17 August 2020 (UTC)

bot adds author names that are not author names

Status: {{fixed}}
Reported by: Trappist the monk (talk) 12:43, 19 August 2020 (UTC)

What happens: |last1=7 |first1=Völlig neu Bearbeitete und Erweiterte Auflage
Relevant diffs/links: diff
We can't proceed until: Feedback from maintainers

when bot changes |work= alias to |encyclopedia=

Status: {{fixed}}
Reported by: Trappist the monk (talk) 22:56, 19 August 2020 (UTC)

What happens: |encyclopedia= (and aliases) are constrained to {{cite encyclopedia}}, {{cite dictionary}}, and {{citation}} (discussion at wt:cs1)
Relevant diffs/links: diff
We can't proceed until: Feedback from maintainers

https://github.com/ms609/citation-bot/pull/3377

Chapter ignored error

Status: {{fixed}}
Reported by: Lithopsian (talk) 20:41, 19 August 2020 (UTC)

What happens: CS1 errors: chapter ignored category applied incorrectly
What should happen: Category should not be applied to citation template when work field (or aliases) is not included
Relevant diffs/links: Rho Persei
We can't proceed until: Feedback from maintainers

FYI: this appears to relate to this diff from last March in which Citation bot saw a {{cite book}} template with correctly-formatted chapter=, title=, and edition= parameters but also with incorrect content in the chapter= parameter (not actually a chapter) and with an incorrect journal= parameter and decided to change the template to {{cite journal}}, without changing the parameters, leaving a broken citation. I hope this is long fixed by now. —David Eppstein (talk) 23:24, 19 August 2020 (UTC)

Down?

The Citations button seems to hang. Expand citations too. I tried a few different things, nothing... Abductive (reasoning) 23:35, 19 August 2020 (UTC)

under very heavy load AManWithNoPlan (talk) 23:48, 19 August 2020 (UTC)

{{fixed}} AManWithNoPlan (talk) 12:24, 20 August 2020 (UTC)

OAuth tasks done

All URLs must be updated in GitHub (done)
Gadget and sidebar button code updated on Wikipedia (done)
Dev code, if anyone has it (done - that's their problem, and that bot is down anyway)
other people with their own scripts (done - that's their problem, and the ones I know about told)
DNS moved (done)
Update Bot wiki pages (done)
Update https://en.wikipedia.org/wiki/Template:Automated_tools (done)

AManWithNoPlan (talk) 16:39, 10 June 2020 (UTC)

Do we already have a permissive CORS rule as suggested in https://wikitech.wikimedia.org/wiki/News/Toolforge.org#Cross-Origin_Resource_Sharing_(CORS)_requests_broken ? I'm currently getting errors on that front. Nemo 11:28, 24 June 2020 (UTC)

I now know that is not relevant here. But, thanks for the link, that was a good idea to check out. AManWithNoPlan (talk) 18:06, 25 June 2020 (UTC)

{{fixed}} flag to archive. AManWithNoPlan (talk) 02:03, 21 August 2020 (UTC)

we skipped this step : Once action taken or determined as not required, mark off as 'done' at Here

s2cid towards end with rest of identifiers

It's great that Citation bot is adding all these S2CID entries, but is there some reason why they are being added between authors and title instead of towards the end with the rest of the identifiers? — Chris Capoccia 💬 20:13, 21 August 2020 (UTC)

I see. The "2" in the name confuses it. I will fix that. AManWithNoPlan (talk) 21:23, 21 August 2020 (UTC)

Convert wrong citeseerx/doi

Status: {{wontfix}} way too rare to do.
Reported by: Headbomb {t · c · p · b} 17:25, 21 August 2020 (UTC)

What should happen: [8]
We can't proceed until: Feedback from maintainers

Basically if you have 10.1.1... in a |doi= it should be converted to a |citeseerx=, and if you have a valid DOI in a |citeseerx=, then that too should be converted to a |doi=. Headbomb {t · c · p · b} 17:25, 21 August 2020 (UTC)

Processing of JSTOR citations

Status: {{notabug}}
Reported by: 凰兰时罗 (talk) 18:46, 22 August 2020 (UTC)

What happens: (1) Removal of the level of access from JSTOR links is wrong: different JSTOR materials have different levels of access. (2) Adding "issue=110" is just wrong.
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=John_Leary_(politician)&curid=56263000&diff=974379039&oldid=963559061
We can't proceed until: Feedback from maintainers

The issue=110 is actually correct, it is the original volume=110 that was wrong. Also "different JSTOR materials have different levels of access" this is very rare, which is why only |jstor-access=free is the only option for |jstor-access= allowed. By definition |jstor-access=closed-off is assumed, until proven otherwise. AManWithNoPlan (talk) 19:04, 22 August 2020 (UTC)

Yes, my bad – you're actually correct on both points :). Thanks! 凰兰时罗 (talk) 23:45, 22 August 2020 (UTC)

The Nation

Status: {{fixed}}
Reported by: Kaltenmeyer (talk) 17:59, 23 August 2020 (UTC)

What happens: what is added |journal=The Nation : A Weekly Journal Devoted to Politics, Literature, Science, Drama, Music, Art, and Finance
What should happen: I believe the usual name of the journal should be added; add |journal=The Nation
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=News_media_endorsements_in_the_2020_United_States_presidential_primaries&type=revision&diff=974337281&oldid=953468038
We can't proceed until: Feedback from maintainers

I have added some code for that ISSN specifically. Should be deployed after all tests are passed. https://github.com/ms609/citation-bot/pull/3404 AManWithNoPlan (talk) 18:40, 23 August 2020 (UTC)

Down again

The Citations button is hanging again. Expand citations too. Abductive (reasoning) 07:53, 21 August 2020 (UTC)

When it gets slow, it can start to result in people trying again and again, which is like putting a fire out with gasoline. AManWithNoPlan (talk) 11:25, 21 August 2020 (UTC)

seems like its inactive right now for over an hour https://en.wikipedia.org/wiki/Special:Contributions/Citation_bot AManWithNoPlan (talk) 11:25, 21 August 2020 (UTC)

Got an operator to reboot it AManWithNoPlan (talk) 12:05, 21 August 2020 (UTC)

{{fixed}} for now. AManWithNoPlan (talk) 13:59, 24 August 2020 (UTC)

Question about interwiki links

At WP:ANI#Citation bot someone said "The bot probably doesn't recognize the interwiki prefix ..." – is that so? Same question for interlanguage links. See Wikipedia:Namespace#Interwiki and interlanguage links. If the bot doesn't understand, it shouldn't mess with it, right? At least a lame excuse: it's the bot's task to understand, and not to remove legit links because it doesn't understand, thus filing a bug report:

Status: {{fixed}}
Reported by: Francis Schonken (talk) 06:01, 24 August 2020 (UTC)

What happens: bot de-links an interwiki link, i.e. it changed |title=Pianoforte zu vier Händen to |title=Pianoforte zu vier Händen (for clarity: without providing a replacement link)
What should happen: leave legit link alone: removing the link is in no universe helpful to the reader, and certainly not when such reader would want to verify Wikipedia's content which is referenced to this.
Relevant diffs/links: [9]
We can't proceed until: Feedback from maintainers

https://github.com/ms609/citation-bot/pull/3411 AManWithNoPlan (talk) 13:55, 24 August 2020 (UTC)

when creating |author-link= from author name parameters ...

Status: {{fixed}}
Reported by: Trappist the monk (talk) 13:25, 22 August 2020 (UTC)

What happens: bot apparently skips |first=
Relevant diffs/links: diff
We can't proceed until: Feedback from maintainers

bot changes this:

{{cite web|last1=[[Matt Welch|Welch]]|first1=[[Matt Welch|Matt]]|date=March 4, 2020|url=https://reason.com/2020/03/04/libertarian-super-tuesday-big-night-for-jacob-hornberger-nota-john-mcafee-drops-out-and-backs-vermin-supreme/|title=Libertarian Super Tuesday: Big Night for Jacob Hornberger, NOTA; John McAfee Drops Out and Backs Vermin Supreme|website=Reason|accessdate=March 4, 2020}}

Welch, Matt (March 4, 2020). "Libertarian Super Tuesday: Big Night for Jacob Hornberger, NOTA; John McAfee Drops Out and Backs Vermin Supreme". Reason. Retrieved March 4, 2020. {{cite web}}: Check |first1= value (help)

to this:

{{cite web|last1=Welch|first1=[[Matt Welch|Matt]]|date=March 4, 2020|url=https://reason.com/2020/03/04/libertarian-super-tuesday-big-night-for-jacob-hornberger-nota-john-mcafee-drops-out-and-backs-vermin-supreme/|title=Libertarian Super Tuesday: Big Night for Jacob Hornberger, NOTA; John McAfee Drops Out and Backs Vermin Supreme|website=Reason|accessdate=March 4, 2020|author1-link=Matt Welch}}

Welch, Matt (March 4, 2020). "Libertarian Super Tuesday: Big Night for Jacob Hornberger, NOTA; John McAfee Drops Out and Backs Vermin Supreme". Reason. Retrieved March 4, 2020. {{cite web}}: Check |first1= value (help)

should also strip wikilink from |firstn=.

Same should apply to other name parameters (and their aliases): |contributor-firstn=, |editor-firstn=, |interviewer-firstn=, |translator-firstn=

—Trappist the monk (talk) 13:27, 22 August 2020 (UTC)

this should be an improvement. https://github.com/ms609/citation-bot/pull/3412 AManWithNoPlan (talk) 14:38, 24 August 2020 (UTC)

Unauthorised date format change

Status: {{fixed}}
Reported by: Francis Schonken (talk) 06:50, 24 August 2020 (UTC)

What happens: bot changes |...date=28 March 2020 to |...date=2020-08-24, despite the fact that the article has a {{Use dmy dates}} tag in its header section
What should happen: conversion to dmy dates is OK, not the other way around
Relevant diffs/links: [10]
We can't proceed until: Feedback from maintainers

Interestingly enough, the {{Use dmy dates}} template controls the display of the date to humans. That is really cool. This will enforce the data style in the meta-data https://github.com/ms609/citation-bot/pull/3410 AManWithNoPlan (talk) 13:35, 24 August 2020 (UTC)

Inventing first name

Status: {{fixed}}
Reported by: Francis Schonken (talk) 08:08, 24 August 2020 (UTC)

What happens: The bot "invents" a first name, in this case "D. K." – as it happens the "D." in that name refers to a last name (Dorling) as does the "K." (Kindersley)
What should happen: The bot should not try to invent authors: the series is called DK Eyewitness: even if, agreed, google books, erroneously, marks "DK Eyewitness" as the book's author, converting that to |author=DK Eyewitness would be bad enough, not trying to extract a "first name" from that. Applicable policy: WP:OR – original research by human editors is bad enough, it being programmed into a bot should absolutely be avoided.
Relevant diffs/links: [11] (under "Line 160:")
We can't proceed until: Feedback from maintainers

I think this will help https://github.com/ms609/citation-bot/pull/3409 AManWithNoPlan (talk) 12:55, 24 August 2020 (UTC)

3RR

Your recent editing history at Red Fort shows that you are currently engaged in an edit war; that means that you are repeatedly changing content back to how you think it should be, when you have seen that other editors disagree. To resolve the content dispute, please do not revert or change the edits of others when you are reverted. Instead of reverting, please use the talk page to work toward making a version that represents consensus among editors. The best practice at this stage is to discuss, not edit-war. See the bold, revert, discuss cycle for how this is done. If discussions reach an impasse, you can then post a request for help at a relevant noticeboard or seek dispute resolution. In some cases, you may wish to request temporary page protection.

Being involved in an edit war can result in you being blocked from editing—especially if you violate the three-revert rule, which states that an editor must not perform more than three reverts on a single page within a 24-hour period. Undoing another editor's work—whether in whole or in part, whether involving the same or different material each time—counts as a revert. Also keep in mind that while violating the three-revert rule often leads to a block, you can still be blocked for edit warring—even if you do not violate the three-revert rule—should your behavior indicate that you intend to continue reverting repeatedly.

@AManWithNoPlan: after four reports above (#Unauthorised date format change, #Non-repair repair, #Equal treatment and #Inventing first name) about the same Citation bot edit to the Red Fort article, an edit that was, according to its edit summary, "Suggested by AManWithNoPlan", after which I reverted that edit, it seems hardly a good idea for you to suggest to the bot to edit-war over it, before even engaging in the bug reports above. --Francis Schonken (talk) 11:31, 24 August 2020 (UTC)

My internet was flaky and my webbrowser must have tried reconnect to the category page multiple times, each time launching the DOI fixing run. That's super annoying for several reasons: one, it creates the appearance of an edit war. Two, it means the bot ran wasted resources processing pages multiple times. AManWithNoPlan (talk) 12:18, 24 August 2020 (UTC)

A self-revert on this revert would be welcome then. @AManWithNoPlan: could you revert, if instructing the bot to do a self-revert would not be possible? --Francis Schonken (talk) 12:31, 24 August 2020 (UTC)

{{fixed}} with a revert and made another fix while I was there. AManWithNoPlan (talk) 13:39, 24 August 2020 (UTC)

Thanks. --Francis Schonken (talk) 13:41, 24 August 2020 (UTC)

Non-repair repair

Status: {{not a bug}}
Reported by: Francis Schonken (talk) 07:41, 24 August 2020 (UTC)

What happens: bot changed |website=britannica.co, to |website=britannica.co – neither url is valid, nor the original one, nor the "repaired" one. Both |website=britannica.com and |website=britannica.co.uk work (obviously the first was intended, while that's the one appearing in the |url=https://www.britannica.com/... parameter). |website=britannica.co, on the other hand, doesn't work.
What should happen: avoid non-repairs that give the impression that something was repaired.
Relevant diffs/links: [12]
We can't proceed until: Feedback from maintainers

Removing the trailing comma is a fix. A very small one, but a fix. |website= is not supposed to be a URL, but human text: it is simply the name of the website, not a full URL. AManWithNoPlan (talk) 13:38, 24 August 2020 (UTC)

In the case you didn't understand "Non-repair repair": the edit was a "non-fixing fix". For human readers "britannica.co" makes no sense either (without it being really clear that something is wrong), while the "britannica.co," form is at least clearer that something is wrong and needs fixing. The discussion whether it was a "repair" or a "fix" is irrelevant: that part of the edit was unhelpful on all levels. --Francis Schonken (talk) 13:52, 24 August 2020 (UTC)

We do convert website to URL if it has the http in it. Otherwise, we usually assume that the website entered by a human is correct. The problem is that a website might called IamCool.co, but redirect to IamCool.cheaphosting.com. So, the website might be correctly "IamCool.co" and the URL be IamCool.cheaphosting.com at the same time. It is hard to fix such things automatically. We use blacklists and whitelists and otherwise leave those to the humans to fix. AManWithNoPlan (talk) 14:54, 24 August 2020 (UTC)

Re. "It is hard to fix such things automatically" – that is the crux of the matter: if the bot can't fix it in a reasonable manner, then the bot shouldn't touch it, and leave it to human editors, and not implement a pseudo-fix, which is many times less helpful than not touching it.

Other than that, you gave a perfect explanation of what I said above:

For human readers "britannica.co" makes no sense either (without it being really clear that something is wrong), while the "britannica.co," form is at least clearer that something is wrong and needs fixing.

--Francis Schonken (talk) 15:45, 24 August 2020 (UTC)

And the bot fixed what it could: the stray comma. That additional fixes needed to be done is inconsequential. Headbomb {t · c · p · b} 15:55, 24 August 2020 (UTC)

Nah, the bot shouldn't have touched it. In no universe should removing the stray comma, without anything else, be considered a helpful fix. --Francis Schonken (talk) 15:56, 24 August 2020 (UTC)

The bot fixes the error "CS1 maint: extra punctuation" and this is one of two errors. The other error is not fixable by a bot. Grimes2 (talk) 16:04, 24 August 2020 (UTC)

This one wasn't fixable either (as evidenced above), so the bot should not engage in it: it is a bug while the bot tries to fix something it can't fix, and makes the situation worse. --Francis Schonken (talk) 16:22, 24 August 2020 (UTC)

The stray punctuation was fixed. That the bot doesn't fix everything is known, and 'fixing everything' is impossible to do by bot. This is also known. The bot fixes what it can. Headbomb {t · c · p · b} 16:26, 24 August 2020 (UTC)

Again, it was an unhelpful fix: seems best to remove the feature from the bot's code. --Francis Schonken (talk) 16:28, 24 August 2020 (UTC)

In short, these sort of fixes are not suitable for a bot running in automatic mode: assisted, or with a human checking before a proposed fix is saved would be far more effective for such fixes that need some interpretation that can't be delivered by bot. --Francis Schonken (talk) 16:33, 24 August 2020 (UTC)

The bots failure to only do some things is not a bug. It did not do anything wrong, it just failed to do something right. A typo that existed in three pages on all of wikipedia is hardly a shortcoming of the bot. AManWithNoPlan (talk) 20:47, 24 August 2020 (UTC)

This discussion is still open. --Francis Schonken (talk) 02:52, 25 August 2020 (UTC)

On the ground of the matter, the fix applied by the bot goes against WP:BOTPOL, see WP:CONTEXTBOT: "Examples of context-sensitive changes include ... punctuation mistakes" – which, per the policy, should not be performed by unsupervised bots. That's why this is a bug that needs to be fixed. --Francis Schonken (talk) 03:19, 25 August 2020 (UTC)

I can't tell whether you're serious. That sentence is about natural language, where punctuation is rarely black and white. Removing one stray character from an URL is not a "punctuation fix" in that sense. Nemo 06:15, 25 August 2020 (UTC)

See above, AManWithNoPlan's first reply after the bug report box: "|website= is not supposed to be a URL, but human text: it is simply the name of the website, not a full URL." (my emphasis), so, indeed this falls under the WP:CONTEXTBOT policy.

Your "I can't tell whether you're serious" comment is quite unhelpful at this stage. Care to retract it? --Francis Schonken (talk) 07:00, 25 August 2020 (UTC)

I find that the "I can't tell whether you're serious" makes it clear that you come across as a troll to some editors. I think it was a very kind way to express that sentiment. AManWithNoPlan (talk) 13:02, 25 August 2020 (UTC)

This is getting into WP:TE terrority. Removing punctuation in general is a context sensitive change, yes. Here the context is clear. This is the removal of stray punctuation in a template parameter which should not have stray punctuation. There is no WP:CONTEXTBOT violation here. The bot is not changing Firstly, we should attempt... to Firstly we should attempt. This is no different than AWB enforcing WP:REFPUNCT. Headbomb {t · c · p · b} 20:40, 25 August 2020 (UTC)

removing wikilink from title

Status: {{fixed}}
Reported by: Francis Schonken (talk) 02:56, 25 August 2020 (UTC)

What happens: bot changes |title=Art through the Ages to |title=Art through the Ages
What should happen: title should not be de-linked
Relevant diffs/links: [13]
We can't proceed until: Feedback from maintainers

Equal treatment

Status: {{notabug}}
Reported by: Francis Schonken (talk) 07:49, 24 August 2020 (UTC)

What happens: On the same page, in the same edit, the bot removes |website=books.google.ca from one {{cite book}} template, while it is left alone in another.
What should happen: should be handled similarly in both instances
Relevant diffs/links: [14] (the "removed" one is under the "Line 41:" part of the diff, the "unmodified" one under "Line 160:")
We can't proceed until: Feedback from maintainers

google.books.ca is a spam site, not books.google.ca. I just removed all references to google.books.ca from wikipedia. Interesting typo. AManWithNoPlan (talk) 14:02, 24 August 2020 (UTC)

Removed the "not a bug" assessment: if it is a rogue website it shouldn't be left alone in some cases, while it is removed in other cases. Either we can depend on the bot to remove it when it has gone through an article, or it leaves it to human assessment: randomly removing it and not removing it in a same update by the bot is untrustworthy behaviour of the bot, and should be addressed. --Francis Schonken (talk) 14:36, 24 August 2020 (UTC)

We have a small list of websites that are removed. Google books is one of them, since that is simply incorrect. The source of the information is not google, but a book. Google is just a library and they have no say in the material. We assume that the information in |website= is good, unless it is on the blacklist. AManWithNoPlan (talk) 14:46, 24 August 2020 (UTC)

Again, the problem is that the bot went through the article, removing the |website=books.google.ca in one instance, and leaving it untouched in another {{cite book}} template (in which it did other changes, but not the removal of that website parameter): that is undependable random behaviour which should be repaired. --Francis Schonken (talk) 15:52, 24 August 2020 (UTC)

Failure to do everything useful is not a bug. if that was the case, then every edit on wikipedia would wrong. AManWithNoPlan (talk) 16:10, 24 August 2020 (UTC)

Then it seems better to remove the undependable feature from the bot, and admit that the bot can't fix everything. --Francis Schonken (talk) 16:26, 24 August 2020 (UTC)

See also my suggestion about the "automatic" mode being the real problem for such fixes that need some human interpretation, in the #Non-repair repair section above. --Francis Schonken (talk) 16:36, 24 August 2020 (UTC)

Again, comma removal and clutter removal is dependable. That it doesn't fix everything you want it to fix is a case of WP:SOFIXIT. If you have specific suggestions that can be dependable, do make them though. The above is not one, for the reasons mentionned by AManWithNoPlan. Headbomb {t · c · p · b} 17:40, 24 August 2020 (UTC)

This discussion is still open. --Francis Schonken (talk) 02:52, 25 August 2020 (UTC)

The undependable feature should probably best be removed from the bot's code. --Francis Schonken (talk) 03:24, 25 August 2020 (UTC)

Equal treatment RfC

Is it acceptable behaviour in an unsupervised process (automatic bot) to randomly remove a parameter from one cite template, and keep the same parameter, with the same content, in an identical cite template on the same page, without the bot's maintainers being able to explain why the bot behaves thus? 03:42, 25 August 2020 (UTC)

No, unacceptable behaviour for the bot: the random feature should be removed from the bot's code. The bot's maintainers should at least be able to explain why the bot behaves thus. --Francis Schonken (talk) 03:42, 25 August 2020 (UTC)
This supposed "RfC" does not comply with RfC guidelines, due to a ridiculously partisan introductory text, and should be ignore. Nemo 06:17, 25 August 2020 (UTC)
If I understand the complaint, Editor Francis Schonken is arguing that |website=books.google.ca (line 41) is exactly the same as |website=google.books.ca (line 160) (diff). Superficially, to a human, perhaps they are the same; to a computer they are not – in the former books is a second-level subdomain of google.ca; in the latter, google is a second level subdomain of books.ca. No doubt |website=google.books.ca should be added to the bot's code so that the bot can remove it. That does not require an rfc.—Trappist the monk (talk) 10:07, 25 August 2020 (UTC)
- Re. "That does not require an rfc" – apparently it did: the first assistant maintainer of the bot saw no way to address the issue. Hopefully now they can. --Francis Schonken (talk) 10:25, 25 August 2020 (UTC)
  - I cannot address non-existent stupid issues. AManWithNoPlan (talk) 13:06, 25 August 2020 (UTC)
This rfc introduction seems a little biased, but AManWithNoPlan's explanation here and in the non-repair seemed fine to me. books.google.ca and google.books.ca are clearly not the same text, even though they might be similar. It seems a bit harsh to immediately request removal of features and it does not look like a bug. Not doing something is not a bug if the intention wasn't to fix it. Now a bot does not have intentions, but it didn't "fix" it because the site was not included. This can be easily be rectified if wanted, but would be an addition, not a bug fix. Also if it did in fact make at least a good edit, but missed something, why remove the feature? One the one side it is suggested to remove a feature while on the other the bot should be able to fix everything? It seems like the bot mainter(s) did in fact explain how the bot works, whether someone thinks it was good enough or not does not seem like a reason for an RFC. Redalert2fan (talk) 20:22, 25 August 2020 (UTC)
I've removed the RFC templates as a horribly, hopelessly biased leading question without any sort of example, based on a flawed premise (the behaviour is neither random, nor are the parameters the same). The behavior has been explained multiple times now. |website=books.google.ca is not the same as |website=google.books.ca. You might argue that they should be treated as equivalent (they are not), but you don't need an RFC for this. Headbomb {t · c · p · b} 20:32, 25 August 2020 (UTC)\
Agree, not a bug. Presumably google.books.ca was a typo and books.google.ca was intended, but we can't expect bots to automatically realize that incorrectly entered "website" parameters should match its patterns for bad and removable "website" parameters. —David Eppstein (talk) 21:15, 25 August 2020 (UTC)

Unauthorised date format change (2)

Status: {{fixed}} - legacy redirects added
Reported by: Francis Schonken (talk) 06:50, 24 August 2020 (UTC), modified/corrected by Matthiaspaul

What happens: bot changes |...date=28 March 2020 to |...date=2020-08-24, despite the fact that the article has a {{Use dmy dates}} tag in its header section
What should happen: conversion to the date format specified by the Use dmy/mdy dates template's |cs1-dates= parameter (if present), or (only if the |cs1-dates= parameter is not present) to the format according to the Use dmy/mdy dates template's name is OK, otherwise the format must not be changed
Relevant diffs/links: [15]
We can't proceed until: Feedback from maintainers

Interestingly enough, the {{Use dmy dates}} template controls the display of the date to humans. That is really cool. This will enforce the data style in the meta-data https://github.com/ms609/citation-bot/pull/3410 AManWithNoPlan (talk) 13:35, 24 August 2020 (UTC)

I couldn't find this in the code (but only had a cursory look), therefore:

Does it adhere to the setting of the optional |cs1-dates= parameter of the {{Use dmy/mdy dates}} template(s) as well (see Template:Use_dmy_dates#Auto-formatting_citation_template_dates)? This setting, if present, takes precedence over the setting derived from the template's name. If the code does not deal with this, the date format should not be changed at all.

This is particularly important in conjunction with the |cs1-dates=y setting because something like {{Use dmy/mdy dates|date=August 2020|cs1-dates=y}} means that the dates in the citation should be in ymd format, not dmy/mdy format.

Also, does it check for the various aliases of the {{Use dmy/mdy dates}} templates as well? If it doesn't, it would miss the presence of the template if it's redirected.

FYI, these are the patterns searched for by CS1/CS2 citation templates:

'{{ *[Uu]se dmy dates *[|}]'
'{{ *[Uu]se mdy dates *[|}]'
'{{ *[Uu]se DMY dates *[|}]'
'{{ *[Uu]se MDY dates *[|}]'
'{{ *[Uu]se *dmy *[|}]'
'{{ *[Uu]se *mdy *[|}]'
'{{ *[Uu]se MDY *[|}]'
'{{ *[Uu]se DMY *[|}]'
'{{ *[Dd]my *[|}]'
'{{ *[Mm]dy *[|}]'
'{{ *[Dd]MY *[|}]'
'{{ *[Mm]DY *[|}]'

--Matthiaspaul (talk) 04:15, 25 August 2020 (UTC)

Our checking is case-insensitive. I will add the shorter MDY type ones. AManWithNoPlan (talk) 13:12, 25 August 2020 (UTC)

Regular expression failure

Status: {{fixed}}
Reported by: Whywhenwhohow (talk) 03:41, 28 August 2020 (UTC)

What happens: Regular expression failure
Relevant diffs/links: https://citations.toolforge.org/process_page.php?slow=on&edit=webform&page=Ranitidine&cat=
We can't proceed until: Feedback from maintainers

Fixed on the page: https://en.wikipedia.org/w/index.php?title=Ranitidine&type=revision&diff=975428346&oldid=975367091 Also, added some debug output to the bot so that you can find these yourself. AManWithNoPlan (talk) 13:13, 28 August 2020 (UTC)

ResearchGate is not a publisher

Status: {{fixed}}
Reported by: Nemo 13:58, 28 August 2020 (UTC)

What happens: Nothing
What should happen: special:diff/975434945
We can't proceed until: Feedback from maintainers

Researchgate.net and ResearchGat and wikillinked and of course case-insensitive. AManWithNoPlan (talk) 15:20, 28 August 2020 (UTC)

https://github.com/ms609/citation-bot/pull/3433 soon. AManWithNoPlan (talk) 15:25, 28 August 2020 (UTC)

Adds journal=Report to cite book template

Status: {{fixed}}
Reported by: Whywhenwhohow (talk) 18:22, 29 August 2020 (UTC)

What happens: The bot adds journal=Report to a cite book
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=Ledipasvir/sofosbuvir&diff=975654426&oldid=975654215
We can't proceed until: Feedback from maintainers

> Checking AdsAbs database no record retrieved.
  + Adding journal: Report

> Remedial work to clean up templates

  ! Citation should probably not have journal = Report as well as chapter / ISBN  9789241209946

https://github.com/ms609/citation-bot/pull/3439 should be live soon. AManWithNoPlan (talk) 19:17, 29 August 2020 (UTC)

Fix biorxiv parameter

Status: {{fixed}}
Reported by: Headbomb {t · c · p · b} 13:58, 31 August 2020 (UTC)

What should happen: [16]
We can't proceed until: Feedback from maintainers

https://github.com/ms609/citation-bot/pull/3466 AManWithNoPlan (talk) 23:29, 31 August 2020 (UTC)

What is the point of pmid, bibcode?

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

What is the point of adding entries pmid and bibcode to references? The DOI already provides all information needed to locate the entry. --Jorge Stolfi (talk) 15:01, 1 September 2020 (UTC)

Not all papers have DOIs. Also bibcode links often contain free full versions and additional bibliographic information (like how often something is cited, related publications), and PMIDs are most useful to health professionals and will have supplementary bibliographic information (much like bibcodes do). DOIs also sometimes go bad, and it's good to have backup links to confirm what is being cited. Headbomb {t · c · p · b} 15:26, 1 September 2020 (UTC)

Also since this does not concern Citation bot, this is {{not a bug}}. Further discussion about the purpose of identifiers can continue at Help talk:CS1 if you want.

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Wiki-linked book titles are needlessly piped

Status: {{fixed}}
Reported by: XOR'easter (talk) 18:36, 1 September 2020 (UTC)

What happens: wiki-linked book titles are needlessly piped
What should happen: nothing
Relevant diffs/links: see my reverts here, here, and here
We can't proceed until: Feedback from maintainers

Thanks. Will be fixed soon. AManWithNoPlan (talk) 21:47, 1 September 2020 (UTC)

also added code that will fix these when it finds them - there is at least one other bot that does this already. AManWithNoPlan (talk) 02:07, 2 September 2020 (UTC)

Bibcode lookup issues

Status: {{notabug}}
Reported by: Lithopsian (talk) 18:41, 1 September 2020 (UTC)

What happens: The Bot fails to add bibcode or arxiv fields to citations
What should happen: The bot normally (used to) add the bibcode and arxiv fields to citations that have them, for example when looking up a citation by doi. This isn't happening now. Probably related, citations with a bibcode but no doi are not expanded at all although there is no warning from the bot.
Relevant diffs/links: Try at User:Lithopsian/sandbox or GCIRS 16SW (reference 5)
We can't proceed until: Feedback from maintainers

The bot has used up its allocation of bibcodes for the a time period. Should start working again soon. AManWithNoPlan (talk) 21:45, 1 September 2020 (UTC)

|first= and |author-link=

Status: {{fixed}}
Reported by: Trappist the monk (talk) 00:25, 2 September 2020 (UTC)

What happens: leaves behind wikilinks in |first= when adding |author-link=
Relevant diffs/links: diff
We can't proceed until: Feedback from maintainers

https://github.com/ms609/citation-bot/pull/3472 AManWithNoPlan (talk) 01:37, 2 September 2020 (UTC)

Adds journal=Report to cite book template

Status: {{fixed}}
Reported by: Whywhenwhohow (talk) 02:26, 2 September 2020 (UTC)

Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=Ledipasvir/sofosbuvir&diff=976269033&oldid=976093612
We can't proceed until: Feedback from maintainers

It would be useful for the bot to display its version number and/or build date in its output when run via the web UI.

This bug was discussed and closed at https://en.wikipedia.org/wiki/User_talk:Citation_bot/Archive_22#Adds_journal=Report_to_cite_book_template but still appears to be a bug.

>Checking CrossRef database for doi. 
>Searching PubMed...  nothing found.
>Checking AdsAbs database no record retrieved.
  +Adding journal: Report

>Remedial work to clean up templates

  !Citation should probably not have journal = Report as well as chapter / ISBN  9789241209946 
Written to Ledipasvir/sofosbuvir

Fixing at a much deeper level this time to catch them all. There is no simple way to display a version with git in any meaningful manner. https://github.com/ms609/citation-bot/pull/3473 AManWithNoPlan (talk) 12:22, 2 September 2020 (UTC)

The Canadian Catholic Historical Association published a journal with the title Report 🤪 I think we can live we rejecting that. AManWithNoPlan (talk) 14:33, 2 September 2020 (UTC)

Changed hyphenated page number to dashed page range.

Status: {{fixed}} for |page=
Reported by: User-duck (talk) 00:16, 2 September 2020 (UTC)

What happens: |page=3-159 was changed to |page=3–159.
What should happen: A single hyphenated page should not be changed.
Relevant diffs/links: Iroquois
We can't proceed until: Feedback from maintainers

I realize one of the common mistakes is to use a hyphen in a range. The bot should change |pages=3-159 to |pages=3–159. User-duck (talk) 00:16, 2 September 2020 (UTC)

99.9% of the time the change is correct. Also, it is the kind of thing that human editors do all the time. Lastly, this change actually changes the wiki text to match what is displayed. People get this wrong often enough that the bots page and the citation template docs point this out too. AManWithNoPlan (talk) 01:11, 2 September 2020 (UTC)

99.9 percent is very far from good enough for a bot (a machine that can be programmed to be deterministic instead of doing guesswork). Such errors introduced by bots are difficult to find and therefore harmful. Don't let the bot carry out such changes unless you are sure the bot has actually spotted an error and the edit is correct ("sure" = 100%, because you have actually checked the citation). An alternative could be to collect such occurences and publish a list of them for humans to look over. Or add a HTML comment so that later human editors editing the article find the note and can have an extra eye on this. But don't let the bot introduce such changes itself.

Regarding humans doing similar mistakes, yes, this happens, but they don't hammer their mistakes down into thousands of articles in no time, like bots do. This is why a mistake done by a human may be annoying, but the same mistake done by a bot is harmful and not acceptable.

--Matthiaspaul (talk) 11:34, 2 September 2020 (UTC)

99.9% is a phenomenal success rate. All AWB bots make those changes with their genfixes. No reason why Citation Bot should be the special exception here. Plus, the CS1 documentation Help:CS1#Pages is clear that hyphenated pages are the exception, rather than the norm, and that people should use {{hyphen}} to indicate that this is intentional, and not to be converted to an ndash. Headbomb {t · c · p · b} 14:11, 2 September 2020 (UTC)

Yes, 99.9% is a phenomenal success rate, but I doubt Citation Bot is that successful because I doubt it detects 1000 different errors. In any case, the success rate can be increased.

In this case, when the bot decides to make the change from 'hyphen' to 'en dash' it assumes |page= is a mistake. In this case, it should also change |page= to |pages=.

Maybe the bots should use {{en dash}} or – when they make the change. It makes more sense for a bot to add extra characters than a human.

Finally, I did not imply that "Citation Bot should be the special exception". This is were I discovered the error. Of course, all the bots should be fixed.

— User-duck (talk) 15:38, 3 September 2020 (UTC)

The statement "Lastly, this change actually changes the wiki text to match what is displayed." confused me until I found these tidbits: (It took me a while to find the source code.)

page: The number of a single page in the source that supports the content. Use either |page= or |pages=, but not both. Displays preceded by p. unless |nopp=y. If hyphenated, use {{hyphen}} to indicate this is intentional (e.g. |page=3{{hyphen}}12), otherwise several editors and semi-automated tools will assume this was a misuse of the parameter to indicate a page range and will convert |page=3-12 to |pages=3{{ndash}}12.
OR: pages: A range of pages in the source that supports the content. Use either |page= or |pages=, but not both. Separate using an en dash (–); separate non-sequential pages with a comma (,); do not use to indicate the total number of pages in the source. Displays preceded by pp. unless |nopp=y.
Hyphens are automatically converted to en dashes; if hyphens are appropriate because individual page numbers contain hyphens, for example: pp. 3-1–3-15, use double parentheses to tell the template to display the value of |pages= without processing it, and use {{hyphen}} to indicate to editors that a hyphen is really intended: |pages=((3{{hyphen}}1{{ndash}}3{{hyphen}}15)). Alternatively, use |at=, like this: |at=pp. 3-1–3-15.

I do not remember reading about this use of {{hyphen}} and {{ndash}}. But it has been a while since I have read the cite templates' documentation completely.

Am I correct that the cite templates will convert |pages=3–1–3–15 to pp. 3–1–3–15 (pp. 3–1–3–15). Which is obviously wrong.

It is frustrating that I need to incorporate two different work-arounds for the templates and editors. |page=3-12 and |pages=3–1–3–15 are not ambiguous. (Unless |page= and |pages= are aliases for the same parameter.)

It is gratifying to know that some tool did translate |page=3-12 to |pages=3{{ndash}}12. Or at least a documentation editor thought they should. (And the Citation Bot does not.)

Finally, what is done with |pages=325? This is obviously wrong and from my experience a common misuse of |pages= for the total number of pages.

— User-duck (talk) 17:08, 3 September 2020 (UTC)

(edit-conflict) If AWB makes the same mistakes that's not an excuse but a reason to either improve the tools or ditch them. I know, we all are volunteers here, but that should not keep us from applying professional standards to the work we do. Citation Bot's many questionable edits are an ongoing community-wide annoyance and distraction from actual article work. It is such a waste of time to be forced to clean up the mess this bot creates all over the place. I can tolerate an occasional glitch, if it will be fixed soon (I'm even willing to help), but a bot failing in one go and its operators/maintainers/users trying to defend its weaknesses by denying the errors it creates cannot be accepted. That's wasting even more precious time and energy. As sad as it is, at the present low success rate, the project would be better off without this bot. Therefore, if this bot should have any future in Wikipedia, it must be significantly improved in two areas: First, it must not carry out any edits for which there is no broad community-consensus. Second, it must not carry out any edits based on guesswork or likelihoods or assumptions instead of verified facts. Also, the attitude must change to a conservative approach about what kind of edits a bot is realistically able to carry out reliably and what is carrying a risk of messing up something, and to always stay on the safe side. If that can't be garanteed, don't make the edit.

It is perfectly fine for a bot or other tool to assist humans in detecting, collecting or marking spots which may require careful further investigation by a human. Even heuristics can be used for this with good success. Machines can be very good in screening huge amounts of data in no time. It is also perfectly fine for a bot to carry out "deterministic" actions for as long as they are backed up by consensus, not only by what a few militant citation warriors want to enforce as their citation standard. So, in this example, it would be okay if the bot, running into what could be a valid page or incorrectly a page range in a single |page= parameter, would actually retrieve the cited document and check the type of page numbering used in there and see if the page number exists or not, or, simpler, to ask a human to check the page numbering. If the bot is not capable of determining this with 100% reliability, it could still leave a HTML comment in the citation so that later human editors are alerted on the possible situation and can check the facts. So, the bot can still be useful, even if it does not carry out such edits by itself. It is also fine for a bot to add, f.e., an identifier or other missing information that can be retrieved with 100% accuracy (but it should not use unreliable channels to retrieve such information). What is not acceptable is to base edits on guesswork and likelihoods instead of actually verifying it in the source. That's harmful and must stop. Bots exists to assist humans, not the other way around. --Matthiaspaul (talk) 17:54, 3 September 2020 (UTC)

|page= vs. |pages=. Grrrr. These template citations are like contradiction wrapped in an enigma. Coming soon: https://github.com/ms609/citation-bot/pull/3474 AManWithNoPlan (talk) 19:08, 3 September 2020 (UTC)

Please do not convert plain references to Cite templates

Please DO NOT convert plain references to {{cite}} templates, as was done here. They are worse in every respect -- length, readability of the source and produced text, ease of entering and editing (especially by new editors), bug resistance, ... and have NO redeeming features. In fact, Wikipedia would be immensely better if their uses were all SUBSTed and the templates deleted. And I believe that there was a WP policy that the converssion (either way) should NOT be done without a good reason.--Jorge Stolfi (talk) 14:54, 1 September 2020 (UTC)

You are heavily mistaken. Citation templates greatly facilitate the long-term maintenance of the Encyclopedia, and present the information in a consistent, uniform, way. That you can't think of a "redeeming feature" doesn't mean they aren't there. I count 5 styles errors alone in your first citation alone

M. A. Casado-Rodriguez, M. Sanchez-Molina, A. Lucena-Serrano, C. Lucena-Serrano, B. Rodriguez-Gonalez, Manuel Algarra, Amelia Diaz, M. Valpuesta, J. M. Lopez-Romero, J. PerezJuste, and R. Contreras-Caceres (2016): "Synthesis of vinyl-terminated Au nanoprisms and nanooctahedra mediated by 3-butenoic acid: Direct Au@pNIPAM fabrication with improved SERS capabilities". Nanoscale, volume 2016, issue 8, pages 4557-4564. doi:10.1039/C5NR08054A

and 2 more in

David B. Bigley and Michael J. Clarke (1982): "Studies in decarboxylation. Part 14. The gas-phase decarboxylation of but-3-enoic acid and the intermediacy of isocrotonic (cis-but-2-enoic) acid in its isomerisation to crotonic (trans-but-2-enoic) acid". Journal of the Chemical Society, Perkin Transactions 2, volume 2, issue 1, pages 1-6. doi:10.1039/P29820000001

There are additional features beyond just internal consistency, but those usually kick in on bigger articles. Suffice to say that it is much, much easier to maintain citation templates than it is to maintain manual citations. Having CS1/2 templates will also emit COinS metadata, and several tools will only work if citation templates are used. Headbomb {t · c · p · b} 15:38, 1 September 2020 (UTC)

(1) There were indeed typos in those references (thanks for pointing that out), but they do not justify converting them to cite templates (which by itself would not fix the typos).
(2) Inconsistency in abbreviation of first names is not a "style error"! Giving the name in full when known (especially if that is how it appears on the artiicle itself) causes no harm to anyone, and may be useful to identify the author and distinguish homonyms. The practice of abbreviating first names (and journal names, and writing "9(11)23-5" instead of "volume 9, issue 11, pages 23-25") was developed by journal publishers only to save paper; not because the God of Bibliography mandated it. Academics are used to these shorthands, but for the general reader of Wikipedia they are inscrutable hieroglyphs. That is one of the many reasons why the Cite templates are BAD.
(3) Insisting an n-dash instead of hyphen to separate page numbers is not only ridiculous finnickery (which is not espoused by many authors and publishers), but in fact goes against the spirit of Wikipedia: that editors should spend their time on contents rather than appearance. That is the reason, by the way, for its early option for straight quotes and apostrophes, instead of paired open-close quotes. Demanding or implying adherence to elaborate typographical standards discourages new editors and wastes the time of old ones, without bringing any measurable benefit to readers.
All the best, --Jorge Stolfi (talk) 18:09, 1 September 2020 (UTC)
(4) And it is not at al true that "cite templates are much, much easier to maintain". Quite the opposite! Just finding the year or title in a Cite template entry takes careful scanning of the whole entry.
Fromyour comment, I infer that there are "projects" that intend to use the Wikipedia references as some sort of database, with query tools etc. Such a project would only have merit if it was explicitly defined, justified, and included in the official Wikipedia goals. You cannot demand that editors help such a project if they do not know about it (and if it has no visible benefit for them of for Wikipedia readers).
Actually I myself got tired of requesting that reference bodies be removed from the articles and placed in a separate unified database, so that entries do not have to be typed again and again -- like images are unified in Wikipedia Commons. I would support such a project (but not using the Cite templates as they are). Until then, please stop converting perfectly good references to Cite templates.
All the best, --Jorge Stolfi (talk) 18:26, 1 September 2020 (UTC)

This discussion IS about ongoing disruption from the citation bot during an RFC, and those who activate it, so please refrain from closing threads. I will now start over on the post I lost to edit conflict, and @RexxS and Salvio giuliano: SandyGeorgia (Talk) 15:57, 1 September 2020 (UTC)

Starting over on post lost to edit conflict when thread was prematurely closed.

This post IS about citation bot and the broad disruption caused by it and those who activate it.
Jorge Stolfi is correct about WP:CITEVAR, and existing style issues in a citation are not a reason to convert the entire article.
Because of the ongoing vagaries and problems associated with this bot, I regret having recently converted FA Tourette syndrome from manual citations to citation templates, as now I must deal with constant disruption.
As soon as Salvio guiliano unblocked the bot, and in spite of an ongoing RFC, the bot resumed removing free full link URLs, and installing non-free full link URLs, even after I followed the bot operators' advice to add inline comments. At this point, this is simply disruptive and vandalistic, and I should not have to continue correcting such issues, just because someone or some group are anxious to add yet another identifier to every citation, resulting in even more clutter for editors and readers. See disruptive edit at a Featured article here, and take note of inline comments and addition of a non-free URL. Nor should the bot operators be disrespecting CITEVAR as raised by Jorge Stolfi.
Salvio giuliano, is it time to reblock? SandyGeorgia (Talk) 16:06, 1 September 2020 (UTC)

- This has nothing to do with Citation Bot. This is all about me. I manually changed the page after fixing the bad doi. That was my bad. AManWithNoPlan (talk) 16:17, 1 September 2020 (UTC)
  - OK, thanks for stating that, but once again ... regular editors who just want to get some work done are having to deal with bot problems. What happened at the link I gave above for dementia with Lewy bodies? Do you agree I am justified now in simply reverting the lot, as it appears some operators can't help themselves, and I should not have to do continuous corrections? And by the way, who activated the bot at DLB, and how am I supposed to track that down, other than reporting here as earlier advised ? SandyGeorgia (Talk) 16:20, 1 September 2020 (UTC)
    - The word "table" in the PMC url is now magic and the bot will see that. AManWithNoPlan (talk) 16:52, 1 September 2020 (UTC)
      - Thanks for whatever magic you did there, but three things. 1. GreenC below is saying we need to somehow flag these issues, which I did as instructed with the inline comment, and yet someone activated the bot and ignored the inline. 2. Besides deleting the URL to the Table, a non-free URL was added ... weird. 3. How can I tell who activated the bot there? Regards, SandyGeorgia (Talk) 18:14, 1 September 2020 (UTC)
        What non-free URL? I don't see any in your example, please be specific. Nemo 06:56, 2 September 2020 (UTC)
        SandyGeorgia, You can see who activated the bot by "Suggested by (username)" . Or in older versions it was "Activated by (username)". Redalert2fan (talk) 09:36, 2 September 2020 (UTC)

Since the non-template articles are few and far between (in the corpus of 6+ million) the onus is on those articles to flag somehow because citation bot is not alone. IABot, WaybackMedic and RefFill are only a few that come to mind that also convert to CS1|2, under some conditions. Like it or not CS1|2 has become a standard. A template flag such as {{nocs1}} could work or some other method. Automated tools need to be told what you want done, they can't magically determine a non-templated article vs. only a single cite that happened to be non-templated for no reason. -- GreenC 16:42, 1 September 2020 (UTC)

Indeed. Nobody is forced to write their references with templates, but at the same time nobody is forced to do without templates when a reference needs to be fixed. References go rotten very quickly, so they need frequent maintenance and no human editor can be expected to keep up with it: automation is the only way to respect the second pillar. Nemo 06:56, 2 September 2020 (UTC)

{{notabug}} but maybe a {{personalproblem}} of mine. AManWithNoPlan (talk) 12:18, 6 September 2020 (UTC)

504 Gateway Time-out for citations.toolforge.org

Looks like something is wrong and I can't run the bot on any pages. Just times out. — Chris Capoccia 💬 15:43, 4 September 2020 (UTC)

I'm getting a 503 Service Not Available. Abductive (reasoning) 18:39, 4 September 2020 (UTC)
yep. that's all i'm getting now too. — Chris Capoccia 💬 22:19, 4 September 2020 (UTC)

Fixed now and working again. — Chris Capoccia 💬 02:20, 5 September 2020 (UTC)

- {{fixed}}, but i need to figure out why this occures. AManWithNoPlan (talk) 12:20, 6 September 2020 (UTC)

Bot makes additional edits on second pass

How is it possible, as can be seen at Animal latrine, that the bot can find more things to fix only a few days later, with no intervening edits? Abductive (reasoning) 12:10, 6 September 2020 (UTC)

There are a few rare cases where this can happen, but in 99%+ of the times, it is a database being down that is the cause. In this case, the bibcode database was down. AManWithNoPlan (talk) 12:22, 6 September 2020 (UTC)

Maybe the bot should let folks know when databases are down? And is it checking downed databases for each and every article, even if it just encountered the downed database? Might this be partly responsible for the slowness? Abductive (reasoning) 12:49, 6 September 2020 (UTC)

The edit summaries are too long as they are often. It think people would generally be annoyed and find it not useful to have edits summaries that said "...tune in next week to see how this edit ends up." AManWithNoPlan (talk) 13:34, 6 September 2020 (UTC)

Also, if you don't really need the bibcodes, you can disable slow mode: that will make the edits faster and more reliable. You'd also leave more of the query quota for users who actually care about the bibcodes (mostly astrophysics folks, I think). Nemo 17:26, 6 September 2020 (UTC)

Interesting. Abductive (reasoning) 21:57, 6 September 2020 (UTC)

{{notabug}}, but annoying. AManWithNoPlan (talk) 00:50, 9 September 2020 (UTC)

url vs chapter-url again

Status: {{fixed}}
Reported by: Kanguole 11:20, 7 September 2020 (UTC)

What happens: The link shows two errors related to chapters in books:

In the first change, the URL added to |url= actually points at the chapter (and is equivalent to the one already given in |chapter-url=).
In the second change, the URL points to the whole book, so changing it from |url= to |chapter-url= is erroneous.

Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=Japonic_languages&curid=501569&diff=977180142&oldid=973737202
We can't proceed until: Feedback from maintainers

It seems the first bug is still present (diff). Kanguole 17:09, 8 September 2020 (UTC)

Sorry, I missed that one. AManWithNoPlan (talk) 17:26, 8 September 2020 (UTC)

https://github.com/ms609/citation-bot/pull/3481

added volume and issue; left malformed alias number

Status: {{fixed}}
Reported by: Trappist the monk (talk) 23:45, 7 September 2020 (UTC)

What happens: |number= is an alias of |issue=
Relevant diffs/links: [17]
We can't proceed until: Feedback from maintainers