Wikipedia:Village pump (idea lab)

From Wikipedia, the free encyclopedia
 Policy Technical Proposals Idea lab WMF Miscellaneous 
The idea lab section of the village pump is a place where new ideas or suggestions on general Wikipedia issues can be incubated, for later submission for consensus discussion at Village pump (proposals). Try to be creative and positive when commenting on ideas.
Before creating a new section, note:

Before commenting, note:

  • This page is not for consensus polling. Stalwart "Oppose" and "Support" comments generally have no place here. Instead, discuss ideas and suggest variations on them.
  • Wondering whether someone already had this idea? Search the archives below, and look through Wikipedia:Perennial proposals.

Discussions are automatically archived after remaining inactive for two weeks.

« Archives, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54

Brainstorming a COPYVIO-hunter bot[edit]

I'd like to propose the idea of a a COPYVIO-hunter bot, but I'm not ready to make a specific Bot request yet, and so I'd like to expose this idea here first to brainstorm it. Sometimes, copyright violations are discovered that have been present on Wikipedia for years. (The copyright-violating content at Barnabas#Alleged writings was added on 4 August 2014 and discovered 18 December 2023.) But for an alert Tea house questioner two days ago, who knows when, if ever, this would have been discovered. That's worrisome.

We have some good tools out there, such as Earwig's detector, and my basic idea is to leverage that by building a bot around it, which would apply it to articles, and either generate a report, or apply the {{Copyvio}} template directly. A couple of additional bot tasks could streamline the human part of the investigation by finding the insertion point (Blame) and determining copy direction (IA search). There are input, performance, scaling questions, and human factors, and likely others I haven't thought of. As far as input, ideally I'd like to see a hybrid or dual-channel input of a hopper with manual feed by editors (possibly semi-automated feed by other tools), and an automated input where the bot picks urls based on some heuristic.

For performance, I launched Earwig with all three boxes checked, and it took 62 seconds to return results for Charles de Gaulle (174,627b) and 16 seconds for (randomly chosen) Junes Barny (5,563b). I'm pretty sure there are a lot more articles closer in size to the latter than the former, so let's say Earwig takes 30 seconds per search on average; multiplying that by {{NUMBEROFARTICLES}} gives us 6.43 years to search all of Wikipedia with a dumb, single-threaded bot with no ability to prune its input stack. (Of course, Wikipedia would be bigger six years later, but that gives us an idea.) Given that the Barnabas violation went undiscovered for nine years, six years is not so bad, as I see it. But not all articles are equal, and probably some pruning method could decrease the size of the input stack, or at least prioritize it towards articles more likely to have undiscovered violations.

As far as scaling, I have no idea of server availability at WMF, but presumably there are some bot instruction pages somewhere for bot writers which address how many threads are optimal, and other factors that could scale up the processing for better throughput; maybe someone knows something about that. If we had six threads going against one input stack, that would reduce it to one year; it would be great to run it annually against the entire encyclopedia.

For human factors, I'm thinking about the increased number of articles tagged with copy violations, and the additional load on admins that would inevitably result. There are currently 17 articles tagged with the {{Copyvio}} template right now. I wanted to provide some estimate of activity at Wikipedia:Copyright problems to gauge current throughput, but I'm not so familiar with the page, and was unable to do so. Inevitably, a bot would increase the load on admins (for WP:REVDEL) and other volunteers, and it would be helpful to gather some data about what would happen. Not sure if its possible to project that, but maybe a stripped down version of the bot just to wrap Earwig and spit out numbers on a test run of a week or two might give us some idea. I'm guessing in operation, it would generate a big, backlog balloon initially based on the first two decades of Wikipedia, but then its output would slow to some steady state; in any case, backlogs in other areas have been generated and attacked before with success.

Maybe a bot could somewhat reduce load per investigation, by means a handy output report that includes Earwig percent, maybe a brief excerpt of copied content, and so on. A couple of additional tasks could be defined which would work off the output report, one task running Blame on the suspect articles to add date of insertion to the report, and another to read IA snapshots and determine direction of copy (i.e., is it a mirror, or a copyvio), resulting in a report with information that ought to make the human part of the investigation considerably faster and more efficient per occurrence, which should at least somewhat offset the increased overall number of investigations.

Would love to hear any feedback on the technical aspects of this, as well as the human factors, and whether something like this should even be attempted. Thanks, Mathglot (talk) 02:00, 21 December 2023 (UTC)Reply[reply]

Maybe a fourth task could be a disposition-triage task, and would act on the report output of previous tasks based on configurable values; something like: "if copy-direction = copyvio then if Earwig-pct > 85 then remove content from article and mark/categorize as revdel-needed; else if Earwig-pct < 20 then remove Copyvio template and mark report as handled; else leave for human assessment; else mark as mirror and handled." Mathglot (talk) 02:29, 21 December 2023 (UTC)Reply[reply]
EranBot currently sends every new edit through CopyPatrol if I understand it correctly, which essentially runs the edits through Turnitin/iThenticate. One could reduce the bot load by making it only look at articles that were created prior to August 2016.
@MusikAnimal (WMF) and Mathglot: I understand that the WMF is currently working on a replacement/re-vamp of CopyPatrol (i.e. Plagiabot). Is there a way to integrate a sort of "historical article detection" into a similar interface while re-using some of the code from the new Plagiabot, or is this something that you think would be better kept separate? — Red-tailed hawk (nest) 02:42, 21 December 2023 (UTC)Reply[reply]
That's terrific news, which means, if I understand correctly, that whatever the scope of the problem is, at least it's not getting worse (assuming perfect precision from Plagiabot). So we only have to deal with the pre-whatever-year issue, and slowly chip away at it. (I am subscribed; no ping needed.) Mathglot (talk) 02:56, 21 December 2023 (UTC)Reply[reply]
@MusikAnimal (WMF) I remember putting this up on phabricator somewhere (I think?), but would it be possible to provide a stable API to integrate CopyPatrol with various other editing/CVUA tools (specifically it would be great to be able to answer the question "What is the iThenticate score/URLs for a specific edit") Sohom (talk) 06:29, 21 December 2023 (UTC)Reply[reply]
I've left MusikAnimal a comment on their WMF account talk page. It would be nice to hear from them on this. — Red-tailed hawk (nest) 17:45, 25 December 2023 (UTC)Reply[reply]
I acknowledge it's Christmas, and many WMF staff are taking vacation/holiday, so it's fairly possible that we might not hear back for a week or so. — Red-tailed hawk (nest) 17:53, 25 December 2023 (UTC)Reply[reply]
Thanks. I've added DNAU for 1 month, imagining that he may be on a nice, long winter vacation. Mathglot (talk) 21:24, 25 December 2023 (UTC)Reply[reply]
An API for reviewing/unreviewing does exist, but it's undocumented right now. It also doesn't provide Access Control headers. I was working on an external-use API for CopyPatrol, but decided to hold off until the new version that uses Symfony was finished and deployed, since it won't be usable anyway until deployment has finished. Chlod (say hi!) 02:22, 26 December 2023 (UTC)Reply[reply]
Thanks for your patience! I was "around" on my volunteer account, but haven't been checking this one until today (my first day back at work after the break).
It sounds like you all are asking for phab:T165951, which was declined last November. It can be re-opened if there's interest in it. However, it's worth noting CopyPatrol doesn't go through every edit, only those that meet certain criteria. I let @JJMC89 speak to that before I say something wrong ;)
As for an API, we can certainly add an endpoint to get the score for a given revision, if it exists in our database. That's simple to implement and won't require authentication. If you could file a bug, I can have that ready for when the new CopyPatrol goes live.
API endpoints that make changes to our db, such as reviewing/unreviewing, is another matter. Right now we authenticate with OAuth, so we'd need to somehow have clients go through that before they could use the endpoint. If @Chlod is interested in building this, I'll happily review it! :) Off the top of my head, I'm not sure how to go about implementing it. Alternatively, maybe we could provide all logged in users an API key? That would avoid clients having to login to CopyPatrol.
I don't think we want to permit requesting new scores for any arbitrary revision, at least not until our partnership with Turnitin is finalized. That should happen very soon, and then we'll know for sure if we can send out that many API requests. Some changes to JJMC89's bot would likely also need to be made. All in all, I'd say this feature request is not much more than a "maybe".
Also, in case no ones mentioned it yet, attempting to identify old copyvios is tricky because of the all-too-common WP:BACKWARDSCOPY issue. In some cases it may not be possible to ascertain which came first -- Wikipedia or the source -- so I'd weary of attempting to automate this. MusikAnimal (WMF) (talk) 00:57, 3 January 2024 (UTC)Reply[reply]
The new bot looks at edits made in the article and draft namespaces (0 and 118) to submit to turnitin and skips the following types of edits:
  • made by a bots or users on the allow list
  • (revision) deleted before processing (rare unless catching up from a service outage)
  • rollbacks (MediaWiki native or Twinkle)
  • additions of < 500 characters after cleaning the wikitext.
Those that come back with more than a 50% match to a (non-allow listed) source are shown in CopyPatrol for human assessment.
As a quick test, I added an endpoint to dump the data from the database for a specified revision.[1]
{
  "diff_id": 7275308,
  "lang": "en",
  "page_namespace": 0,
  "page_title": "Mahāyāna_Mahāparinirvāṇa_Sūtra",
  "project": "wikipedia",
  "rev_id": 1178398456,
  "rev_parent_id": 1178304407,
  "rev_timestamp": "Tue, 03 Oct 2023 12:16:34 GMT",
  "rev_user_text": "Javierfv1212",
  "sources": [
    {
      "description": "C. V. Jones. \"The Buddhist Self\", Walter de Gruyter GmbH, 2021",
      "percent": 50.3817,
      "source_id": 820817,
      "submission_id": "3084bde6-3b8b-488c-bf33-c8c27a73ae06",
      "url": "https://doi.org/10.1515/9780824886493"
    }
  ],
  "status": 0,
  "status_timestamp": "Tue, 03 Oct 2023 12:38:16 GMT",
  "status_user_text": null,
  "submission_id": "3084bde6-3b8b-488c-bf33-c8c27a73ae06"
}
Please file a task so we can workshop the best way to design the API.
— JJMC89(T·C) 00:40, 4 January 2024 (UTC)Reply[reply]
Filed as phab:T354324. This could be done on either the frontend or the backend; but it doesn't look like the backend source is publicly-available (and API endpoints are a frontend task anyway, so it should probably live on the frontend). Chlod (say hi!) 10:03, 4 January 2024 (UTC)Reply[reply]
I'd encourage making the repos public unless there is a reason for keeping them private. It will make things easier if someone goes inactive or if someone wants to submit a patch. –Novem Linguae (talk) 11:36, 4 January 2024 (UTC)Reply[reply]
Hi, Mathglot! Great to hear more initiative on copyright cleanup tasks; they're always a big help. Someone brought up a related idea at WT:CCI a while back, and I responded with a few points that probably apply here too. I've got a cannula lodged in my hand right now, so I'll copy over what I said in that thread to avoid straining it. There wasn't a lot of back-and-forth on that thread anyway so it's probably easier if I just repost it here.

There was an idea previously floated around about having Turnitin or Earwig run on all revisions of past cases; I'd say this is probably the general idea when talking about automation for CCI cases. When it actually comes down to making it happen, though, it's a spider web of caveats and limitations that make it hard to get off the ground. Here's a more-organized explanation of my thoughts that I randomly collected in the past few months:

  • First is the issue of cost. There's around 508 thousand revisions left to check (as of May this year), but we only ever have a finite amount of Earwig search engine searches or Turnitin credits. Processing all of these automatically means we have to work with the WMF to get more credits for a one-time run-through, and we're not sure if we'll get decent results for a majority of those checks.
    • We could work around this by completely disabling search engine checks, as the thread you linked discussed, but this can either work for or against us based on the case. We could also work around this by only selecting a few cases which rely mostly on web sources or (for Turnitin) sources that we know would probably be indexed. This significantly cuts down on the amount of revisions to check. But then there's the next issue:
  • A lot of the older cases, especially the ones over three years old, start getting a lot of false positives. As article text remains on the wiki for long periods of time, SEO spam sites, academic documents, slideshows, and others start copying from Wikipedia. We filter out a lot of these already (like those in this list and a bunch of others), but we still hit them every once in a while and enough that it clogs up what reports we would otherwise get from Earwig/Turnitin.
    • A possible solution to this would be human intervention (which is more or less a given with something like this), where editors will double-check to see if a flagged revision actually is copied from somewhere, or if it's just a false positive. Human intervention will weed out false positives, but then it won't weed out the false negatives.
  • At the end of the day, copyvio checking is a really hard computer science problem that humanity is still in the middle of solving. False negatives; like when a revision flies under the radar because a source it copied from has died, or when the text has been paraphrased enough to make checkers think it's completely original text; will always be one of the biggest brick walls we face. False positives waste editor time, yes, but false negatives arguably take up more time, because we then need to re-check the case. It also wouldn't be a good look for us or the WMF if it turns out that we get a lot of false positives and negatives, since that could be perceived by the community as a waste of funds. Perhaps this is still something that could benefit from research and testing.
    — User:Chlod 13:02, 24 November 2023 (UTC)
This was for checking revisions on CCI pages, but the same applies for scanning every latest revision for all articles. It seems we've also been stretching Earwig to its limits recently, Earwig has been going down for almost every day in the past two weeks (CommTech's UptimeRobot). Unfortunately, the Earwig logs are project members-only, so I can't snoop in to figure out the cause by myself. But usually, we chalk this up to Earwig running out of Google API tokens. Would appreciate comments or ideas for the problems above; anything to ensure copyvios don't fly under the radar. Chlod (say hi!) 02:15, 26 December 2023 (UTC)Reply[reply]
Chlod thanks much for this. A few questions or comments:
  • Whats the 508,000 revisions? Is that just from CCI investigations?
  • In that same bullet, what cost are you talking about, processing time? And what did you mean by decent results, are you alluding to false +/- that you raised lower down?
    • As far as the workarounds, this sounds like roughly what I referred to as various pruning methods to shorten or reorder the input list.
  • Re false + due to websites copying from Wikipedia, I don't see this as a major problem and I addressed it in the 'direction of copy' comment involving IA checks. Maybe we'd have to negotiate with IA for a certain amount of search traffic per unit time, but as a fellow non-profit and given the reasons for it, I can't imagine there wouldn't be some positive arrangement to come out of that. That would eliminate the need for human intervention in a proportion of cases; see the "if-then" psuedo-code at the end of my comment. The triage attempts to automate a lot of it, and steer only the grey-area cases toward human intervention. And it should also weed out most false negatives for the same reason, and I don't see the failure to have 0% false negatives as a problem. There is always a problem identifying edge cases, even when humans are involved; if an automated solution improves our accuracy and throughput over what it was before, then it's worthwhile. One hundred percent accuracy and coverage are a goal but they will never be attained and that shouldnt stop us from incremental progress; even if automated processes fail to identify some sites for human intervention, we'll catch 'em, hopefully, next iteration of the processing.
  • "Really hard computer science problem": again, imho, we don't need to "solve" it, we just need to do a bit better than we were doing heretofore. Paraphrase will fall, imho, to better shingling turbocharged with some AI to recognize synonyms and linguistic transformations at some point in the not-nearly so distant future as I would've guessed a year ago. We needn't let the perfect be the enemy of the good, and I think we can do a lot of good now.
  • Earwig woes: is anyone maintaining it?
Thanks, Mathglot (talk) 00:02, 27 December 2023 (UTC)Reply[reply]
  • Yep, the 508k revisions is those we have to check at CCI. That's from a dashboard by Firefly to see how much is left. It has its inaccuracies, but it's correct for most cases.
  • For the cost, it's actual monetary cost. From what I've heard (and what I assume from what I've heard), the WMF pays for the Google API and Turnitin credits, and that cost is pinned to how much we use Earwig and how many edits are checked by CopyPatrol, respectively. Attempting to request more credits for either needs discussion with the WMF, who then needs to discuss with Google/Turnitin. And yeah, the decent results is whether or not Earwig comes up with a false positive/negative.
    • Definitely; there's a lot of one-or-two-sentence stubs that don't really need checking. This could, of course, be filtered out, possibly with a lot more criteria for skipping than just that.
  • I'm wary about using Internet Archive as a "source of truth" for dates. Though we do exactly that in CCI, it's probably not reliable enough to make broad judgements on whether a page is a copy or was copied from. If the pipeline goes Earwig → URL of likely match → Internet Archive, the data it would provide in a report could be a false positive if either the page changed URLs at any point in time (as I've seen happen with Sparknotes) as Internet Archive may not recognize the switch or if it was never archived before (though this practically never happens for recently-added citations). Of course, it's best if this is tested empirically first.
    • This is a step in the right direction though. The downside of not using a system like this at all is that the direction checking will be manual, which then just pushes the investigation work back to the addressing user/administrator, and that could result in anywhere from zero (by luck) to a lot of false positives. But what has to be checked first is whether this will end up increasing processing time/workload for checking users.
  • Earwig's Copyvio Tool is actively maintained by The Earwig. The recent downtimes were shortly discussed in User talk:The Earwig § Copyvio tool is down; I only saw this now. Seems to have been from increased usage.
I agree; something is better than nothing. I'm mostly just worried about stretching the few editors working on copyvio even thinner by adding more work to do. We could balance this by encouraging more editors to help out at WP:CCP, but copyright cleanup really just has historically low participation rates. Chlod (say hi!) 05:14, 27 December 2023 (UTC)Reply[reply]
Hey Chlod, thanks for pinging me here.
  • With Google's API, there's a hard daily limit of 10,000 queries per day, which costs US$50. The copyvio detector will make up to 8 queries per page (each query corresponds to a sentence or so of text, so that is chosen to strike a balance between performance and detection accuracy – longer articles would really benefit from more than 8 queries in many cases). So that works out to somewhere between 1,250 and 10,000 articles per day; let's say 2,000 on average. To be very clear, that's a limit built into Google's API terms. We can't get around it without a special agreement with Google, and everything I've heard from the WMF indicates we have no special agreement: we're paying the regular rate. Over ten years of running the copyvio detector, and despite multiple people asking, I've never managed to make the right connections with the right people at Google to get a special agreement (or the WMF hasn't, and IMO it's really them who should be doing that instead of me).
  • Just bashing the numbers out, checking 500,000 pages without a special agreement with Google would cost $12,500 and take at least 8 months (again assuming 5 queries/page).
  • The search engine is really the limiting factor here, hence my emphasizing it. Compute cost is much cheaper and we could use WMCloud to parallelize this more effectively if the daily limits weren't so severe.
  • Recent issues aren't related to using up all of our Google API credits but mostly due to my own poor software engineering decisions ten years ago. Sometimes it's due to unauthorized bot traffic that needs to be identified and blocked, but in this case I haven't noticed any. There's an ongoing project to improve performance, but no timeline for when it will be ready, unfortunately.
— The Earwig (talk) 14:53, 27 December 2023 (UTC)Reply[reply]
Thanks for these detailed explanations. Just noting that I've started User:Novem Linguae/Essays/Copyvio detectors to try to document all these copyright tools and their nuances. Seems like every couple months this comes up and I've forgotten all the details since the last discussion, so maybe an essay will help me remember it :) –Novem Linguae (talk) 12:13, 31 December 2023 (UTC)Reply[reply]
@The Earwig: Anywhere I could possibly help with the copyvio detector's uptime? It's also affecting the NPP workflow at times, as the copyvio detector is part of checks to be done when patrolling. Chlod (say hi!) 13:56, 4 January 2024 (UTC)Reply[reply]
@Chlod: Thanks for offering to help! I've given you maintainer access to the tool, and you have permission to restart it when needed. This is the case if the request backlog gets full (a log message "uWSGI listen queue of socket" is printed to uwsgi.log over several minutes) but occasional slowness doesn't necessarily mean the queue is full and needs to be cleared. It's good for us to have maintainers across different timezones. But beyond the occasional restarts, addressing the underlying issue is complicated and not something I expect help with. As hinted above, a backend rewrite is in progress to improve performance. — The Earwig (talk) 16:41, 4 January 2024 (UTC)Reply[reply]
As I understand it, the issues with applying Earwig's copyvio thing to more pages (and the reason it always takes a million years to run) has nothing to do with computational power or programming skill on our part, but rather because Google search, which is a quite critical part of this software working, has deliberately decided to fuck us sideways on search queries.
Well, it's not clear: it could be that or it could be that nobody from Wikipedia or from the WMF has succeeded in figuring out how to ask them from a special dispensation.
At any rate, we have a rather low quota, and it would cost tens of thousands of dollars to make it higher, and we do not get any special dispensation although I guess they are perfectly fine to make millions of dollars from reusing our content in their own knowledge panels lol. jp×g🗯️ 11:25, 28 December 2023 (UTC)Reply[reply]
Maybe @NPerry (WMF): might give more insight as to why the Wikimedia Foundation has not been able to get resources for copyright detection with Google search ? AFAIR, last year, they were involved with managing Wikimedia's partnership with Google. Sohom (talk) 11:54, 28 December 2023 (UTC)Reply[reply]
  • I'm not active in copyvio detection work, so take what I say as an outsider's perspective. Overall, copyvio detection on Wikipedia seems like an area that's struggling despite the heroic efforts of those working on it — multi-year backlogs at places like CCI are indicative of a system that's just not working. Bot assistance is our best hope of changing that dynamic on a systemic level, so I think it's a fruitful avenue to pursue. It'd be complex on a level greater even than ClueBotNG, but if successful it'd be similarly impactful.
    One thing to perhaps think about is the difference between old copyvios and newly added ones. My vague understanding is that a lot of the difficulty/pain comes from years-old insertions, which have since been built upon, necessitating removal of large chunks of an article. If it'd be simpler to build a bot that only checks/fixes new contributions, then perhaps that'd be a good place to start. If it could sufficiently stem the tide, perhaps it'd lead to a situation similar to what we have with non-notable articles/deficient FAs today, where there's a bunch of stuff in the past to clean up, but ultimately it's a finite backlog with few new entries being added, creating hope we'll someday get through it (cf. WP:SWEEP).
    Hope that's helpful, and good luck with this work! {{u|Sdkb}}talk 00:03, 3 January 2024 (UTC)Reply[reply]
  • (Possible overlap with part of above) - we have a copyright flagging system already (see log) - and allowing more bots to flag is fairly easy to do. Like many have said, building a reliable algorithm for doing the actual checking is a "hard" problem. One problem that came up during prior third party solutions like TURNITIN is that these companies wanted to reuse Wikipedia content without honoring the licensing requirements (e.g. We send them some text, they store it, then they reserve that to other people without attribution). — xaosflux Talk 17:00, 4 January 2024 (UTC)Reply[reply]
  • My 2c is it makes more sense for the WMF to spend some cents on a copyright monitoring service like every other content publisher on the internet rather than have volunteers make one from scratch. Levivich (talk) 05:54, 5 February 2024 (UTC)Reply[reply]
    We presently do that to some extent with iThenticate, from what has been mentioned above. MusikAnimal mentioned something about a Turnitin partnership being negotiated above. I do wonder what the final details would be. — Red-tailed hawk (nest) 06:06, 5 February 2024 (UTC)Reply[reply]
    That is still in the final stages. I'll have more information to share soon. To set expectations, I highly doubt we can do any sort of large-scale checks for copyvios on arbitrary revisions/articles. However I am confident we'll have a reliable service that automates copyvio checks on any new content being added, via toolforge:copypatrol. In addition, doing something like phab:T165951 is something I think we could look into. MusikAnimal (WMF) (talk) 16:16, 14 February 2024 (UTC)Reply[reply]

Would be nice. But how would it differentiate stuff that is mirroring Wikipedia? North8000 (talk) 20:46, 13 February 2024 (UTC)Reply[reply]

Using the exclusion list at User:EarwigBot/Copyvios/Exclusions works pretty well for the Earwig copyvio detector. –Novem Linguae (talk) 00:16, 14 February 2024 (UTC)Reply[reply]
Either Html creation date (often in the Html <head>) or archival copy date. Nothing is a perfect solution here; this is about improving on what we have now, not trying to claim a 100% effective in every possible case. Also, a tool doesn't have to make the call in every case; it's enough to triage into very likely okay, very likely not okay, and a middle tier which requires human intervention. If the tool does nothing but handle low-hanging fruit, that's that much less for human editors to have to do. Mathglot (talk) 00:23, 14 February 2024 (UTC)Reply[reply]

Workshop: draftifying[edit]

Firstly, Jimbo Wales agrees with me. Well, not with me directly. But with the gist of this argument, and the argument behind unreferenced-PROD. He wrote in 2006:

I really want to encourage a much stronger culture which says: it is better to have no information, than to have information like this, with no sources. Any editor who removes such things, and refuses to allow it back without an actual and appropriate source, should be the recipient of a barnstar.

[1]

Anyways...

As a New Page Patroller, I frequently draftify unsourced articles.

Not unfrequently, the creator of the article moves the draft back to mainspace, or re-creates it with the same content. The topic is frequently fringe, difficult to verify, but not necessarily PRODable or AfD'able.

What to do? There's an unsourced "article" in mainspace. It should be in draftspace, where it should be improved by the creator. It is unfit for mainspace. As one of my favourite essays points out, unsourced content is essentially digital graffitti and should be removed. The WP:BURDEN is on the creator to add references to their claims.

It isn't 2005 anymore. We shouldn't have new unsourced articles being created. They do get created, but are usually PRODed or draftified by NPPers.

Per WP:DRAFTIFY, we aren't allowed to re-draftify an article. Because of this clause, draftifying is essentially useless. All the creator has to do is move it back.

An analogy (or possibly a parable):

Someone dumps a pile of garbage on the sidewalk. There might be some re-usable or recyclable items in there, but it's hard to tell. Per municipal policy, a street cleaner takes it to the waste-dumper's house. It's their garbage.
Instead of throwing it out normally, or sorting out re-usable or recyclable stuff, the waste-dumper takes their garbage out of the facility and puts it right back onto the street. The street cleaner finds it again. Municipal policy states that the cleaner should either sort through it themself or ignore it. Once they have finished sorting, they should keep the recyclable items and take the rest to a waste-management facility, where they will have to write a report detailing why they think the garbage should by destroyed. The waste management facility is called AFD.
This is clearly nonsense. Why should the street cleaner have to sort through someone else's garbage?

I would like to propose disallowing draftified articles being moved back to mainspace if the problem for which the "article" was draftified has not been fixed. Let the street cleaner take the garbage back the waste-dumper's house. 🌺 Cremastra (talk) 15:34, 13 January 2024 (UTC)Reply[reply]

Notified: WT:NPP, WT:Draft, WT:AFD. ~~~~
🌺 Cremastra (talk) 15:42, 13 January 2024 (UTC)Reply[reply]

References

  1. ^ Wales, Jimmy (2006-07-19). "insist on sources". WikiEN-l. Retrieved 2007-01-31.
  • A reminder for those who don't notice the brightly-coloured editnotice: this is the idea lab, so no !votes. 🌺 Cremastra (talk) 15:35, 13 January 2024 (UTC)Reply[reply]
  • I think we discussed this before on Discord, and the main highlighted point was: 'Users have the right to object to draftification and can move it back to the article space.' But it's good to see you here, searching for some additional ideas. – DreamRimmer (talk) 16:17, 13 January 2024 (UTC)Reply[reply]
  • When I check my Draftify log, most articles have been improved & returned to Mainspace. The major challange is the PROD/AfD process. In May, 2023 I setup my PROD/AfD subpage to track articles. I am totally Ok with reverts along with a reasonable explanation. It's a problem for un-explained reverts, and "junk/incomplete" articles remaining in mainspace. And I understand the goal is article improvements. Thanks for this discussion. Regards, JoeNMLC (talk) 16:44, 13 January 2024 (UTC)Reply[reply]
    Looking at one's draftify log is a good idea. Looking at my own, from August 2023, when I only draftified 8 articles:
    • One was redirected, after some confusing duplicated drafts/AfC-dodging and this related AfD.
    • Five remain in draftspace. Two of those drafts have been deleted because they were created by a sockpuppet.
    • One has been moved back to mainspace after being improved in draftspace, and looks pretty good.
    • One was re-created, unreferenced, in mainspace. It was unsuccessfully PRODed by a different user in October 2023; it has no references but is a list of sorts.
    🌺 Cremastra (talk) 16:54, 13 January 2024 (UTC)Reply[reply]
  • I hate that linked essay, and have found that in most cases, unreferenced prose is relatively easily verifiable in published sources that the original editor neglected to include.
    Having said that, I do think the current wording of WP:DRAFTOBJECT is overly strict. I don't think the same reviewer / patroller should draftify the same page more than once, even in the absence of improvements, but if multiple reviewers / patrollers think an article should be returned to draftspace for improvement, that no longer strikes me as "unilateral", and it is in fact the draft creator's moves to mainspace that are "unilateral", and the required next process should be AfC rather than AfD.
    The AfD problem is real, but the garbage analogy is inapt. Unreferenced articles are less "this is garbage" and more "someone didn't fill out the paperwork". (Also, unless you're very nosy, it's usually pretty difficult to determine whose garbage you've happened across littered in the public space, and no municipality I'm aware of requires street cleaners to sort waste streams on pickup, even if it is best practice. Typically, this duty falls on the people who work the recycle and hazmat streams at the transfer station or other facilities, with the acknowledgement that the landfill stream will often contain material that properly ought to be processed differently.) Folly Mox (talk) 18:01, 13 January 2024 (UTC)Reply[reply]
  • To 99% of people having their article moved to draftspace is going to discourage them from ever improving it.★Trekker (talk) 22:16, 13 January 2024 (UTC)Reply[reply]
    Why so? Having it moved to draftspace is a chance for them to fix it without other editors swarming over it with cleanup tags and PROD templates and brightly-coloured banners. 🌺 Cremastra (talk) 22:42, 13 January 2024 (UTC)Reply[reply]
    I believe @WhatamIdoing has the specific numbers, but draftified articles have a dismal return-to-mainspace rate. Mach61 (talk) 01:26, 14 January 2024 (UTC)Reply[reply]
    Draftspace is where articles go to die, and we've known that for years. Steven Walling knows the original research on this best, and if you wanted to get more recent numbers, @Cryptic or someone else at Wikipedia:Request a query could probably tell you what percentage of pages in the Draft: namespace got deleted last year (e.g., created in January 2023 and deleted from the Draft: namespace since then).
    You can also estimate it from the logs. You can find the number of page moves into vs out of the Draft: space in Special:Log/move and the number of articles created and deleted in Special:RecentChanges. The numbers for the last couple of days look like roughly 120 articles created each day, 150 articles moved into the draftspace each day, 150 articles moved out of the draftspace each day, and 150 articles deleted each day. We put 270 articles in, and we deleted 150 of them. That's a 55% deletion rate. Ideally, you'd look at these numbers over the space of at least 7 full days, as there are definitely weekly trends in activity, and things like a holiday weekend, an important football game, a change in the activity level for even one key editor, etc., can throw the numbers off quite a bit. WhatamIdoing (talk) 22:48, 14 January 2024 (UTC)Reply[reply]
    But is all this necessarily bad? I believe in quality over quantity. 🌺 Cremastra (talk) 22:51, 14 January 2024 (UTC)Reply[reply]
    I believe in notable topics being allowed to have articles that have a chance to actually become better, which they do not in draftspace.★Trekker (talk) 21:58, 24 January 2024 (UTC)Reply[reply]
  • Issues I believe this proposal would have to first resolve in order to have any chance of gaining consensus: (1) There will probably be a dispute about whether the alleged problem with the article/draft actually existed in the first place. (2) There will probably be a dispute about whether the alleged problem with the article/draft was sufficiently serious to justify draftification. (3) There will probably be a dispute about whether the alleged problem with the article/draft has actually been fixed. In all three cases, the draftifier is not particularly unlikely to be completely on the wrong side of consensus. The fact that the draftifier believes or claims that a page is "garbage" does not mean that the page actually is garbage. To take the example given by the proposer, I have, over the course of many years, seen many articles tagged as "unreferenced", despite that fact that those articles very obviously did have references (presumably because the tagger just did not like the references in the article). I cannot imagine the community supporting the unilateral draftification, with no right of appeal, of articles, where there is a real dispute about the 'appropriateness' of the draftification. James500 (talk) 02:18, 14 January 2024 (UTC)Reply[reply]
  • I don't think this can pass. Judging from previous village pump discussions, about half of Wikipedians don't like draftspace, seeing it as a backdoor to deletion. The de facto options in situations where a very poor article (such as one with no sources) is WP:DRAFTOBJECTed are WP:AFD, or WP:TNT. Hope this helps. –Novem Linguae (talk) 03:58, 14 January 2024 (UTC)Reply[reply]
  • Unsourced articles are an easy target but fundamentally WP:DRAFTOBJECT is not about article content, it's about WP:CONSENSUS. If you think an article doesn't meet WP:V and therefore shouldn't be in main space, but another editor disagrees in good faith (i.e. by reverting your bold move to draft space), then you have to stop and talk about it. There's really no way around that. You can't just insist that you're right and the other editor has to satisfy you, because you're the New Page Reviewer. That's not "the encyclopedia that anyone can edit". Besides, I've seen NPPers wrongly identify articles as unsourced plenty of times, whether because they missed references that looked like something else, a new editor struggling to format their sources, or just didn't read it properly. Folly Mox makes a good point about multiple reviewers being involved above, but still, if multiple editors are involved in a dispute about where a page should be, we'd expect them to discuss it (e.g. at AfD), not get into a move war. – Joe (talk) 07:14, 14 January 2024 (UTC)Reply[reply]
  • Wasn't the "information like this" from the Jimbo quote something about two tech founders throwing pies at each other to settle a dispute? That probably doesn't apply to most of the articles we're talking about, which don't tend to involve unlikely stories about BLPs (the quote is from years before the creation of WP:BLP). A few thoughts:
    • The underlying assumption is that the article creator WP:OWNs the article. This is supposed to be a collaborative project, so why should we treat an unsourced article as "your garbage"? I disagree that unsourced content is always "garbage" or "graffiti", but why don't we think of it as "our" problem? New content is a gift to us and to the world. Some gifts are bigger or smaller, and some are better or worse, but the absence of a little blue clicky number doesn't make it garbage. (My own idea of garbage is misinformation and disinformation.)
    • The belief that an unsourced article is "unfit for mainspace" is not supported by any policy or guideline. It is the personal preference of a fraction of editors, but it's not one of our rules. If we want to build a system based on this preference, then that preference needs to be turned into an actual rule first.
    • I wonder how big this problem actually is. I checked the pages created during the last three days in the mainspace and draftspace, using the visual editor (because there's a tag that makes it easy to check for the addition of a new ref, but it's not available for the 2010 wikitext editor [yet?]). 40% of them were redirects, at least 45% had at least one ref tag added in the first version of the page, and the remaining small fraction either had a ref added later (example, example, example), or not at all (example article, example dab page), or it actually had refs but they weren't autodetected (example, example, example, and pinging User:ESanders (WMF) to see whether that's a bug in mw:EditCheck). This is overall not feeling like a serious problem. Most pages that are supposed to have refs (e.g., they're not dab pages) are already getting refs. In fact, having looked at this, I don't think I would draftify a new article if this were the only serious problem.
  • WhatamIdoing (talk) 23:41, 14 January 2024 (UTC)Reply[reply]
    Unsourced articles are definitely unfit for mainspace in this day and age especially if they don't fall into the evergreen WP:NSPECIES, WP:NPOL and WP:NGEO spectrum. I personally prefer the AFD route than the draftification route, however, it still stands that unless improved, a complete unsourced article is no better than misinformation and disinformation. Sohom (talk) 15:42, 15 January 2024 (UTC)Reply[reply]
    @Sohom Datta, there is no policy or guideline that says all articles must cite at least one reliable source. Wikipedia:Notability explicitly says the opposite: what makes a subject notable is whether sources exist in the real world, not whether sources have been typed into the Wikipedia article. It is true that some individuals personally believe that an article without a source is unfit for mainspace, but that's a personal belief and is not supported by policy.
    BTW, the research on the draftspace indicates that if you want unsourced articles to get sources, you need to leave them in the mainspace. If your goal is to get them deleted with a minimum of fuss and bother, then you should put them in the draftspace. WhatamIdoing (talk) 17:27, 15 January 2024 (UTC)Reply[reply]
    I do agree that articles need to have at least one reliable source cited though. I think what you meant here is that this should not be acted retroactively. CactiStaccingCrane (talk) 17:29, 15 January 2024 (UTC)Reply[reply]
    My point is even smaller than that: Editors should not misrepresent the state of the actual rules by claiming that unsourced articles can't be, or shouldn't be, in the mainspace solely because they are unsourced. The correct (i.e., accurate and honest) process is:
    1. Rules explicitly do not require a source to be cited in a mainspace article.
    2. Get rules changed to require at least one source to be cited.
    3. Tell editors (especially newbies) that their new article is unacceptable because it does not comply with the new rule.
    The process that some editors are currently using is:
    1. Rules explicitly do not require a source to be cited in a mainspace article.
    2. Tell editors (especially newbies) that their new article is unacceptable because it does not meet my personal criteria, while pretending that my personal criteria are the actual rules.
    Whether the new rule is retroactive or not is not really a concern of mine. I am confident that it would eventually become retroactive even if it doesn't start that way. (That's exactly what happened with the rules for WP:BLPPROD: it started off as solely forward-looking, and became retroactive later.) What concerns me is editors claiming that the rules are X when the rules are actually not-X. Either change your claims or change the rules, but don't misrepresent the rules. WhatamIdoing (talk) 17:41, 15 January 2024 (UTC)Reply[reply]
    @WhatamIdoing I think you missed my point about a AFD. The fastest way to get sources to a article in my experience (counterintuitively and unfortunately) is an articles for deletion/discussion, not letting it languish in mainspace (and definitely not draftspace). A AFD puts it on the radar on multiple wikiprojects, which are much more likely to provide reliable sourcing than I will ever be able to provide.
    If even after 2/3 weeks of advertising, nobody (including the article creator) thinks the article is worth saving, that could/should indicate that the article is probably not notable at that current moment.
    Also, I agree that there currently exists no policy that prevents a editor from not including any sources in a article (theoretically). But at a much more practical level, it is not really fair to expect a editor with limited prior understanding of the subject matter to accurately evaluate a articles notability if they have absolutely zero starting points for their search for sources. Sohom (talk) 17:57, 15 January 2024 (UTC)Reply[reply]
    And yet we say that Wikipedia:Deletion is not cleanup, because it's uncollegial and anti-collaborative for an editor to demand that others drop everything they're doing because an article must be sourced this week, or it will be deleted. As you say, editors with limited prior understanding of the subject matter have difficulty accurately evaluating notability for those subjects – so they shouldn't be sending them to AFD in the first place. AFD is for articles that you genuinely believe to be non-notable, not articles you'd like someone else to improve right away.
    Permitting editors to use AFD to demand clean up of subjects they're unfamiliar with is also a source of systemic bias. We've had altogether too many cases of editors sending Asia- and Africa-related subjects off to AFD out of ignorance, thinking that WP:NEVERHEARDOFIT is a good enough excuse and that if other editors want to keep it, then they will cheerfully drop everything they're doing to provide sources. If nobody intervenes, we lose the articles. This is not okay. WhatamIdoing (talk) 18:07, 15 January 2024 (UTC)Reply[reply]
    I personally don't think it is uncollaborative to go "Hey, this article has no sources, and I could not find any based on a few Google searches, what should we do about this ?" (which tends to be most AFDs these days). For all you know, it could be a radioactive peice of hallucinating ChatGPT junk (or other promotional garbage) which needs to nuked out of orbit ASAP, or it could be a documenting an important niche topic that few people have heard about which needs to be preserved. AFD is lot more collaborative than, "well, that's somebody else's problem I guess" and walking away. Sohom (talk) 18:28, 15 January 2024 (UTC)Reply[reply]
    Hey, this article has no sources, and I could not find any based on a few Google searches, what should we do about this ? That's something that should be said on the article's talk page, not in an AfD. In AfD, what should be said is: "Here are very good reasons why this article should be deleted". The two statements are not interchangeable. Sometimes they will address the same situation, but sometimes they won't. —Alalch E. 15:04, 17 January 2024 (UTC)Reply[reply]
    Do you mean that the creating-edit isn't tagged with "adds reference"? That does look suspicious. Am I still allowed to ask you to file a Phab task 🙂 ESanders (WMF) (talk) 18:31, 15 January 2024 (UTC)Reply[reply]
    If y'all end up creating a phab, could you subscribe @soda (me) as well :) Sohom (talk) 18:34, 15 January 2024 (UTC)Reply[reply]
    Ed, you can always ask... ;-) WhatamIdoing (talk) 14:32, 19 January 2024 (UTC)Reply[reply]
  • Oppose - The remedy here is AFD, not permanent banishment to AFC (followed by eventual deletion without discussion through G13). If you can't see a consensus to delete, please don't harass article creators. ~Kvng (talk) 13:15, 16 January 2024 (UTC)Reply[reply]
    Sigh. You didn't read the banner, or the edit notice, or my reminder, did you. 🌺 Cremastra (talk) 13:24, 16 January 2024 (UTC)Reply[reply]
    @Cremastra, No I clearly did not. Sorry. I'm not sure how to be constructive about this proposal. Does that mean I should say nothing? ~Kvng (talk) 02:14, 17 January 2024 (UTC)Reply[reply]
  • "The topic is frequently fringe, difficult to verify, but not necessarily PRODable or AfD'able." If it would be appropriate to boldly draftify an article, then it would appropriate (when contested without fixing the issue) to nominate it at AfD for draftification. As with a contested blank-to-redirect, for which the appropriate discussion venue is AfD per this request for comment, you don't need to request deletion in order to nominate an article at AfD. And if it is not nominated for the purpose of deletion, then a full WP:BEFORE inquiry about whether the subject is notable and so forth isn't applicable.
    I'd like to see the standard draftification messages more explicitly say that if an editor disagrees with the reasons for draftification, they can respond to the reasons for the move and ask (insist) that the article be restored to mainspace until there is a discussion to either delete it or make it a draft. SilverLocust 💬 19:56, 16 January 2024 (UTC)Reply[reply]
  • The topic is frequently fringe, difficult to verify, but not necessarily PRODable or AfD'able—why would it not be AfDable?—Alalch E. 15:01, 17 January 2024 (UTC)Reply[reply]
    I suspect that what's meant by "not AFD'able" is "it would probably not get deleted at AFD". WhatamIdoing (talk) 14:44, 19 January 2024 (UTC)Reply[reply]
    WhatamIdoing, when you write The belief that an unsourced article is "unfit for mainspace" is not supported by any policy or guideline, that seems to contradict to our core content policy of Verifiability which says All quotations, and any material whose verifiability has been challenged or is likely to be challenged, must include an inline citation to a reliable source that directly support the material. Emphasis added. If any editor acting in good faith says "I challenge the material in this particular unreferenced article, because it is unreferenced", does that not impose an immediate policy burden to provide references (citations) that verify the challenged material? Cullen328 (talk) 00:25, 20 January 2024 (UTC)Reply[reply]
    And since that hasn't happened? "I challenge every unsourced article just because they're unsourced" is not acceptable. Even then, issuing a WP:CHALLENGE doesn't make the material unfit for the mainspace. If it did, then {{citation needed}} would hide the text instead of leaving it there. WhatamIdoing (talk) 18:12, 26 January 2024 (UTC)Reply[reply]
    @WhatamIdoing and Cullen328: It would be great if {{citation needed}} hid the text! When someone found a reference, the template could be removed and the text could reappear! Like a more powerful version of {{citation needed span}}. Instead of this:
    It is a city on the planet Earth.[1] The city has a population of 300,320 as of 2019.[citation needed] It is 10km from City Y.[1]
    We could have:
    It is a city on the planet Earth.[1] (unsourced content—please add a reference) It is 10km from City Y.[1]
    Of course, that's an ugly version, but with templatestyles we could have much better CSS. Not having the tooltip rely on title= would be a start. (If you have external CSS, you can do it with something like<span class="tooltip" data-mouseover="mouseover text here"> in the HTML and CSS like this:
    .tooltip:hover::after {
    	cursor: help;
    	content: attr(data-mouseover);
    	background-color:peru;
    	z-index:5;
    	position:fixed;
    	font-size: 15px;
    	color: white;
    	padding:2px;
    }
    
    Or, at least, that works for me. Fundamentally a good idea.
    And "I challenge every unsourced article just because they're unsourced" is acceptable. The burden remains on the writer. 🌺 Cremastra (talk) 18:42, 26 January 2024 (UTC)Reply[reply]
    I think you're reaching Wikipedia:Don't disrupt Wikipedia to make a point territory with that last comment. Anomie 19:00, 26 January 2024 (UTC)Reply[reply]
    What? CactiStaccingCrane (talk) 01:17, 27 January 2024 (UTC)Reply[reply]
    "I challenge every uncited statement from here to infinity" is one of the things we routinely give as an example of unacceptable behavior at WT:V and of Wikipedia:Disruptive editing#Point-illustrating. Wikipedia is a collaborative project, and that means that each of us have to put limits on our own behavior. WhatamIdoing (talk) 08:09, 28 January 2024 (UTC)Reply[reply]
    I wouldn't do that, because I think it would be a little extreme. But it is acceptable under WP:V 🌺 Cremastra (talk) 12:28, 28 January 2024 (UTC)Reply[reply]
    @Cremastra, you really should look at WT:V. This is a controversial position and a lot of time and goodwill is spent arguing it on a seemingly ongoing basis in different venues around the project. ~Kvng (talk) 13:35, 28 January 2024 (UTC)Reply[reply]
    There are two different points here:
    • There is occasional discussion around whether it's fair for you to blank content that you personally know (or strongly suspect) is verifiable solely because someone else didn't add a citation.
      • I sometimes call this the Mother may I? view, since this type of removal is based on someone doing the right thing by the content (adding accurate and Wikipedia:Glosary#verifiable information) but not following the proper form (saying Mother, May I?/adding a citation at the same time).
      • This seems to appeal to editors with a control streak (though not exclusively to them): they might have the source in hand, but what really matters to them is forcing others to follow their orders.
    • There is no argument about whether an editor can issue a blanket WP:CHALLENGE for every uncited piece of material in all 6,775,901 articles. This is always considered inappropriate.
    WhatamIdoing (talk) 16:33, 28 January 2024 (UTC)Reply[reply]
    I more frequently see the issue with longstanding material. A patttern I've been involved in several times is an editor who is typically not an expert in the subject matter will boldly remove longstanding unsourced content and edit war with anyone attempting to restore it citing WP:BURDEN. While I consider this WP:DISRUPTIVE, WP:DEMOLISH and WP:NODEADLINES, they believe by removing well-vetted but uncited material they're following policy and Jimmy's wishes and are improving the quality of the article or, probably more accurately, forcing others to improve the quality of the article. ~Kvng (talk) 17:31, 28 January 2024 (UTC)Reply[reply]
    When faced with someone who would rather destroy content than search for sources, it's usually faster to Wikipedia:Let the Wookiee win and add those sources yourself than to try to get them to WP:Use common sense. WhatamIdoing (talk) 01:04, 9 February 2024 (UTC)Reply[reply]
    There are editors that know stuff and there are editors that are good at research. Both types make valuable contributions. Not many editors who know stuff are interested in finding sources for stuff they already know. Many WP:VOLUNTEER editors don't like to be told what to do. We also don't enjoy arguing with a wikilawyer. And so editors with this "quality" mindset can get significant traction. ~Kvng (talk) 15:04, 9 February 2024 (UTC)Reply[reply]
    And, unfortunately, when (I think) I know something, and try to find a reliable source to support it, I all too often find that either I cannot find reliable sources to support what I know, that recent research has overturned what I learned long ago, or, even worse, that reliable sources disagree with me. Donald Albury 15:16, 9 February 2024 (UTC)Reply[reply]
    I find I need to make a trip to the library since I have not kept most of the books I've learned from.
    Do you you support removing all unsourced material in the encyclopedia? ~Kvng (talk) 15:31, 9 February 2024 (UTC)Reply[reply]
    I support removing content that I cannot find reliable sources for. I routinely revert new unsourced additions that are inserted in front of an exisiting citation, but apparently are not supported by that citation, or which are extraordianry claims (i.e. this revert), or which I believe are unlikely or contradictory to sourced content in the article. I will removed unsourced content that has been marked with CN for a long time, and for which I cannot find a source in a routine search, or for which I feel is of little relevance to the article, or quite likely to be wrong. I will remove unsourced content when I am expanding or revising an article and the said content does not feel congruent with the sourced content I am adding (i. e. this edit). Donald Albury 16:45, 9 February 2024 (UTC)Reply[reply]
    I do most of that too. Additionally I add tags to stuff I'm unsure about.
    What I was talking about above is editors that don't appear to have the background to evaluate the material removing tagged unsourced material citing WP:BURDEN. In the most disruptive cases they tag unsourced statements with minimal evaluation and no research and then come back a week later and remove the material when no one has stepped up and added sources.
    These do go to ANI and they don't get good support for the behavior but, since they are generally good wikilawyers and ANI is often dysfunctional, there is no official consequence. You just have to wait for them to tire of the game. ~Kvng (talk) 19:19, 9 February 2024 (UTC)Reply[reply]

Mickey Mouse (film series) and adding shorts[edit]

What do people think about adding the public domain Mickey Mouse shorts directly to the Mickey Mouse (film series) page? I think it would make them feel more accessible since you don't have to go to the direct page for each short. And it makes the public domain status feel more concrete to use them in a wider context. Generally this is an idea I have for all short film series, but I wanted to start smaller. SDudley (talk) 01:11, 21 January 2024 (UTC)Reply[reply]

I would agree with you, but adding a video short needs to be credited as well as using more common sense on consensus. Gold Like Shore8 (talk) 00:43, 3 February 2024 (UTC)Reply[reply]
Credited how? SDudley (talk) 18:10, 6 February 2024 (UTC)Reply[reply]

Make the talk page "Add a topic" form clearer[edit]

Currently if a user clicks "Add a topic" on an article talk page, they are prompted for a Subject and a Description and there's no further explanation of what's happening or what they're expected to type there.

Talk pages like Talk:ChatGPT and Talk:Speech synthesis have ended up having to be protected because so many IP visitors think that they're interacting with that software when they type there, and don't realise that they're posting a message on a Wikipedia talk page. Talk:DALL-E gets a lot but hasn't been protected yet. There are also weirder cases like Talk:Doppelgänger (perhaps it's also the name of an app?) where IPs constantly post short sentences about wanting to see their doppelgänger, sometimes entering their email address.

Can we give these cryptic Subject/Description boxes better names, and/or add a short "you are about to post a comment to Wikipedia" message somewhere? Description in particular seems a very strange word to use at all, for something that's a comment or a question. Belbury (talk) 15:47, 22 January 2024 (UTC)Reply[reply]

I have never had a problem, but I am probably not a typical reader. Phil Bridger (talk) 18:39, 22 January 2024 (UTC)Reply[reply]
Yes, I'm thinking more about new users here. As well as the IP problems above there will also be cases where the first talk page a new user visits happens to be blank, and they're left to guess what the Subject/Description interface is actually asking of them.
Replying you to here the message box says Reply to Phil Bridger in grey text before I start typing. I'm wondering if we just forgot to set a meaningful box message for new comments (the new interface only went live in 2022). Belbury (talk) 19:52, 22 January 2024 (UTC)Reply[reply]
"Description" could be replaced with "Message" or "Type your message here". QuietCicada - Talk 18:07, 23 January 2024 (UTC)Reply[reply]
"Your message" (or similar variations as suggested by @QuietCicada) would be much clearer than "Description". (Generally, I think of "description" as metadata.) Along the same lines, "Title" is clearer than "Subject". Schazjmd (talk) 18:27, 23 January 2024 (UTC)Reply[reply]
I was thinking "Suggest an improvement to the article" (for talkspace) or "Suggest an improvement to Wikipedia" (everywhere else), but these are more universal (neither would make sense, in, say, a user talk page.) 🌺 Cremastra (talk) 21:04, 23 January 2024 (UTC)Reply[reply]
Are you sure that this has anything to do with why people post messages on those pages ? You see a problem that annoys you and you want a solution. You THINK you have found a cause that is associated, but there is no proof of that association, you are only guessing. But some people are just really young/lost/unexperienced/dumb etc. when it comes to interacting with the Internet. I know someone working a Helpdesk and they literally keep a list of phone numbers of OTHER help desks, because ppl will happily call their bank if their internet is down. No amount of endlessly high stacked messaging or guardrails of any sort is going to protect some people from making mistakes like these. I just see a wall of meaningless text to ignore and archive. That page doesn't have to be clean. Nor are we required to answer each of those people dumb enough to ask a chat GPT a question there. —TheDJ (talkcontribs) 09:56, 24 January 2024 (UTC)Reply[reply]
Yes, I am guessing at a connection, I saw a recurring behaviour and was considering what upstream factors might feed into it. Even if there turns out to be no connection, replacing the Subject / Description prompt with something clearer seems like it would still be a useful change to Wikipedia's interface. Belbury (talk) 14:07, 25 January 2024 (UTC)Reply[reply]
@Trizek (WMF) will probably want to talk to the Editing team about this.
I've also wondered whether we're getting more reverted comments. For example, four misplaced comments were posted to WT:V last week. WhatamIdoing (talk) 18:21, 26 January 2024 (UTC)Reply[reply]
I will. It is not the first discussion I see about placeholder on text areas, though. Finding the right term for the right place, and therefore for the right context, is complicated. As TheDJ said, even if we make a change, you will always see a few users doing things the wrong way. Trizek_(WMF) (talk) 14:09, 29 January 2024 (UTC)Reply[reply]
So what changes are possible? Are the current Subject and Description messages only used inside new topic boxes on talk pages, or do the same strings need to work in other contexts as well? Would we be able to specify different messages for user/article/project talk pages? Belbury (talk) 19:08, 29 January 2024 (UTC)Reply[reply]
As many people know, when you create a new section, the subject window is above the main edit window; and when you edit a existing section, the edit summary window is below the main edit window. But not everybody is aware that the subject window in the first case and the edit summary window in the second case have the same ID (it's wpSummary). So there is a possibility that changes affecting the subject window may also affect the edit summary window. --Redrose64 🌹 (talk) 21:40, 29 January 2024 (UTC)Reply[reply]
The most basic fix would be altering the strings in MediaWiki:Discussiontools-newtopic-placeholder-title (currently Subject) and MediaWiki:Discussiontools-replywidget-placeholder-newtopic (currently Description), without touching any of the underlying code or HTML. It looks like the same strings are used on both article and user talk pages, though, so there's limited scope for how specific the new messages can be. Belbury (talk) 10:43, 30 January 2024 (UTC)Reply[reply]
You are correct: altering these strings will affect all pages. This change has to be very carefully considered, until a proper solution for in-context labels is provided. Trizek_(WMF) (talk) 14:38, 30 January 2024 (UTC)Reply[reply]
Okay. Where's a good place to consider that change? Should I start a thread at Wikipedia:Village pump (proposals), would it be better to workshop some ideas here first, or is there a better place to have this discussion? Belbury (talk) 17:41, 8 February 2024 (UTC)Reply[reply]
You can add an edit notice to the talk page, if you think that additional messaging will help, and it will be shown above the "Add topic" form. (But, I beg you, no longer than ten words, or definitely nobody ever will read it.) Matma Rex talk 16:10, 29 January 2024 (UTC)Reply[reply]

RFC Surveys[edit]

Around 18 months ago I started being more active in RFC discussions. It's a great way to learn more about Wikipedia policies and to help build consensus. However, I have noticed in some of the contentious topics the surveys get bogged down in walls of text, badgering, and bludgeoning. Yes, problem users can be taken to Arb or ANI, but that's a lot of work and in some cases more trouble than it's worth.

The best RFCs I have jumped into keep discussion and comments separate from the survey. The survey should be reserved for the editors who have volunteered their time to comment. They could be wrong, they could be right, but it shouldn't be an invitation to debate. If there's a question about a comment just ping the user in the discussion section.

what is the best way to keep surveys clean? Use templates? A policy on how RFC should be conducted? I believe a great deal of time would be saved and editors would be more willing to comment if the RFC process was cleaner. Let me know what you think. Thanks! - Nemov (talk) 18:24, 30 January 2024 (UTC)Reply[reply]

I’m not sure what you mean. RfCs are meant to have differing formats to fit different needs and scales. Templates for !votes were already banned. Aaron Liu (talk) 18:50, 30 January 2024 (UTC)Reply[reply]
You've never seen a RFC with wall to wall text, bludgeoning, and badgering where it's unclear where to even leave your comment? Nemov (talk) 18:56, 30 January 2024 (UTC)Reply[reply]
The first three, yes, but as someone who was heavily involved in Wikipedia:V22RFC2, I have not seen any RfC where it’s unclear to leave your comment.
I also still don’t understand what proposals you intend to workshop. Aaron Liu (talk) 19:20, 30 January 2024 (UTC)Reply[reply]
This is the idea lab. I have brought what I believe is an issue in high traffic/contentious RFCs and I'm looking for ideas on how to make it better. If you don't think there's an issue that's perfectly fine, but that hasn't been my experience on occasion. Nemov (talk) 19:29, 30 January 2024 (UTC)Reply[reply]
Well, my opinion is against using templates or a format that all RfCs should follow, and as I don’t get the problem I can’t suggest anything either :p Aaron Liu (talk) 19:32, 30 January 2024 (UTC)Reply[reply]
Why has this not been raised at Wikipedia talk:Requests for comment? --Redrose64 🌹 (talk) 23:52, 30 January 2024 (UTC)Reply[reply]
Shouldn't I have something more concrete before it's brought up there? Nemov (talk) 00:08, 31 January 2024 (UTC)Reply[reply]
@Nemov, no, there's no need to wait. We're usually pretty friendly over there. Come talk to us whenever you want, about anything RFC-related, or even just to tell us about an interesting idea or discussion happening elsewhere. For example, a little while ago, I suggested that if an RFC has separate sub-sections, maybe the discussion should go before the vote. WhatamIdoing (talk) 01:34, 9 February 2024 (UTC)Reply[reply]

Biography Chronology/Timeline Addition[edit]

In perusing biographical entries on Wikipedia, I've often felt the absence of a succinct chronological summary detailing the significant milestones in a subject's life. While one can generally piece together this timeline from the narrative, the absence of specific dates can occasionally be a hurdle. A brief, date-specific chronology, inserted at the outset or conclusion of all biographies, would immensely aid in grasping the flow and context of a person's life experiences. It would also reduce errors when writing about a person's life. This addition, presumably a minor task for someone well-acquainted with the subject's life, could exponentially streamline research efforts for many others. Jamesgmccarthy (talk) 18:01, 5 February 2024 (UTC)Reply[reply]

Jamesgmccarthy, I think it is a great idea. Something like the "Key dates" portion of Channel Tunnel? Wikipedia:Manual of Style/Biography doesn't seem to mention a chronological summary, yet.--Commander Keane (talk) 01:18, 7 February 2024 (UTC)Reply[reply]
Thank you for your reply. Yes, the Channel Tunnel sidebar of "Key Dates" would work well. On some events it would be good to state the start and stop date as some key dates have long gaps between them. For example, when studying the life of Charles Darwin, it would be helpful to know when he began and ended his studies at the University of Cambridge, not just when he entered the university. Jamesgmccarthy (talk) 09:23, 7 February 2024 (UTC)Reply[reply]
The "Timeline" section of Charles Darwin's Resonator page does has some key moments of his life in a horizontal timeline; though because he had so many children and awards, the timeline is a bit crowded. You may have to scroll down far or collapse sections to see the timeline at towards the bottom of the page.
Reasonator pulls data from Darwin's item page on Wikidata, a free collaborative database that supports Wikipedia. Though people in past requests for comment about incorporating Wikidata information in English Wikipedia articles haven't been enthusiastic.
Lovelano (talk) 11:24, 9 February 2024 (UTC)Reply[reply]
I would imagine this would spawn a whole host of arguments of what ought and ought not to be included in various articles' timelines (Is event X really a key date for this person? Why isn't event Y in there?...), plus arguing over if certain articles should even contain a timeline at all. Effectively, the infobox wars but on a brand-new front. 2603:8001:4542:28FB:2F1F:EA76:7140:85DA (talk) 09:14, 8 February 2024 (UTC) (Send talk messages here instead)Reply[reply]
I don't think this will prove to be an insurmountable problem. WhatamIdoing (talk) 01:36, 9 February 2024 (UTC)Reply[reply]
+1 - This is a good idea and I agree should be done more widely in articles. There are some articles that have "timeline" spin offs (usually the titles start with "Timeline of...") for major events (e.g. Timeline of the French Revolution) but I agree it would be helpful for all history topics -- history of people (biographies), history of buildings, history of cities, events, etc. etc. Levivich (talk) 18:23, 8 February 2024 (UTC)Reply[reply]
Timelines would definitely make that kind of research easier, but I worry they would become a magnet for fancruft in some articles. I would not add anything resembling a trivia section to a BLP for example. HansVonStuttgart (talk) 08:46, 9 February 2024 (UTC)Reply[reply]
This could be good. I’m not super keen on having a second vertical sidebar where there is already an infobox, so I wonder if it could be integrated into the infobox. This is concordant with the purpose of the infobox anyway:- to summarise key points. We already have birth and death dates there. Perhaps we could insert key dates between these. As a reader, I would be much more interested in seeing, at a glance, when the notable person did their notable things, not just when they were born or died. Barnards.tar.gz (talk) 09:18, 9 February 2024 (UTC)Reply[reply]
I'd suggest remaining flexible. For some articles, in the infobox would work. For others, a separate verticle sidebar (e.g., if the top of the article isn't the best placement for the timeline). For others, it might be a separate section in the prose, and not a sidebar at all. I could see it all depending on the length of the article, the length of the timeline, and other factors (content of the article, BLP considerations, whether items on the timeline are particularly controversial or not, etc. etc.). For example, the timeline for a YouTube star might be very different from the timeline for a president. Levivich (talk) 16:56, 9 February 2024 (UTC)Reply[reply]

Making it easier to add topics (Big buttons, it being more noticeable, ect...)[edit]

I spent a good solid minute trying to find the button to post an idea here, and it was very hard. Maybe we should make the add topic button more noticeable, like the Teahouse? 3.14 (talk) 00:37, 9 February 2024 (UTC)Reply[reply]

I don't think so. Being here assumes you're more experienced, meaning you know where the buttons are. The buttons aren't harder to find than the edit button.
That said, Wikipedia:Convenient Discussions adds a button to add a topic at the bottom of each page. Aaron Liu (talk) 00:43, 9 February 2024 (UTC)Reply[reply]
I only recently found these forums. They're sort of new to me. 3.14 (talk) 00:57, 9 February 2024 (UTC)Reply[reply]
Also chat pages, that was the original idea. 3.14 (talk) 00:58, 9 February 2024 (UTC)Reply[reply]
Welcome to Wikipedia!
@Trizek (WMF) will want to know about this, because the mw:Editing team had been talking about something similar last year. In the meantime, I suggest going to Special:Preferences#mw-prefsection-betafeatures and turning on "Discussion tools", so you can see some cool information (like how many people commented in a given discussion and when the most recent comment was). WhatamIdoing (talk) 01:38, 9 February 2024 (UTC)Reply[reply]
Indeed, the improvements you mention include a proper button to add a new topic, on all pages. you can see how it looks like at Czech Wikipedia, for instance. If you haven't yet tested these improvements, please note that they will soon become the default environment (with a way to opt them out in personal preferences). Trizek_(WMF) (talk) 13:10, 9 February 2024 (UTC)Reply[reply]
@3.14159265459AAAs We don't have "chat pages". What exactly do you mean by that? Doug Weller talk 12:12, 9 February 2024 (UTC)Reply[reply]
Like texting but for Wikipedians. 3.14 (talk) 21:48, 9 February 2024 (UTC)Reply[reply]
User talk pages? Aaron Liu (talk) 22:37, 9 February 2024 (UTC)Reply[reply]
Those exist? 3.14 (talk) 00:00, 10 February 2024 (UTC)Reply[reply]
Like, User talk:3.14159265459AAAs? Aaron Liu (talk) 00:03, 10 February 2024 (UTC)Reply[reply]
Like that, but more social. Not just directed to one person, but the whole Wikipedia community. (Please give me the correct word for Wikipedia's Community, Wikipedians I know.) 3.14 (talk) 01:07, 10 February 2024 (UTC)Reply[reply]
I wonder if you're looking for something like Wikipedia:Discord or Wikipedia:IRC. WhatamIdoing (talk) 01:35, 10 February 2024 (UTC)Reply[reply]
Also, if you scroll down and get the sticky header, there's a very obvious "Add topic" button. Aaron Liu (talk) 12:30, 9 February 2024 (UTC)Reply[reply]
Thanks, I didn't notice. 3.14 (talk) 00:01, 10 February 2024 (UTC)Reply[reply]

Serious reform to this top-down nonsense of projects and quality assessments[edit]

I'm an active editor but I've never engaged with discussions about Wikipedia procedures before, so apologies if I'm going about this wrong way, but here are my thoughts on improving a frustrating aspect of the editing experience - an aspect which has recently become a lot more high profile.

For the past few weeks all the articles I am working on have been subject to bots delivering a project independent quality assessment. Hard enough to find out what this even means in plain language. The quality assessments are mostly nonsense being derived from "Projects" that either are totally inactive or have no possibility to achieve their aims because they consist of 7 people (half of whom are probably dead) aiming to assess all the articles in immensely broad categories.

This is deeply frustrating for people actually trying to improve articles because:

1) Its so top down - a bot swoops down and allocates some random rating to the article, based on a - probably ill-informed - rating done by someone affiliated to a project years ago.

2) There is no information provided to encourage people actually actively involved improving an article to engage with the quality assessment process. Its hard enough to even find out and understand what this whole PIQA process IS, despite the flurry of bot activity it has unleashed on active editors' watchlists.

3) The quality assessment is drawing attention to "importance" ratings from projects that are utterly arbitrary.

My suggestion to improve this is:

1) Information provided as part of the banner shell at the top of talk pages encouraging active editors of articles to provide the quality rating for that article on a simplified rating - they are the people who actually know.

2) A quality assessment based on 3 ratings: stub, improving, completed article (this last meaning ready for 3rd party assessment as a good article).

3) Guidance provided to projects to refocus their activity - not around unachievable quality assessments and meaningless importance ratings across thousands of articles - but instead around assessing good articles in their area, within the existing Good Article nomination and review process.

3) Take automatic project "importance ratings" off talk pages. If people are interested in what a small group of people think are the most important Wikipedia articles on the topic of e.g Christianity (a topic so broad it covers nearly all intellectual activity in Europe over most of 2000 years), they can find their way independently to the project page concerned.

This will have the benefit of:

1) Vastly improving the accuracy of quality assessments by encouraging active editors of an article - rather than randoms - to provide it.

2) Ending the demoralising effect of working on an article that some people years ago have classified as "low importance" and which a bot has now declared officially overall "stub" or "start class" (yes I know you can change it - but lets make that clear to all active editors).

3) Supporting the Good Article process and systematising third party review of articles, which is clearly important and valuable.


Atrapalhado (talk) 23:32, 12 February 2024 (UTC)Reply[reply]

  • Personally, I just ignore the whole “assessment” thing. If I think that I am able to improve an article, I do so… regardless of its “rating”. I hope everyone else would do the same. Blueboar (talk) 00:23, 13 February 2024 (UTC)Reply[reply]
Yes - but that's kind of my point - its just nonsense. Atrapalhado (talk) 00:25, 13 February 2024 (UTC)Reply[reply]
I think the content assessment link provided by the template is enough to know about what the heck these are. Maybe the page needs a nutshell though.

3) The quality assessment is drawing attention to "importance" ratings from projects that are utterly arbitrary.
Take automatic project "importance ratings" off talk pages. If people are interested in what a small group of people think are the most important Wikipedia articles on the topic of e.g Christianity (a topic so broad it covers nearly all intellectual activity in Europe over most of 2000 years), they can find their way independently to the project page concerned.

IMO this is a bad idea. The importance ratings are to focus (active) projects on actively maintaining their most vital articles, sort of like Wikipedia:Vital articles, but for a specific project. Maintaining giant lists situated somewhere in the remote outskirts of projectspace is way harder than just letting the banner shell automatically add categories based on the importance rating.
All types of ratings here are completely arbitrary. B-class and above maybe less so, but still.

2) A quality assessment based on 3 ratings: stub, improving, completed article (this last meaning ready for 3rd party assessment as a good article).

I see a couple problems with this:
  1. "Improving" wouldn't necessarily mean improving. Most projects don't have much people working anymore. Guidance provided to a sparse hall would do almost nothing; it's not a matter of not knowing how, it's a matter of not enough workforce. (This also applies to recommendation #3)
  2. Articles are never completed. We are a wiki, and all articles are constantly evolving. This would imply that no more changes should be made to the article. Getting an article to what was B class doesn't necessarily mean ready for GA review.
  3. Having more layers between stub and just under GA is of great benefit. Under the current system, most B-class articles just need some polish based on consistency, style, and sometimes filling in obscurer content gaps. I'm pretty sure there are editors out there who just turn B-class articles into GAs. Start class articles are otherwise good but have a severe lack in content. These new ratings would just turn a lot of articles of different quality levels into "improving" for no good benefit that I can see. The old system is also pretty understandable.
Aaron Liu (talk) 01:47, 13 February 2024 (UTC)Reply[reply]
@Atrapalhado, I think you'll want to read the Wikipedia:Content assessment page. The (optional) |importance= or |priority= ratings exist to tell the Wikipedia:Version 1.0 Editorial Team that an article might not be popular (in terms of page views) or central (in terms of incoming links) but is still important (or not) to a particular subject area (e.g., a small country), in the opinion of a group of editors who are sufficiently interested in that subject to form a group to improve those articles. High ratings somewhat increase the likelihood of the tagged article being included in an offline collection of articles.
Some groups additionally use those to prioritize their work. For example, years ago, editors at Wikipedia:WikiProject Medicine systematically improved their 100 top-priority articles to at least Start-class.
Until a couple of months ago, each WikiProject separately assessed the |quality= of each article. We decided, through a series of discussions, that this was inefficient: a stub is a stub, and if four groups are interested in this article, we don't need each of the four groups to separately say that it's a stub. A couple of bots are currently running around and turning those duplicate project-specific quality assessments ("stub, stub, stub, stub") into a single project-independent rating ("all stub"). This is generating a lot of activity in watchlists right now but isn't AFAIK supposed to be creating new quality assessments. Hopefully it'll be done in a couple of weeks. (If you don't want to see the bots, then you can hide all bot edits in your watchlist.) WhatamIdoing (talk) 03:08, 13 February 2024 (UTC)Reply[reply]
Thank you @WhatamIdoing. QUALITY SCORES I understand the current approach. As per my comment to @Aaron Liu below, the problem is that the quality ratings come from these projects that have impossibly broad scope and have tiny numbers of people involved and/or are largely inactive. Across the fifty or so articles I am more or less active on, only 1 has ever had a quality score updated: the PIQA scores are nearly all wrong. The PIQA project is building castles on sand. IMPORTANCE RATINGS - I wouldnt object to any group of people noting on the talk page that they think an article is important. Nobody has come up with any argument as to the value of med/low importance scores. Atrapalhado (talk) 10:56, 14 February 2024 (UTC)Reply[reply]
Thanks for the reply @Aaron Liu. 1) Importance Ratings - "IMO this is a bad idea. The importance ratings are to focus (active) projects on actively maintaining their most vital articles." This does not require spreading arbitrary "medium" and "low importance" ratings at the top of talk pages on articles that people work hard on. As for projects needing to store the data of their (usually limited, out of date) project importance scores on talk pages, because of the way Wikipedia works, well maybe - but they certainly don't need to be presented at the top in the banner shell. As for everything being arbitrary, well maybe, I guess all of life is - and actually maybe arbitrary is the wrong word, as i suspect these importance scores show A LOT of systemic bias of the people involved in projects.
2 Quality Ratings - Yes the present system is logical. Its just nobody actually uses it apart from the tiny number of people involved in these impossibly vast Projects. The whole thing - especially with the PIQA update - is castles built on sand. The only answer is a simplified system that people who actually edit the articles concerned are encouraged to use. Atrapalhado (talk) 10:48, 14 February 2024 (UTC)Reply[reply]
In my experience the importance ratings are much less reliable than the quality ratings. The quality ratings are at least based on semi-hard numbers. But in working on referencing unreferenced articles, I've often come across clearly noteworthy and important topics that are ranked Low either because the article was a low-context stub, or because the WikiProject it is associated with is somewhat tangential to the main topic. (Sha-Mail comes to mind out of some recent expansions.) Gnomingstuff (talk) 08:19, 13 February 2024 (UTC)Reply[reply]
@Gnomingstuff: Re I've often come across clearly noteworthy and important topics that are ranked Low ... because the WikiProject it is associated with is somewhat tangential to the main topic.: that's the idea - it's the importance for that particular WikiProject, not the importance in the whole scheme of things. It is perfectly valid for one WikiProject to assign Low-importance to an article which another considers to be High-importance. For example, Talk:Charles III shows Top-importance for WikiProject United Kingdom (obviously), but Low-importance for WikiProject Children's literature - and it's hard to see how he might be rated above that. --Redrose64 🌹 (talk) 09:32, 13 February 2024 (UTC)Reply[reply]
If my understanding of this is correct, these ratings are an artifact of a time where Wikipedia was aiming at producing a CD or something like that. I've seen some WikiProjects using them to prioritize work, but only a few and a minority. And I've seen complaints about importance ratings being dismissive. I would support a motion to turn them off by default and asking WikiProjects that want them to opt in. Jo-Jo Eumerus (talk) 09:12, 13 February 2024 (UTC)Reply[reply]
I think that Start-class gets overused by timid assessors. They know it's not a stub, but they aren't confident enough to rate it any higher. I think that making the assessment "less wrong" is helpful. For the most part, I don't worry about class assessments unless they're significantly wrong. IMO a C-class article should not be rated as a Stub and vice versa, but the difference between a "high Start" and a "low C" might be a matter of individual judgment, so I don't worry about it myself. WhatamIdoing (talk) 16:43, 13 February 2024 (UTC)Reply[reply]
The impression that I get is that few experienced Wikipedia editors care about quality ratings below WP:GA. Remember that a GA itself is usually rated by only one person, although against a consensus-agreed list of criteria. Only featured articles have undergone quality review by more than one editor. If you wish to then please review everything you've written to see if it can be bumped up one or two quality ratings, but I don't think that will achieve much other than give you a warm glow. Phil Bridger (talk) 18:12, 13 February 2024 (UTC)Reply[reply]
Quality assessments are also used by other editors, e.g., by student editors (do expand the stub; don't touch the FA) and for the occasional de-stubbification drive. WhatamIdoing (talk) 18:31, 13 February 2024 (UTC)Reply[reply]
I agree with the last two sets of comments. The last time I took part in a de-stubbification drive I found at least 1/3 of the articles I looked at weren't remotely stubs. Partly people who improve articles are too reluctant to self-rate with a better class. I find I often upgrade quality ratings, but importance ratings sometimes need downgrading. I do wonder why I bother though. Johnbod (talk) 18:56, 13 February 2024 (UTC)Reply[reply]
IMO, assessments are very useful to find articles to improve in a field of interest, and for me are more useful than specific issue tags. If you don't like them they can very easily be ignored, and if you disagree you can just change it. 90% of the problems with assessments onwiki are people being too hesitant to update or rate articles PARAKANYAA (talk) 22:39, 16 February 2024 (UTC)Reply[reply]

I pretty much agree with the OP with regards to article/project ratings. For the reasons described, I don't think that they are very meaningful and also ignore them. And sometimes they are harmful. Maybe we should drop them. Regarding projects, there are some projects which are active or semi-active and which do valuable work, so I would not agree with broad negative statements about projects. Even though there are some dead or inactive ones. North8000 (talk) 18:45, 13 February 2024 (UTC)Reply[reply]

It is irritating to create or edit an article and believe that it's good, complete, etc. -- and then rater X comes along and calls it a "start." That's worse than no rating at all. I don't care whether an article I've created is rated "C" or "B" or not rated at all, but "start" is an insult. Secondly, length is not a synonym for quality. A 300 word article is adequate for some subjects. I get irritated when rater X comes along and calls the 300 word article I have created a "stub." Perhaps a 2 tier rating system would be workable: "good" articles are those that have been peer-reviewed; everything else is unrated. Or maybe you have a third rating of "stub, needs improvement or expansion" and you put that as a header on the article page to encourage improvement of the article.Smallchief (talk) 10:40, 13 February 2024 (UTC)Reply[reply]

You can rate your own articles. You don't have to wait for a random to come along. ~WikiOriginal-9~ (talk) 13:35, 13 February 2024 (UTC)Reply[reply]
Thing is, in most places of the world including Wikipedia judging one's own work is considered bad. So that's pretty unintuitive. Jo-Jo Eumerus (talk) 14:11, 13 February 2024 (UTC)Reply[reply]
Removing the {{stub}} tags or updating a |class=stub rating is exactly as unintuitive as removing maintenance tags like {{unref}} or {{confusing}}, I'd say. We want editors, including new editors, to do all of these things when they believe they have solved the problem. WhatamIdoing (talk) 16:08, 13 February 2024 (UTC)Reply[reply]
I am quite comfortable in rating articles I create or work on as "stub", "start", or "C class", based on my (subjective) assessment their quality. However, I will not rate any article I have worked on as "B class" or higher. I have seen articles I started that had a "stub" rating for years, even though they had several paragraphs of text and half-a-dozen cited sources. Given how many articles sit around with inappropriately low quality ratings, I think it is inefficient to stop editors from rating articles (below B class) they have worked on. Donald Albury 16:08, 13 February 2024 (UTC)Reply[reply]
I don't think that's what Eumerus is saying; seems to me like they are saying that it's a cultural thing to not rate your own Aaron Liu (talk) 16:41, 13 February 2024 (UTC)Reply[reply]
(I think that's a reply to WikiOriginal's comment that "You can rate your own articles". Not everyone uses the Reply button/tries to make the indentation align with the exact comment they're responding to, especially if doing so might create a line of comments where you can't easily see when the first stops and the next starts.) WhatamIdoing (talk) 17:28, 13 February 2024 (UTC)Reply[reply]
Aye, what WikiOriginal is saying may be theoretically in line with the rules, but out of line with how assessments work elsewhere in the world including on Wikipedia (GA, DYK, etc.) Jo-Jo Eumerus (talk) 10:55, 14 February 2024 (UTC)Reply[reply]

Thanks everyone so much for your comments. Having read the comments my view now is the following:

It's clear a lot of active editors just regard this whole system as irrelevant/ignorable (Good Article process excepted).
1) IMPORTANCE RATINGS - a few people have come up with arguments for projects needing to identify articles that are important. Nobody seems to value med and low importance scores; they can be demoralising and are largely ignored. They're also rarely updated and almost certainly harbour a lot of systemic bias. Accordingly I am going to put forward a formal proposal that importance ratings are removed from banner shells. Projects can still indicate on talk pages that projects are important or not to them, if they want. (A slightly qualified version of this would be to leave the info in banner shells where projects think an article is important, but not med/low importance). Under either approach, all the data on ratings provided by projects can still sit in the underlying article data, so it is not lost to the projects, just not be displayed in banner shells.
2) QUALITY RATINGS - more complicated/mixed views. PIQA is probably an improvement but the ratings still come from these impossibly vast projects that often barely active or totally inactive and the ratings are nearly all out of date, at least on the articles I work on. I think a big part of the problem is that as @Jo-Jo Eumerus points out "in most places of the world including Wikipedia judging one's own work is considered bad." The banner shell needs to enourage active editors of articles to update the PIQA - they are the people that know - and provide an easy mechanism for editors to do this. Atrapalhado (talk) 11:22, 14 February 2024 (UTC)Reply[reply]
I think proposing that only top and high be included has a much higher chance of succeeding. Just including it as text on a talk page is disorganized and I don't see the harm in including these in the banner shell, which seems much more logical.
I'd agree with a proposal to encourage, though we need to work out the specific wording first. Aaron Liu (talk) 12:35, 14 February 2024 (UTC)Reply[reply]
Well, as one of the editors who actually uses |importance=low (though I wish it were described as "priority"), I obviously have some concerns about removing it. The Wikipedia:Version 1.0 Editorial Team, which still exists and is still active (though no longer using CDs for distributing Wikipedia articles to schools and other places with limited access to the internet), would also be sorry to lose that information.
I do see a difference, however, between "actually removing" and "not displaying prominently". I don't want 97.5% of WPMED's articles dumped back into Category:Unknown-importance medicine articles (which is what will happen if mid- and low- ratings are actually removed). I don't care whether the low rating (~75%, mostly organizations and people) is shown to people looking at the talk page banners, but I don't want to have the ~800 that we need to review for the first time lost in a sea of 50,000 that we've already reviewed. WhatamIdoing (talk) 18:40, 14 February 2024 (UTC)Reply[reply]
@Atrapalhado: You appear to have the wrong idea about importance ratings. They are not part of the banner shell, and are not intended to be. They have always been set on each individual WikiProject banner.
Aside from that, are you aware of the extensive discussions that have taken place since January 2023 at this page, VPR, Template talk:WikiProject banner shell (particularly the archives since January 2023)? I am concerned that this is becoming a parallel discussion that is seeking to throw all of that away. --Redrose64 🌹 (talk) 19:55, 14 February 2024 (UTC)Reply[reply]
Hi @Redrose64 Thanks for the feedback. Worth splitting this out into importance ratings and quality assessments/ratings.
IMPORTANCE RATINGS. Apologies if I've got my technical language wrong in the reference to banner shell, but I think you're missing my wider point. The project banners always appear at the top of the talk page and they display the allocated project importance ratings. Those importance ratings - especially when they present a low or med rating - are irrelevant, demotivating, out of date and need to go.
QUALITY RATINGS On the links to the earlier discussions - thanks for these. I haven't read through all of these but will endeavour to. They seem to relate to the now implemented PIQA process. PIQA is an improvement on having lots of quality ratings from different projects. However, the ratings are still substantially out of date and (like importance ratings) usually regarded as an irrelevance by active editors. My revised proposal on quality ratings would be that (A) Active editors are encouraged to update the quality assessment for the article prob by adding a statement to this effect in the banner shell (B) There is a clear explanation (@Aaron Liu suggested a ?in a nutshell doc?) of the PIQA process which is currently totally opaque. Currently the PIQA bot activity on my watchlist looks like this: "(Implementing WP:PIQA (Task 26))." If you click on that WP:PIQA link it tells you: "In February 2023, the Wikipedia community expressed strong support for article quality assessments that are independent of WikiProjects. Consensus was found for adding a |class= parameter to Template:WikiProject banner shell, which all projects would inherit. To avoid redundancy, this rating is then shown only on the banner shell. Projects can choose to opt-out of this system by adding the parameter |QUALITY_CRITERIA=custom to their project banner template. It is also possible to add a standalone banner shell template to an article without any WikiProjects, for example {{WikiProject banner shell | class=start}} The robot will regularly maintain {{WikiProject banner shell}}." WTF? Atrapalhado (talk) 15:04, 15 February 2024 (UTC)Reply[reply]
I don't see what your problem with the PIQA link is. It seems pretty clear to me.
I added the nutshell summary to Wikipedia:Content assessment. Aaron Liu (talk) 15:58, 15 February 2024 (UTC)Reply[reply]
  • IMO the |importance= ratings are irrelevant to most, demotivating to a few, but almost never out of date. In fact, I'd say they were wrong (e.g., raised by someone who mistakenly believes that will cause more improvements to an article) more often than outdated. The importance of, e.g., Cancer to Wikipedia:WikiProject Medicine isn't something that changes over time. It will never be "out of date".
  • The |quality= ratings are out of date on a significant minority of articles. However, the PIQA process has nothing to do with that; PIQA is merely rearranging where the existing information is stored. PIQA is a one-time, one-off update to the wikitext syntax. If you want some official encouragement, then you need to be looking at the long-term, permanent pages, such as WP:Content assessment, which say things like "Generally speaking, all editors, including editors who have written or improved an article, are encouraged to boldly set any quality rating that they believe is appropriate, except for the GA, FA, and A-class ratings." – and have said this for years and years.
WhatamIdoing (talk) 16:59, 15 February 2024 (UTC)Reply[reply]
Thanks @WhatamIdoing On Importance Ratings: As I've said I had less problem where (as in your cancer article example) projects (or indeed anyone) wants to say an article is important to them. But nobody has come up with a use case for low or med project ratings on top of the talk page. Can I trade you an example: Talk:Spalding Priory - its an important enough little article in its way (I wont bore you with why I think its interesting!) and slowly developing - but what on earth use does it serve to have the top of the talk page telling us three different projects regard it of "low importance"? Atrapalhado (talk) 22:15, 15 February 2024 (UTC)Reply[reply]
"Medium" is usually taken to mean normal or average, which does not seem to be problematic. (I don't ever remember seeing a complaint in which someone alleges that it's insulting.)
I don't think the value is in "the top of the talk page telling us". I think the value is someone (e.g., me) being able to get a list of articles I care about, excluding the ones that I don't care about. WhatamIdoing (talk) 02:27, 16 February 2024 (UTC)Reply[reply]
Thank you Aaron for doing this. Atrapalhado (talk) 22:22, 15 February 2024 (UTC)Reply[reply]
The banner shell doesn't display importance ratings. Any importance rating that you see is displayed by a WikiProject banner, one or more of which are enclosed in the banner shell. See the Charles III example that I provided earlier. Some WikiProjects (e.g. Biography) do not provide importance ratings; but for those that do, PIQA has nothing to do with either how they are decided, where the chosen values are set, nor how they are displayed. --Redrose64 🌹 (talk) 22:07, 15 February 2024 (UTC)Reply[reply]
Yes - I understand that - in my last response to you I tried to be very clear in distinguishing between comments on Quality Ratings/PIQA and importance ratings. And I apologised for not understanding banner shells. As I said, the point is still that the importance ratings are put there in banners at the top of the talk page and they're out of date, demotivating and irrelevant. Atrapalhado (talk) 22:20, 15 February 2024 (UTC)Reply[reply]
Technically, the banner "shell" is not displaying any importance ratings at all. It's the "banners" (inside the outer shell) that display the importance rating. WhatamIdoing (talk) 02:30, 16 February 2024 (UTC)Reply[reply]
I don't think he mentioned banner shells in that reply. Aaron Liu (talk) 03:17, 16 February 2024 (UTC)Reply[reply]
Second sentence: "And I apologised for not understanding banner shells." In his defense, nested templates like that are confusing to just about everyone. WhatamIdoing (talk) 03:21, 16 February 2024 (UTC)Reply[reply]
I think that means that he apologizes for not understanding the distinction between banner shells and banners. Aaron Liu (talk) 12:34, 16 February 2024 (UTC)Reply[reply]
I would like to mention that I find Importance ratings to be quite useful. QuicoleJR (talk) 00:10, 16 February 2024 (UTC)Reply[reply]
Thanks. could you expand a bit. Why? How? Atrapalhado (talk) 01:10, 16 February 2024 (UTC)Reply[reply]
It makes it easier to find potential new additions to the vital article list, for one. Also, if I was better at improving articles, the intersection of quality and importance would help me find which articles are most important to improve. Also, I do not think only having Top-Importance and High-Importance is a good idea, because allowing a high rating implies that there could be a low rating. Also, if it ain't broke, don't fix it. QuicoleJR (talk) 01:38, 16 February 2024 (UTC)Reply[reply]
I find importance ratings quite useful as well. Nice to see what important articles for my fields of interest are in a sorry state so I can improve them. The cons for keeping them are very very low, so I think they're worth the benefit, even if it's just for some people. PARAKANYAA (talk) 22:35, 16 February 2024 (UTC)Reply[reply]
My main project, Wikipedia:WikiProject_Visual_arts decided years ago not to use importance ratings. But we signed up to Wikipedia:WikiProject Visual arts/Popular pages, which compares views with quality ratings, & is in many ways a better way of assessing "importance" for a project's top 1,000 articles. Johnbod (talk) 15:31, 16 February 2024 (UTC)Reply[reply]
Popularity isn't necessarily related to relatedness. Aaron Liu (talk) 15:33, 16 February 2024 (UTC)Reply[reply]
"relatedness" to what? It's arguably a better guide than the currect system, though common sense is needed, with allowance for google doodles, doodling sports stars etc. January's top 10 are:
Wikipedia:WikiProject Visual arts/Popular pages
Period: 2024-01-01 to 2024-01-31
Total views: 53,798,474
Rank Page title Views Daily average Assessment Importance
1 Ansel Adams 1,074,270 34,653 GA Unknown
2 Neri Oxman 513,881 16,576 GA Unknown
3 Mona Lisa 374,933 12,094 B Unknown
4 Leonardo da Vinci 305,479 9,854 GA Unknown
5 Vincent van Gogh 265,308 8,558 FA Unknown
6 Pablo Picasso 242,319 7,816 B Unknown
7 Arun Yogiraj 228,227 7,362 Start Unknown
8 Bob Ross 219,491 7,080 B Unknown
9 Eiffel Tower 214,041 6,904 C Unknown
10 Terry Crews 201,157 6,488 C Unknown

Johnbod (talk) 15:50, 16 February 2024 (UTC)Reply[reply]

Something popular might only be tangentially related to a subject. Aaron Liu (talk) 16:04, 16 February 2024 (UTC)Reply[reply]
Indeed - the current system has that flaw too. In theory different projects can use different "importance" ratings, but in practice they are generally the same for all projects. Johnbod (talk) 16:09, 16 February 2024 (UTC)Reply[reply]

Here's the top 10 at Wikipedia:WikiProject Medicine/Popular pages:

1 Sexual intercourse 641,551 20,695 B Mid
2 Paolo Macchiarini 494,071 15,937 C Low
3 Christopher Duntsch 343,744 11,088 B Low
4 COVID-19 pandemic 316,530 10,210 GA Top
5 Bhopal disaster 307,206 9,909 B Low
6 Suicide methods 287,328 9,268 B Low
7 Norovirus 274,185 8,844 B Mid
8 Jean Tatlock 265,429 8,562 GA Low
9 Factitious disorder imposed on another 256,335 8,268 C Mid
10 COVID-19 245,215 7,910 B Top

As you can see, 50% of them are low priority to the group. Three are mid-priority, and the two COVID pages are top-priority. You could argue that we do care more about Suicide methods and we should care ore about Bhopal disaster, but we could also argue that the only reasons we care about two of the mid-rated articles is because they're popular (one not primarily being a medical subject and the other being a rare disease, which is normally rated low). Popularity has not been a reliable indicator for us. However, for groups that primarily work on articles about people, culture, and business, I would expect the popularity to line up more closely with their own priorities. WhatamIdoing (talk) 18:16, 16 February 2024 (UTC)Reply[reply]

As I said, on this test, you have to allow for temporary effects, eg Jean Tatlock was Oppenheimer's girlfriend; she'll drop back to her natural level soon (below 100 per day, having peaked at over 200,000 pd on the film's release). That's pretty much the same for arts. The top 10-20 are often odd anyway, with the real usefulness coming mid-table. Actually, if you want to know what readers are actually looking for, I'd expect this approach is at least as useful for medicine as for arts subjects, if not more. Johnbod (talk) 18:36, 16 February 2024 (UTC)Reply[reply]
I'd say that if you want to know what readers are looking for with articletopic:medicine-and-health then it'd be very useful, but if you want to know what kinds of articles the editors at WP:MED want to work on, it's not so useful. WhatamIdoing (talk) 22:24, 16 February 2024 (UTC)Reply[reply]
Without talking about medicine in particular, one of our big problems on WP is that editors spend vast amounts of time on articles that readers don't want to read (see FAC at any time), and more attention to ones that people do want to read would be a good thing. Johnbod (talk) 04:26, 17 February 2024 (UTC)Reply[reply]
@Iridescent has argued that this isn't necessarily a bad thing. I can get a good bit of information on popular subjects (whether that's the new box-office hit or a common disease) anywhere. There aren't many good and freely available webpages talking about obscure subjects (whether that's a rare disease or an obscure work of art). WhatamIdoing (talk) 05:26, 17 February 2024 (UTC)Reply[reply]
I never found Iri's oft-repeated argument convincing, or not for a vast range of "mid-popularity" subjects (up to say 1,000 views per day). When writing a new article or expansion, I always do a google search, and in most cases imagining what a teenage reader would make of it is rather depressing. Actually the rest of the internet does rare diseases rather well, I would have thought. Just one or two "good and freely available webpages" will be enough for most readers. Johnbod (talk) 15:39, 17 February 2024 (UTC)Reply[reply]
At least two, I think, if the subject is a 'serious' one. I've heard that a typical pattern is to read several sites, and look for what matches. If all the sites say ______, then you tend to trust that they're correct. If they disagree with each other, then you know that you don't have the full story (e.g., maybe some of them are out of date). If you only read one, you're not sure whether the one you read is correct or if you landed on an outlier. Figuring out whether the contents of the webpage you're reading matches your own experiences/beliefs, and whether it matches other seemingly reputable sites, is one of the main ways that people decide what content to trust on the internet.
We even do that here; "figure out whether all the apparently reliable sources agree" is the basis for most of NPOV. WhatamIdoing (talk) 18:36, 17 February 2024 (UTC)Reply[reply]
On the other hand, I think Wikipedia has too much coverage of the ephemeral and popular and not enough of things of long-term significance, i.e., the problem of recentism. But then, we are all (well, almost all) volunteers who edit what they want to edit. Wikipedia is what it is, and I remain happy to primarily work on articles that often get less than 10 hits a day. And, then, something breaks in the news, and an article that has been getting less than a hundred hits a day suddenly gets 95 thousand in one day, and Wikipedia has proved its usefulness by having information about some obscure topic that everyone suddenly wants to know about, such as happened here. Donald Albury 13:10, 17 February 2024 (UTC)Reply[reply]
Well, Jean Taplock above is a good example of that. Perhaps one day the FA 1986–87 Gillingham F.C. season (av views 2 per day) will get its day in the sun. Johnbod (talk) 15:39, 17 February 2024 (UTC)Reply[reply]

I think medicine is a relatively poor example because as a topic area it contains a lot of living people and other "human interest stories" which may become popular for a brief period of time. Sure, people might want to know about Paolo Macchiarini this month, but it's clear it's not as important to the Medicine Wikiproject as, say, Pain. I found WikiProject Mathematic's popular page chart to be one that seems significantly less volatile:

1 Stephen Hawking 1,330,671 42,924 B Mid
2 Albert Einstein 643,550 20,759 GA High
3 1 296,115 9,552 C Top
4 Ted Kaczynski 289,597 9,341 FA Low
5 Isaac Newton 277,526 8,952 GA Top
6 0 271,987 8,773 B Top
7 Alan Turing 244,721 7,894 GA High
8 Fibonacci sequence 230,374 7,431 B High
9 Srinivasa Ramanujan 198,130 6,391 GA Top
10 Pi 194,909 6,287 FA Top

The popularity seems much more associated with importance to the field here. The only outlier is someone who is better known for things outside of mathematics, to put it mildly. I would guess that other academic and scientific subjects without much "drama" would line up well with their importance as well. I checked WikiProject Geology and it also seems relatively stable and well-correlated with importance. Pinguinn 🐧 19:30, 17 February 2024 (UTC)Reply[reply]

I see that WP:MATH has the same challenge that WP:MED does in one respect, though: If you want to write about math itself, instead of mathematicians, then you generally aren't going to be interested in the most popular articles. WhatamIdoing (talk) 21:36, 17 February 2024 (UTC)Reply[reply]
I'd say scientists are way more related to their specific sciences than math. Aaron Liu (talk) 22:14, 17 February 2024 (UTC)Reply[reply]

Redefinition of ECP[edit]

Since the topic keeps coming up, and concerns about gaming this aren't seeming to go away, I want to open another WP:RFCBEFORE discussion about the possibility of redefining ECP to address gaming concerns.

My previous proposal turned out to be unworkable, so I'm going to keep this discussion a little broader:

  1. Should we create a minimum byte size for some or all of the edits? For example:
    1. At least 500 significant edits, with "significant edits" defined as:
      • 200 bytes, or
      • 100 bytes, or
      • 20 bytes
    2. At least 250 major edits and 250 minor edits, with "major" and "minor" edits defined respectively as:
      • 200 bytes and 20 bytes, or
      • 100 bytes and 10 bytes
  2. Should we require a minimum level of mainspace participation? For example:
    1. At least 500 edits total, including at least 250 edits to mainspace
  3. Should we exclude reverts and/or reverted edits?
  4. Should we exclude edits made to topic areas covered by ECP?

#1, #2, and #3 will be possible to automate - while it should be possible to modify mediawiki to work this way, it probably won't be practical. Instead, I would suggest we create an admin bot that checks whether these criteria are met and grants ECP when they are.

The one that we won't be able to automate is #4, but it will make it clear to admins when it is appropriate to manually revoke ECP. BilledMammal (talk) 02:46, 13 February 2024 (UTC)Reply[reply]

How many times a month does the community conclude that an editor is actually "gaming" ECP? For example, when was the last ANI discussion that concluded someone had gamed ECP? Can you name any editors blocked or otherwise sanctioned for gaming ECP this calendar year?
This feels a bit like a perceptual problem instead of a practical problem – like when violent crime is actually down, but unfounded fear of crime is up, so the politicians give speeches about being tough on crime. WhatamIdoing (talk) 03:16, 13 February 2024 (UTC)Reply[reply]
Part of the trouble is that there is no clear definition of what gaming is; this is intended to help create such a definition. Further, action in many cases have been rejected because editors make the reasonable point that it was unfair to expect editors to abide by standards we don't tell them about; this is intended to allow us to tell editors these standards.
There are some examples, such as Onesgje9g334 who had their ECP revoked, but these are mostly the more obvious cases. BilledMammal (talk) 04:11, 13 February 2024 (UTC)Reply[reply]
I don't see why one would exclude topic area edits that haven't been reverted already; that sounds like bureaucracy. 3 should be split into two when this gets proposed (personally I like the second part much better). Unsure about the first one per WAID Aaron Liu (talk) 03:19, 13 February 2024 (UTC)Reply[reply]
Largely because there's a question of whether it's appropriate for an editor who evaded the restriction for long enough to gain ECP through those evasions to have ECP - there are two parts to this, those editors who continued evading after being made aware, and those editors who only evaded while unaware. BilledMammal (talk) 04:11, 13 February 2024 (UTC)Reply[reply]
How exactly does one "evade the restriction"? What does that mean, exactly? Do you mean that if a page is EC-protected, then someone without EC rights has managed to edit it anyway? That sounds like the basis for a bug report, not for a policy change.
Or does this mean something like "Well, Hamas is under ECP, but there are articles that mention Hamas that aren't ECP, so if you edit one of those, or if you create an article about a notable subject related to Hamas that is put under ECP later, you were 'evading the restriction', because we only mean for 0.75% of successful editors to be able to contribute their bit in this area?" Your first 500 edits includes contributions to the talk pages of articles about Israel [2] and Ukraine [3]. Were you evading any restrictions when you participated in those discussions?
As for the rest – if it's not obvious, like the editor who adds or subtracts a couple of letters through a couple hundred edits in a single day, then I don't see the point in calling it "gaming". I think you need to explore what editors actually mean when they claim that someone (or the unspecified, vage 'others') are gaming. Once you know what they actually want stopped, you'll have an easier time formulating a relevant proposal. WhatamIdoing (talk) 04:39, 13 February 2024 (UTC)Reply[reply]
On how to explore that subject: I think an couple of in-depth user interviews, with people who have recently made such an accusation, would be the way to find out what the complainants actually meant. Whether you agree with them or think their reasons are founded is not especially relevant. It's like older people saying "Too much crime around here" when what they really mean is "There's been a massive demographic shift in my community. When I was young, all the kids I saw were from my social/racial/ethnic group, but these days, all the kids I see in my community these days don't look like me, and it makes me feel isolated and insecure". You have to figure out what "They're gaming the system" actually looks like, regardless of whether you sympathize with the situation that they're complaining about. WhatamIdoing (talk) 04:44, 13 February 2024 (UTC)Reply[reply]
Were you evading any restrictions when you participated in those discussions? Possibly; it depends on how the community assesses it. I would consider edits made in contravention of the restriction, whether intentionally or unintentionally, to be "evading", while edits made in line with the restriction would be permissible (I think those edits were in accordance with the restrictions as they were at the time - in particular, I think that ECP for Ukraine-Russia was applied after I made those edits), but the broader community may apply a stricter standard. I also wouldn't consider a couple of edits made inside the topic area to be a cause for concern; if an editor has made 450 outside but 50 inside, it simply isn't worth manually revoking ECP only to manually reinstate it a week later. Again, though, the broader community may apply a stricter standard.
For me, this comes down to what the purpose of ECP is. In my opinion, the purpose is too keep the topic areas functional; to make it harder for bad actors to participate in the topic area, and ensure that good actors have sufficient experience to participate in the topic area.
I'm not convinced that the current requirement is sufficient, and I think even a minor strengthening of it (for example, only unreverted edits above 20 bytes made outside the topic area contribute) would be a very positive step. BilledMammal (talk) 04:57, 13 February 2024 (UTC)Reply[reply]
The best public policy design is based in data. You say I'm not convinced that the current requirement is sufficient where sufficient is to keep the topic areas functional; to make it harder for bad actors to participate in the topic area, and ensure that good actors have sufficient experience to participate in the topic area. Why not show that there is a strong association between (i) edit types that happens in editors' 0-500 edit history and (ii) how they turn out to be "good" or "bad" actors?
You'd have to find an agreed-upon labeling schema that marks editors as "good" or "bad". Maybe whether they were banned?
I think pinning these things down will help productively move this idea forward and make it more grounded. eyal (talk) 16:14, 13 February 2024 (UTC)Reply[reply]
+1 to what @Eyal3400 said.
For example, all of these proposals are about the types of edits. Maybe it would make more sense is to adjust the minimum age of the account, which is currently 30 days. The account accused of gaming is currently 8 months old, reaching 500 edits after about five weeks and having EC manually removed around 5.5 months. Maybe setting the EC minimum to three or six months would help. If this is motivated by WP:ARBAP2, then we could set it, at least temporarily, to 9 months (=unless you have already registered an account, you will not be able to edit articles about the US presidential election until after election day, which is 8 months, 23 days from now).
But I'm still thinking: Where's the evidence that we have a significant problem that we aren't already capable of handling under the current rules? WhatamIdoing (talk) 16:33, 13 February 2024 (UTC)Reply[reply]
The problem with gaming is that any transparent procedure is open to it. For example, if the limits were set as in the immediately preceding edit (only unreverted edits above 20 bytes made outside the topic area contribute) how would we look upon an editor who made 500 unreverted edits each of exactly 21 bytes? The only ways to deal with gaming conclusively are by making our procedures secret (which goes against our culture so much that I won't even consider it) or not clearly defined. Phil Bridger (talk) 11:42, 13 February 2024 (UTC)Reply[reply]
We tell editors that to edit a topic area you need to meet a certain criteria; I actually think it is reasonable and even natural for an editor interested in a topic area to work towards that criteria.
As such, if we decide the limit is 20 bytes and an editor makes 500 good faith edits each of 21 bytes, then I would suggest we do nothing; they've abided by our rules, and that's good enough - as you point out, we can't prevent editors working towards the criteria, but we can set the criteria at a level where we don't mind if editors do so. BilledMammal (talk) 15:10, 13 February 2024 (UTC)Reply[reply]
  • We currently tell editors that 500 edits count; an editor diligently worked towards that criteria with at least 250 edits of 1–3 bytes (plus perhaps as much as 250 apparently normal edits, looking only at the undeleted ones), and you level accusations of gaming.
  • You propose telling editors that 500 edits of 20 bytes count; we ask whether an editor diligently working towards that criteria with edits of 21 bytes, and you say that's not gaming.
Does that make you think that One of These Things (Is Not Like the Others)? I makes me think that. WhatamIdoing (talk) 16:12, 13 February 2024 (UTC)Reply[reply]
I don't understand what you're saying with Does that make you think that One of These Things (Is Not Like the Others)? I makes me think that..
I think they're both gaming; I just think the way to address it is to set the criteria to a level where we don't mind if editors do so. BilledMammal (talk) 20:52, 13 February 2024 (UTC)Reply[reply]
So if someone adds a simple link to 500 articles (+4), that would be gaming and would bother you/the community? But if they add {{refimprove|date=February 2024}} to 500 articles (+33), then that would be gaming but wouldn't bother you/the community? WhatamIdoing (talk) 21:09, 13 February 2024 (UTC)Reply[reply]
Personally, I would set the criteria as higher than 20 bytes, but if the community decides that 20 bytes is sufficient I'm not going to oppose tightening restrictions because I don't think they're tightened enough. BilledMammal (talk) 21:11, 13 February 2024 (UTC)Reply[reply]
  • I'm fairly strongly opposed to managing page protection with a process that requires an administrator to individually screen editors. Complex automated screening should be done on the back end, which would require significant developer support (mw:Extension:FlaggedRevs can do many parts of this, but is lacking such support and would not be deployed without it. — xaosflux Talk 15:06, 13 February 2024 (UTC)Reply[reply]
    Unless we really want to go all-in on it and make everyone apply for actual manual review. Perhaps change the auto ECP to 5000/300, and send everyone to WP:PERM/EC. I don't think this is a great idea though, but it eliminates making everything be the responsibility of one admin. — xaosflux Talk 15:21, 13 February 2024 (UTC)Reply[reply]
    The current rule (500+ edits) excludes 99.25% of editors who have ever made an edit (ignoring the fact that the majority of qualified accounts are inactive).
    Setting it to 5,000 would exclude something like 99.95% of all editors. Do we really want only 0.05% of editors to be able to edit articles on some very large subject areas? WhatamIdoing (talk) 16:15, 13 February 2024 (UTC)Reply[reply]
    @WhatamIdoing that was just an arbitrary large number, this would be if the "normal" route to ECP becomes "manual review" - could still leave something big for automatic. — xaosflux Talk 21:14, 13 February 2024 (UTC)Reply[reply]
    foundation:Wikimedia Access to Temporary Account IP Addresses Policy uses a default threshold of 300 edits and six months. Choosing 300 edits vs 500 edits did not significantly change the barrier (from "very high" to "slightly higher than that"). If you are a high-volume editor, then you are a high-volume editor, and if your editing leads you to make the first 300 edits, then it will lead you right along to the next 200 edits.
    I don't remember hearing about any specific research for the six-month level; we may have taken it from Wikipedia:The Wikipedia Library. I had recommended against EC's 30-day minimum, and apparently they agreed with me. WhatamIdoing (talk) 00:48, 14 February 2024 (UTC)Reply[reply]
    Adjusting the automatic from 500/30 to 500/180 is something we could do as well. I don't think 5000/300 is a "good value" - just was putting suggesting some fall back if we went to a normally manual process (perhaps 1000/360 would work there as well). — xaosflux Talk 11:29, 14 February 2024 (UTC)Reply[reply]
    Is there even enough support for a system where WP:PERM/EC is the typical route to justify listing it in an RfC? My suspicion is that the overwhelming majority of users would oppose it due to sysop workload, the fact that it would make the ability to edit some massively important areas contingent on an individual's opinion, and so on. Sincerely, Novo TapeMy Talk Page 16:11, 14 February 2024 (UTC)Reply[reply]
    I also have my doubts about this. How many potential candidates are we talking about? Hundreds or thousands? About 800K editors made one or more edits last year; if 1% of them achieved 500 edits, that would be 8,000 potential candidates to review. That's probably the highest possible estimate. If it's 0.1%, then that's 800 potential candidates, or several per week.
    More likely, though, is that few of them would know that they could ask for it, so we'd be silently missing out on contributions without having so many manual requests. WhatamIdoing (talk) 19:03, 14 February 2024 (UTC)Reply[reply]
    I agree with WAID. 5000 is way too much and would lead to more gaming Aaron Liu (talk) 16:43, 13 February 2024 (UTC)Reply[reply]

Urging people to register an email account[edit]

I just read yet another sad case of a productive editor who lost access to their account because they lost the password and didn't have an email address registered so they couldn't recover it. What would people think about a bot which looked for accounts that don't have email registered and dropped them a message on their talk page explaining the risk of not being able to recover a lost password and how to fix that? We'd probably want some filter criteria like only doing it for accounts which have been active in the past N days and only sending a reminder once per year. RoySmith (talk) 17:02, 13 February 2024 (UTC)Reply[reply]

Maybe we should start with a watchlist notice or a sitenotice that displays only to extended-confirmed editors. I don't know whether the sitenotice can be controlled according to whether e-mail is registered, but perhaps that's not terribly important. Some of us may have invalid/outdated e-mail addresses.
As for making a list for personal messages, we could consider 500+ or 1,000+ edits and perhaps 1+ years old, or anyone with "advanced" user rights. Maybe it would be worth excluding the handful of people who have an e-mail address registered at another SUL-connected wiki (but disabled here).
Have you thought about Echo/Notifications messages? It's also possible to send to any list of individuals. That would be more private for the notified people and less potentially annoying to the people watching their pages. Once we've dealt with the backlog, it's possible to have an automatic trigger for a notification when a milestone is reached. Wikipedia:The Wikipedia Library congratulates people on reaching 500 edits. Maybe when you reach a certain level, it could suggest making sure that an e-mail address is set in your prefs. WhatamIdoing (talk) 17:44, 13 February 2024 (UTC)Reply[reply]
Have you considered to edit milestones notifications locally? Trizek_(WMF) (talk) 17:51, 13 February 2024 (UTC)Reply[reply]
I like the idea of this being a notification, for all the reasons WhatamIdoing pointed out. RoySmith (talk) 18:09, 13 February 2024 (UTC)Reply[reply]
"Has email" isn't something I think we can determine publicly, but for most people "Is emailable" (which can be determined) is good enough - we could MMS ("is NOT emailable" AND "in some group") I suppose. — xaosflux Talk 21:19, 13 February 2024 (UTC)Reply[reply]
I'm assuming whatever causes the "Email this user" link to show up in the sidebar is the same thing that allows you to do an account recovery, no? RoySmith (talk) 21:48, 13 February 2024 (UTC)Reply[reply]
That's what he means by "is emailable". The user has to not deselect "allow emails from other users" for that link to show up. Aaron Liu (talk) 22:13, 13 February 2024 (UTC)Reply[reply]
What about people who know they could do that but don't want to? 🌺 Cremastra (talk) 22:30, 13 February 2024 (UTC)Reply[reply]
I envision that they'll get one reminder and there will be a way to opt out of additional reminders. RoySmith (talk) 22:35, 13 February 2024 (UTC)Reply[reply]
Or, we only add it to the 500 edits milestone and maybe the 5000 (or the 10000) one, and foresake any further warnings. Aaron Liu (talk) 23:55, 13 February 2024 (UTC)Reply[reply]
(Wikipedia:The Wikipedia Library is a 500-edit milestone, so I think we should pick a different round number.) WhatamIdoing (talk) 03:34, 16 February 2024 (UTC)Reply[reply]
Why not both? Aaron Liu (talk) 12:33, 16 February 2024 (UTC)Reply[reply]
Because if you get one message in January, and a different message in February, you're more likely to take action on both of them than if you get both at the same time.
Also, do we really want to wait for 500 edits? Why not show this suggestion sooner? WhatamIdoing (talk) 22:29, 16 February 2024 (UTC)Reply[reply]
I don't see why that'd be the case, especially since both messages are short.
Maybe 100? Aaron Liu (talk) 22:57, 16 February 2024 (UTC)Reply[reply]
I lost my initial account, User:Union Tpke 613, since I was not yet old enough (13) to get a gmail account, and thus didn't link it to an email. Kew Gardens 613 (talk) 06:46, 16 February 2024 (UTC)Reply[reply]

How Many Opinions Are Required to Reach Consensus ?[edit]

I'm not sure what forum is best for this post.


I think there should be much greater responses on talk pages before anyone declares a consensus is reached. The talk pages are great and useful for discussing and editing article content. Yet, they usually don't approach consensus with any scientific or academic validity. There are usually not enough responces. I'll recomed that the policy dictating consensus be updated with statistical polling requirements.


This is information I found easily that should have some bearing on a consensus:
A quick evaluation suggests that at the very minimum, to get a consensus with a 10% error possibility Wikipedia should get responses from 86 Administrators and/or 100 Wikipedians.

ProofCreature (talk) 22:01, 14 February 2024 (UTC)Reply[reply]

Simply no.
That just takes too much time. If not enough people respond and there aren't opposes, just implement the edit. They can always be reverted later.
WP:Wikipedia is not a democracy. We don't care about polling to get sample sizes unless the change is going to be so huge it warrants Wikipedia:Centralized discussion. Aaron Liu (talk) 22:26, 14 February 2024 (UTC)Reply[reply]
A lot depends on what you are assessing consensus about. Sometimes two mildly disagreeing editors who are willing to compromise is all it takes to reach consensus. At other times you need more (even a wide sample from the broader community) to assess consensus. That said… one thing you can say: The fewer editors who are involved, the weaker the consensus is… and the more editors involved the stronger the consensus is. Blueboar (talk) 22:48, 14 February 2024 (UTC)Reply[reply]
Fair. ProofCreature (talk) 17:38, 15 February 2024 (UTC)Reply[reply]
I get that. It is certainly easier to include one's own opinion and ignore any research or statistics - certainly easier and not uncommon.
Wikipedia lends itself to this kinda editing by allowing for reverting and multiple editors. That's the point to Wikipedia and a large part to the reason it's useful. I'm not sure it's a consensus, though. ProofCreature (talk) 17:43, 15 February 2024 (UTC)Reply[reply]
In the vast majority of discussions I've been involved with here, getting that many responses would have been a miracle. Sometimes getting responses from five Wikipedians is challenging. Sometimes getting any responses when I try to initiate a discussion takes more effort than I would have anticipated. DonIago (talk) 13:52, 15 February 2024 (UTC)Reply[reply]
One of the lesser-known strengths of the community is the actual high levels of courtesy here. If a capable responsible editor would like a change, then often people may allow it by allowiing the edits to proceed without debate. Sm8900 (talk) 15:01, 15 February 2024 (UTC)Reply[reply]
True. ProofCreature (talk) 17:37, 15 February 2024 (UTC)Reply[reply]
Please no. If you look at the first listing on CENT, you'll notice it has a grand total of 6 commenters . Even the survey about Vector (2022) only has 63 editors. Getting 100 !voters, especially on pages about topics the vast majority of people have never even heard of, let alone care about, would be literally impossible. Sincerely, Novo TapeMy Talk Page 17:23, 15 February 2024 (UTC)Reply[reply]
True. ProofCreature (talk) 17:37, 15 February 2024 (UTC)Reply[reply]

Honestly, I don't expect much from this proposal. I am entirely aware how difficult it is to get more than a few Wikipedians to comment on any given page or article. There are many reasons for this. Disinterest on anything but "pet topics " is one big reason. There's also malaise and no motivating factors (ego, boredom, and a love to share learning are about the only motivation I can see to edit anything).

The thing I am looking to have happen from this suggestion is to see the statistics that describe consensus somehow included in the protocool for reaching a consensus without them becoming a mandate. More like a suggested action or just an F.y.i. There should be an effort (due diligence) made to draw in comments from Wikipedians who are less interested in the topic. Too, it should be noted, somehow, that limited comments (a number below the threshold for statistical consensus) can create an echo chamber effect. ProofCreature (talk) 17:55, 15 February 2024 (UTC)Reply[reply]

  • I think that it might be useful as an essay; but even then, it's a matter of how strong of a consensus is required. Most normal edits to individual articles do not require a very strong consensus at all, so this wouldn't be applicable to them; even most policy changes don't require the thresholds you describe here. But an essay on the subject would be useful as a reference for people who are considering the thresholds necessary to make truly drastic changes to policy with wide-ranging implications (I believe the threshold you mention is somewhat close to the currently required threshold necessary to amend ArbCom's charter, say.) It's also important to note that this assumes a random sample and that, while we do make some effort to prevent canvassing and meatpuppetry, Wikipedia participation is still not random, whether we're talking about participation in any one RFC or across Wikipedia as a whole. --Aquillion (talk) 21:38, 15 February 2024 (UTC)Reply[reply]
    I'm not sure it would be useful even as a WP:User essay. The point behind discussion is to find out what the answer is. We know we have "the answer" when the results endure. For reasons of efficiency, the answer should be found out with the least amount of effort by anyone. If that can be done with no discussion, then great! If it can be done with a discussion involving just a few people, then good! If we really need WP:100 – well, perhaps twice a year we feel that we need that many editors to express an opinion. And unfortunately, that doesn't always result in a durable answer.
    @ProofCreature, I suggest that you look up the research Google did years ago on the correct number of interviewers they needed to make a decision about whether to hire someone. The answer was four (for most jobs). Any more than that was just a waste of resources that didn't change the outcome. WhatamIdoing (talk) 21:51, 15 February 2024 (UTC)Reply[reply]

WikiProject Namespace[edit]

Hello there, Village Pump! I am currently thinking about a WikiProject namespace, like changing Wikipedia:WikiProject Chess to WikiProject:Chess. I had this idea because it would look nicer in my opinion. Maybe the shortcut could still be WP: because WikiProject and WikiPedia share the letters W and P? - The exclamation mark Master ofexclamation mark  Magenta clockclockHedgehogsMagenta clockclock (always up for a conversation!) 19:23, 16 February 2024 (UTC)Reply[reply]

This has been suggested before, though (if memory serves) not for this reason.
WikiProject:Chess and similar pages could be set up as Wikipedia:Cross-namespace redirects. I believe that the abbreviation (WP:) could usually be made to work through the addition of specific redirects.
However, I'm not sure that it would make much difference. Most experienced editors are going to use WP:CHESS, so we're not going to see the name in discussions. WhatamIdoing (talk) 22:40, 16 February 2024 (UTC)Reply[reply]
Note Wikipedia:Chess is already a redirect to the corresponding WikiProject, and Project is an alias for the Wikipedia namespace, so Project:Chess will redirect as well. isaacl (talk) 23:00, 16 February 2024 (UTC)Reply[reply]

New sister project: WikiForum? WikiEssay?[edit]

So, I don't exactly know what the name would be, but I think a sister project where people will be allowed to write about things without worrying about notability, citations, etc. would be good. Kind of like Wikipedia essays but on any topic. It would still have restrictions of course, for inappropriate content and attacks on people or organizations. What do you think? Youprayteas (t c) 16:22, 17 February 2024 (UTC)Reply[reply]

I think you're looking for forums like lesswrong, reddit, kbin, etc. I don't think such an host of arguments would align with the WMF's goals of free knowledge. Banning all attacks on people or organizations also severely restricts the quality essaying that might go. Aaron Liu (talk) 16:30, 17 February 2024 (UTC)Reply[reply]
Proposals should go to meta:Proposals for new projects. 115.188.119.62 (talk) 21:17, 17 February 2024 (UTC)Reply[reply]