Proposal for an extension to PRISMA for systematic reviews that are based on clinical trial registry entries

The advent of clinical trial registries has enabled a new means for evaluating and synthesizing human research, however there is little specific guidance from PRISMA for researchers who wish to include clinical trial registry entries in their systematic reviews. I would suggest an extension to PRISMA to directly address these gaps.

My main suggestions would be to explicitly require researchers to:

  • Justify which clinical trial registries were included
  • Specify retrieval methods (“downloaded from” is not enough)
  • Distinguish between human-curated vs machine-interpreted data
  • Specify details of procedure for human-curated data, or code and quality control efforts for machine-interpreted data
  • Provide the decision procedure for matching registry entries to publications

I have provided explanations, examples and code below where I felt it was appropriate.

Choice of sources

There are currently 17 primary clinical trial registries other than listed by the WHO that meet the 2009 WHO registry criteria. Most reviews of clinical trial registry entries only include registry entries from and few provide any rationale for their choice in trial registry. This is a small enough number of registries that it is reasonable to ask authors to specify which ones were searched, or to justify why any were excluded.

Specification of retrieval methods

There are at least four distinct ways to download data from alone:

  1. The entire database can be downloaded as a zipped folder of XML files from
  2. A CSV or TSV file containing a table of search results can be downloaded from the web front-end of
  3. A zipped folder of XML files can be downloaded from the web front-end of
  4. The API can be queried for an XML response.

These methods do not provide the same results for what may appear to be the same query.

For example, a search performed on the web front-end of for the condition “renal cell carcinoma” returns 1745 results. (See Code example 1.)

A query to the API for the condition “renal cell carcinoma,” however, returns 1562 results. (See Code example 2.)

These are both searches of for the condition “renal cell carcinoma,” but there is a very different set of records that are produced in each case. The difference here is that the web front-end for also includes search results for synonyms of “renal cell carcinoma” in order to ensure the highest sensitivity for searches made by patients who are searching for clinical trials to participate in.

Similarly, the web front-end will often include results for related drugs, when searching for a particular drug name. E.g. a search for temsirolimus also returns results for rapamycin.

PRISMA currently tells researchers to “Present full electronic search strategy for at least one database, including any limits used, such that it could be repeated.” More specific guidance seems to be required, as (in my experience) the bulk of systematic reviews of clinical trial registry entries do not distinguish between downloading results via the API vs the web front-end.

Human-curated data vs machine-interpreted data

Post-download screening steps

Screening clinical trial registry entries for inclusion or exclusion can often be done at the point of searching the registry, however in many cases, the search tools provided by a clinical trial registry do not have exactly the right search fields or options, and so post-download screening based on data or human judgements is common. It is often not clear which screening steps were performed by the registry search, which ones were post-download filters applied to the data set, and which were based on the judgement of human screeners. To ensure transparency and reproducibility, there should be specific instructions to coders to specify, and to disclose the code for doing so, where any was used.

Extraction of clinical trial data

In a traditional systematic review of clinical trials, trial data is extracted by human readers who apply their judgement to extracting data points to be analyzed.

Reviews of clinical trials that are based on clinical trial registries often include analyses of data points that are based on machine-readable data. For example, answering the question “What is the distribution of phases among trials of renal cell carcinoma in sunitinib?” can be done in 5 lines of R code without any human judgement or curation at all. (See Code example 3.) However, there are other questions that would be difficult to answer without human interpretation, e.g. “Does the rationale for stopping this trial indicate that it was closed for futility?”

To make it more complicated, there are questions that could in principle be answered using only machine-readable information, but where that interpretation is very complicated, and in some cases, it might be easier to simply have humans read the trial registry entries. E.g. “How many clinical trials recruit at least 85% of their original anticipated enrolment?” This question requires no human judgement per se, however there is no direct way to mass-download historical versions of clinical trial registry entries without writing a web-scraper, and so a review that reports a result for this question may be indicating that they had human readers open the history of changes and make notes, or they may be reporting the results of a fairly sophisticated piece of programming whose code should be published for scrutiny.

These distinctions are often not reported, or if they are, there is not enough detail to properly assess them. Code is rarely published for scrutiny. Whether human-extracted data were single- or double-coded is also often left unclear. A result that sounds like it was calculated by taking a simple ratio of the values of two fields in a database may actually have been produced by a months-long double-coding effort or the output of a piece of programming that should be made available to scrutiny.

Data that was never meant to be machine readable, but is now

There are some data points that are presented as machine readable in clinical trial registries that were never meant to be interpreted by machines alone. PRISMA assumes that all data points included in a systematic review were extracted by human curators, and so there is a particular class of problem that can arise.

For example, in clinical trial NCT00342927, some early versions of the trial record (e.g. 2009-09-29) give anticipated enrolment figures of “99999999”. The actual trial enrolment was 9084. The “99999999” was not a data entry error or a very bad estimate—it was a signal from the person entering data that this data point was not available. The assumption was that no one would be feeding these data points into a computer program without having them read by a human who would know not to read that number as an actual estimate of the trial’s enrolment.

This can, of course, be caught by visualizing the data, checking for outliers, doing spot-checks of data, etc., but there is currently no requirement on the PRISMA checklist to report data integrity checks.

Matching registry entries to publications or other registry entries

Not all systematic reviews that include clinical trial registry entries are based on registry data alone. Many are hybrids that try to combine registry data with data extracted from publications. Clinical trials are also often registered in multiple registries. In order to ensure that clinical trials are not double-counted, it is necessary in some cases to match trial registry entries with publications or with entries in other registries. For this reason, any review that includes more than one trial registry should be required to report their de-duplication strategy.

Trial matching or de-duplication is a non-trivial step whose methods should be reported. Even in cases where the trial registry number is published in the abstract, this does not necessarily guarantee that there will be a one-to-one correspondence between publications and trial registries, as there are often secondary publications. There is also a significant body of literature that does not comply with the requirement to publish the trial’s registry number, and the decision procedure for matching these instances should be published as well.

PRISMA does not require that the decision procedure for matching trial registry entries to other records or publications be disclosed.

R Code examples

1. Search for all trials studying renal cell carcinoma using the web front-end

temp <- tempfile()
download.file("", temp)
unzip(temp, list=TRUE)[1] %>%
  count() %>%
## n
## 1744

2. Search for all trials studying renal cell carcinoma using the API

read_xml("[Condition]renal+cell+carcinoma") %>%
  xml_find_first("/FullStudiesResponse/NStudiesFound") %>%
## [1] "1562"

3. Distribution of phases of clinical trials testing sunitinib in renal cell carcinoma

read_xml("[Condition]renal+cell+carcinoma+AND+AREA[InterventionName]sunitinib") %>%
  xml_find_all("//Field[@Name='Phase']") %>%
  xml_text() %>%
  as.factor() %>%
## Not Applicable Phase 1 Phase 2 Phase 3 Phase 4
## 5              13      53      13      3

How to rename a folder of images or movies by the date and time they were taken

If you’re renaming one file, this is overkill, but if you’re renaming several hundred files, this will make your life so much better. This might be useful if your smartphone happens to name your pictures and videos using the least useful convention possible: integers that increment from 1, starting when you got your phone. (Please do not leave a comment telling me to switch to Android, thanks.)

The following instructions should work on Ubuntu 20.04, and it assumes you have a basic knowledge of the command line.

JPEG images

First, if you don’t already have it, install jhead like so:

$ sudo apt install jhead

Then cd to your folder of images and run the following command:

$ jhead -autorot -nf%Y-%m-%d\ %H-%M-%S *.jpg

This will rename all the .jpg files in that folder by the date/time they were taken.

You might need to repeat it for .JPG, .JPEG, etc.

Warning: If it can’t find metadata for the date/time inside that file, it will rename the file using the file’s creation date, which may or may not be what you want.


This one’s more complicated. You have to write a short shell script.

Step one: Learn Emacs.

Lol just kidding, use whatever text editor you want.

Make a new file called and put the following in it:




for file in $(ls $folderfiles); do
    datetime=$(mediainfo $file | grep Tagged\ date | head -n 1 | grep -o [0-9]\\{4\\}-[0-9]\\{2\\}-[0-9]\\{2\\}\ [0-9]\\{2\\}:[0-9]\\{2\\}:[0-9]\\{2\\} | sed 's/:/-/g')
    if [ "$datetime" != "" ]
	mv "$file" "$newname"
	echo "No metadata for $file"

Then make the file executable:

$ chmod +x

Now you can run your script!

$ ./ MOV '/home/yourname/Videos'

You might have to run this several times for each type of file extension .mov, .MOV, .mp4, etc.

You can even open a terminal window and drag the shell script on it, then type MOV and then drop the folder on it, and it should work!

Warning: Make sure that there’s no trailing slash at the end of the folder. Also, the script doesn’t handle file names with spaces in them nicely, so get rid of them first. (In Nautilus, select all the files, then press F2 and do a find/replace for spaces to underscores, maybe?)

How to calculate Fleiss’ kappa from a Numbat extractions export

If you’ve done a systematic review using Numbat, you may want to estimate inter-rater reliability for one or more of the data points extracted.

First, make sure that all the extractors have completed all the extractions for all the references. If there is one missing, you will get an error.

When the extractions are complete, log in to your Numbat installation, and choose Export data from the main menu. Export the extractions, not the final version.

This will give you a tab-delimited file that contains a row for every extraction done for every user, which is not the format that the Fleiss’ kappa function as implemented by the irr package in R requires, unfortunately. (Hence the R script below.)

Next, choose which of the data points you wish to assess for inter-rater reliability. Let’s imagine that you were extracting whether a clinical trial is aimed at treatment or prevention, and this column is called tx_prev in the exported extractions file.

You could delete all the columns from the extractions file except the referenceid and userid columns, and the data point of interest, in this case tx_prev. The following CSV is an example that you can use. A typical Numbat export will contain many more columns than this. These are just the relevant ones.


If you saved this TSV to your Downloads folder as numbat-export.tsv, you could use the following function to convert this CSV into a data frame that is compatible with kappam.fleiss() from irr.


extractions <- read_csv("numbat-export.csv")

get_numbat_extraction_data <- function (data, refid, uid, row) {
  return( as.character( unlist( data[data$referenceid == refid & data$userid == uid, row] ) ) )

numbat_transform_row <- function (data, row) {

  new_data <- data %>%
    select(referenceid) %>%

  users <- data %>%
    select(userid) %>%
    unique() %>%
    unlist() %>%

  rids <- new_data %>%
    unlist() %>%

  for (user in users) {
    colname <- paste0("user", user)
    new_data[[colname]] <- ""

  for (user in users) {
    for (rid in rids) {
      colname <- paste0("user", user)
      new_data[new_data$referenceid == rid, colname] <- get_numbat_extraction_data (

  new_data$referenceid <- NULL



numbat_transform_row (extractions, "tx_prev") %>%

This should give you a console printout that looks like this:

Fleiss' Kappa for m Raters
Subjects = 10
Raters = 3
Kappa = 0.583
z = 3.2
p-value = 0.0014

Congrats, you just calculated Fleiss’ kappa from your Numbat extractions!

Clinical agnosticism and when trials say “maybe”—a presentation for #SummerSchool hosted by

On 2020 August 4, I gave a presentation on Clinical Agnosticism as a part of , a free, online, interdisciplinary academic conference hosted by

You can download the slides from my presentation here. I transcribed my presentation in the Notes for each slide (Click View > Notes), if you want to know what I said, too!

If you want more information on this subject, this research was based on my doctoral thesis, The Moral Efficiency of Clinical Trials in Anti-cancer Drug Development. Chapter 5 will be of particular relevance.

Learning from pandemics to improve science

by Peter Grabitz MD and Benjamin Gregory Carlisle PhD

The year 2020 has already been full of surprises—many of which would have been difficult to imagine just a few months ago. Everything seems to be changing quickly, even science, which is not exactly known for being speedy! So it is worth taking a closer look: How has the pandemic changed the research landscape? And even more important: What can we learn from it?

Science caught Coronavirus

It is the end of May and we are sitting in a cafe. Under normal circumstances, this would not be worth mentioning. However, these are not normal circumstances. Berlin, Germany, and more-or-less the whole world is under lock-down. An international pandemic has spread and curfews have been imposed in almost every city, country and continent. What seemed like an initial competition for the strictest measures (South Africa was in the lead) has now been replaced by a competition for the fastest relaxation of those measures (as of 31.05. Thuringia seems to win.(1))

And even as the world has changed, so has science itself changed. It has virtually freaked out, one might think, virologists are portrayed in all daily and weekly newspapers. Other disciplines are now researching on and around the coronavirus as well. If you meet researchers in the digital Zoom corridors of their offices and laboratories, everyone can tell you about a new COVID-19 project. One could almost think that in science, a competition for the most COVID-19 funding applications in the most distant disciplines has started (according to the authors, the idea of using homeopathy for COVID-19 prophylaxis wins(2)).

On the research funders side, a similar competition appears to be taking place: the Federal Ministry of Education and Research, for example, has launched a 150 million Euro funding package for COVID-19 research for German Medical Centers alone.(3) A little later, the EU even raised 7.4 billion for COVID-19 research and development.(4)

The incentives work, because in May alone, about 4000 articles per week were published about COVID-19.(5)

Figure 1: Number of publications listed in PubMed for keywords of different 21st century pandemics. Overall, pandemics generate academic interest. Coronavirus/Covid-19 plays in a league of its own. (Source: the authors searched PubMed on 31.05.2020)

Is that good? Well, there’s good and bad.

On the one hand, we are all in the middle of a pandemic of global proportions. It affects every facet of our daily lives. Our basic rights have been restricted, our goals and projects have been interrupted and even our safety is at risk. A vaccination should be found as soon as possible. (For those who are critical of vaccinations: they’re good, actually!) Further, research is needed right now on measures to contain the virus (Do school closures help? Masks? Are children less affected? Aerosol transmission or surfaces more relevant?) Due to the dangers posed by the current pandemic, the volume of research published in response to COVID-19 is much higher than that produced in response to other pandemics of the 21st century, as Figure 1 shows.

On the other hand, the pressure that researchers are under to be the first with a hot-take on the Coronavirus pandemic can also backfire. For example, well known epidemiologist John Ioannidis’ controversial antibody study(6) has come under intense scrutiny, with allegations of research misconduct leading to an internal investigation by Stanford.(7) A major, influential study of hydroxychloroquine safety and efficacy in Covid-19 patients (8) has also been retracted, after concerns were raised regarding the veracity of the data.

As meta-researchers, we feel obliged to also think about other consequences of this gigantic re-prioritization of the global scientific apparatus.

What happens if everyone is mostly researching one thing? Is research involving bacteria now worth less than research on viruses? Or even worse: science that deals with quantum physics or space travel? Corona is reason enough to plunge the current generation of young researchers into a deep crisis of meaning. What should a PhD in English literature be worth now?1

If you will, all research has been infected with Sars-CoV-2. This can be impressively reconstructed by looking at the American study registry Here, clinical studies have to be registered before they start enrolling patients. Principal Investigators are required to regularly update their registry entries with current information. Since March 2020, a large number of clinical trials have been suspended, withdrawn or completely terminated (see Figure 2). The green bars show how many studies were suspended, terminated or withdrawn with reasons that explicitly mention COVID-19. In total, as of May 31st, more than 50,000 patients have already been enrolled in studies that stopped because of the current pandemic. Of course, this is not only due to new priority setting in clinical research. In many cases, study participants are no longer able to travel to hospitals and lock-down measures prevent studies from continuing, too. However, the extent to which studies are halted give you food for thought.

At this moment it is not clear yet how many trials will “get back on their feet”. In a comparison cohort of trials that were interrupted in the same months in 2018, only about 10% have been resumed within a year. So it remains to be seen how and if clinical research will recover from COVID-19. The “death rate” (i.e. studies that were stopped permanently because of COVID-19) is currently at 2.5% (33/1336).

Figure 2: Clinical trials reported as interrupted, withdrawn or completely terminated in the American study registry The green bars show how many of these studies explicitly mention COVID-19 as the reason.(9)

Can we learn from pandemics to improve research?

In our everyday jobs we often ask ourselves how we could also implement positive changes in research. How can we make science more robust, transparent and useful? In the following we present two ideas:

Idea 1: More pandemics!

Pandemics, it turns out, are highly effective at moving the static research world and incentivizing new ways of thinking.

Instigating a global pandemic to combat every problem in science, however, may not be a realistic strategy. After careful consideration we regret to inform you that this strategy has an unfavourable risk-benefit balance and did not pass ethical review.

Idea 2: Learn from “bandwagon” effects and try to reproduce the right incentives (and avoid potential problems)

So here is a second suggestion:

Imagine, if you can, a world in which researchers stopped competing over being the first to rush out a pandemic-related publication and instead fought to get ahead of the curve in terms of implementing good scientific practices. For example, who does the most reproducible and transparent research? The idea here is to initiate a similar bandwagon effect, that we are currently observing in COVID-19 research, just not for a scientific topic, but for good research practices. We need to initiate a “fear of missing out” on the Good Science bandwagon among people who are able to make decisions regarding scientific practice!

If successful, we wouldn’t overhear conversations like “My lab published so-many papers about Covid-19” (because that’s what everyone is doing now) but rather “All of my lab’s papers are open access so that everyone can read them” or “How many of your papers include open, interoperable, searchable and reproducible data?

Short background: The bandwagon effect is very well researched. In the popular book “Nudge” by Nobel Prize winner Richard Thaler, an experiment of the “Minnesota Department of Revenue” (10) is described, which investigated how tax compliance can be improved. Taxpayers were either

  • offered help with paperwork to pay taxes correctly,
  • they were shown the benefits of tax compliance for society,
  • more frequent audits were announced,
  • or they were informed that 93% of people in Minnesota pay their taxes correctly already.

The surprising result: Informing tax payers about the correct behavior of the vast majority of other people led to better tax compliance!

Marketing industries nudge us regularly using similar methods. Just think of the “3549 people” that “are looking at this accommodation right now” or the “only 3 backpacks left” in the online shop of your choice. Nobody wants to be the last to act. It doesn’t matter if it’s about the new iPhone, tax compliance or writing corona publications.

Note of caution to this idea: Of course, we need to be careful that attempts to create a “bandwagon” for Good Science are directed at those who have the ability to make such changes. For example, early career researchers who do not get to choose which journals they submit to should not be shamed for the decisions of their supervisors!

Example: Summary results in clinical research

Let’s take a closer look at an example. Clinical trials are an essential key to medical progress. Clinical trials are used to test whether a new therapy, a new active substance, a new medical product, a new diagnostic procedure or a new preventive measure has the desired benefit for improving care, and whether there is an appropriate balance between risks and benefits.(11)

This is how the German Science Council describes the importance of clinical studies. However, in order to fulfil their role as “essential keys to medical progress”, clinical studies must also publish their results. This may sound obvious at first. But it is not. A study published last year showed that for about a quarter of all completed clinical studies carried out by German university hospitals, there are no published results even after six or more years: Neither as peer reviewed publications, nor as summary results in registries. This corresponds to missing results of at least 171 clinical studies with at least 18,000 patients that were completed 2010-2014.(12)

Why is that? One can make a guess: There may be no one to hold study sponsors accountable. This is supported by the fact that the industry now publishes 90-100% of its study results directly on registries. There is more attention on transparent research practises of the industry than of public institutions.

Another argument could be that the respective principal investigators have left the university and studies were never completed or did not produce results that are easy to publish.

However, none of this matters to the patients who put themselves at risk, or the patients who depend on trustworthy evidence to guide medical practice. The risks and burdens assumed by patients who participate in clinical trials of experimental therapies are justified by the scientifically and socially valuable information that these trials create. It is difficult to say that there is any value at all to performing a clinical trial, if no one knows what the results of that trial were. Non-publication also undermines the trust that patients placed in the institution of human research, which is lost if studies never publish their findings.

So how can the situation be changed now?

The report we mentioned and many of the previous publications present their results in the form of a ranking (see Figure 3): this allows universities to compare themselves with each other. This comparison is exactly the language that universities—and the decision-makers at those universities—understand and listen to. No university wants to be at the bottom of the table.

Figure 4. How many study results have you published in the last 6 months?” This graph could motivate a Principal Investigator to be like Heidelberg in order to not be last in the next ranking. Source: TranspariMED blog (14).

There has been increased media attention on this issue that has also contributed to the fact that universities are now taking the topic more seriously and are achieving considerable and rapid success in some cases.

Figure 4is taken from the TranspariMED Blog (14), which shows the successes of different universities in recent months. One could even claim that half of German University Medical Centers are actively working on fulfilling their duty to publish all study results, and jumping on the Good Science bandwagon!

This improvement did not even require a pandemic, by the way. ;)


Pandemics like COVID-19 change science at lightning speed; all researchers and funding agencies want to make a positive contribution. One possible psychological basis is the bandwagon effect. We should learn from this and use more positive reinforcement methods to make science more transparent and robust. The case of university performance in clinical trial summary results posting suggests that creating bandwagon effects for good science can work. No pandemics necessary.


Where the authors work, there are other, more or less radical ideas to improve scientific practice. Those interested can find a comprehensive description here:

Strech D, Weissgerber T, Dirnagl U, on behalf of QUEST Group (2020) Improving the trustworthiness, usefulness, and ethics of biomedical research through an innovative and comprehensive institutional initiative. PLoS Biol 18(2): e3000576.


1. Connolly K. German state causes alarm with plans to ease lockdown measures. The Guardian [Internet]. 2020 May 25 [cited 2020 Jun 5]; Available from:

2. Wie Pseudomedizin gegen das neue Corona-Virus beworben wird [Internet]. MedWatch – der Recherche verschrieben. 2020 [cited 2020 May 31]. Available from:

3. BMBF-Internetredaktion. Karliczek: Wir fördern Nationales Netzwerk der Universitätsmedizin im Kampf gegen Covid-19 – BMBF [Internet]. Bundesministerium für Bildung und Forschung – BMBF. [cited 2020 Jun 5]. Available from:

4. Coronavirus Global Response: €7.4 billion raised [Internet]. European Commission – European Commission. [cited 2020 Jun 5]. Available from:

5. The most influential coronavirus research articles [Internet]. [cited 2020 Jun 5]. Available from:

6. Bendavid E, Mulaney B, Sood N, Shah S, Ling E, Bromley-Dulfano R, et al. COVID-19 Antibody Seroprevalence in Santa Clara County, California. medRxiv. 2020 Apr 17;2020.04.14.20062463.

7. JetBlue Founder David Neeleman Helped Fund The Stanford Coronavirus Antibody Study [Internet]. [cited 2020 Jun 7]. Available from:

8. Mehra MR, Desai SS, Ruschitzka F, Patel AN. RETRACTED: Hydroxychloroquine or chloroquine with or without a macrolide for treatment of COVID-19: a multinational registry analysis. The Lancet [Internet]. 2020 May 22 [cited 2020 Jun 7];0(0). Available from:

9. Carlisle, Benjamin. Clinical trials that were terminated, suspended or withdrawn due to Covid-19. 2020 [cited 2020 May 31]; Available from:

10. Coleman S. The Minnesota Income Tax Compliance Experiment State Tax Results [Internet]. Rochester, NY: Social Science Research Network; 1996 Apr [cited 2020 Jun 5]. Report No.: ID 1585242. Available from:

11. Deutscher Wissenschaftsrat. Empfehlungen zu Klinischen Studien (Drs. 7301-18) [Internet]. Hannover; 2018 Oct [cited 2019 Nov 26]. Report No.: Drs. 7301-18. Available from:

12. Wieschowski S, Riedel N, Wollmann K, Kahrass H, Müller-Ohlraun S, Schürmann C, et al. Result dissemination from clinical trials conducted at German university medical centers was delayed and incomplete. Journal of Clinical Epidemiology. 2019 Nov;115:37–45.

13. German universities: 445 clinical trials missing results [Internet]. transparimed. [cited 2020 Jun 1]. Available from:

14. German universities report record number of clinical trial results [Internet]. transparimed. [cited 2020 Jun 1]. Available from:


1 Rhetorical question. The answer is: Of course, the same as before COVID-19!

Reminders for myself for next time I re-install Regolith Ubuntu

  • Don’t try to install wifi drivers by downloading them on a USB and copying them over. You will be in dependency hell. Plug your phone in to tether the internet connexion, get the wifi drivers, then proceed normally
  • To get Bluetooth working:
  • Install Gnome-Tweaks, then change Appearance > Themes > Applications to Adwaita; there is no other way to have a non-dark theme
  • To install R:

Risk that a queer character was written for straight comfort assessment tool

Download as an .odt file

The following is a tool for assessing the risk that a fictional character was written mainly for the comfort of straight people. The version from 2020-04-15 is an early draft and will almost certainly be modified later.

Name of queer character and media in which they appear
Section A total
Section B total
Summary score: 30 + (Section A total) – (Section B total)

Section A

“Gay people just look like … people”

– Fucking JK Rowling

Score 0 for “doesn’t apply”; 1, “possibly or probably”; 2, “yes, for certain.”

StatementScore (0-2)
“They’re not a gay character; they’re a character who happens to be gay”
Frames queer rights mainly or exclusively in terms of “love wins,” “love is love,” etc.
Police officer, military or clergy
Regards gay marriage as the end-goal of the gay rights movement
Overtly patriotic
Has adopted children or children by surrogate
White gay man with no mental health issues
Sweater-vest or other non-threatening clothing
It would be in-character for them to say, “I’m not like other queer people”
Votes Republican
If trans, this character or their story uncritically places a high value on “passing”
“They break gay stereotypes”
Married or monogamous
Upper middle class
Encounters and overcomes the kind of discrimination that lets straights say “I would never do that”

Section B

“That thing that only gay people do? I hate it for non-homophobic reasons.”

– Old straight-people proverb

Score 0 for “doesn’t apply”; 1, “possibly or probably”; 2, “yes, for certain.” For Section B, score 0 if the statement applies, but only as a cautionary tale, a joke or a character flaw.

StatementScore (0-2)
Is trans
Has casual sex
Character highlights an intersection of queerness (e.g. being queer and Black)
Kink or fetish
Is single or has more than 1 partner
Lower level of formal education
Has difficulty with, or is critical of the police or other existing power structures
Engages in some stereotypically gay activity
Financial difficulty
In the closet, at least in some contexts
Flamboyantly gay or otherwise clearly queer-coded
Politically active in progressive causes
Sex worker
Depiction as a “good” character doesn’t depend on how chaste they are

Summary result

The range of possible scores for sections A and B are 0 to 30. These statements have been equally weighted, however there may be cases where some of them are important or even defining to the queer character in question and should be weighted more heavily.

Subtract the score from Section B from the score for Section A and add 30 for a single summary measure ranging from 0 to 60. The higher the score, the more likely it is that the character was written for straight comfort.

Introducing PubMed NCT Extractor

Inspired by talks at the 2019 METAxDATA un-conference, I wrote a little meta-research tool to batch extract (“NCT”) numbers from PubMed XML search results and test whether they correspond to legitimate entries.

It’s written for Python 3 on elementary OS 5.1, so I can’t guarantee it will work on anything else. I also wrote a paper based on this tool that I’m currently sending to journals for review. If you’re interested in reading a draft, let me know and I’ll be happy to share it with you.

You can get the code from Codeberg! If you try it out or use it for something, let me know!