The Wikimedia Foundation's Editing team is working to improve how contributors communicate on Wikipedia using talk pages through a series of incremental improvements that will be released over time.
As part of this effort, the Editing team introduced a new workflow for starting new topic threads on talk pages, across Wikipedia's 16 talk namespaces. This new workflow is intended to make it more intuitive for Junior Contributors to initiate conversations in ways other contributors can easily reply to and to help Senior Contributors do the same, with less effort.
The new topic tool tool provides an an inline form for adding new topics. In addition, the language throughout the workflow was adjusted to be more topic-specific.
The team ran an AB test of the New Topic Tool from 27 January 2022 through 25 March 2022 [^timeline] to assess the efficacy of this new feature. The test included all logged-in and logged-out users that edited a talk page at one the 20 participating Wikipedias during the duration of the AB test (see full list of participating Wikipedias in task description and conditions outlined in the methodlology section below). During this test, 50% of users included in the test had the New Topic tool automatically enabled, and 50% did not. This report focues on the results for logged-out users.
[^timeline]: Note that we excluded logged-out data collected from from 27 January 2022 to 17 February 2022 in this analysis due to errors in the bucketing implementation that did not accurately log all the events in the control group.
You can find more information about features of this tool and project updates on the project page.
Similar to the analysis on logged-in users, the AB test on logged-out users was run on a per Wikipedia basis and contributors included in the test were randomly assigned to either the control (new topic tool disabled by default) or treatment (new topic tool enabled by default). There are a few notable differences between the methodlogy used for the logged-in user analysis, which are summarized below:
anonymous_user_token
is a temporarily-assigned unique ID (cookie that expires within 90 days) that is logged in the EditAttemptStep schema.event.action = 'ready'
event in EditAttemptStep to determine the user's assigned bucket.Upon conclusion of the test on 25 March 2022, we recorded a total of 9,527 new topic attempts initiated across both test groups by 7,823 distinct logged-out contributors across all experience levels. Data was collected in EditAttemptStep and talk_page_edit .
In this test, a user can add a new topic using the New Topic Tool or using new topic workflows available with wikitext full-page and section editing. For the purpose of this analysis, these two types of editing experiences are defined as follows:
New Topic Tool: Any edit to add a new topic (section) to a talk page namespace made with the new topic tool. The new topic tool allows edits using both wikitext and source mode. New Topic tool events were sampled at 100%.
Recorded in EditAttemptStep as: event.action = 'init'
, event.integration = 'discussiontools'
, event.init_type = 'section'
Existing Add New Section Link: Any edit to add a new topic (section) using the existing section=new
link and workflow. These events were sampled at a rate of 1/16, or 6.25%.
Recorded in the EditAttemptStep as: event.action = 'init', event.integration = 'page' , event.init_type = 'section', event.init_mechanism IN ('url-new', 'new')
We excluded the following types of edits from this analysis: (1) edits made with the reply tool, (2) full-page edits to create a new page or to an existing page, (3) corrective edits to an existing section. Note: It's possible to create a new section using full page editing but these were excluded for the following reasons (1) we do not have instrumentation to decipher beween an attempt to make an edit to existing text on the page vs an edit to create a new section using full page editing; and (2) This method exists in both the test and control groups and we do not believe would be impacted by the appearance of the new topic tool.
See the following Phabricator tickets for further details regarding instrumentation and implementation of the AB test:
library(IRdisplay)
display_html(
'<script>
code_show=true;
function code_toggle() {
if (code_show){
$(\'div.input\').hide();
} else {
$(\'div.input\').show();
}
code_show = !code_show
}
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()">
<input type="submit" value="Click here to toggle on/off the raw code.">
</form>'
)
shhh <- function(expr) suppressPackageStartupMessages(suppressWarnings(suppressMessages(expr)))
shhh({
library(tidyverse)
# Modeling
library(brms)
library(lme4)
library(tidybayes)
set.seed(5)
# Tables:
library(gt)
library(gtsummary)
})
options(repr.plot.width = 15, repr.plot.height = 10)
#collect all new topic tool attempts and saves by logged-out users
query <-
"
--find all edit attempts
WITH edit_attempts AS (
SELECT
wiki AS wiki,
event.editing_session_id as edit_attempt_id,
event.is_oversample AS is_oversample,
event.integration AS editing_method,
If(event.integration == 'discussiontools', 1, 0) AS new_topic_tool_used,
CASE
WHEN event.init_type = 'section' AND event.integration == 'discussiontools' THEN 'new_topic_tool'
WHEN event.init_type = 'section' AND event.integration == 'page' AND event.init_mechanism IN ('url-new', 'new') THEN 'new_section_link'
ELSE 'NA' -- check to make sure all edit types accounted for in above list
END AS section_edit_type
FROM event.editattemptstep
WHERE
-- only in participating wikis
wiki IN ('amwiki', 'bnwiki', 'zhwiki', 'nlwiki', 'arzwiki', 'frwiki', 'hewiki', 'hiwiki',
'idwiki', 'itwiki', 'jawiki', 'kowiki', 'omwiki', 'fawiki', 'plwiki', 'ptwiki', 'eswiki', 'thwiki',
'ukwiki', 'viwiki')
-- since deployment
AND year = 2022
AND ((month = 02 and day >= 18) OR
(month = 03 and day <= 25))
-- remove bots
AND useragent.is_bot = false
-- anon user
AND event.user_id = 0
-- look at only desktop events
AND event.platform = 'desktop'
-- review all talk namespaces
AND event.page_ns % 2 = 1
AND event.action = 'init'
-- discard VE/Wikieditor edits to create new page or reply tool edits
AND NOT (
-- not a reply tool edit
(event.init_type = 'page' AND event.integration = 'discussiontools') OR
-- not an wikitext edit to create a new page
(event.init_type = 'page' AND event.init_mechanism IN ('url-new', 'new') AND event.integration = 'page') OR
-- not a corrective edit to an existing section
(event.init_type = 'section' AND event.init_mechanism IN ('click', 'url') AND event.integration == 'page') OR
-- not a full page edit
(event.init_type = 'page' AND event.init_mechanism IN ('click', 'url') AND event.integration = 'page')
)),
--- bucketing applied at ready events
ready_events AS (
SELECT
wiki AS wiki,
event.anonymous_user_token as user_id, --anon token assigned at ready event
event.bucket AS experiment_group,
event.editing_session_id as edit_ready_id
FROM event.editattemptstep
WHERE
-- only in participating wikis
wiki IN ('amwiki', 'bnwiki', 'zhwiki', 'nlwiki', 'arzwiki', 'frwiki', 'hewiki', 'hiwiki',
'idwiki', 'itwiki', 'jawiki', 'kowiki', 'omwiki', 'fawiki', 'plwiki', 'ptwiki', 'eswiki', 'thwiki',
'ukwiki', 'viwiki')
AND year = 2022
AND ((month = 02 and day >= 18) OR
(month = 03 and day <= 25))
AND event.platform = 'desktop'
-- only users in AB test
AND event.bucket IN ('test', 'control')
-- only talk page events
AND event.page_ns % 2 = 1
-- only anon users
AND event.user_id = 0
),
-- find all published comments
published_dt_new_topics AS (
SELECT
session_id AS edit_save_id,
`database` AS wiki
FROM event.mediawiki_talk_page_edit
WHERE
year = 2022
AND ((month = 02 and day >= 18) OR
(month = 03 and day <= 25))
-- only in participating wikis
AND `database` IN ('amwiki', 'bnwiki', 'zhwiki', 'nlwiki', 'arzwiki', 'frwiki', 'hewiki', 'hiwiki',
'idwiki', 'itwiki', 'jawiki', 'kowiki', 'omwiki', 'fawiki', 'plwiki', 'ptwiki', 'eswiki', 'thwiki',
'ukwiki', 'viwiki')
AND performer.user_id = 0
),
published_section_link_new_topics AS (
SELECT
event.editing_session_id AS edit_save_id,
wiki AS wiki
FROM event.editattemptstep
WHERE
-- only in participating wikis
wiki IN ('amwiki', 'bnwiki', 'zhwiki', 'nlwiki', 'arzwiki', 'frwiki', 'hewiki', 'hiwiki',
'idwiki', 'itwiki', 'jawiki', 'kowiki', 'omwiki', 'fawiki', 'plwiki', 'ptwiki', 'eswiki', 'thwiki',
'ukwiki', 'viwiki')
AND year = 2022
AND ((month = 02 and day >= 18) OR
(month = 03 and day <= 25))
AND event.user_id = 0
AND event.action = 'saveSuccess'
)
-- main query
SELECT
eas.wiki,
res.user_id,
edit_attempt_id,
res.experiment_group,
is_oversample,
editing_method,
new_topic_tool_used,
section_edit_type,
-- was saved in either talk page edit or editattemptstep
-- was saved in either talk page edit or editattemptstep
IF ((section_edit_type = 'new_topic_tool' AND (tpe_save.edit_save_id IS NOT NULL OR eas_save.edit_save_id IS NOT NULL))
OR (section_edit_type = 'new_section_link' AND (tpe_save.edit_save_id IS NOT NULL OR eas_save.edit_save_id IS NOT NULL)), 1, 0) AS edit_success
FROM edit_attempts eas
INNER JOIN ready_events res ON
eas.edit_attempt_id = res.edit_ready_id AND
eas.wiki = res.wiki
LEFT JOIN published_dt_new_topics tpe_save ON
eas.edit_attempt_id = tpe_save.edit_save_id AND
eas.wiki = tpe_save.wiki
LEFT JOIN published_section_link_new_topics eas_save ON
eas.edit_attempt_id = eas_save.edit_save_id AND
eas.wiki = eas_save.wiki
"
new_topic_attempts <- wmfdata::query_hive(query)
Don't forget to authenticate with Kerberos using kinit
# data reformatting and cleanup
#set factor levels with correct baselines
new_topic_attempts$section_edit_type <-
factor(
new_topic_attempts$section_edit_type,
levels = c("NA", "new_section_link", "new_topic_tool"),
labels = c("NA", "Existing add new section link", "New topic tool")
)
new_topic_attempts$edit_success <-
factor(
new_topic_attempts$edit_success,
levels = c(0, 1),
labels = c("Not Complete", "Complete")
)
# reformat user-id and adjust to include wiki to account for duplicate user id instances.
# Users do not have the smae user_id on different wikis
new_topic_attempts$user_id <-
as.character(paste(new_topic_attempts$user_id, new_topic_attempts$wiki, sep ="-"))
#clarfiy wiki names
new_topic_attempts <- new_topic_attempts %>%
mutate(
wiki = case_when(
#clarfiy participating project names
wiki == 'amwiki' ~ "Amharic Wikipedia",
wiki == 'bnwiki' ~ "Bengali Wikipedia",
wiki == 'zhwiki' ~ "Chinese Wikipedia",
wiki == 'nlwiki' ~ 'Dutch Wikipedia',
wiki == 'arzwiki' ~ 'Egyptian Wikipedia',
wiki == 'frwiki' ~ 'French Wikipedia',
wiki == 'hewiki' ~ 'Hebrew Wikipedia',
wiki == 'hiwiki' ~ 'Hindi Wikipedia',
wiki == 'idwiki' ~ 'Indonesian Wikipedia',
wiki == 'itwiki' ~ 'Italian Wikipedia',
wiki == 'jawiki' ~ 'Japanese Wikipedia',
wiki == 'kowiki' ~ 'Korean Wikipedia',
wiki == 'omwiki' ~ 'Oromo Wikipedia',
wiki == 'fawiki' ~ 'Persian Wikipedia',
wiki == 'plwiki' ~ 'Polish Wikipedia',
wiki == 'ptwiki' ~ 'Portuguese Wikipedia',
wiki == 'eswiki' ~ 'Spanish Wikipedia',
wiki == 'thwiki' ~ 'Thai Wikipedia',
wiki == 'ukwiki' ~ 'Ukrainian Wikipedia',
wiki == 'viwiki' ~ 'Vietnamese Wikipedia',
)
)
new_topic_attempts_bygroup <- new_topic_attempts %>%
filter(is_oversample == 'false') %>% #All Discussion Tool events are oversampled - removing to check balance.
group_by(experiment_group) %>%
summarise(users = n_distinct(user_id),
attempts = n_distinct(edit_attempt_id), .groups = 'drop')
new_topic_attempts_bygroup
experiment_group | users | attempts |
---|---|---|
<chr> | <int> | <int> |
control | 451 | 461 |
test | 749 | 797 |
Our key performance indicator (KPI) for this analysis identified as new topic completion rate. For the purpose of this analysis, we are defining completion rate as the percent of logged out contributors that successfully published (event.action = 'saveSuccess'
in EditAttemptStep)[^instrumentation] at least one new topic after clicking the Add topic /Section=new link interface (event.action = 'init'
) during the time of the AB test.
Note that this does not take into account the number of attempts it took for the user to publish or the duration of their editing sessions. For comparison purposes, we also reviewed completion rate defined as the percent of all new topic edit attempts by Junior Contributors that were successfully published. This was also the dataset we used to model the impact of the new topic tool as the model accounts for both user and wiki experience on the success of each edit attempt.
[^instrumentation]: During this analysis, we identified a bug where some instances of edits sessions completed using the new topic tool were not correctly recorded as being sucessfully saved in EditAttemptStep. As a result, we also used the talk_page_edit schema to more accurately account for all edits posted by the new topic tool by joining this data with init events identifed in EditAttemptStep.
# Completion Rate By Session
new_topic_completes_bysession <- new_topic_attempts %>%
group_by (wiki, section_edit_type) %>%
summarise(n_attempts = n_distinct(edit_attempt_id),
n_completions = n_distinct(edit_attempt_id[edit_success == 'Complete']),
new_topic_tool_used = as.integer(ifelse(sum(section_edit_type== 'New topic tool'), 1, 0)),
.groups = 'drop')
# By session
new_topic_completes_bysession_all <- new_topic_completes_bysession %>%
group_by(section_edit_type) %>%
summarise(n_attempts = sum(n_attempts),
n_attempts_completed = sum(n_completions),
completion_rate = paste0(round(n_attempts_completed / n_attempts *100, 1), "%"),
.groups = 'drop'
)
new_topic_completes_bysession_all
section_edit_type | n_attempts | n_attempts_completed | completion_rate |
---|---|---|---|
<fct> | <int> | <int> | <chr> |
Existing add new section link | 700 | 59 | 8.4% |
New topic tool | 8827 | 1044 | 11.8% |
# Contributors that completed at least 1 edit
new_topic_completes <- new_topic_attempts %>%
group_by (wiki, section_edit_type, user_id) %>%
summarise(n_attempts = n_distinct(edit_attempt_id),
n_completions = n_distinct(edit_attempt_id[edit_success == 'Complete']),
edit_success = ifelse(sum(n_completions >= 1), 'Complete', 'Not Complete'), #redefine edit success as user completed at least 1 edit attempt
new_topic_tool_used = as.integer(ifelse(sum(section_edit_type== 'New topic tool'), 1, 0)),
.groups = 'drop')
# By Contributor
# Review edit completion rate by editing method
new_topic_completes_all <- new_topic_completes %>%
group_by(section_edit_type) %>%
summarise(n_users = n_distinct(user_id),
n_users_completed = sum(n_completions >= 1), #user completed at least 1 edit
completion_rate = paste0(round(n_users_completed / n_users *100, 1), "%"),
.groups = 'drop'
) %>% #determine credible intervals
cbind(as.data.frame(binom:::binom.bayes(x = .$n_users_completed, n = .$n_users, conf.level = 0.95, tol = 1e-10))) %>%
mutate(lower = round(lower,2),
upper = round(upper, 2))
# Review edit completion rate by editor interface
new_topic_completes_all_anon_table <- new_topic_completes_all %>%
select(c(1,2,3,4,11,12)) %>% #remove unneeded rows
gt() %>%
tab_header(
title = "Logged-out Contributors new topic completion rate",
subtitle = "across all participating Wikipedias"
) %>%
cols_label(
section_edit_type = "Editing method",
n_users = "Number of users attempted",
n_users_completed = "Number of users completed",
completion_rate = "New topic completion rate",
lower = "CI (Lower Bound)",
upper = "CI (Upper Bound)"
) %>%
tab_footnote(
footnote = "Defined as percent of contributors that attempted and published at least 1 new topic",
locations = cells_column_labels(
columns = 'completion_rate'
)
) %>%
tab_footnote(
footnote = "Sampling rate for Non-New Topic Tool events is 6.25%",
locations = cells_body(
columns = 'section_edit_type', rows = 1)
) %>%
tab_footnote(
footnote = "Sampling rate for Topic Tool Tool events is 100%",
locations = cells_body(
columns = 'section_edit_type', rows = 2)) %>%
tab_footnote(
footnote = "95% credible intervals. There is a 95% probability that the parameter lies in this interval",
locations = cells_column_labels(
columns = c('lower', 'upper')
)) %>%
gtsave(
"new_topic_completes_all_anon_table.html", inline_css = TRUE)
IRdisplay::display_html(data = new_topic_completes_all_anon_table, file = "new_topic_completes_all_anon_table.html")
Logged-out Contributors new topic completion rate | |||||
---|---|---|---|---|---|
across all participating Wikipedias | |||||
1
Defined as percent of contributors that attempted and published at least 1 new topic
2
95% credible intervals. There is a 95% probability that the parameter lies in this interval
3
Sampling rate for Non-New Topic Tool events is 6.25%
4
Sampling rate for Topic Tool Tool events is 100%
|
dodge <- position_dodge(width=0.9)
p <- new_topic_completes_all %>%
ggplot(aes(x= section_edit_type, y = n_users_completed / n_users, fill = section_edit_type)) +
geom_col(position = 'dodge') +
geom_text(aes(label = paste(completion_rate), fontface=2), vjust=1.2, size = 8, color = "white") +
geom_errorbar(aes(ymin = lower, ymax = upper), color = 'red', size = 1, alpha = 0.5, position = dodge, width = 0.25) +
scale_y_continuous(labels = scales::percent) +
scale_x_discrete(labels = c("Existing add new section link", "New topic tool")) +
labs (y = "Percent of contributors ",
x = "Editing method",
title = "Logged-out contributors new topic completion rate \n across all participating Wikipedias",
caption = "Defined as percent of contributors that make a new topic attempt and publish at least 1 new topic \n
Red error bars: 95% credible intervals") +
scale_fill_manual(values= c("#999999", "steelblue2")) +
theme(
panel.grid.minor = element_blank(),
panel.background = element_blank(),
plot.title = element_text(hjust = 0.5),
text = element_text(size=16),
legend.position= "none",
axis.line = element_line(colour = "black"))
p
ggsave("Figures/new_topic_completes_all_anon.png", p, width = 16, height = 8, units = "in", dpi = 300)
Overall, 9.6% of all Junior Contributors that made a new topic attempt were able to successfully publish at least 1 new topic with the new topic tool, while 8.1% of all Junior Contributors successfully published a new topic using the existing add new section link workflow. This represents a 1.5 percentage point; 18.5% observed increase in new topic completion rate.
A couple notes:
# Review edit completions by editing method and wiki
new_topic_completes_bywiki <- new_topic_completes %>%
group_by(wiki, section_edit_type) %>%
summarise(n_users = n_distinct(user_id),
n_users_completed = sum(n_completions >=1), #user completed at least 1 edit
completion_rate = paste0(round(n_users_completed / n_users *100, 1), "%"),
.groups = 'drop'
) %>% #determine credible intervals
cbind(as.data.frame(binom:::binom.bayes(x = .$n_users_completed, n = .$n_users, conf.level = 0.95, tol = 1e-10))) %>%
mutate(lower = round(lower,2),
upper = round(upper, 2))
new_topic_completes_bywiki_anon_table <- new_topic_completes_bywiki %>%
select(c(1,2,3,4,5,12,13)) %>% #remove unneeded rows
gt() %>%
tab_header(
title = "Logged-out contributors new topic completion rate by participating Wikipedia"
) %>%
cols_label(
wiki = "Wikipedia",
section_edit_type= "Editing method",
n_users = "Number of users attempted",
n_users_completed = "Number of users completed",
completion_rate = "Completion rate",
lower = "CI (Lower Bound)",
upper = "CI (Upper Bound)"
) %>%
tab_footnote(
footnote = "Defined as percent of contributors that make a new topic attempt and publish at least 1 new topic",
locations = cells_column_labels(
columns = 'completion_rate'
)
) %>%
tab_footnote(
footnote = "Sampling rate for Non-New Topic Tool events is 6.25%",
locations = cells_column_labels(
columns = 'section_edit_type'
)
) %>%
tab_footnote(
footnote = "Sampling rate for New Topic Tool events is 100%",
locations = cells_column_labels(
columns = 'section_edit_type'
)) %>%
tab_footnote(
footnote = "95% credible intervals. There is a 95% probability that the parameter lies in this interval",
locations = cells_column_labels(
columns = c('lower', 'upper')
)) %>%
gtsave(
"new_topic_completes_bywiki_anon_table.html", inline_css = TRUE)
IRdisplay::display_html(data = new_topic_completes_bywiki_anon_table, file = "new_topic_completes_bywiki_anon_table.html")
new_topic_completes_bywiki_anon_table
Logged-out contributors new topic completion rate by participating Wikipedia | ||||||
---|---|---|---|---|---|---|
1
Sampling rate for Non-New Topic Tool events is 6.25%
2
Sampling rate for New Topic Tool events is 100%
3
Defined as percent of contributors that make a new topic attempt and publish at least 1 new topic
4
95% credible intervals. There is a 95% probability that the parameter lies in this interval
|
NULL
# Plot edit completion rates for each user on each wiki
p <- new_topic_completes_bywiki %>%
filter(!(wiki %in% c('Amharic Wikipedia', 'Bengali Wikipedia', 'Hindi Wikipedia', 'Oromo Wikipedia',
'Korean Wikipedia', 'Ukranian Wikipedia'))) %>% # remove wikis where there are under 10 events total for editing method
ggplot(aes(x= section_edit_type, y = n_users_completed / n_users, fill = section_edit_type)) +
geom_col(position = 'dodge') +
geom_text(aes(label = paste(completion_rate), fontface=2), vjust=1.2, size = 5, color = "white") +
geom_errorbar(aes(ymin = lower, ymax = upper), color = 'red', size = 1, alpha = 0.5, position = dodge, width = 0.25) +
facet_wrap(~ wiki) +
scale_y_continuous(labels = scales::percent) +
labs (y = "Percent of contributors ",
title = "Logged-out contributors new topic completion rate by participating Wikipedia",
caption = "Amharic, Bengali, Hindi, Oromo, Korean, and Ukranian removed from analysis due to insufficient events \n
Red error bars: 95% credible intervals") +
scale_fill_manual(values= c("#999999", "steelblue2"), name = "Editing Method", labels = c("Existing add new section link", "New topic tool")) +
theme(
panel.grid.minor = element_blank(),
panel.background = element_blank(),
plot.title = element_text(hjust = 0.5),
text = element_text(size=16),
legend.position="bottom",
axis.text.x = element_blank(),
axis.title.x=element_blank(),
axis.line = element_line(colour = "black"))
p
ggsave("Figures/new_topic_completes_bywiki_anon.png", p, width = 16, height = 8, units = "in", dpi = 300)
Trends vary on a per wikipedia basis.
There were no significiant decreases in new topic tool completion rate for any particular wiki with the exception of Hebrew Wikipedia, which saw a 20 percentage point decrease (44.4% → 20.9%; 50% decrease) in new topic completion rate for logged-out contributors using the new topic tool.
A couple notes:
We next explored different models to correctly infer the impact of the new topic tool on whether a new topic was completed or not and account for the random effects by the user and wiki. This allows us to confirm if the observed increase above is statistically significant (did not occur due to random chance).
New topic attempts completed on the same Wikipedia and by the users on that Wikipedia are related to each other. Therefore, we can more accurately infer the impact of the new topic tool by accounting for the effect of the user and wiki on the success probability of a Junior Contributor completing an edit.
We used a Bayesian Hierarchical regression model to model this structure. For this model, we reviewed whether each edit attempt was sucesfully completed or not. We identified the user and Wikipedia as random effects and whether the new topic tool was used as the fixed effect or predictor variable.
#redefine edit success as factor for use in the model
new_topic_completes$edit_success <-
factor(
new_topic_completes$edit_success,
levels = c("Not Complete", "Complete")
)
priors <- c(
set_prior(prior = "std_normal()", class = "b"),
set_prior("cauchy(0, 5)", class = "sd")
)
fit_anon <- brm(
edit_success ~ section_edit_type + (1 | wiki/user_id),
family = bernoulli(link = "logit"),
data = new_topic_completes,
prior = priors,
chains = 6, cores = 4
)
fit_anon_tbl <- fit_anon %>%
spread_draws(b_section_edit_typeNewtopictool, b_Intercept) %>%
mutate(
exp_b = exp(b_section_edit_typeNewtopictool),
b4 = b_section_edit_typeNewtopictool/ 4,
avg_lift = plogis(b_Intercept + b_section_edit_typeNewtopictool) - plogis(b_Intercept)
) %>%
pivot_longer(
b_section_edit_typeNewtopictool:avg_lift,
names_to = "param",
values_to = "val"
) %>%
group_by(param) %>%
summarize(
ps = c(0.025, 0.5, 0.975),
qs = quantile(val, probs = ps),
.groups = "drop"
) %>%
mutate(
quantity = ifelse(
param %in% c("b_Intercept", "b_section_edit_typeNewtopictool"),
"Parameter", "Function of parameter(s)"
),
param = factor(
param,
c("b_Intercept", "b_section_edit_typeNewtopictool", "exp_b", "b4", "avg_lift"),
c("(Intercept)", "Using new topic tool", "Multiplicative effect on odds", "Maximum Lift", "Average lift")
),
ps = factor(ps, c(0.025, 0.5, 0.975), c("lower", "median", "upper")),
) %>%
pivot_wider(names_from = "ps", values_from = "qs") %>%
arrange(quantity, param)
fit_anon_tbl%>%
gt(rowname_col = "param", groupname_col = "quantity") %>%
row_group_order(c("Parameter", "Function of parameter(s)")) %>%
fmt_number(vars(lower, median, upper), decimals = 3) %>%
fmt_percent(columns = vars(median, lower, upper), rows = 2:3, decimals = 1) %>%
cols_align("center", vars(median, lower, upper)) %>%
cols_merge(vars(lower, upper), pattern = "({1}, {2})") %>%
cols_move_to_end(vars(lower)) %>%
cols_label(median = "Point Estimate", lower = "95% CI") %>%
tab_style(cell_text(weight = "bold"), cells_row_groups()) %>%
tab_footnote("CI: Credible Interval", cells_column_labels(vars(lower))) %>%
tab_footnote(
html("Average lift = Pr(Success|New Topic Tool) - Pr(Success|Existing New Section Link Editing) = logit<sup>-1</sup>(β<sub>0</sub> + β<sub>1</sub>) - logit<sup>-1</sup>(β<sub>0</sub>)"),
cells_body(vars(median), 3)
) %>%
tab_footnote(
html("Maximum lift calculated using the divide-by-4-rule"),
cells_body(vars(median), 2)
) %>%
tab_header("Logged-Out Contributor Completion Rate: Posterior summary of model parameters") %>%
gtsave(
"fit_anon_tbl.html", inline_css = TRUE)
IRdisplay::display_html(file = "fit_anon_tbl.html")
Logged-Out Contributor Completion Rate: Posterior summary of model parameters | ||
---|---|---|
Parameter | ||
Function of parameter(s) | ||
1
CI: Credible Interval
2
Maximum lift calculated using the divide-by-4-rule
3
Average lift = Pr(Success|New Topic Tool) - Pr(Success|Existing New Section Link Editing) = logit-1(β0 + β1) - logit-1(β0)
|
Since the model parameters are on the log-odds scale, we needed to apply the following transformations to make sense of them.
[^Gelman]: Gelman, Andrew, Jennifer Hill, and Aki Vehtari. 2021. Regression and other stories. https://doi.org/10.1017/9781139161879.
While we observed an increase in the new topic completion rate for logged-out contributors, there is not sufficient evidence to definitively say that the new topic tool led to this increase - as indicated by a credible interval that crosses 1, since a multiplicative effect of 1 is no change either way.
We also wanted to ensure that enabling the new topic tool did not result in an increase in the number of disruptive edits being made to talk pages.
To evaluate any disruption caused by the new topic tool, we determined the percent of new topics published to talk pages that were reverted within 48 hours.
For this analysis, we reviewed data recorded in mediawiki_history to identify the percent comments posted with the new topic tool (identified by the revision tag: discussiontools-newtopic
) on talk pages that are reverted within 48 hours [^revert]. We joined this data with AB test data logged in editAttemptStep and talk_page_edit to isolate data to the attempts included in the AB test and try to exclude any edits not made to create a new topic.
[^revert]: 48 hours is a common cutoff, as research suggests that, at least for the English Wikipedia, nearly all reverts take place within 48 hours. Source: Research: Revert. Mediawiki. https://meta.wikimedia.org/wiki/Research:Revert.
We compared the revert rate for new topics published using the new topic tool to the revert rate for new topics made using the existing section link editing during the same timeframe.
new_topic_reverts_anon <-
read.csv(
file = 'Data/new_topic_reverts_anon.csv',
header = TRUE,
sep = ",",
stringsAsFactors = FALSE
) # loads all revert data
#clarfiy levels and labels for factor variables
new_topic_reverts_anon$section_edit_type <-
factor(
new_topic_reverts_anon$section_edit_type,
levels = c("non-new-topic-tool", "new-topic-tool"),
labels = c("Existing add new section link", "New topic tool")
)
new_topic_reverts_anon$is_reverted <-
factor(new_topic_reverts_anon$is_reverted,
levels = c("reverted", "not-reverted"),
labels = c("Reverted", "Not reverted"))
#clarfiy wiki names
new_topic_reverts_anon <- new_topic_reverts_anon %>%
mutate(
wiki = case_when(
#clarfiy participating project names
wiki == 'amwiki' ~ "Amharic Wikipedia",
wiki == 'bnwiki' ~ "Bengali Wikipedia",
wiki == 'zhwiki' ~ "Chinese Wikipedia",
wiki == 'nlwiki' ~ 'Dutch Wikipedia',
wiki == 'arzwiki' ~ 'Egyptian Wikipedia',
wiki == 'frwiki' ~ 'French Wikipedia',
wiki == 'hewiki' ~ 'Hebrew Wikipedia',
wiki == 'hiwiki' ~ 'Hindi Wikipedia',
wiki == 'idwiki' ~ 'Indonesian Wikipedia',
wiki == 'itwiki' ~ 'Italian Wikipedia',
wiki == 'jawiki' ~ 'Japanese Wikipedia',
wiki == 'kowiki' ~ 'Korean Wikipedia',
wiki == 'omwiki' ~ 'Oromo Wikipedia',
wiki == 'fawiki' ~ 'Persian Wikipedia',
wiki == 'plwiki' ~ 'Polish Wikipedia',
wiki == 'ptwiki' ~ 'Portuguese Wikipedia',
wiki == 'eswiki' ~ 'Spanish Wikipedia',
wiki == 'thwiki' ~ 'Thai Wikipedia',
wiki == 'ukwiki' ~ 'Ukrainian Wikipedia',
wiki == 'viwiki' ~ 'Vietnamese Wikipedia',
)
)
# aggregrate based on editing experience type
new_topic_reverts_all <- new_topic_reverts_anon %>%
group_by(section_edit_type) %>%
summarise(total_reverts = n_distinct(revision_id[is_reverted == "Reverted"]),
total_comments = n_distinct(revision_id),
revert_rate =paste(round(total_reverts/total_comments * 100, 2), '%'), .groups = 'drop') %>%
# add credible intervals
cbind(as.data.frame(binom:::binom.bayes(x = .$total_reverts, n = .$total_comments, conf.level = 0.95, tol = 1e-10))) %>%
mutate(lower = round(lower,2),
upper = round(upper, 2))
new_topic_reverts_anon_all_table <- new_topic_reverts_all %>%
select(c(1,2,3,4,11, 12)) %>% #remove unneeded rows
gt() %>%
tab_header(
title = "Logged-out contributors new topic revert rate across all participating Wikipedias",
subtitle = "Across all participating Wikipedias"
) %>%
cols_label(
section_edit_type = "Editing method",
total_reverts = "Number of new topics reverted",
total_comments = "Number of new topics published",
revert_rate = "Revert rate",
lower = "CI (Lower Bound)",
upper = "CI (Upper Bound)"
) %>%
tab_footnote(
footnote = "Defined as percent of new topics reverted within 48 hours.",
locations = cells_column_labels(
columns = 'revert_rate'
)
) %>%
tab_footnote(
footnote = "Sampling rate is 100% for new topic tool events and 6.25% for non-new topic tool events",
locations = cells_column_labels(
columns = 'section_edit_type'
)
) %>%
tab_footnote(
footnote = "95% credible intervals. There is a 95% probability that the parameter lies in this interval",
locations = cells_column_labels(
columns = c('lower', 'upper')
)
) %>%
gtsave(
"new_topic_reverts_anon_all_table.html", inline_css = TRUE)
IRdisplay::display_html(file = "new_topic_reverts_anon_all_table.html")
Logged-out contributors new topic revert rate across all participating Wikipedias | |||||
---|---|---|---|---|---|
Across all participating Wikipedias | |||||
1
Sampling rate is 100% for new topic tool events and 6.25% for non-new topic tool events
2
Defined as percent of new topics reverted within 48 hours.
3
95% credible intervals. There is a 95% probability that the parameter lies in this interval
|
# Plot revert rates
p <- new_topic_reverts_all %>%
ggplot(aes(x= section_edit_type, y = total_reverts/ total_comments, fill = section_edit_type)) +
geom_col(position = 'dodge') +
geom_errorbar(aes(ymin = lower, ymax = upper), color = 'red', size = 1, alpha = 0.5, position = dodge, width = 0.25) +
geom_text(aes(label = paste(revert_rate), fontface=2), vjust=1.2, size = 8, color = "white") +
scale_y_continuous(labels = scales::percent) +
scale_x_discrete(labels = c("Previous add new section link", "New topic tool")) +
labs (y = "Percent of new comments reverted ",
title = "Logged out contributors new topic revert rate across \n all participating Wikipedias",
caption = "Revert rate defined as percent of published new topics reverted within 48 hours \n
Red error bars: 95% credible intervals") +
scale_fill_manual(values= c("#999999", "steelblue2"), name = "Editing Method", labels = c("Existing add new section link", "New topic tool")) +
theme(
panel.grid.minor = element_blank(),
panel.background = element_blank(),
plot.title = element_text(hjust = 0.5),
text = element_text(size=16),
legend.position="bottom",
axis.text.x = element_blank(),
axis.title.x=element_blank(),
axis.line = element_line(colour = "black"))
p
ggsave("Figures/new_topic_reverts_anon_all .png", p, width = 16, height = 8, units = "in", dpi = 300)
Overall, across all participating Wikipedias, we observed a slight 3 percentage point (20.7% → 23.7%; 14.5% increase) in the revert rate for new topic tool edits made by Junior Contributors compared to edits made using the existing add new section link.
It is important to note that there is high level of uncertaininty in the revert rate identified for the exisiting add new section link as indicated by the red error bar in the chart above. This is due to the smaller numer of events available to review for this editing method. As a result, there is not sufficient evidence to identify any significant changes in revert rate caused by the new topic tool.
# aggregrate data by wiki and editing interface
new_topic_reverts_bywiki <- new_topic_reverts_anon %>%
group_by(wiki, section_edit_type) %>%
summarise(total_reverts = n_distinct(revision_id[is_reverted == "Reverted"]),
total_comments = n_distinct(revision_id),
revert_rate =paste(round(total_reverts/total_comments * 100, 2), '%'), .groups = 'drop') %>%
cbind(as.data.frame(binom:::binom.bayes(x = .$total_reverts, n = .$total_comments, conf.level = 0.95, tol = 1e-10))) %>%
mutate(lower = round(lower,2),
upper = round(upper, 2))
head(new_topic_reverts_bywiki)
wiki | section_edit_type | total_reverts | total_comments | revert_rate | method | x | n | shape1 | shape2 | mean | lower | upper | sig | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
<chr> | <fct> | <int> | <int> | <chr> | <fct> | <int> | <int> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | |
1 | Bengali Wikipedia | Existing add new section link | 0 | 2 | 0 % | bayes | 0 | 2 | 0.5 | 2.5 | 0.1666667 | 0.00 | 0.57 | 0.05 |
2 | Bengali Wikipedia | New topic tool | 5 | 9 | 55.56 % | bayes | 5 | 9 | 5.5 | 4.5 | 0.5500000 | 0.26 | 0.83 | 0.05 |
3 | Chinese Wikipedia | Existing add new section link | 1 | 7 | 14.29 % | bayes | 1 | 7 | 1.5 | 6.5 | 0.1875000 | 0.00 | 0.44 | 0.05 |
4 | Chinese Wikipedia | New topic tool | 4 | 21 | 19.05 % | bayes | 4 | 21 | 4.5 | 17.5 | 0.2045455 | 0.05 | 0.37 | 0.05 |
5 | Dutch Wikipedia | Existing add new section link | 1 | 2 | 50 % | bayes | 1 | 2 | 1.5 | 1.5 | 0.5000000 | 0.06 | 0.94 | 0.05 |
6 | Dutch Wikipedia | New topic tool | 15 | 34 | 44.12 % | bayes | 15 | 34 | 15.5 | 19.5 | 0.4428571 | 0.28 | 0.60 | 0.05 |
new_topic_reverts_anon_bywiki_table <- new_topic_reverts_bywiki %>%
select(c(1,2,3,4, 5, 12,13)) %>% #remove unneeded rows
gt() %>%
tab_header(
title = "Logged-out Contributors new topic revert rate by participating Wikipedia"
) %>%
cols_label(
wiki = "Wikipedia",
section_edit_type = "Editing method",
total_reverts = "Number of new topics reverted",
total_comments = "Number of new topics published",
revert_rate = "Revert rate",
lower = "CI (Lower Bound)",
upper = "CI (Upper Bound)"
) %>%
tab_footnote(
footnote = "Defined as percent of new topic reverted within 48 hours.",
locations = cells_column_labels(
columns = 'revert_rate'
)
) %>%
tab_footnote(
footnote = "Sampling rate is 100% for new topic tool events and 6.25% for non-new topic tool events",
locations = cells_column_labels(
columns = 'section_edit_type'
)
) %>%
tab_footnote(
footnote = "95% credible intervals. There is a 95% probability that the parameter lies in this interval",
locations = cells_column_labels(
columns = c('lower', 'upper')
)
) %>%
gtsave(
"new_topic_reverts_anon_bywiki_table.html", inline_css = TRUE)
IRdisplay::display_html(file = "new_topic_reverts_anon_bywiki_table.html")
Logged-out Contributors new topic revert rate by participating Wikipedia | ||||||
---|---|---|---|---|---|---|
1
Sampling rate is 100% for new topic tool events and 6.25% for non-new topic tool events
2
Defined as percent of new topic reverted within 48 hours.
3
95% credible intervals. There is a 95% probability that the parameter lies in this interval
|
Some per participating Wikipedia trend highlights :
We also explored if the new topic tool resulted in a greater number of Junior Contributors to start participating productively on talk pages and if it caused a greater percentage of Junior Contributors to continue participating productively on talk pages.
Note: We did not look at retention for logged-out new topic users because unlike the logged-in users who have a persistent bucket that's applied to their account, the logged out bucketing applies entirely to the current browser that is being used by the logged-out user. As a result, this is metric is susceptible to variation in the user's browser selection and not reliable.
This metric was defined as the number of distinct logged-out Contributors (based on the anonymous_user_token
field in EditAttemptStep) who make at least one new topic edit to a page in a talk namespace that is not reverted within 48 hours. Since different sampling rates were applied to each editor type, we removed any events that were oversampled (sampling rate increased to 100%) to allow us to directly compare the numbers between the two groups.
Note: The logged out bucketing applies entirely to the current browser that is being used by the logged-out contributor. If the logged-out user switched browsers, they may be rebucketed or counted twice. FIXME: Need to confirm.
num_anon_editors <- new_topic_attempts %>%
filter(is_oversample == 'false' ) %>% #remove oversampled events
group_by(experiment_group, section_edit_type) %>%
summarise(total_users_attempt = n_distinct(user_id),
total_users_complete = n_distinct(user_id[edit_success == 'Complete']), .groups = 'drop')
num_anon_editors_table <- num_anon_editors %>%
gt() %>%
tab_header(
title = "Number of Logged Out Contributors that made a new topic attempt during the AB test by test group and section edit type"
) %>%
cols_label(
experiment_group = "Test group",
section_edit_type = "Editing method",
total_users_attempt = "Number of users that attempted a new topic",
total_users_complete = "Number of users that published a new topic"
) %>%
tab_row_group(
rows = experiment_group == 'control'
) %>%
tab_row_group(
rows = experiment_group == 'test'
) %>%
tab_footnote(
footnote = "Based on a sampling rate of 6.25% for all events. Any oversampled events were removed so data for the two editor types could be directly compared",
locations = cells_title(
)
) %>%
gtsave(
"num_anon_editors_table.html", inline_css = TRUE)
IRdisplay::display_html(file = "num_anon_editors_table.html")
Warning message in if ((loc$groups %>% rlang::eval_tidy()) == "title") {: “the condition has length > 1 and only the first element will be used”
Number of Logged Out Contributors that made a new topic attempt during the AB test by test group and section edit type1 | |||
---|---|---|---|
1
Based on a sampling rate of 6.25% for all events. Any oversampled events were removed so data for the two editor types could be directly compared
|
A few explanations regarding the numbers above:
p <-num_anon_editors %>%
group_by(section_edit_type) %>%
summarise(total_user_complete = sum(total_users_complete), .groups = 'drop') %>%
ggplot(aes(x= section_edit_type, y = total_user_complete, fill = section_edit_type)) +
geom_col(position = 'dodge') +
geom_text(aes(label = paste(total_user_complete),fontface=2), vjust=1.2, size = 8, color = "white") +
labs (y = "Number of Logged-Out Contributors",
title = "Number of Logged-Out Contributors that completed a new topic \n across all participating Wikipedias") +
scale_fill_manual(values= c("#999999", "steelblue2"), name = "Editing Method", labels = c("Existing add new section link", "New topic tool")) +
theme(
panel.grid.minor = element_blank(),
panel.background = element_blank(),
plot.title = element_text(hjust = 0.5),
text = element_text(size=16),
legend.position="bottom",
axis.text.x = element_blank(),
axis.title.x=element_blank(),
axis.line = element_line(colour = "black"))
p
ggsave("Figures/num_anon_editors_bygroup.png", p, width = 16, height = 8, units = "in", dpi = 300)
There were 12 more distinct anonymous users that sucessfully completed an edit with the new topic tool compared the the previous add new section link.