Skip to content
Snippets Groups Projects
Commit eefbe5f4 authored by Thomas  OBADIA's avatar Thomas OBADIA
Browse files

Add a script as step 03 in the observational project to run QC rules on the...

Add a script as step 03 in the observational project to run QC rules on the curated dataset. This moves the merging operation with inventory data, which is also a form of QC, into a subsequent, nested QC script.
parent 802dc5c6
No related branches found
No related tags found
No related merge requests found
## INVENTORY_05_select_list_function.R
## INVENTORY_05_generate_list_of_participants_for_observational_study.R
## Date : 2024/02/02
## Author : Eliharintsoa Rajaoranimirana, Thomas Obadia
##
......
## OBSERVATIONAL_03_QC_curated_data.R
## Date : 2024/12/04
## Author : Thomas Obadia
##
## This script processes the curated dataset from
## OBSERVATIONAL_02_curate_REDCap_raw_data.R and applies a series of
## QC rules.
## It returns a distinct dataset with columns corresponding to the
## outcome of each QC rule. Any 'TRUE' in these columns should warrant
## further investigation and clarification by study team.
######################################################################
######################################################################
### SOURCE THE DATA
######################################################################
source("./02_OBSERVATIONAL/OBSERVATIONAL_02_curate_REDCap_raw_data.R")
\ No newline at end of file
## OBSERVATIONAL_02_merge_inventory_metadata.R
## OBSERVATIONAL_04_merge_inventory_metadata.R
## Date : 2024/10/17
## Author : Thomas Obadia
##
......@@ -32,5 +32,51 @@ if (!exists("DATA_EXTRACT_IS_RECENT_OBS") || !DATA_EXTRACT_IS_RECENT_OBS) {
######################################################################
###
### MERGE INVENTORY AND OBSERVATIONAL DATA
######################################################################
### The list of individuals from the inventory phase is stored in the
### inventory_list_p table. It merely contains the CensusID (which
### encodes the country, cluster, house, household and subject),
### as well as age and gender.
### As part of the observational study, the same data was collected and
### *should* report the CensusID when it was available.
### This section will confront demographics from both studies, and
### explore if reconciling these two cross-sectional datasets is
### feasible.
## In the observational data, record_id differs across countries:
## - Ethiopia used consecutive autonumbering
## - Madagascar used censusid
## Check that censusid is actually redundant with record_id in Madagascar
dat_observational_curated %>%
mutate(record_id_is_censusid = (record_id == censusid)) %>%
count(country, record_id_is_censusid,
.drop = FALSE)
tmp = dat_observational_curated %>%
select(censusid, consent, sex, agey) %>%
# REDCap labels were translated in Madagascar.
# Handle it here, before it's maybe handled before at the curation stage?
mutate(consent = plyr::mapvalues(x = consent,
from = c("Oui", "Non"),
to = c("Yes", "No")),
sex = plyr::mapvalues(x = sex,
from = c("Féminin", "Masculin"),
to = c("Female", "Male"))) %>%
full_join(inventory_list_p %>%
select(censusid, sex, agey),
by = join_by(censusid == censusid),
suffix = c(".obs", ".inv")) %>%
filter(consent == "Yes") %>%
separate_wider_regex(cols = censusid,
patterns = c(country = "^(?:E|M)",
"-",
clusterid = "\\d{2}",
"-",
"H",
houseid = "\\d{3}",
"-",
nested_hhid = "\\d{2}",
"-",
nested_subjid = "\\d{2}"), too_few = "debug")
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment