%%capture
%%R
<- redcap_read_oneshot(
data_by_record = c(1,2),
records = url,
redcap_uri = token
token )$data
20 Appendix
20.1 Filter Data During Export
REDCapR and PyCap functions have options for filtering data upon export. If a REDCap project has a large amount of data that is slow to export, then we recommend using REDCapR/PyCap functions to filter the data during export when applicable.
Creating custom reports within REDCap and then exporting those filtered reports is another option for filtering the data before export for projects with a large amount of data. See Chapter 9.
20.1.1 Filter By Record ID
Export data for Record IDs 1 and 2.
=['1','2'],
project.export_records(records='label',
raw_or_label='df').reset_index() format_type
record_id | redcap_event_name | redcap_repeat_instrument | redcap_repeat_instance | first_name | last_name | phone_num | zip_code | dob | age | ... | cc_phone | cc_email | close_contacts_complete | supervisor_name | supervisor_email | work_inperson_yesno | work_date | work_contagious | work_contagious_calc | work_information_complete | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Personal Info | NaN | NaN | John | Doe | (999) 999-9999 | 98105.0 | 2006-04-11 | 18.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1 | 1 | Notifications | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | Boss | NaN | No | NaN | No | NaN | Complete |
2 | 1 | Case Intake | NaN | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
3 | 1 | Notifications | Close Contacts | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | ... | (999) 999-9999 | fake_email@gmail.com | Complete | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
4 | 1 | Notifications | Close Contacts | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | ... | (999) 999-9999 | fake_email@gmail.com | Complete | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
5 | 2 | Personal Info | NaN | NaN | Jane | Doe | (999) 999-9999 | 98105.0 | 1994-06-29 | 29.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
6 | 2 | Notifications | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | Boss | fake_email@gmail.com | Yes | 2023-10-10 | Yes | NaN | Complete |
7 | 2 | Case Intake | NaN | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
8 | 2 | Case Intake | NaN | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
9 | 2 | Notifications | Close Contacts | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | ... | (999) 999-9999 | fake_email@gmail.com | Complete | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
10 | 2 | Notifications | Close Contacts | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | ... | (999) 999-9999 | NaN | Incomplete | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
11 rows × 48 columns
20.1.2 Filter By Date
Data exports can be filtered by the date the record was added or modified. In the following example, we will view all records that were modified or added after April 9, 2024. For more information on how these records were modified, refer to the logging section of this guide. See Chapter 17.
%%capture
%%R
<- redcap_read_oneshot(
data_by_date = url,
redcap_uri = token,
token = as.POSIXct("2024/04/09",
datetime_range_begin format = "%Y/%m/%d")
)$data
Note: Need to specify date format as seen above using as.POSIXct
from datetime import datetime
=datetime.fromisoformat("2024-04-09"),
project.export_records(date_begin='df') format_type
redcap_repeat_instrument | redcap_repeat_instance | first_name | last_name | phone_num | zip_code | dob | age | ethnicity | race | ... | cc_phone | cc_email | close_contacts_complete | supervisor_name | supervisor_email | work_inperson_yesno | work_date | work_contagious | work_contagious_calc | work_information_complete | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
record_id | redcap_event_name | |||||||||||||||||||||
1 | personal_info_arm_1 | NaN | NaN | John | Doe | (999) 999-9999 | 98105.0 | 2006-04-11 | 18.0 | 1.0 | 4.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
notifications_arm_1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | Boss | NaN | 0.0 | NaN | 0.0 | NaN | 2.0 | |
case_intake_arm_1 | NaN | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
notifications_arm_1 | close_contacts | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | (999) 999-9999 | fake_email@gmail.com | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
notifications_arm_1 | close_contacts | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | (999) 999-9999 | fake_email@gmail.com | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
2 | personal_info_arm_1 | NaN | NaN | Jane | Doe | (999) 999-9999 | 98105.0 | 1994-06-29 | 29.0 | 0.0 | 5.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
notifications_arm_1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | Boss | fake_email@gmail.com | 1.0 | 2023-10-10 | 1.0 | NaN | 2.0 | |
case_intake_arm_1 | NaN | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
case_intake_arm_1 | NaN | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
notifications_arm_1 | close_contacts | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | (999) 999-9999 | fake_email@gmail.com | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
notifications_arm_1 | close_contacts | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | (999) 999-9999 | NaN | 0.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
3 | personal_info_arm_1 | NaN | NaN | John | Doe | (999) 999-9999 | 98105.0 | NaN | NaN | 1.0 | 4.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
notifications_arm_1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | Boss | NaN | 0.0 | NaN | 0.0 | NaN | 2.0 | |
case_intake_arm_1 | NaN | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
notifications_arm_1 | close_contacts | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | (999) 999-9999 | fake_email@gmail.com | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
notifications_arm_1 | close_contacts | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | (999) 999-9999 | fake_email@gmail.com | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
16 rows × 46 columns
20.1.3 Filter By Field Value
Export records that reported ‘female’ for gender.
%%capture
%%R
<- redcap_read_oneshot(
data_by_gender = url,
redcap_uri = token,
token = "[gender] = '0'"
filter_logic %>%
)$data select(c(record_id, gender))
Note: When filtering on REDCap’s multiple choice variables, yes/no variables, and checkboxes, you must put quotes around the coded value when using the filter_logic
argument, otherwise REDCap will not perform the filtering correctly.
="[gender] = 0",
project.export_records(filter_logic='df'
format_type'record_id', 'gender']] ).reset_index()[[
record_id | gender | |
---|---|---|
0 | 2 | 0 |
1 | 4 | 0 |
2 | 6 | 0 |
Export records that reported an age greater than 20.
%%capture
%%R
<- redcap_read_oneshot(
data_by_age = url,
redcap_uri = token,
token = "[age] > 20 "
filter_logic %>%
)$data select(c(record_id, age))
Note: Because age is a numeric field in REDCap, it does not need quotes around the number 20.
="[age] > 20",
project.export_records(filter_logic='df'
format_type'record_id', 'age']] ).reset_index()[[
record_id | age | |
---|---|---|
0 | 2 | 29 |
1 | 4 | 28 |
2 | 5 | 34 |
3 | 6 | 25 |
20.2 Export Selected Fields
Export first and last name
%%capture
%%R
#specifying record_id automatically also pulls the event, instrument and instance columns (when applicable).
<- redcap_read_oneshot(
field_subset_1 = c("record_id","first_name","last_name"),
fields = url,
redcap_uri = token
token )$data
Note: if record_id
is not specified, no identifier fields will be exported. By including record_id
in the fields
argument, all variables that make up the unique key are automatically exported.
=['3','4'],
project.export_records(records=["first_name","last_name"],
fields='df') format_type
redcap_repeat_instrument | redcap_repeat_instance | first_name | last_name | ||
---|---|---|---|---|---|
record_id | redcap_event_name | ||||
3 | personal_info_arm_1 | NaN | NaN | John | Doe |
notifications_arm_1 | NaN | NaN | NaN | NaN | |
case_intake_arm_1 | NaN | 1.0 | NaN | NaN | |
4 | personal_info_arm_1 | NaN | NaN | Jane | Doe |
notifications_arm_1 | NaN | NaN | NaN | NaN | |
case_intake_arm_1 | NaN | 1.0 | NaN | NaN | |
case_intake_arm_1 | NaN | 2.0 | NaN | NaN |
Note: if record_id
is not specified, all fields that make up the unique key will still be exported.
20.3 Export Specific Instruments
See Chapter 13 to get a list of all the instrument names in your project. Specifying instruments to export will still export all rows of the project data (including rows not relevant to the desired instrument). However, it is useful because it will only export the fields (columns) in that instrument. Use the filter_logic
argument and the <form>_complete
variable (automatically created by REDCap for each form) to get the desired output.
In this example, we use the symptoms_complete
field to export the ‘symptoms’ form and all associated data.
%%capture
%%R
<- redcap_read_oneshot(redcap_uri = url,
records_dem = "record_id",
fields = "symptoms",
forms = '[symptoms_complete] <> ""',
filter_logic = token
token
)$data
<- gt(head(records_dem))
records_dem_tbl = 'export_records6.html', path = "./files/export_files/") gt::gtsave(records_dem_tbl, filename
Note: You must add record_id
to the fields
argument for the data to export with the record_id
and associated unique key attached.
='symptoms',
project.export_records(forms="[symptoms_complete] <>''",
filter_logic ='df') format_type
redcap_repeat_instrument | redcap_repeat_instance | symptoms_yesno | symptom_onset | symptoms_exp___1 | symptoms_exp___2 | symptoms_exp___3 | symptoms_exp___4 | symptoms_exp___5 | symptoms_exp___6 | symptoms_exp___7 | symptoms_exp___8 | symptoms_exp___9 | symptoms_exp___10 | symptoms_exp___11 | symptom_notes | symptoms_complete | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
record_id | redcap_event_name | |||||||||||||||||
1 | case_intake_arm_1 | NaN | 1 | 1 | 2023-10-09 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | NaN | 2 |
2 | case_intake_arm_1 | NaN | 1 | 1 | 2023-10-11 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | NaN | 2 |
case_intake_arm_1 | NaN | 2 | 1 | 2021-06-05 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 0 | NaN | 2 | |
3 | case_intake_arm_1 | NaN | 1 | 1 | 2023-10-09 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | NaN | 2 |
4 | case_intake_arm_1 | NaN | 1 | 1 | 2023-10-11 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | NaN | 2 |
case_intake_arm_1 | NaN | 2 | 1 | 2021-06-05 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 0 | NaN | 2 | |
5 | case_intake_arm_1 | NaN | 1 | 1 | 2023-10-08 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | NaN | 2 |
6 | case_intake_arm_1 | NaN | 1 | 1 | 2023-10-02 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | NaN | 2 |
Note: if record_id
is not specified, all fields that make up the unique key will still be exported.
20.4 Export Data as CSV (PyCap Only)
In Section 8.2 and Section 9.2, it was noted that to export labeled headers, the data needs to be exported as a csv. See the example below on how to do this.
from io import StringIO
= StringIO(project.export_records(records='2',
data_csv ='label',
raw_or_label='label',
raw_or_label_headers='csv'))
format_type= pd.read_csv(data_csv, sep=',')
df_csv df_csv
Record ID | Event Name | Repeat Instrument | Repeat Instance | First Name | Last Name | Phone Number | ZIP Code | Date of birth | Age (years) | ... | Phone number of close contact | Email of close contact | Complete?.3 | Supervisor Name | Supervisor email | Do you ever work in-person? | Date you last worked in-person? | Was this person at work while contagious? Symptom Onset Date: [case_intake_arm_1][symptom_onset] | Was this person at work while contagious? | Complete?.4 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2 | Personal Info | NaN | NaN | Jane | Doe | (999) 999-9999 | 98105.0 | 1994-06-29 | 29.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1 | 2 | Notifications | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | Boss | fake_email@gmail.com | Yes | 2023-10-10 | Yes | NaN | Complete |
2 | 2 | Case Intake | NaN | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
3 | 2 | Case Intake | NaN | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
4 | 2 | Notifications | Close Contacts | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | ... | (999) 999-9999 | fake_email@gmail.com | Complete | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
5 | 2 | Notifications | Close Contacts | 2.0 | NaN | NaN | NaN | NaN | NaN | NaN | ... | (999) 999-9999 | NaN | Incomplete | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
6 rows × 48 columns
20.5 Creating a Reference Class (REDCapR only)
This Reference Class represents a REDCap project. Once some values are set that are specific to a REDCap project (such as the URI and token), later calls are less verbose (such as reading and writing data).
First, define the project:
%%capture
%%R
<- REDCapR::redcap_project$new(redcap_uri=url, token=token)
project <- project$read() ds_all
%%capture
%%R
<- project$read(fields = c("record_id", "gender", "first_name"))$data data
Pull the record_id
for all ‘female’ records:
%%R
<- data$record_id[data$gender=='0']
record_of_females record_of_females
[1] NA NA 2 NA NA NA NA NA 4 NA NA NA NA NA 6 NA NA
20.6 Clean Checkbox Choices (REDCapR Only)
REDCapR has a checkbox_choices
function that can be used to neatly list all answer options for a checkbox field.
First pull the metadata:
%%capture
%%R
<- redcap_metadata_read(
metadata = url,
redcap_uri = token
token )$data
Select the checkbox field that you would like to view:
%%R
<- metadata[metadata$field_name == "symptoms_exp",]$select_choices_or_calculations
symptoms_exp <- REDCapR::checkbox_choices(select_choices=symptoms_exp)
symptoms_exp_list symptoms_exp_list
# A tibble: 11 x 2
id label
<chr> <chr>
1 1 Sore Throat
2 2 Cough
3 3 Shortness of Breath
4 4 Chest Pain
5 5 Headache
6 6 Runny Nose
7 7 Congestion
8 8 Fever
9 9 Body Aches
10 10 Nausea/Vomiting
11 11 Diarrhea
20.7 REDCap Constants (REDCapR Only)
You can quickly search for the numerical values of several ‘constants’ within REDCap. For example, when exporting data from REDCap, all instruments end with a variable called <form_name>_complete
that when exported as raw data will take the values of 0, 1 or 2. You can see what each numerical value means by using constant("form_incomplete")
and constant("form_complete")
.
Other constants across all REDCap projects include the values for the various user rights settings. For a full list of constants available, refer to the following documentation.
View the <form_name>_complete
REDCap constant values:
%%R
REDCapR::constant(c("form_incomplete",
"form_complete",
"form_unverified"
))
[1] 0 2 1