This resource module is for researchers from organisations that hold a CPRD Multi-Study Licence (MSL), and organisations interested in a CPRD MSL to understand how this licence option works.
By the end of this module, the reader will have learnt:
- What is a CPRD Multi-Study Licence (MSL)?
- What are the benefits to holding a CPRD MSL?
- How does the CPRD MSL work?
- What happens when a CPRD MSL ends?
- What is a nominated user?
- What is the CPRD data access portal?
- How do MSL holders request linked data for their study?
- How to select linkage eligible patients in CPRD primary care data?
- The linkage eligibility file (GOLD/Aurum_enhanced_eligibility_month_year.txt)
- The linkage coverage file (linkage_coverage_month_year.txt)
The CPRD Multi-Study Licence (MSL) is an access option that enables an organisation to conduct an unlimited number of permitted studies within the licence terms over a 12-month period.
The CPRD primary care data MSL includes direct online access to the CPRD GOLD and CPRD Aurum databases via nominated users.
There are different MSL options available to suit the requirements for different organisations. The pricing differs, along with the specific terms of the licence e.g. flexible terms and conditions, the number of nominated users included, and the ability to include linked data MSLs. The MSL options with their fees and what is included are shown on www.cprd.com/pricing.
Where MSLs can be purchased for linked datasets, these do not include direct online access to the linked dataset. Instead, an unlimited number of approved studies using the linked data can be requested from CPRD.
The key obligations for using CPRD data are mentioned in the How to Access CPRD Data module, and researchers should also liaise with the nominated users and key licence contacts within the organisation to understand the contractual terms of the CPRD MSL.
CPRD primary care data MSL
Linked data MSLs
More cost-effective if intending to carry out multiple studies using CPRD data.
No additional data fees for studies approved within the 12-month period, provided the terms (e.g. services, data, etc.) are covered in the MSL contract.
No need for individual contracts for each study, resulting in faster access to CPRD data, provided the terms are covered in the MSL contract.
Direct access to the database for organisations to explore the data internally, run patient counts and extract datasets from CPRD GOLD and CPRD Aurum, resulting in faster and more complex exploration of the data, and efficient access to CPRD data for research.
Organisations wishing to access CPRD data must first gain CPRD Client approval, more detail and the application form are available at www.cprd.com/Data-access.
CPRD will liaise with the client to ensure that our products and services meet the organisation's data and research requirements, and that the terms of the MSL contract are understood. The client must discuss and decide the details of the MSL, e.g. who the key contacts will be, how many nominated users, any additional services or data to be included, the affiliates to be included on the MSL, etc. before the CPRD MSL contract agreement is signed.
Once the CPRD MSL is signed, the organisation's nominated users complete training provided by CPRD on how to access the CPRD data portal and how to support their organisation's researchers. From the date when data access is first granted, nominated users then have 12 months of online access to CPRD GOLD and CPRD Aurum, able to explore and extract these data directly for research.
The organisation can submit an unlimited number of research applications via www.erap.cprd.com, those approved within the 12-month duration period and covered within the terms of the contract agreement will be included in the CPRD MSL. If any data, services, or terms are requested that are not included in the CPRD MSL these will incur additional fees. The nominated users are able to extract the primary care data for these studies (with the exception of NCRAS studies requiring linkage to SACT and/or RTDS data), please note that any research application that requires CPRD to extract primary care data for an MSL organisation will incur CPRD service fees. The Chief Investigator (CI) and Corresponding Applicant (CA) can then request any linked data required from CPRD by submitting a request for linked data via www.erap.cprd.com.
When a CPRD MSL is approaching the contract end date, CPRD will liaise with the organisation to confirm if the MSL will be renewed.
If an organisation does not renew the MSL, the MSL will terminate on the contract end date:
- Nominated users' online access to the CPRD data portal ends on the contract end date.
- Organisations must destroy any data extracted or provided that is not associated with an approved RDG protocol.
- Organisations that request that CPRD supply the study dataset for an RDG protocol that was approved during the MSL licence period will be charged the individual study dataset licence fee for all primary care data and linked data that will be provided as part of the study dataset, and must sign a Study Dataset Agreement.
- Any submitted study that is approved after the MSL contract end data will be charged the individual study dataset licence fee and must sign a Study Dataset Agreement.
- Where a protocol is amended after the MSL contract end date, any data requests that are made as a result will not be covered by the MSL, and the individual study dataset licence fee will apply.
MSL organisations are encouraged to not wait until the end of the MSL to submit or amend their research applications. Sufficient time should be left for application approval and to extract the primary care data if they will not renew the licence.
- Nominated users are individuals who have completed CPRD training, hold secure personal CPRD data access portal login details, and are able to access the CPRD primary care databases directly online for the 12-month MSL period. With this access, they can explore the databases, run feasibility counts, and extract primary care data for studies for colleagues in their organisation.
- Nominated users can access the latest primary care data support files (i.e. code browser tools, denominator data, look-up files, and linkage source/eligibility files) after logging into the CPRD Data Access Portal on the Shared (L:) drive. Nominated users may share these files with colleagues who are authorised users within their organisation for the purposes of progressing a CPRD study.
- They will be key liaisons between CPRD and their organisation, able to cascade relevant information between organisations and share materials, as well as share expertise on using CPRD data and discuss research proposals.
- At least two nominated users are included as part of the CPRD MSL. Access for additional users can be purchased for Standard and Full MSLs.
- There are different ways of working: some organisations opt for researchers who will be conducting data management and analysis to become their nominated users; other organisations prefer to choose data managers, who will extract the data for other researchers to analyse, as their nominated users.
The CPRD data access portal (https://cprdportal.cprd.com/) is the platform through which nominated users access the CPRD primary care databases, CPRD GOLD and CPRD Aurum, online via the CPRD tools (Define, Extract, and Refine). Access requires an account created by CPRD.
For studies that do not request linked National Cancer Registration and Analysis Service (NCRAS) SACT and/or RTDS data, the nominated users can extract CPRD primary care data directly for their organisation's colleagues and use the linkage eligibility files. Nominated users can access these support files after logging into the CPRD Data Access Portal on the Shared (L:) drive.
The study CI or CA needs to submit a Request for Linked Data for the approved research application by logging into www.erap.cprd.com, completing the details requested and attaching the documents required e.g. the list of CPRD patient identifiers (in the form of one-tab delimited text file) that the study requires linked data for.
Upon receipt of a Linked Data Request, CPRD will firstly evaluate whether the appropriate contractual agreements are in place for the study, and then review the details of the request in line with the approved protocol. The turnaround time for data delivery is 10 working days from receipt of a valid, approved request, i.e.:
- The Linked Data Request correctly completed, including the database build and approved data sources,
- The patient/code list(s) in the correct format,
- The patients are eligible for the linkages requested,
- There is a plan for data minimisation if the request is for >600,000 patients.
If all agreements are in place and the request is valid and approved, CPRD will provide the linked data within 10 working days. If any issues are highlighted, CPRD will be in touch directly. Please note that not addressing these issues will delay processing, and the request will not be processed until these issues are resolved. The progress of a Linked Data Request can be tracked by the CI or CA on the eRAP website.
For studies that request NCRAS Cancer Registration Tumour and Treatment data, the process outlined above now applies.
However, for NCRAS SACT and/or RTDS data, there are additional steps required to access these datasets prior to submission of the research application - a completed NCRAS Data Selection Form must first be approved by CPRD. This process is detailed in the NCRAS documentation available at https://cprd.com/cprd-linked-data#Cancer%20data. Nominated users cannot extract the CPRD primary care data and then request the NCRAS SACT and/or RTDS data and other linked data from CPRD due to additional governance requirements when providing these data. Even if the licence agreement to access CPRD data is via a CPRD MSL, CPRD must extract all primary care and linked data for NCRAS SACT and/or RTDS studies and provide this to the client.
For each linked dataset update that is released, a source linkage eligibility file ([database]_enhanced_eligibility_[month]_[year].txt) is produced which contains all patients from English practices who have not opted-out or dissented from the sharing of confidential patient information for planning and research, and whose identifiers were transferred to the trusted third party for linkage prior to processing. From this file, researchers can select (through a binary variable) patients who are eligible for linkage to the data source(s) of interest. Some patients will not be eligible for linkage to any of the datasets, whereas others may be eligible for linkage to some or all of them.
The linkage coverage file (linkage_coverage_[month]_[year].txt) provides the start and end dates for each data source.
The linkage eligibility file (GOLD/Aurum_enhanced_eligibility_month_year.txt)
- CPRD’s trusted third party, NHS England, provides CPRD with a list of patient identifiers and eligibility flags.
- This includes all patients registered in English practices that consented to take part in the linkage process.
- Excludes patients who have opted-out or dissented from the sharing of confidential patient information for planning and research.
- Patients with valid NHS-number, are eligible to be linked to HES, ONS death, COVID-19 and NCRAS data.
- Patients with a valid postcode, are eligible to be linked to the patient-level small area level data.
Note: All patient level data examples used in this training pack are made for the purposes of this training and do not represent real patients.
The linkage coverage file (linkage_coverage_month_year.txt)
In order to ensure provision of the latest available data per data source, and to honour patient opt-outs, the latest linkage eligibility per patient for each requested linked data will be applied during the processing of a linkage request.