Accelerometer Physical Activity Datasets: A cross-country Scoping review
Introduction
The benefits of physical activity have been known for decades. A range of studies, utilising both observational and intervention study designs, have examined whether an increased level of physical activity has a beneficial impact on the risk of an individual developing chronic diseases such as cardiovascular disease, cancer and type 2 diabetes. Although self-reported measures of physical activity have been used more extensively in the past, technological advances have led to device-based measures becoming cheap enough that they can be used at scale to measure physical activity. This has led to an increasing number of studies collecting a large volume of physical activity and accompanying health outcomes data.
Population representative surveys, such as the National Health and Nutrition Examination Survey (NHANES) in the United States (US), has led to the production of very large cross-sectional datasets (circa n=>5000), which provides suitable statistical power to investigate a number of associations between physical behaviour and health. Secondary analyses of the data allow newly defined research questions to be addressed without the need to collect new data. As well as such observational studies, RCTs have also collected physical activity and health data in a large number of participants. Such studies perform a battery of health assessments before and after a physical activity intervention. These baseline data can be treated as a cross-sectional study with all the population performing their usual behaviour before undergoing randomisation. These data appear to be far less commonly re-analysed in secondary data analyses, likely due to the large amount of work that would be required to harmonise the data.
This reliance on a small number of primary studies data to perform a large amount of data analysis raises a few potential issues. Firstly, as NHANES is based in North America the ethnicities measured are only those prevalent in North America. Therefore, it fails to capture the full range of ethnicities that exist in other parts of the world. Secondly, the data collection being isolated to one country means that potential differences due to geographic location may not be fully assessed. For example, countries closer to the equator are generally considered to have more tropical climates than those closer to either pole. This could have a substantial impact on the physical activity patterns of individuals living in such countries.
Due to these limitations, we aim to map all the available primary studies datasets that have collected accelerometry measured physical activity. Once completed, this will provide a comprehensive resource that allows other researchers to identify which studies may have collected the necessary data to answer their research question. It will also help to highlight geographic locations and populations in which data have not currently been collected. This information will also allow the potential for future harmonised data analysis to be assessed.
In 2015, a review was conducted to describe the scope of accelerometry data collected internationally. The current study aims to both update the findings of this previous work, as well as extract more comprehensive data regarding the characteristics of each study including proportion of different ethnic groups measured, accelerometry protocol utilised, the access status of each primary study dataset and health outcomes collected alongside device measured physical activity. It is intended that the data will have greater utility in highlighting the gaps that currently exist in the current body of evidence. This review will also look to highlight studies where harmonisation work may be warranted.
Therefore, this scoping review aims to identify the available datasets that have used accelerometers to collect physical activity and health outcomes data. To do this we will systematically review the evidence to identify relevant studies and thereafter extract these datasets and their characteristics (e.g. study location, population(s) recruited, the accelerometer brand and model used to measure physical activity, the location this device is placed on the body and the simultaneous health outcomes assessed and recorded in the dataset) to describe the scope of accelerometry data collected internationally. Our findings will provide a user-friendly library of datasets which researchers can refer to when conducting future studies about the health benefits of physical activity. It is intended that the data will have greater utility in highlighting the gaps that currently exist in the current body of evidence. It will also look to highlight studies where harmonisation work may be warranted.
Methodology
Due to the aims of the study, it was decided that a scoping review methodology should be adopted. A scoping review, according to Arksey and O’Malley, is a form of knowledge synthesis that aims to:
a) Examine the extent or nature of research activity b) To determine the value of undertaking a full systematic review c) To summarise and disseminate research findings d) To identify research gaps in the existing knowledge.
As the aim of this study is not to answer a specific research question, but instead to understand the extent and nature of datasets that had currently been collected using accelerometry to measure physical activity, a traditional systematic review is not appropriate. Considering a key secondary aim of our study is to assess whether an IPD meta-analysis is warranted and feasible, a scoping review is considered the most appropriate methodology. To ensure our methodology is consistent with that used in previous literature, the framework developed by Arksey and O’Malley, which was further built on by Levac et al, will be followed. The paper will also be written in accordance with the Preferred Reporting of Systematic Reviews of Meta-analyses Scoping Review Extension (PRISMA-Scr) checklist.
Arksey and O’Malley laid out five key stages in the production of a scoping review. Our approach to each stage is discussed below.
Stage 1: Identify the question
Our main aim was to identify and characterise these datasets that have currently collected device measured physical activity using accelerometers. To answer this question fully, four research aims were derived:
a) To identify datasets that have measured physical activity using accelerometers
b) To identify the key characteristics of these datasets such as: which country/countries they have been collected in, which populations they have focused on, which devices have been used to collect the physical activity data and any health outcomes that were simultaneously collected
c) To clarify the access status of these datasets identified to determine the feasibility of conducting harmonised data analysis
d) To identify any “emerging” relevant datasets. These will be defined as datasets for which data have been collected but not yet published, datasets that are currently collecting data or datasets for which methodologies are currently being constructed.
Stage 2: Search for literature
The search strategy will be created with the assistance of an Information Specialist. Key words to search for will be determined by consultation between the research team and the information specialist. The search strategy will be developed iteratively to ensure maximum coverage whilst keeping the review feasible in terms of the number of studies returned. Searches will be limited to adults aged 18 years or older. Searches will also be limited to human participants. No date limiter will be used for the searches. Three search methods will be used which are outlined below:
a) Peer-reviewed online literature databases will be searched. Searches will be performed in Medline and Medline in-progress, EMBASE, CINAHL, Sport Discus and Central.
b) Grey literature sources will also be searched. Searches will be performed in: Trials Registries, Open Grey and Conference Proceedings Citation Index-Science (CPCI-S)
c) Experts in the field will also be contacted to contribute “emerging” studies. These will be defined as studies for which data have been collected but not yet published, studies that are currently collecting data or studies for which methodologies have been or are currently being constructed. Experts in the field will be determined by the research team and will be sent an email asking them to contribute any studies they feel fit the above criteria.
d) Forward and backward searching of all included articles will be conducted to identify further publications that may meet the inclusion criteria of this review. A draft of the search strategy to be used in Medline can be found in Appendix 1 alongside an explanation of each line of the search strategy.
Stage 3: Study selection
Searches will be conducted by an Information Specialist. Results of these searches will be exported to EndNote (Clarivate, Thomas Reuters Corporation, George Mason University) and duplicates will then be removed. The remaining studies will be extracted into Covidence (Veritus Health Initiative, Melbourne, Australia). Subsequently, title and abstract screening will be performed. Each study will be screened by two researchers. If there is disagreement regarding the ratings, discussion between the two researchers will be used to resolve this. If the disagreement cannot be resolved, a third researcher will be brought into the discussion to make a final decision.
All studies will be screened based on the inclusion and exclusion criteria outlined in Table 1.
Full-text screening will then be conducted on the remaining studies and the reason for excluding studies at this stage will be recorded. All inclusion and exclusion decisions will be recorded and a PRISMA flow diagram will be produced and included in the final manuscript.
Stage 4: Charting the data
Once full text screening is completed and eligible datasets have been identified, the name of the datasets used for the analysis will be extracted from each publication. Each publication will then be grouped by the dataset collected and/or analysed. Data will then be extracted relating to each dataset. As each unique publication is likely to have applied unique inclusion and exclusion criteria to the total dataset to answer their specific research questions, where possible we will consulted dataset websites, methodology publications or publications that provide a description of the entire cohort. If such resources cannot be found, then all the publications that have arisen from the dataset will be consulted to determine the characteristics of the dataset. If the information is irretrievable at this stage, the characteristic outcome will be marked as unknown.
If the same dataset has been used by multiple publications, the extraction process will only be completed once. However, a variable will be created to indicate the number of publications which have stemmed from each dataset. Key characteristics of the datasets will be extracted including the number of participants, the country of data collection, the accelerometer brand and model used to measure physical activity and accessibility status of the dataset.
Stage 5: Collating, summarising and reporting
Once data have been extracted, the data will be synthesised into a table in order to map the key characteristics of each dataset. From this, patterns and trends in key characteristics such as where datasets were collected, and the access status of the datasets will be explored. Further data analysis may be performed based on this preliminary analysis, to explore these relationships in greater detail. This could include the production of descriptive or summary statistics for key variables. Visualisations will be produced where it is felt they help to convey the central message of the data analysis.
It is highly likely that sub-groups of datasets will be derived based on a number of factors including the geographic location where the dataset was collected, the device used to collect the data and the population on which the data were collected. The key findings will be presented as part of the main text however additional summary tables and visualisations may be included in Appendices or as online supplementary material if necessary.
A narrative summary of the above will then be produced highlighting the key findings.