Association of wearable device-measured vigorous intermittent lifestyle physical activity with mortality
Sample and design
Figure 1 describes the derivation of the analytic sample. The UK Biobank Study is a prospective cohort study of adults aged between 40 and 69 years whose baseline measurements took place between 2006 and 2010. Participants provided informed consent and ethical approval was provided by the UK’s National Health Service, National Research Ethics Service (Ethics Committee reference number: 11/NW/0382).
Between 2013 and 2015 (median 5.5 years after the baseline measurements), 103,684 UK Biobank participants wore a wrist-worn accelerometer for 7 days24,25. We excluded participants with missing covariates and insufficient valid wear days. Monitoring days were considered valid if wear time was greater than 16 h. To be included in analysis, participants were required to have at least three valid monitoring days, with at least one of those days being a weekend day45,46. We excluded participants who reported that they cannot walk.
To enable examination of VILPA in our study (brief bouts of nonexercise VPA occurring during daily living), we included only participants who reported no leisure time exercise participation and no more than one recreational walk per week. Participation in exercise and recreational walking was measured through a close-ended touch-screen questionnaire that asked participants to report if, how often, and for how long they participate in such activities (Supplementary Table 2). Among the included 14,982 participants who were walking for recreation once a week or less, the average spacing of VILPA bouts was 165.7 (47.0) min within days and 16.7 (5.5) h between days (last session of a day versus first session the day after). The modal median length of the (at most) one and only weekly walking session these participants reported was 30–60 min (32.5% of the 14,982 participants), effectively eliminating the possibility that the device-recorded VILPA bouts occurred during recreational walking.
To provide a comparison between effects of VILPA and (context-agnostic) VPA we repeated the main analyses among ‘exercisers’, defined as those UK Biobank accelerometry substudy participants who did not meet the above criteria to be considered nonexercisers; that is, those who reported any leisure time exercise or more than one recreational walking session per week (Supplementary Table 1).
Definition of VILPA and choice of bout length
We based the choice of VILPA bout length entered in our analyses on an ongoing study of 58 adults (mean age 55.7 (s.d. 10.1) years) aimed at developing an empirical definition of VILPA (M.N.A., N. Johnson, C.T.-N., M.J.G. and E.S., unpublished data). Participants completed five activities of daily living while wearing an indirect calorimetry unit (Cosmed K5) and Polar heart-rate monitor. The activities included: (1) walking on a flat surface at a self-selected ‘very fast’ pace; (2) walking on a flat surface while carrying shopping-like bags equivalent to 5% of body weight at a self-defined ‘fast’ pace; (3) walking on a flat surface while carrying shopping-like bags equivalent to 10% of body weight at a self-defined ‘fast’ pace; (4) walking at a 2.5% gradient at a self-defined ‘very fast’ pace (treadmill); and (5) walking at a 7.0% gradient at a self-defined ‘very fast’ pace (treadmill). The sequence of activities was randomized for each participant and counterbalanced across participants to prevent biases due to residual fatigue accumulating during the protocol.
Participants performed each activity until vigorous intensity was reached for two of three criteria: (1) %VO2max (percentage of maximal oxygen updake) (≥64%); (2) %HRmax (percentage of maximal heart rate) (≥77%); and (3) rating of perceived exertion (Borg scale) ≥15. For %VO2max and %HRmax, the threshold had to be met for at least 30 consecutive seconds to minimize the effects of noise. VO2max was calculated using the Ebbeling treadmill test and HRmax was calculated using the Tanaka equation47. Between activities, participants had 5 min of seated recovery, or until heart rate and breathing returned to resting levels. Resting VO2 and heart rate were measured at the beginning of each session with the participant lying supine using 5 min of steady-state (coefficient of variation ≤ 10%). The duration to reach vigorous intensity across all five activities is shown in Supplementary Table 7. As the mean time required to reach vigorous intensity in two of the above three physiological intensity indices was 73.5 s (s.d. 26.2 s) across all activities, we decided to test VILPA bouts lasting up to 1 and up to 2 min in the present analyses. As the length of raw bouts within these two VILPA frequency exposures was highly variable, we length-standardized analytic bouts to one minute (for raw bouts lasting up to 1 minute) or two minutes (for raw bouts lasting up to 2 minutes) using a rolling sum on the time-series data until 1 or 2 minutes, respectively, was reached or exceeded. For example, a participant with five consecutive raw bouts lasting up to 1 minute each (20, 30, 20, 40, and 10 seconds long), would be assigned 1.83 analytic bouts: the first three raw bouts would count as one and the rolling sum would be reset; then the last two raw counts would count as 0.83 length-standardised bouts (50 seconds divided by 60). This bout handling has analytic and interpretational advantages: a) it mitigates against the problem of multicollinearity between raw VILPA frequency and daily VILPA duration, and b) permits a more concrete behavioural interpretation of the VILPA frequency findings than raw bouts, as each length-standardised bout can be specifically interpreted as lasting 1 minute or 2 minutes.
Wearable device-based physical activity classification
The methods we describe here were used to classify physical activity intensity in both the nonexercisers (main analyses) and exercisers (additional analyses) strata. Supplementary Fig. 4 summarizes how activity intensity was classified using a previously validated random forest (RF) activity classifier33. RF is an ensemble of multiple decision trees. Each tree is learned on a bootstrap sample of training data and each node in the tree is split using the best among a randomly selected set of acceleration features. The decisions from each tree are aggregated and a final model prediction is based on majority vote. The RF model requires very little preprocessing of the data because the features do not need to be normalized. In addition, the model is resistant to over-fitting the training data because each tree within the forest is independently grown to maximum depth using a randomly selected subset of features.
This two-stage classifier first categorized physical activity in 10-s windows into one of four activity classes: sedentary, standing utilitarian movements (for example, ironing a shirt, washing dishes), walking activities (for example, gardening, active commuting, mopping floors), running/high energetic activities (for example, active playing with children). These activity classes were then assigned to one of four activity intensities: sedentary, light, moderate and vigorous. Walking activities were classified as light (an acceleration value of <100 mg), moderate (≥100 mg) and vigorous (≥400 mg) intensity48. For example, for a VILPA bout lasting up to 2 min, 12 consecutive 10-s windows needed to be classified as vigorous. When there were more than 12 consecutive vigorous activity windows, these bouts counted as long VPA sessions in the corresponding analyses (2.3% of all VPA bouts). Differentiation between sleep36 and nonwear35 was identified using the change in tilt angle and acceleration standard deviation. Monitors were calibrated49 and corrected for orientation50 using previously published methods, although residual signal and alignment uncertainties may persist.
Activities in an independent sample of 98 participants (age 56.4 ± 15.7 years ; 53.1% female) from the US51 (University of California Irvine Center for Machine Learning and Intelligent Systems Physical Activity Monitoring for Aging People study (published data), accessible at https://archive.ics.uci.edu/ml/datasets) and Australia52 (University of Queensland Where and When at Work study (published data) and University of Sydney Intermittent Lifestyle Physical Activity Study (unpublished data)) providing 103,607 activity samples from structured and free-living activities (17,267 min) were used to assess robustness and generalizability of the classifier (Supplementary Tables 8 and 9). For free-living activities participant-worn or researcher-held Go-Pro video-recordings were used to attain ground-truth physical activity. Video files were imported into the Noldus Observer XT software v16.0 for continuous direct observation coding. A two-stage direct observation scheme was implemented in which the participant’s movement behavior was coded for activity type and then activity intensity based on the Compendium of Physical Activities53. The direct observation system generated a vector of date–time stamps corresponding to the start and finish of each movement event, which were used to assign the activity codes to the corresponding time segments of the accelerometer data. Interobserver reliability was assessed by dual coding. The intraclass correlation coefficient for coding activities was 0.91 (0.87–0.94).
Performance was further evaluated in a separate sample of 151 adults (age range 18–91 years, 65.6% female; Supplementary Fig. 5) recruited from the UK34 (University of Oxford Capture 24 study (published data), accessible at https://ora.ox.ac.uk/objects/uuid:99d7c092-d865-4a19-b096-cc16440cd001). Participants in this data set wore body cameras that provided pictures every 20 s to annotate ground-truth free-living activity labels. The picture-based activity coding scheme has been previously described34. A total of 172,360 activity samples (28,727 min) were provided by participants.
Because of the nature of rolling updates for the data linkage, participants were followed up to 31 October 2021, with deaths obtained through linkage with the National Health Service (NHS) Digital of England and Wales or the NHS Central Register and National Records of Scotland. CVD mortality was defined as death attributed to diseases of the circulatory system, excluding hypertension, diseases of arteries and lymph (ICD-10 codes: I0, I11, I13, I20–I51, I60–I69). Cancer mortality was defined as death attributed to any cancer excluded in situ, benign, uncertain, nonmelanoma skin cancer or non-well-defined cancers (ICD-10 codes beginning ‘C0’, ‘C1’, ‘C2’, ‘C3’, ‘C4’ (excluding C49.9), ‘C5’, ‘C6’, ‘C70’, ‘C71’, ‘C72’, ‘C73’, ‘C74’, ‘C75’, ‘C7A’, ‘C8’ or ‘C9’).
In our study, the range of VILPA values (and context-agnostic VPA values in exercisers) was capped at the 97.5 percentile to minimize the influence of sparse data. To reduce the possibility of reverse causation through prodromal/undiagnosed disease, all analyses excluded those with an event within the first 2 years of follow-up. We also excluded those with prevalent CVD and prevalent cancer at baseline (CVD and cancer mortality analyses, respectively).
We examined the dose–response of average daily duration and frequency of VILPA bouts lasting up to 1 min and up to 2 min using Cox proportional hazards (all-cause mortality) and Fine–Gray subdistribution hazards to account for competing mortality risks (CVD and cancer mortality)54. In all analyses, we set knots at the 10th, 50th and 90th percentiles. Departure from linearity was assessed by a Wald test. Proportional hazards assumptions were tested using Schoenfeld residuals in the models with all three outcomes and no violations were observed (all P > 0.05). Analyses were adjusted for age, sex, daily duration of light- and of moderate-intensity physical activity, mutual adjustment for daily duration and frequency of vigorous-intensity physical activity bouts lasting more than 1 to 2 min as appropriate, smoking, alcohol, accelerometry-estimated sleep duration35,36, fruit and vegetable consumption, education, parental history of CVD and cancer, medication use (insulin, blood pressure, cholesterol). All-cause mortality analyses were also adjusted for prevalent CVD and cancer, CVD analyses were adjusted for prevalent cancer, and cancer analyses were adjusted for prevalent CVD (Supplementary Table 3 provides full covariate definitions).
In the exercisers stratum of the UK Biobank accelerometry sample, we repeated the above multivariable-adjusted analyses for daily duration and frequency of (context-agnostic) VPA for bouts lasting up to 2 min, and we compared findings with the equivalent VILPA findings using overlay dose–response plots.
To assert the degree to which VILPA and VPA may contribute to mortality beyond the associations of overall movement volume, we also carried out a volume analysis based on energy expenditure using methods analogous to the study by Strain et al.18 We calculated physical activity energy expenditure for all VILPA and VPA bouts lasting up to 2 min.
To provide conservative point estimates we calculated the ‘minimal dose’, defined as VILPA volume/frequency associated with 50% of the optimal risk reduction37,38. We also present point estimates (HRs and 95% CI) associated with the median and maximum volume/frequency VILPA values. We calculated E-values to estimate the plausibility of bias from unmeasured confounding30,55.
We conducted sensitivity analyses of VILPA with additional adjustment for body mass index. To investigate potential reverse causation bias we also excluded participants who had poor self-rated health. In another sensitivity analysis, we tested the influence of applying a conservative definition of ‘nonexercisers’ by restricting analyses to the 10,230 participants who reported no recreational walking and no leisure time exercise.
We performed all analysis using R statistical software v.4.2.1 with RMS v.6.3.0 and survival package v.3.3.1.
We reported this study as per the Strengthening the Reporting of Observational Studies in Epidemiology guidelines (Supplementary Table 10).
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.