There were sex and regional differences in physical activities
Our analysis included 1 382 284 physical activity geotagged tweets (80 million tweets collected) from 481 146 users (55.65% men and 44.35% women) in 2992 and 2932 counties, respectively, for men and women. We grouped our findings into four geographical regions in the USA: West, South, Northeast and Midwest.44 Sentiment towards exercise was distributed with some uniformity across the USA (online supplementary figure S2). Overall, men and women shared similar sentiments towards physical activity (sentiment scores 0.660 (95% CI 0.660 to 0.661) and 0.657 (95% CI 0.656 to 0.657), respectively).
The top exercise terms were ‘walk’, ‘dance’, ‘golf’, ‘workout’, ‘run’, ‘pool’, ‘hike’, ‘yoga’, ‘swim’ and ‘bowl’. Walking was the most popular physical activity for both groups across all regions. We note overall and sex-specific regional variations in preferred activities (figure 1). For women, hiking was the second most popular activity in the West, representing 15.18% (95% CI 15.01% to 15.35%) of tweets. This activity represented only 3.24% (95% CI 3.13% to 3.35%) to 3.79% (95% CI 3.71% to 3.87%) of tweets in the Midwest and South, respectively. Mentions of participation in yoga also varied by region for women, representing 6.97% (95% CI 6.83% to 7.12%) of tweets in the Northeast, but only 3.87% (95% CI 3.79% to 3.95%) of tweets in the South. We saw similar patterns for hiking among men, representing 12.23% (95% CI 12.10% to 12.36%) of tweets in the West but only 2.38% (95% CI 2.30% to 2.47%) to 4.31% (95% CI 4.20% to 4.42%) of tweets elsewhere. Golf also varied in popularity among men, representing 12.29% (95% CI 12.11% to 12.48%) of tweets in the Midwest but only 7.85% (95% CI 7.72% to 8.00%) of tweets in the Northeast.
Figure 1The 10 most frequently mentioned activities and the proportion of tweets represented by sex and region.
Men and women mentioned performing gym-based activities at approximately equivalent rates (4.68% (95% CI 4.63% to 4.72%) and 4.13% (95% CI 4.08% to 4.18%) of tweets, respectively). CrossFit was the most popular workout class among men (14.91% (95% CI 14.52% to 15.31%) of gym-based tweets), whereas yoga was the most popular workout class among women (26.66% (95% CI 26.03% to 27.19%) of gym-based tweets). However, there were some differences, although not significant, in the estimated intensity of exercises reported by men and women as measured in calories burned (figure 2). The average number of reported calories burned per 30 min of reported exercise was 201.27 (95% CI 201.01 to 201.54) for men, and 191.66 (95% CI 191.37 to 191.95) for women. There were also regional variations in reported exercise intensity within sex. Women in the West reported exercises with the highest average caloric expenditure (ie, 194.78 (95% CI 194.25 to 195.31)), followed by the Northeast (193.26 (95% CI 192.60 to 193.92)), the Midwest (192.62 (95% CI 191.92 to 193.32)) and the South (187.96 (95% CI 187.48 to 188.44)). In contrast, men in the Midwest reported exercises with the highest caloric expenditure (202.71 (95% CI 202.07 to 203.36), followed by the South (202.58 (96% CI 202.13 to 203.04), the West (200.34 (95% CI 199.86 to 200.83)) and the Northeast (198.93 (95% CI 198.32 to 199.54)). The most significant sex disparities were noted in counties within Southern states; the average difference between men and women was 8.51 calories per 30 min of activity.
Figure 2State-level comparisons of self-reported calories burned estimated based on physical activities mentioned by men and women on social media by state and region. Sex-based disparities are on average more significant in the South.
Overall, counties that reported higher levels of physical activity on Twitter also had lower physical inactivity prevalence
The proportion of exercise tweets in a county was negatively associated with leisure-time physical inactivity prevalence for both men and women across regions (see figure 3). These correlations were strongest in the Northeast (r=−0.234 and –0.373 for men and women, respectively) and in the West (r=−0.217 and –0.267 for men and women, respectively). The national association between tweet sentiment and physical inactivity was similar for both men and women (r=−0.115 for men and r=−0.116 for women), but regional disparities exist. This relationship was stronger for men in the West (r=−0.194 for men and r=−0.076 for women) and stronger for women in the Northeast (r=−0.271 for women and r=−0.063 for men). There was a weak negative association between exercise intensity and physical inactivity for both groups (r=−0.061 for men and −0.001 for women), but stratified by region, this effect was much stronger for men in the West (r=−0.203) and the Midwest (r=−0.123). The association between each Twitter variable and inactivity prevalence by sex and region can be found in online supplementary table S3.
Figure 3The relationship between model-estimated and Centers for Disease Control and Prevention-forecasted inactivity prevalence based on mixed-effects linear models that control for measures of physical activity via Twitter, demographic variables and built environment contextual variables. Lines represent a linear fit.
The association between Twitter variables and inactivity remained in models that accounted for demographic, socioeconomic and built environment variables associated with physical inactivity (tables 1 and 2). This relationship was statistically significant for all regions for men, and all regions except the Midwest for women. Also, counties with more positive sentiment towards physical activity had lower inactivity prevalence in the West for both men and women, and in the Midwest for women. Furthermore, counties that reported high-intensity exercises on Twitter also had lower inactivity prevalence for men in the Midwest and the Northeast. There was no significant relationship between exercise intensity and physical inactivity prevalence for women.
We also observed different patterns in the association between Google searches for fitness centres and weight loss and physical inactivity prevalence in the two demographic groups. Specifically, counties in the Northeast with higher searches for ‘fitness centres’ also had lower physical inactivity for women, while counties in the Northeast and South with higher searches for weight loss had higher inactivity for women. Among men, counties with higher searches for fitness centre had lower inactivity prevalence, while counties with higher weight loss searches had higher inactivity prevalence in the Midwest. The directionality of these relationships suggests that populations seeking weight loss information online tend to have higher physical inactivity prevalence, while those seeking information on fitness centres are more likely to be active.
Notable disparities exist in the 2013 and 2015 (forecasted) prevalence of physical inactivity between men and women, with 79.7% and 78.4% of counties showing higher prevalence for women, respectively (figure 43). Our estimates of physical inactivity prevalence using physical activity tweet volume and sentiment, and Google search volume, while controlling for county demographics, and access to exercise space, are overall reflective of the disparities reported in CDC physical inactivity prevalence estimates. Overall, out-of-sample estimates are better for women than for men (average r=0.89 for women, average r=0.82 for men). Correlations between estimated and actual values were higher in the South for both men and women (r=0.79 and r=0.82, respectively).