Spirometry Measurements from NHANES 2007–2012

Description

Various spirometry measurements from the National Health and Nutrition Examination Survey (NHANES) 2007–2012 along with covariates providing demographics and basic body measurements.

Usage

data("SpirometryUS", package = "gamlss2")

Format

A data frame containing 16596 observations on 13 variables.

fvc
Numeric. Forced vital capacity (FVC) in liters, i.e., the volume of air that can forcibly be blown out after full inspiration.
fev1
Numeric. Forced expiratory volume in 1 second (FEV1) in liters, i.e., the volume of air that can forcibly be blown out in the first second, after full inspiration.
ratio
Numeric. Ratio of FEV1 to FVC.
pef
Numeric, peak expiratory flow (PEF) in liters per second, i.e., the maximal flow (or speed) achieved during the maximally forced expiration initiated at full inspiration.
fef
Numeric. Forced expiratory flow (FEF) in liters per second, i.e., the flow (or speed) of air coming out of the lung during the middle portion (25% to 75%) of a forced expiration.
volume
Numeric. Extrapolated volume.
fet
Numeric. Forced expiratory time (FET) in seconds, i.e., the length of the expiration.
gender
Factor. Binary gender information with levels female and male.
age
Numeric. Age in years (rounded to quarters).
weight
Numeric. Body weight in kilograms.
height
Numeric. Body height in centimeters.
bmi
Numeric. Body mass index in kilograms per meter-squared, rounded to 2 decimal places.
ethnicity
Factor. Self-reported race and ethnicity information with levels white, black, mexican American, other hispanic, and other (including multi-racial).

Details

In order to establish lung function reference equations, Zavorsky (2025) studies the dependence of three spirometry measurements (FVC, FEV1, and the FEV1/FVC ratio) on age, adjusted for height and weight and separately for females and males. He intends to show that a simple normally-distributed model with (piecewise) linear mean equation and (piecewise) constant variance suffices for obtaining an adequate distributional fit from which the 5% quantile can be obtained as the so-called lower limit of normal (LLN). Actually, his comparison with GAMLSS – using flexible predictors for both mean and variance along with a Box-Cox-transformed normal distribution – shows that GAMLSS leads to a similar fit for the mean but a much better fit for the LLN.

Zavorsky’s (2025) analyses are based on a data set that he derived from the National Health and Nutrition Examination Survey (NHANES) in the United States 2007–2012. From the entire available data from https://wwwn.cdc.gov/nchs/nhanes/ he included those observations which met or exceeded the technical acceptability of the measurements for forced expiratory volume in 1 second (FEV1) and forced vital capacity (FVC). The data are described in a short communication published in the Data in Brief journal and the accompanying spreadsheet in CSV format (comma-separated values) is available from Mendeley Data.

The data comprises observations from NHANES’ “Examination Data”, in particular in “Spirometry – Pre and Post-Bronchodilator” and “Body Measures”, plus accompanying “Demographics Data”. See the variable descriptions above for more details. Basic information about spirometry can be found for example in the Wikipedia at https://en.wikipedia.org/wiki/Spirometry.

Source

Zavorsky GS (2024). “Refined NHANES 2007–2012 Spirometry Dataset for the Comparison of Segmented (Piecewise) Linear Models to That of GAMLSS”, Mendeley Data, V1. doi:10.17632/dwjykg3xww.1

References

Zavorsky GS (2024). “A Refined Spirometry Dataset for Comparing Segmented (Piecewise) Linear Models to that of GAMLSS”. Data in Brief, 57, 111062. doi:10.1016/j.dib.2024.111062

Zavorsky GS (2025). “Debunking the GAMLSS Myth: Simplicity Reigns in Pulmonary Function Diagnostics”. Respiratory Medicine, 236, 107836. doi:10.1016/j.rmed.2024.107836

Examples

library("gamlss2")

data("SpirometryUS", package = "gamlss2")
summary(SpirometryUS)
      fvc             fev1           ratio             pef        
 Min.   :0.704   Min.   :0.476   Min.   :0.2913   Min.   : 0.901  
 1st Qu.:2.815   1st Qu.:2.224   1st Qu.:0.7604   1st Qu.: 5.816  
 Median :3.630   Median :2.910   Median :0.8147   Median : 7.418  
 Mean   :3.676   Mean   :2.948   Mean   :0.8064   Mean   : 7.506  
 3rd Qu.:4.480   3rd Qu.:3.598   3rd Qu.:0.8620   3rd Qu.: 9.150  
 Max.   :9.361   Max.   :6.923   Max.   :1.0000   Max.   :19.024  
      fef            volume            fet            gender    
 Min.   :0.010   Min.   :  0.00   Min.   : 1.200   female:8303  
 1st Qu.:1.986   1st Qu.: 52.00   1st Qu.: 7.700   male  :8293  
 Median :2.832   Median : 69.00   Median : 9.000                
 Mean   :2.942   Mean   : 76.18   Mean   : 9.693                
 3rd Qu.:3.774   3rd Qu.: 93.00   3rd Qu.:11.500                
 Max.   :9.280   Max.   :321.00   Max.   :32.800                
      age            weight           height           bmi       
 Min.   : 6.00   Min.   : 16.40   Min.   :104.6   Min.   :12.50  
 1st Qu.:17.00   1st Qu.: 57.20   1st Qu.:155.9   1st Qu.:21.67  
 Median :34.00   Median : 72.30   Median :164.8   Median :26.06  
 Mean   :35.89   Mean   : 72.97   Mean   :162.9   Mean   :26.82  
 3rd Qu.:53.00   3rd Qu.: 88.20   3rd Qu.:173.2   3rd Qu.:30.90  
 Max.   :80.00   Max.   :218.20   Max.   :203.8   Max.   :84.87  
    ethnicity   
 white   :6607  
 black   :3598  
 mexican :3068  
 hispanic:1825  
 other   :1498