Introduction: Studies have shown that there is considerable variability inherent in the reader interpretation of chest radiographs. Besides clinical implications, such disagreements can impact the validity of inferences from health studies. Factors contributing to this variability are not well understood. This study addressed three research questions. First, what worker-specific or test factors are significantly associated with differences in reader interpretations in a chest radiographic study? Second, does a change of one reader midway through a longitudinal chest radiographic study have a significant impact on study findings? Third, what was the radiologist's performance and agreement in this 20-year study?
Methods: Chest X-rays (n=6,392) were collected every three years between 1987 and 2008 from workers (n=1,570) participating in a longitudinal study assessing occupational exposure to Refractory Ceramic Fibers (RCF). RCF exposure has been shown to be associated with elevated rate of pleural and interstitial changes. All films were masked of personal identifiers and interpreted by three B-readers. Workers with two chest X-rays between 1987-1997 (n=1,084) were used to assess associations between the probability of agreement among three readers and several worker-specific and test factors. Worker-specific factors included age, body mass index (BMI), pack-years of cigarette smoking, cumulative RCF exposure, and asbestos exposure, while test factors evaluated were film quality and radiologist experience measured in years. The change in one of three readers was investigated by having the new reader re-evaluate the films (n= 193) with the potential to modify each worker's positive or negative longitudinal radiographic finding. Reader performance was assessed by calculating inter-reader agreement (kappa), sensitivity, specificity, and positive predictive value (PPV).
Results: Reader disagreement was significantly associated with increases in worker age, pack-years of cigarette smoking and cumulative RCF exposure. Agreement between readers was significantly associated with increases in reader experience. Film quality, BMI, and asbestos exposure were not significant in any model. The change in readers impacted the longitudinal positive or negative determination for three workers but did not significantly impact the findings of the original study evaluating the association between occupational RCF exposure and radiographic changes. There was moderate agreement among the readers regarding pleural changes (kappa, k=0.58) and 1/0 or greater interstitial changes (k=0.43). Agreement was fair (k=0.29) for 0/1 or greater interstitial changes. Sensitivity was 78% for pleural changes and 69% for 0/1 or greater interstitial changes. Specificity for both pleural and interstitial findings was extremely high, 99%, as expected for a study with few positive cases. PPV for pleural changes was 68%, 47% for 0/1 or greater interstitial changes and 59% for 1/0 or greater interstitial changes.
Conclusions: This study demonstrates that factors associated with radiographic changes, such as cigarette smoking, occupational exposures, and age, can increase disagreement among readers. Any research investigating subtle radiographic findings, such as 0/1 interstitial changes, should also anticipate radiologist disagreement. Researchers are advised to utilize at least three experienced readers when conducting radiographic studies in order to minimize measurement error and increase precision.