Musculoskeletal disorders (MSDs), which encompass a wide variety of bone, soft tissue, and joint abnormalities, are a major healthcare challenge around the world. MSDs are typically diagnosed using radiographs; however, variations in diagnostic interpretation quality can often lead to diagnostic errors. This problem is often compounded by a lack of available tools to triage large volumes of unread examinations, which can result in numerous adverse downstream effects related to delay of diagnosis and treatment.
The recent revolution in deep learning techniques for image analysis suggests that convolutional neural networks (CNNs) can serve as an effective tool for computer-aided detection of radiograph abnormalities. To aid computational models in accurately identifying diverse abnormalities in highly-variable radiographs of multiple body parts, we are releasing LERA (Lower Extremity RAdiographs). This dataset was used as the held-out test set in our recent study, which found that a single pre-trained CNN was effective in performing generalized abnormality detection in lower extremities [citation after publication].
Dataset Details: In this retrospective, HIPAA-compliant, IRB-approved study, we collected data from 182 patients who underwent a radiographic examination at the Stanford University Medical Center between 2003 and 2014. The dataset consists of images of the foot, knee, ankle, or hip associated with each patient.
Assignment of Labels: Our dataset includes a .csv file matching patient identification numbers to diagnosis labels and radiograph types. Diagnosis labels were assigned as follows. After prospective evaluation of all radiographs associated with a patient, the attending radiologist at the time of initial interpretation assigned each patient a binary classification of normal (y=0) or abnormal (y=1). The designation of a radiograph as normal refers to the attending radiologist’s interpretation of a radiograph as normal given the age of the patient; all radiographs that fall outside this categorization are designated as abnormal (which may be as varied as degeneration, hardware, arthritis, and fractures, among others). Due to these loose constraints as well as the fact that ground truth in radiographic examinations can be difficult to establish, we predicted that the dataset contained a small percentage of incorrect labels; in order to correct for this, two board-certified radiologists, each with 6 years of post-graduate experience, independently labeled the images in this dataset through majority vote consensus between the two radiologists and the prospective exam report. As a result, we are confident that dataset is highly accurate and will serve a suitable resource for testing deep learning models.
Additional Information: Please note that all labels are assigned at the patient level, indicating that the same classification applies to all images for a particular patient. Also, since images were collected over a twelve-year period, the dataset includes highly variable images, with radiographs in varying in size, resolution, and color; there may also be duplicate images.
This study was supported by the Stanford Center for Artificial Intelligence in Medicine and Imaging (AIMI). The research reported in this publication was supported by the National Library of Medicine of the National Institutes of Health under Award Number R01LM012966 and Stanford Child Health Research Institute (Stanford NIH-NCATS-CTSA Grant #UL1 TR001085). This research used data or services provided by STARR, "STAnford medicine Research data Repository,” a clinical data warehouse made possible by the Stanford School of Medicine Research Office.
Downloading the Dataset
Please read the Stanford University School of Medicine LERA- Lower Extremity RAdiographs Dataset Research Use Agreement. Once you register to download the LERA- Lower Extremity RAdiographs dataset, you will receive a link to the download over email. Note that you may not share the link to download the dataset with others.
Stanford University School of Medicine LERA- Lower Extremity RAdiographs Dataset Research Use Agreement
1. Permission is granted to view and use the LERA- Lower Extremity RAdiographs Dataset without charge for personal, non-commercial research purposes only. Any commercial use, sale, or other monetization is prohibited.
2. Other than the rights granted herein, the Stanford University School of Medicine (“School of Medicine”) retains all rights, title, and interest in the LERA- Lower Extremity RAdiographs Dataset.
3. You may make a verbatim copy of the LERA- Lower Extremity RAdiographs Dataset for personal, non-commercial research use as permitted in this Research Use Agreement. If another user within your organization wishes to use the LERA- Lower Extremity RAdiographs Dataset, they must register as an individual user and comply with all the terms of this Research Use Agreement.
4. YOU MAY NOT DISTRIBUTE, PUBLISH, OR REPRODUCE A COPY of any portion or all of the LERA- Lower Extremity RAdiographs Dataset to others without specific prior written permission from the School of Medicine.
5. YOU MAY NOT SHARE THE DOWNLOAD LINK to the LERA- Lower Extremity RAdiographs dataset to others. If another user within your organization wishes to use the LERA- Lower Extremity RAdiographs Dataset, they must register as an individual user and comply with all the terms of this Research Use Agreement.
6. You must not modify, reverse engineer, decompile, or create derivative works from the LERA- Lower Extremity RAdiographs Dataset. You must not remove or alter any copyright or other proprietary notices in the LERA- Lower Extremity RAdiographs Dataset.
7. The LERA- Lower Extremity RAdiographs Dataset has not been reviewed or approved by the Food and Drug Administration, and is for non-clinical, Research Use Only. In no event shall data or images generated through the use of the LERA- Lower Extremity RAdiographs Dataset be used or relied upon in the diagnosis or provision of patient care.
8. THE LERA- Lower Extremity RAdiographs DATASET IS PROVIDED "AS IS," AND STANFORD UNIVERSITY AND ITS COLLABORATORS DO NOT MAKE ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, NOR DO THEY ASSUME ANY LIABILITY OR RESPONSIBILITY FOR THE USE OF THIS LERA- Lower Extremity RAdiographs DATASET.
9. You will not make any attempt to re-identify any of the individual data subjects. Re-identification of individuals is strictly prohibited. Any re-identification of any individual data subject shall be immediately reported to the School of Medicine.
10. Any violation of this Research Use Agreement or other impermissible use shall be grounds for immediate termination of use of this LERA- Lower Extremity RAdiographs Dataset. In the event that the School of Medicine determines that the recipient has violated this Research Use Agreement or other impermissible use has been made, the School of Medicine may direct that the undersigned data recipient immediately return all copies of the LERA- Lower Extremity RAdiographs Dataset and retain no copies thereof even if you did not cause the violation or impermissible use.
In consideration for your agreement to the terms and conditions contained here, Stanford grants you permission to view and use the LERA- Lower Extremity RAdiographs Dataset for personal, non-commercial research. You may not otherwise copy, reproduce, retransmit, distribute, publish, commercially exploit or otherwise transfer any material.
Limitation of Use
You may use LERA- Lower Extremity RAdiographs Dataset for legal purposes only.
You agree to indemnify and hold Stanford harmless from any claims, losses or damages, including legal fees, arising out of or resulting from your use of the LERA- Lower Extremity RAdiographs Dataset or your violation or role in violation of these Terms. You agree to fully cooperate in Stanford’s defense against any such claims. These Terms shall be governed by and interpreted in accordance with the laws of California.