MRA-MIDAS: Multimodal Image Dataset for AI-based Skin Cancer
Dataset Description
We introduce the Melanoma Research Alliance Multimodal Image Dataset for AI-based Skin Cancer (MRA-MIDAS) dataset, the first publicly available, prospectively-recruited, systematically-paired dermoscopic and clinical image-based dataset across a range of skin-lesion diagnoses. This dataset encompasses a wide array of skin lesions and includes well-annotated, patient-level, clinical metadata. It aims to more accurately mirror real-world clinical scenarios than retrospectively curated datasets and is enhanced by extensive histopathologic confirmation to ensure data integrity. This research was approved by the Institutional Review Board at Stanford University under IRB#36050, along with the Cleveland Clinic Foundation under IRB#20-666, and adhered to the Helsinki Declaration. Patients presenting to the dermatology clinics of participating dermatologists at Stanford Medicine or Cleveland Clinic Foundation between August 18, 2020, and April 17, 2023, were eligible for the study if 1) they had at least one solitary skin lesion of concern identified where a skin biopsy was deemed medically necessary by the dermatologist investigator or 2) patients were directed to in-clinic evaluation for a lesion that was previously identified as concerning through a teledermatology encounter or dermatologist review of a patient photo submitted through the electronic patient messaging portal. Patients underwent written informed consent with either the physician or research coordinator, after which both clinical and dermoscopic digital photography were obtained of any eligible skin lesions. Each lesion underwent standardized photography with a contemporary model iPhone or iPad device (iPhone SE to iPhone 12 Pro and iPod touch to iPad mini) without flash photography at 15-cm and 30-cm distances, along with digital dermatoscope photography. For each lesion, clinical information about the patient was obtained and recorded including sex assigned at birth, age, Fitzpatrick skin type, personal history of melanoma, anatomic location, and the lesion’s length and width. Investigators had the discretion to identify additional control lesions that clinically appeared benign on a corresponding contralateral body site that were similarly enrolled for digital photography as an un-biopsied control lesion to include in the dataset, though model analysis was restricted to biopsied lesions. This dataset contains images obtained from patients at Stanford who provided consent for public release of their images and represents the near entirety of cases enrolled at this site. At the time of first enrollment, the Stanford dermatologists at the specialized pigmented lesion and melanoma clinics had an average of 15.7 years of post-residency experience while those in general medical dermatology clinics had an average of 3.9 years’ experience. Dermatologists noted their top-five ranked clinical impressions at the time of evaluation, along with their binary level of confidence (Yes/No) in their top impression. For any biopsied lesions, associated histopathologic final diagnoses were recorded and categorized into a previously described taxonomy. Biopsy results were interpreted by three board-certified dermatopathologists at Stanford. A dermatopathology consensus conference reviewed any diagnosis of severely dysplastic melanocytic nevus or worse. Melanocytic lesions were specifically grouped in the following manner: benign melanocytic nevi, melanomas (including melanoma in-situ and invasive melanoma), and surgically-eligible intermediate melanocytic tumors where complete excision is typically recommended (including severely dysplastic melanocytic nevi and melanocytomas such as typical/atypical Spitz tumors, such as BAP-1-inactivated melanocytic tumors, deep penetrating nevi/tumors, and cellular blue nevi with atypia). Cases were included in the dataset if a second reviewing independent board-certified dermatologist agreed with the favored diagnosis based on a review of the associated images. Funding: This project is based on research supported by the Melanoma Research Alliance (MRA)- L’Oreal Dermatological Beauty Brands Team Science Award, along with philanthropic funding from the David Mair and Vanessa Vu-Mair Artificial Intelligence in Skin Cancer Fund and the Tal & Cinthia Simon Melanoma Research Fund at Stanford Medicine. Acknowledgments: This material is the result of work supported with resources and the use of facilities at the Veterans Affairs Palo Alto Health Care System in Palo Alto, California.