Open Access to Medical Imaging Dataset Could Advance Computer-Aided Detection

Facebook X LinkedIn Email
BETHESDA, Md., July 23, 2018 — Researchers announce the open availability of the largest CT lesion-image database accessible to the public. DeepLesion, created by a team from the National Institutes of Health (NIH) Clinical Center, could help foster the development of deep-learning approaches for computer-aided detection (CADe) and diagnosis (CADx).

DeepLesion was developed by mining historical medical data from the NIH PACS (picture archiving and communication system), using the annotations, i.e., bookmarks, of clinically meaningful findings in medical images from the archive. The characteristics of the bookmarks were analyzed, harvested, and sorted to create the database.

DeepLesion, multi-lesion medical imaging database for deep learning. SPIE, NIH.
The ground truth and two enlarged lymph nodes are correctly detected, even though the lymph nodes are not annotated in the dataset. Courtesy of SPIE.

In addition to building the database, the team also developed a universal lesion detector based on the database. This detector could serve as an initial screening tool for radiologists or other specialist CADe systems in the future.

With over 32,000 annotated lesions from over 10,000 case studies, the DeepLesion dataset is now the largest publicly available medical image dataset. It contains multiple lesion types, including kidney lesions, bone lesions, lung nodules, and enlarged lymph nodes.

“We hope the dataset will benefit the medical imaging area just as ImageNet benefited the computer vision area,” said researcher Ke Yan.

In addition to lesion detection, the DeepLesion database could also be used to classify lesions, retrieve lesions based on query strings, or predict lesion growth in new cases based on existing patterns in the database.

Future work will include extending the database to other image modalities, including data from multiple hospitals, and improving the accuracy of the detector algorithm.

The database can be downloaded at

The research was published in SPIE Digital Library (doi:10.1117/1.JMI.5.3.036501).

Published: July 2018
machine learning
Machine learning (ML) is a subset of artificial intelligence (AI) that focuses on the development of algorithms and statistical models that enable computers to improve their performance on a specific task through experience or training. Instead of being explicitly programmed to perform a task, a machine learning system learns from data and examples. The primary goal of machine learning is to develop models that can generalize patterns from data and make predictions or decisions without being...
deep learning
Deep learning is a subset of machine learning that involves the use of artificial neural networks to model and solve complex problems. The term "deep" in deep learning refers to the use of deep neural networks, which are neural networks with multiple layers (deep architectures). These networks, often called deep neural networks or deep neural architectures, have the ability to automatically learn hierarchical representations of data. Key concepts and components of deep learning include: ...
Research & TechnologyAmericaseducationSPIEmedical imagingImagingNIHNational Institutes of HealthmedicinemedicalcancerBiophotonicsmedical image databasemachine learningdeep learningCT lesion-imageCADeCADx

We use cookies to improve user experience and analyze our website traffic as stated in our Privacy Policy. By using this website, you agree to the use of cookies unless you have disabled them.