Imaging is a core activity for researchers in the service of basic biomedical science, for technicians in hospital labs, and for a host of doctors, nurses, administrators and academicians. It is used for every imaginable biomedical purpose, from the study of cells and genes to the search for the mechanisms of cancer and other diseases in the lab and for their presence in patients.
And it has a tendency to expand data storage and manipulation needs faster than most labs can handle.
No matter what imaging modality users might require on a daily basis, together they have similar — but not exactly the same — needs. For example, a typical, if thorough, genetic analysis using a multiwell-plate imager can acquire up to 1 million images at 2.5 MB each — 2.5 terabytes — during a single weeklong run.
To get data in the format in which you need it, you can use the software included with your microscope or other imaging instrument, tie it to the digital storage system using the network supplied by your workplace and hope that your data is easily retrieved, simple to sort through and extremely hard to lose.
Storing a single image costs a tenth of a cent or less — far less than it did a decade ago. The low cost of memory makes storing gigabytes, or even terabytes, of information an inexpensive prospect. Other hardware is less expensive and easier to use than it was just a few years ago, meaning that data processing and networking between computers across the room or across oceans is faster and safer than ever before. It's not that simple, though; there are other issues to consider when establishing or scaling up a storage system for images and their associated content.
Anyone who acquires or needs access to bioimages also must stay on top of security issues. For example, patients' medical images, including x-rays, MRIs and PET scans, must be accessible at once, manipulable, analyzable and safe from prying eyes.
"Data storage is just part of the solution, where rapid data access architecture and availability and security are critical to the process," said Sudip Bhattacharjee, associate professor at the University of Connecticut's School of Business in Storrs. Several companies are jumping into the fray, creating novel data storage hardware and software systems. "However," Bhattacharjee added, "if designed and implemented badly, lawsuits will be the bane of such systems."
When the first badly designed large-scale system is hit with a lawsuit because of a security breach, the medical industry will take a long time to recover, Bhattacharjee said. "Hence, it is critical to build the system with a lot of thought and understanding."
The metadata associated with every acquired image also is essential to good research practices. Capturing it involves recording who created and uploaded the image; the subject's or patient's identity; the time and date of acquisition; the imaging equipment and lighting used (such as type, brand and manufacturer) and their settings; environmental conditions such as temperature and air pressure; and postproduction changes such as colorizing.
The metadata must be captured as quickly as possible and stored in such a way that it is forever linked to the actual image, even if stored in a separate file. Keeping both image and metadata in the same file would be best; keeping metadata in a separate file risks losing the link between it and the original image, which reduces the value of the image almost entirely. Whether separated or not, an image and its metadata should be kept in intelligent databases to keep track of them.
Part of the bioimaging community is developing an open image format standard, called OME-TIFF, that incorporates captured metadata within the file header of an image, said Steve Rawsthorne of the John Innes Centre in Norwich, UK. Rawsthorne and his colleagues Jerome Avondo and Paul Fretter lead the Data Management of Bio-Imaging project sponsored by the UK-based Joint Information Systems Committee, or JISC. OME refers to the Open Microscopy Environment, a confederation of academic and commercial equipment makers and users striving to develop open-source tool kits for microscopists.
The JISC and the OME are not alone in their efforts. Several other groups have initiated programs to organize image data from acquisition through analysis and storage, including the Center for Bio-Image Informatics at the University of California, Santa Barbara. The group is responsible for the Bisque (Bio-Image Semantic Query User Environment) program, which supports users through acquisition, analysis, metadata tagging and data mining.
The accuracy of image and metadata information, along with privacy protection and tamper-proofing are vital but insufficient, University of Connecticut's Bhattacharjee said. "At the end of the day, data management in such an atmosphere is essentially a risk-management exercise."
Organizations just starting to consider a new data storage system, whether for imaging, for medical records or for research data, have much to consider to balance such risks with the reward of good data flow. One way that many begin is to consider the end users. End users include not only researchers or clinicians, but also anyone involved in the acquisition, processing and viewing of the information, including quality assurance and information technology staff.
Modern data storage hardware, such as IBM's new Storwize V7000, can hold multiple terabytes of information in packages that fit on a desktop. But the data must be secure, accessible and easy to analyze as well. (Photo: IBM)
"Begin with the end in mind," Rawsthorne said. "Define how you expect/believe the data will be accessed and used, and work backwards to design a data management and storage infrastructure."
But such a client-based approach is not universally appealing.
"It may seem counterintuitive, but get a law firm involved first," Bhattacharjee said. "They will ask all the relevant questions about the information, which will indicate the real value of the data and what level of protection it requires. Then technical personnel can get involved to design the protection level, which should be enforced technically as well as legally."
To further ensure data security, everyone who has access to the data should sign a contract that clearly states the boundaries of use, including dissemination, he said.
Legal issues are one of the core areas being addressed by the new Euro-Bioimaging consortium, which is focused on establishing light microscopy, molecular imaging and medical imaging infrastructures throughout Europe. The group is chiefly composed of members associated with the European Institute for Biomedical Imaging Research based in Vienna, Austria, and the European Molecular Biology Laboratory in Heidelberg, Germany. It has formed several series of technical and strategic working groups whose goals are to establish and support efforts to make imaging data readily accessible throughout the continent. Besides legal considerations of data distribution, Euro-Bioimaging working groups are investigating data management, open access to novel technologies and training protocols.
Ultimately, a data storage and management system should be blind to the hardware systems that feed into it. Proprietary file formats and system requirements cannot always be avoided, but they can hamper total integration even if they offer some technical advantages. Cloud computing technologies, where data from one or more sources are fed to off-site storage facilities owned and operated by entities not necessarily affiliated with the data originator, offer inexpensive capacity - but not without technological hurdles and security risks.
"Cloud storage and open data access are transforming infrastructures," Rawsthorne said.
Others, such as Bhattacharjee, disagree. "Cloud computing is a buzzword to cover the real fact that there are actually physical computers maintained in the least expensive data centers with cheapest possible manpower."
The major risk of both cloud computing and open-source data manipulation tools is that you can lose complete control over your data, and any hole in the host's security leaves open the possibility of data theft or loss of patient privacy.
The battle for superiority over the ever-growing abundance of biomedical images is just starting to take shape, and the sheer number of individuals and groups on the front lines are just starting to design and build their weapons. Nonetheless, the battle seems winnable, resulting in an array of tools and technologies that will improve the storage and use of imaging data.
"We hope that the imaging equipment vendors will converge on robust and open standards for image and metadata representation and storage, freeing up everyone's time to focus on developing richer data mining, visualization and accessibility systems," Rawsthorne said.