[setup: desk1: firefox with Bio-Formats web site desk2: mac desktop desk5: windows desktop desk6: linux desktop All desktops running at 800x600. Folder called "data" on every desktop with the same sample data. All other desktop icons removed.] { INTRODUCTION } [begin on Bio-Formats index page on desk1] Hi, I'm Curtis Rueden, software architect at the Laboratory for Optical and Computational Instrumentation, LOCI for short. In this video I will be discussing our Bio-Formats project -- from both a philosophical standpoint, and from a practical perspective as I demonstrate how you can use it to work with microscopy data and other life sciences data in a variety of file formats. [gesture with mouse at URL] This is the Bio-Formats web page. We've divided the information into several topics, accessible from the menu on the right. As you can see from this front page, Bio-Formats is a library for working with data in life sciences file formats, particularly microscopy formats. As of this video -- November 1st, 2008 -- Bio-Formats is compatible with 61 different formats -- which I'll discuss in more detail in a moment. { ORIGINS + INTEROPERABILITY } Bio-Formats began as a project here at LOCI. It branched out from my work on our VisBio multidimensional visualization application, when we realized that the community could really benefit from a tool that facilitated interoperability between different biological file formats. Over time Bio-Formats has grown beyond light microscopy to encompass formats in electron microscopy, high content screening, medical imaging, and other related life sciences fields. Interoperability is really what LOCI is all about. We want to provide software that enables scientists to effectively combine their favorite tools. From our perspective, having more and better tools at your disposal will enable you to do better science. Unfortunately, the commercial model has often been to invent a closed, proprietary file format, then release custom acquisition and analysis software designed to work with that format. Unfortunately, this sort of "lock in" mechanism leaves the customer at the company's mercy -- if there are missing features, or bugs, or the company does not provide good support, or you want to use their format with a different analysis package -- you may be out of luck. With Bio-Formats, we are changing that. The library's primary purpose is to [highlight relevant Purpose text] convert proprietary microscopy data into an open standard. This statement means more than just being able to work with specific file formats -- it means that the original file format of your data shouldn't matter, because once you open it up, you'll always be able to work with the information in the same way regardless of the source. I'll be talking more about this idea of standardization in a bit when I discuss metadata. { AUTHORS } As I said, Bio-Formats began at LOCI, but it has since become a joint effort between us and our commercial partner, Glencoe Software. [gesture at authors] Bio-Formats was written primarily written by myself of LOCI, and Melissa Linkert -- formerly of LOCI and now working for Glencoe -- with early contributions from Eric Kjellman -- also formerly of LOCI -- as well as Chris Allan and Brian Loranger from the Open Microscopy Environment project at the University of Dundee. The core Bio-Formats infrastructure was primarily architected by me, with substantial input from Melissa and Chris. Melissa is the lead Bio-Formats developer, and the one responsible for adding the vast majority of the file format support. We are very lucky to have such a talented full time developer working on the project -- she really made it possible for Bio-Formats to grow into what it is today. { PLUGGABLE LIBRARY } Since Bio-Formats is a library, it is designed to be "plugged in" to other software packages, to enable them to support these file formats. [gesture at table of software packages] There are quite a few programs capable of using Bio-Formats. For example, you can call Bio-Formats from MATLAB to read in your data there, or open your datasets using the Bio-Formats plugins for ImageJ. We have also written some command line tools for reading data from the console, as well as a simple graphical viewer program. I don't have time to discuss all of these packages in detail, but later in this video I'll show you how to use Bio-Formats with ImageJ, MATLAB and the command line. { SUPPORTED FILE FORMATS } [scroll down the page] If you scroll down the page, you'll find the list of the currently supported formats. Bio-Formats supports popular open image standards like [scrolling down] JPEG and JPEG 2000, PNG, and TIFF, of course -- but also many proprietary formats from microscopy hardware vendors like [scrolling back up] Zeiss, PerkinElmer, Olympus, Nikon and Leica. By "support" we mean that you can read and parse the image pixels, as well as the associated metadata describing the experiment. Some formats are more metadata-rich than others, though, and Bio-Formats provides varying degrees of support for each format. That's where this handy chart on the right comes in. [gesture at chart header with mouse] This table indicates a few useful pieces of information about your format of interest. [gesture over the LEGEND] Taking a look at the legend, we see a description of the rating system. Gold star is the best rating, followed by silver plus, green check, gray minus, with red X being the worst rating. Each format is rated according to five different categories. The first two of these, Pixels and Metadata, are a rating of the Bio-Formats library itself for that format. [gesture over Pixels heading] The Pixels rating describes how effectively Bio-Formats will actually display your images without any errors. Pixels support is the most fundamental aspect of Bio-Formats -- if we can't read the pixels, we don't consider the format supported. [gesture over Metadata heading] The Metadata rating describes how effectively Bio-Formats processes all the other information associated with the dataset. Metadata is a broad term that means "data about data" and encompasses many different types of information. I'll talk more about metadata in a bit. [gesture at Leica LIF] Taking the Leica LIF format as an example, we can see that it has a gold star rating for Pixels, which means that we are confident that Bio-Formats will be able to open up your LIF datasets and display them to you. We are nearly as confident about the metadata, giving it a rating of four. [gesture at Bio-Rad PIC] The Bio-Rad PIC format is also really well supported, with a similarly good rating. [gesture at Visitech XYS] Not all formats are so lucky, though -- we have coded support for VisiTech's XYS format, but we're not as confident that it is complete and correct -- so we currently list it with a 3-of-5 rating in both pixels and metadata. Much of the reason for that stems from how much sample data we have in each format, and whether we have an official specification document from the format's controlling interest. [gesture at Bio-Rad PIC INFO box] Looking at Bio-Rad PIC again, notice that we have specification documents for two different versions of PIC, as well as a large number of PIC datasets. We actually have a system capable of producing PIC files here at LOCI, too, which makes it very easy to produce additional test data as needed. [gesture at Leica LIF INFO box] Similarly for Leica LIF, we have two different specifications, and a substantial number of LIF datasets, which has enabled us to do a lot of testing with this format. [gesture at VisiTech XYS INFO box] The VisiTech box is sadly much smaller, and you can see that although we have several VisiTech datasets, we don't have a specification document for XYS -- though we would like to have one. If you notice a deficiency in support for your format of interest, you can check our notes by looking at the format's INFO box. If it says that we would like more information, you can help by sending us sample data, or contacting the people responsible for the format and requesting a specification document. Customer feedback is essential for companies to understand that the community demands open standards and compatibility for their data. Another mechanism we're using to provide feedback on these file formats is the remainder of our rating system. [gesture at three remaining categories in the legend] These three categories -- Openness, Presence and Utility -- are really rating the file format itself, rather than the Bio-Formats library's support for it. [gesture over Openness heading] The first category, Openness, describes how accessible information about the format has been. A published open standard is going to get a rating of 5, whereas an actively obfuscated proprietary format will get a 1. [gesture over Presence heading] The next category, Presence, shows the popularity of the format. We haven't empirically measured the number of files in the wild, so this rating is merely a qualitative estimate from our perspective, rather than a quantitative ranking. [gesture over Utility heading] Lastly, Utility represents our opinion on the suitability of the format for storing microscopy image data. Obviously this is a microscopy-centric rating -- we should probably consider renaming this rating to "Richness" and generalizing it to describe the breadth of information capable of being stored in the format. { METADATA } I'd like to briefly explain the Bio-Formats metadata model, before moving on to the fun stuff with ImageJ and MATLAB. CTR START HERE We can generally categorize it into six different kinds. The 1st kind is "core" metadata, which includes very basic but crucial information such as "how many focal planes are in my dataset?", or "what is the resolution of my data?", or "how many --- { IMAGEJ } use Bio-Formats in ImageJ on all three OSes - importer plugin and its various options - data browser - loci plugins configuration - loci plugins shortcut window { MATLAB } Bio-Formats can be used in a variety of other contexts, as well. For example, we provide a MATLAB script to allow import of supported formats into the MATLAB environment. use Bio-Formats in MATLAB on windows - briefly demonstrate how bfopen works { COMMAND LINE } use Bio-Formats from the command line on all three OSes - demonstrate showinf tool's various options - demonstrate tiffcomment, xmlindent and xmlvalid tools { GRAPHICAL VIEWER } [double click loci_tools.jar] { CONCLUSION }