Metabolic profiling is still an emerging field, a status underscored
by the fact that no large public database of metabolites exists yet. In addition, researchers in government, university, and
pharmaceutical company labs have been patching together off-the-shelf software with their own proprietary analytical tools
to help acquire, interpret, and manage metabolite data, which they keep in comparatively modest and often private databases.
Putting together that data, comprising information from mass spectrometry (MS) and/or nuclear magnetic resonance (NMR) techniques,
is just the first hurdle. Scientists will face an even bigger challenge in the future when they try to compare metabolic profiling
data with that from proteomics, genomics, and other emerging "omics" fields such as kinomics.
One of the big challenges with metabolic
profiling is coping with the large amount of data and statistical issues, says Glenn Cantor, DVM, PhD, principal veterinary
pathologist at Bristol-Myers Squibb's Pharmaceutical Research Institute in Princeton, N.J. "There is a certain amount of noise
in NMR and MS systems, so it's very challenging to figure out what's noise and what's a real signal. There's a lot of data,
too much for a human to scan through, so we need software to help." Cantor added that there are no out-of-the-box solutions
that can help answer all questions about a metabolite. So his company uses software from instrument or software manufacturers
and also develops its own software, which includes expert systems and novel ways of doing multivariate data analysis.
What's in a name?
The field of studying
metabolites has several names, including metabolic profiling, metabonomics, and metabolomics. Some researchers increasingly
are using the terms interchangeably. Others still prefer to use the names separately to mean different aspects of metabolite
study, for example, metabonomics for a systems approach to metabolite profiling and metabolomics for cell-based metabolite
profiling. Metabonomics also is used often by drug companies, and metabolomics is used as a general term for metabolite studies.
For this article, we will use the terminology used by researchers to describe their work. (see related story, "Metabolic Profiling:
Meet the Latest 'Omics'" on page 39).
For Cantor, the power of metabonomics is
more in hypothesis-generating than in getting complete answers. "It can yield new insights as to how things work, from which
we go forward and look at whether patients with a given disease have a particular protein or a particular combination of transcripts
or metabolites."
The Holy Grail, though,
will be the integration of genomics, proteomics, and metabonomics, as well as toxicology and biology, and devising informatics
tools that produce information that leads to new biological insights such as potential drug candidates and pathways, he says.
In five to 10 years, "I think we're going to have much more complete integration of all the 'omics,' and by that I don't only
mean metabonomics, genomics, and proteomics, but all the newer omics, like kinomics, glycomics, and so on. And I think we'll
refine our ability not just to accumulate data, but to insightfully interpret it." A research arm of the US Food and Drug
Administration already plans to create such an integrated database over the next several years (See "FDA Research Group Plans
Comprehensive Database," p. 49).
Getting back to basics
At the most basic level,
managing metabolite data begins with the ability to identify the metabolite. That may sound simple, but there are many molecules
in biofluids that remain a mystery, says Gary Siuzdak, PhD, senior director for The Scripps Research Institute, Center for
Mass Spectrometry, La Jolla, Calif. Scripps has a database called METLIN that is a repository for mass spectral data on about
1,000 endogenous and drug metabolite samples and tissues (http://metlin.scripps.edu/ ).
"It's striking, because
we believe we know so much already about what's going on in humans, but the reality is not only do we not know what's going
on, but we also don't even know what the molecular players are," Siuzdak says. "One of the main challenges of the metabolite
field is the one thing that's easy to do with proteomics, which is to identify the protein on the basis of the fragmentation
pattern that you generate from the MS data. With metabolites we're finding it's a lot easier to do quantitative analysis."
MS will reveal the mass, but identifying the molecule is tricky. "There's no comprehensive database right now," he says.
Siuzdak adds that metabolite
data for 50 patients is about 20 GB in size, which is not a huge amount. "The more difficult part is processing the data,
looking for unique molecules that are indicative of whether an individual with a disease has certain ions that are disappearing
or ions that are appearing. So we're looking for metabolites that are going to tell us whether the disease is occurring or
whether it is in its early stages or not, and that's a real challenge."
John Ryals, PhD, agrees.
Ryals, CEO of Metabolon Inc., Research Triangle Park, N.C., says the volume of data of metabolic profiling is not as dense
as that from RNA profiling or proteomics. That's because there are only about 2,400 to 2,600 metabolites, whereas there are
100,000 mRNAs and one million proteins. "These are not as dense a data set as you see in RNA profiling, because we just don't
have that many variables. So there isn't that big of a demand on the data processing or the databases," he says. The hard
part comes after the basic information on the metabolites is in the databases and statistical methods have to be applied to
analyze the data. Biologists aren't accustomed to handling this data, so companies hire mathematicians and statisticians to
interpret the data.
Metabolon developed software to analyze
metabolomics data. "We're actually able to name the molecule and get the quantitation on it," says Ryals. "We essentially
had to reengineer very high-end mass spectrometers. We've rewritten a lot of the software on how the data is gathered and
analyzed." One problem the software can handle is peak deconvolution during the liquid chromatography mass spectrometry (LC/MS)
process. When more than one molecule is under a peak, which happens often, it is necessary to identify the molecules and reassemble
them.
Intelligent software
The Metabolon software has what Ryals calls
"chemical intelligence," which allows it to go into very complicated peak patterns and identify known molecules by the way
they ionize. It took a few years and talented software writers who thoroughly understood the MS instruments to code the software.
"If you try to do it by hand, the experiment will take a year just to crunch the numbers." Ryals says MS hardware is very
good but, in general, the software for the instrument from the time the signals are produced until there is an answer still
needs work.
Donald Chace, PhD, director of the Division
of BioAnalytical Chemistry and Mass Spectrometry, Pediatrix Screening Laboratory, Bridgeville, Pa., is using metabolomics
to detect and study 35 inborn metabolism disorders in infants. Chace uses some off-the-shelf mass spectrometry software, plus
his own company's analytical tools, to look at metabolite ratios to diagnose a disease. The company looks at amino acids in
blood to get a picture of their metabolism, as well as acylcarnitines, which are metabolites that transport long-chain fatty
acids across the mitochondrial membrane.
The challenge to a laboratory is to do MS
well on many samples a day. "The last analysis has to be as good as the first even if you've done 1,000," says Chace. Most
of the processing and sample analysis Chace does is run on personal computers. The computers use software packages from instrument
companies such as the NeoLynx Screening Application Manager from Waters Corp., Milford, Mass., for quantitative measurements
of phenylalanine and tyrosine in neonatal blood samples by tandem MS. As well as other programs, he also uses the Analyst
software, from MDS Sciex, Concord, Canada, for MS data acquisition and processing.
NMR versus MS
There is no shortage
of opinions among users and vendors about whether NMR or MS is better. MS is very sensitive and gathers more data, but NMR
is good at identifying unknown molecules by looking at signal changes. An emerging trend is for drug scientists to use both
NMR and MS. For example, researchers at Imperial College in London and six pharmaceutical companies collaborated in an effort
called COMET, the Consortium on Metabonomic Toxicology, which resulted in a database of NMR spectra covering 150 compounds
or biofluid fingerprints that can be used to check for toxicity. The second stage of COMET will perform more mechanistic toxicology
studies and supplement the NMR database with MS data.
"With NMR we may see
things we cannot see by MS and vice versa. So, it is a complementary technique," says Jose Castro-Perez, LC/MS market development
manager at Waters Corp. "When you look at any biological insult toxicity base, you are not looking at just one biomarker,
but a series of biomarkers, maybe 20 or 30. The more information you can get out of the system the better," Perez says of
the strength of MS.
But MS is not without
its weaknesses, says John Shockcor, business development manager for metabolic profiling at Bruker BioSpin. NMR yields strong
quantitative data, and everything in the sample is visible within the limits of detection, which is not the case in MS. "Say
our MS colleagues find a new lipid with a mass of 600 and they notice that it is two mass units lower than another lipid they've
just found. The first thing they think is there's unsaturation, that there's a double bond. They have no idea where that double
bond is and they don't know if it is [a] cis or trans [isomer], so they don't know anything about the stereochemistry. If
they want to answer those questions, they have to do NMR." NMR sensitivity is still as much as four orders of magnitude less
than that of MS, he says. Shockcor says although NMR and MS data can be merged, he prefers to keep the data separate so he
can look at separate data sets.
Chace of Pediatrix says
that even with the vendor software to help process and analyze some of the data, it still is the job of the laboratory to
understand the relationship between metabolites and to diagnose a disease. "You get the software shell from the companies
and then you write the algorithms for picking up and deciding what to flag as abnormal or not. The software helps eliminate
the normals, so you're left with the abnormals," says Chace. "But a lot of clinical interpretation still is required."
Scripps' Siuzdak adds,
"Everybody has unique problems they're going after or they see issues they haven't seen addressed by off-the-shelf packages."
Metabolites undergo all kinds of statistical distributions that require unique analysis, says Metabolon's Ryals. He points
to estrogen as a good example. "It's going to be present in women but not very present in men, and so if you just go across
a row looking at estrogen in a population of people, you're going to see a really weird statistical distribution. So how do
you handle that kind of data? It's these types of issues that I don't think we have the proper statistical tools for yet."
For more on the organizations
mentioned here, refer to www.dddmag.com