(Circulation. 2000;101:e215.)
© 2000 American Heart Association, Inc.
Circulation Electronic Pages |
From the Margret and H.A. Rey Laboratory for Nonlinear Dynamics in Medicine, Beth Israel Deaconess Medical Center/Harvard Medical School, Boston, Mass (A.L.G., J.M.H., J.E.M., C.-K.P); the Center for Polymer Studies and Department of Physics, Boston University, Boston, Mass (L.A.N.A., P.Ch.I., H.E.S.); the Division of Health Sciences and Technology, Harvard University/Massachusetts Institute of Technology, Cambridge, Mass (R.G.M., G.B.M.); and the Centre for Nonlinear Dynamics in Physiology and Medicine, Department of Physiology, McGill University, Montréal, Québec, Canada (L.G.).
Correspondence to Ary L. Goldberger, MD, Cardiovascular Division, GZ-435, Beth Israel Deaconess Medical Center, 330 Brookline Avenue, Boston, MA 02215. E-mail ary{at}astro.bidmc.harvard.edu
| Abstract |
|---|
|
|
|---|
Key Words: aging databases death, sudden electrophysiology heart rate nervous system, autonomic nonlinear dynamics
| Introduction |
|---|
|
|
|---|
| Background and Objectives |
|---|
|
|
|---|
|
|
Recent findings3 4 5 6 7 8 indicate that such complex datasets may contain "hidden information," which is defined here as information that is neither visually apparent nor extractable with conventional methods of analysis. Such information promises to be of clinical value (forecasting sudden cardiac death in ambulatory patients or cardiopulmonary catastrophes during surgical procedures). It may relate to basic mechanisms in molecular biology and physiology.9 With the advent of sophisticated computational tools and powerful methods for storing and disseminating vast quantities of information, the biomedical research community seems to be on the cusp of a major breakthrough at both the clinical and basic levels of investigation.10
| The Challenges |
|---|
|
|
|---|
Data Resources
Researchers need, but generally lack access to, high-quality,
rigorously validated, and standardized databases of biomedical signals
obtained in a variety of healthy and pathological conditions. In many
cases, both experimental and clinical data are collected at
considerable expense to the public, analyzed once by their
collectors, and filed away indefinitely. As a result, federal agencies
and other research sponsors may fund repetitive and redundant
projects.
Analytic Resources
Significant effort is required to develop software for signal
processing, time-series analysis, and related functions needed
by researchers working with these datasets. Commercial software is
unavailable for many of these functions, and what little is available
is generally unsuitable for use with multigigabyte datasets.
Researchers frequently develop such software at considerable expense
for use within a single project. Furthermore, the validation of
signal processing and analysis algorithms (and of their
software implementations) is rarely performed in a way that permits
rigorous peer review. Frequently, researchers self-evaluate their
software using the same private dataset used for its development and
then report its behavior using ad hoc measures of
performance.
Human and Communications Resources
Advances in the field of complex biomedical signal
analysis have also been limited by the lack of concentrated and
concerted research efforts. Furthermore, advanced analytic techniques
developed by experts in the field are often not readily accessible to
end users, who may lack the background and technical skills needed for
the successful use of these new tools. Even among experts, the
processes of the evaluation of new algorithms and the comparison of
research results are complicated by subtle variations in software
implementations of algorithms that themselves may not be thoroughly
specified in research reports and by a lack of standardized test data
and testing methods.
This set of problems and challenges is reminiscent of the status of research in genetics and molecular biology before the advent of GenBank, which is arguably the first successful large-scale example of a medium for the exchange and dissemination of raw research data. By allowing researchers to begin their work with instant access to all of the ever-increasing store of knowledge of DNA sequences, GenBank encourages innovative rather than redundant research, leverages research expenditures to promote the most efficient use of limited funds, and makes serendipitous discoveries more likely.
In much the same way, biomedical research in general, and
cardiovascular investigations in particular, rely on
large quantities of physiological data.
Intersubject variability is a major focus of research interest in
biomedical research (as it is in genetic research); hence, information
gathered from a variety of subjects has an added value quite beyond the
importance of verifying an initial set of findings. In contrast to the
relatively simple alphabet of DNA sequences, however, biomedical
signals are characterized by complex time-varying features and
interrelationships that require nontrivial computational techniques for
quantification and analysis (Figures 1
and 2
).
The new resource offers researchers a medium for the exchange and dissemination of such biomedical signals and algorithms. It aims to bring to biomedical research the diverse and compelling benefits offered to molecular biology by GenBank. The central mission of the resource is to accelerate current research progress and to stimulate and bootstrap new investigations in the study of complex biomedical signals with an integrated approach.
| Structure of the Resource: Data, Software, Interchange |
|---|
|
|
|---|
|
PhysioBank is an archive of well-characterized biomedical signals for use by the research community. As we build PhysioBank, we collect, characterize, and document databases of multiparameter signals from healthy subjects and patients with pathological conditions that have major public health implications (eg, epilepsy, congestive heart failure, sleep apnea, sudden cardiac death, myocardial infarction, movement disorders, and aging). This component will also include other databases that will contain signals obtained from selected in vitro and in vivo experiments, as well as from physiologically-motivated algorithms that generate complex time series.11 A large and growing collection of these databases is now available to the scientific community via the PhysioNet website and on CD-ROM.
PhysioToolkit is a growing library of signal processing and analytical techniques implemented in open-source software. The PhysioToolkit library includes software for physiological signal processing and analysis; the detection of physiologically significant events using both classic methods and novel techniques from statistical physics, fractal scaling analysis, and nonlinear dynamics; the analysis of nonstationary processes; interactive display and characterization of signals; the creation of new databases to support further development of PhysioBank; the simulation of physiological and other signals, when such signals may be useful for the study of algorithm behavior; and the quantitative evaluation and comparison of analysis algorithms.
PhysioNet provides a 2-way dynamic link between the resource and the research community for efficient retrieval and submission of data and software from and to PhysioBank and PhysioToolkit via the World Wide Web (http://www.physionet.org). PhysioNet is an on-line forum for the dissemination and exchange of recorded biomedical signals and the software for analyzing such signals; it provides facilities for the cooperative analysis of data and the evaluation of proposed new algorithms. It provides a meeting place for physiological data and algorithms, where both can be submitted, discussed, evaluated, reviewed, and examined in detail by any investigator willing to join this on-line community. PhysioNet also provides a means to resolve differences in results that may result from errors in the interpretation of algorithm descriptions, errors in algorithm implementation, or fundamental errors in algorithm design. As an educational component, PhysioNet provides on-line tutorials to assist clinicians, students, and basic researchers in making the best use of the resource (see Appendix). In conjunction with Computers in Cardiology 2000, PhysioNet is supporting a time-series competition focusing on the challenge of detecting obstructive sleep apnea from the ECG (http://www.physionet.org/cinc-challenge-2000.shtml).
Data and software that are available via PhysioNet fall into the following 3 categories:
| Potential Benefits |
|---|
|
|
|---|
|
Reference databases12 13 14 (http://ecg.mit.edu/ and
http://reylab.bidmc.harvard.edu/) are also essential resources for
developers and evaluators of algorithms that analyze biomedical
signals who need to test algorithms with realistic data and to perform
these tests repeatedly and reproducibly as algorithm refinements are
proposed. These databases also have value in medical education by
providing well-documented case studies of both common and rare but
clinically significant diseases. By making well-characterized clinical
data available to researchers, these databases will make it possible to
formulate and answer numerous physiological
questions (Figure 3
), without the need to develop a new set of
reference data at great cost in each case.15 In this
regard, PhysioBank can serve as a final and permanent repository for
time-series data from publicly-funded studies, such as large
multicenter clinical trials, or physiological
studies conducted by the National Aeronautics and Space Administration
(NASA). Such data are, by statute, in the public domain, yet often they
cannot be readily accessed by qualified investigators, even long after
the original investigators have completed their analysis.
Furthermore, irreplaceable physiological data, such
as electrocardiographic recordings from NASAs pre-Shuttle
missions, are no longer retrievable due to a lack of mechanisms for
data annotation, analysis, and archival. Such mistakes should
not be repeated.16
Another source of concern in the biomedical community in recent years has been the problem of scientific misconduct,17 including the publication of fraudulent data. These lapses rob not only the scientific community, which relies on published findings, but also the taxpayers who support this research. Considerable effort has been directed at designing safeguards to prevent or detect such fraudulent science. Unfortunately, even the most careful peer review may fail to discover deliberate misrepresentation or unintentional mistakes. The willingness of investigators to deposit original datasets as part of a research resource may be one of the most potent assurances of the integrity of data. The fact that these datasets can be reanalyzed by the scientific community at large permits ready double-checking of the initial findings and serves as perhaps the most efficient remedy for unintentional errors. An additional benefit is that the data can also be restudied with new techniques as they become available, allowing for "data-leveraging" or "data-mining." For federally-funded investigations, the investigators consent to eventually bank relevant physiological signals in such a resource could become a standard part of certain research proposals, with provisions for the absolute protection of subject anonymity.
Without common databases, such as those provided by PhysioBank, it can
be impossible to resolve certain contradictory research results,
ranging from understanding the dynamics of normal sinus rhythm to
life-threatening cardiac arrhythmias.9 18 A
specific example of how the absence of a well-characterized database
has impeded scientific progress and prevented the resolution of a
major, clinically relevant problem relates to the mechanism of
ventricular fibrillation (VF), the major cardiac
arrhythmia associated with sudden death (Figure 1
).
Although multiple investigators have studied the dynamics of this
electrical disturbance, there remains a remarkable lack of
consensus about its underlying mechanism(s).9 19 20 21 A
probable source of disagreement has been that different investigators
have studied different sets of waveforms obtained in diverse
preparations. Furthermore, the analyses used to reach these
disparate conclusions have made use of different analytic techniques or
different implementations of similar algorithms. Without a standardized
database of high-quality signals accepted by the community of
investigators using the same algorithms, attempts to resolve this
central controversy are likely to be at best incomplete and at worst
reminiscent of the parable of the blind men and the elephant. To aid in
the development of new approaches to defibrillation,20 22
it would be invaluable to make available
electrophysiological mapping data collected
during VF in model systems. A more informed analysis of VF will
lead to a deeper understanding of the mechanism of complex wave
phenomena, which underlie not only sudden cardiac
death23 24 but possibly other
pathophysiological dynamics, such as seizure
disorders.
Finally, the peer review process itself has been shaped historically by the constraints of the publication process. It has never been feasible to publish the raw data that support research resultsuntil now. The Internet and the near-universal availability of inexpensive, high-capacity, mass-storage media such as the CD-ROM have made it possible to consider a new paradigm for scientific publication and for peer review. Within a few years, it may not be considered acceptable for a study based on physiological signal analysis to be published in most peer-reviewed journals without making supporting raw data available for examination, and no peer review may be considered sufficiently rigorous unless it has included an examination of how the research results have been derived from these data. The resource now provides a site for authors to publish such "dynamic appendices" to accompany their articles, giving readers access to the actual time-series data on which statistical tests were performed. A precedent for the publication of such primary datasets has already been established with respect to high-resolution biomolecular structural data, which are now released at or before the time of publication of the articles describing these data.25 "Open-source research" is a powerful idea that may sweep aside entrenched patterns of behavior in research, just as increasing awareness of the benefits of open-source software is changing the practice of software development. We hope that this new National Institutes of Health resource will now help extend these benefits and their often unanticipated rewards to those with an interest in complex physiological signals.
| Acknowledgments |
|---|
| Footnotes |
|---|
| Appendix 1 |
|---|
|
|
|---|
PhysioToolkit software may be downloaded in source form or in precompiled versions for Linux/x86, Solaris/Sparc, or MS-DOS/MS-Windows (precompiled versions for other environments may also be available).
We invite your comments and contributions of data and software for review, discussion, and possible inclusion in PhysioBank and PhysioToolkit. Contributors are asked to review our guidelines at http://www.physionet.org/guidelines.shtml.
PhysioNet is supported by mirrored Web servers at multiple locations around the world to provide reliable access to the research community. We invite users to replicate the PhysioNet website locally and to add their sites to our list of mirrors; please visit http://www.physionet.org/mirrors/ for further information.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
D. T. Schmitt and P. Ch. Ivanov Fractal scale-invariant and nonlinear properties of cardiac dynamics remain stable with advanced age: a new mechanistic picture of cardiac control in healthy elderly Am J Physiol Regulatory Integrative Comp Physiol, November 1, 2007; 293(5): R1923 - R1937. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Lian, D. Mussig, and V. Lang Ventricular rate smoothing for atrial fibrillation: a quantitative comparison study Europace, July 1, 2007; 9(7): 506 - 513. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. L. Goldberger Giles F. Filley Lecture. Complex Systems Proceedings of the ATS, August 1, 2006; 3(6): 467 - 471. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Heldt Continuous blood pressure-derived cardiac output monitoring--should we be thinking long term? J Appl Physiol, August 1, 2006; 101(2): 373 - 374. [Full Text] [PDF] |
||||
![]() |
V. Tuzcu, S. Nas, T. Borklu, and A. Ugur Decrease in the heart rate complexity prior to the onset of atrial fibrillation Europace, June 1, 2006; 8(6): 398 - 402. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Kocica, A. F. Corno, F. Carreras-Costa, M. Ballester-Rodes, M. C. Moghbel, C. N.C. Cueva, V. Lackovic, V. I. Kanjuh, and F. Torrent-Guasp The helical ventricular myocardial band: global, three-dimensional, functional architecture of the ventricular myocardium Eur. J. Cardiothorac. Surg., April 1, 2006; 29(Suppl_1): S21 - S40. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Zvelebil, M. Palus, and D. Novotna Nonlinear Science issues in the dynamics of unstable rock slopes: new tools for rock fall risk assessment and early warnings Geological Society, London, Special Publications, January 1, 2006; 261(1): 79 - 93. [Abstract] [PDF] |
||||
![]() |
J. Carlson, R. Havmöller, A. Herreros, P. Platonov, R. Johansson, and B. Olsson Can orthogonal lead indicators of propensity to atrial fibrillation be accurately assessed from the 12-lead ECG? Europace, January 1, 2005; 7(s2): S39 - S48. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. K. Stein Increased randomness of heart rate could explain increased heart rate variability preceding onset of atrial fibrillation J. Am. Coll. Cardiol., August 4, 2004; 44(3): 668 - 669. [Full Text] [PDF] |
||||
![]() |
F. Roche, V. Pichot, E. Sforza, I. Court-Fortune, D. Duverney, F. Costes, M. Garet, and J-C. Barthelemy Predicting sleep apnoea syndrome from heart period: a time-frequency wavelet analysis Eur. Respir. J., December 1, 2003; 22(6): 937 - 942. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. P. Gerstenfeld, S. Dixit, D. J. Callans, Y. Rajawat, R. Rho, and F. E. Marchlinski Quantitative comparison of spontaneous and paced 12-lead electrocardiogram during right ventricular outflow tract ventricular tachycardia J. Am. Coll. Cardiol., June 4, 2003; 41(11): 2046 - 2053. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Farre, J. M. Montserrat, J. Rigau, X. Trepat, P. Pinto, and D. Navajas Response of Automatic Continuous Positive Airway Pressure Devices to Different Sleep Breathing Patterns: A Bench Study Am. J. Respir. Crit. Care Med., August 15, 2002; 166(4): 469 - 473. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. L. Goldberger, L. A. N. Amaral, J. M. Hausdorff, P. Ch. Ivanov, C.-K. Peng, and H. E. Stanley Fractal dynamics in physiology: Alterations with disease and aging PNAS, February 19, 2002; 99(suppl_1): 2466 - 2472. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Circulation Home | Subscriptions | Archives | Feedback | Authors | Help | AHA Journals Home | Search Copyright © 2000 American Heart Association, Inc. All rights reserved. Unauthorized use prohibited. |