Data processing: measuring – calibrating – or testing – Measurement system – Statistical measurement
Reexamination Certificate
2003-06-26
2004-11-23
Hoff, Marc S. (Department: 2857)
Data processing: measuring, calibrating, or testing
Measurement system
Statistical measurement
C717S125000
Reexamination Certificate
active
06823286
ABSTRACT:
BACKGROUND OF INVENTION
The present invention relates to analyzing and interpreting multi-dimensional datasets. Examples of such datasets include optical recordings of neuronal cell slice fluorescence and differences in expression levels of multiple genes within a population of patients or subjects.
It is often desirable to understand the relationship of various events occurring within such a multidimensional dataset. For example, various neurons in a neuronal cell slice may exhibit spontaneous activity in a time series of optical images. It would be desirable to determine which, if any, group of neurons were ever coactive (i.e. active at the same time or at specific different times), were regularly coactive (i.e. coactive at multiple times over the period of observation), and which neuron, if any, consistently activates before or after another neuron activates. It would also be advantageous to know the statistical significance of the relationships between the various events. In other words, whether the correlation among the various events is stronger than would be expected from random activity.
SUMMARY OF THE INVENTION
These and other advantages are achieved by the present invention which provides a method and system for analyzing a multidimensional dataset and for detecting relationships between various events reflected in the dataset.
In an exemplary embodiment, a method is presented for analyzing a sequence of data arrays including selecting at least one type of region of interest and at least one region of interest for each type of region of interest chosen from said data arrays, and transforming the sequence of data arrays into a simplified data array with a first dimension equal to the number of selected regions of interest and a second dimension equal to the number of data arrays in the original sequence of data arrays. The simplified data array is then examined to detect events of interest in the regions of interest, and those events of interest are stored in a second simplified data array having the same dimensions as the first simplified data array, but the data in each element of the array is binary. The second simplified array is then analyzed to determine relationships between the events of interest and correspondingly, the regions of interest.
In one exemplary embodiment, analyzing includes plotting a portion or all of the data in the first simplified array to allow visual examination of the relationships between the activities of interest in various regions of interest. In another exemplary embodiment, the analysis step involves detecting events of interest that are coactive and determining whether the number of coactive events is statistically significant. This embodiment may include detecting all such coactive events (i.e. events where at least two regions of interest are active simultaneously), detecting instances where many regions of interest are coactive simultaneously, or detecting instances where two or more regions of interest are each active in a certain temporal relationship with respect to one another (also referred to as coactivity).
In a further exemplary embodiment, the data analysis involves calculating a correlation coefficient between two regions of interest based on how often the regions of interest are coactive relative to how often the first region is active. A map of all such regions is displayed with lines between the regions having a thickness proportional to the correlation coefficient between the two regions.
Another exemplary embodiment includes plotting a cross-correlogram or histogram of events of interest in a particular region of interest with respect to events of interest in another region of interest, so that the histogram will reveal the number of times an event of interest in the first region of interest occurs a certain number of locations away from an event of interest in the second region of interest in the second simplified data array. The cross-correlogram can be plotted with respect to one region of interest, thus showing how many times an event of interest occurs before or after the occurance of another event of interest in the same region of interest.
Other exemplary embodiments include performing Hidden Markov Modeling on the second simplified data array to determine a hidden Markov state sequence and displaying a cross-correlogram between events of interest occurring in one region of interest while that region is in one of the detected Markov states and performing a singular value decomposition on the first simplified data array.
REFERENCES:
patent: 5215099 (1993-06-01), Haberl et al.
patent: 5608908 (1997-03-01), Barghouti et al.
patent: 6064770 (2000-05-01), Scarth et al.
patent: 6525712 (2003-02-01), Held
patent: 6728955 (2004-04-01), Berry et al.
patent: 9944062 (1999-09-01), None
“A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition” by Lawrence R. Rabiner,Proceedings of the IEEE; vol. 77, No. 2, Feb. 1989, pp. 257-285.
“The Segmental K-Means Algorithm for Estimating Parameters of Hidden Markov Models” by Juang et al.,IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 38, No. 9, Sep. 1990, pp. 1639-1641.
“Hidden Markov Modelling of Simultaneously Recorded Cells in the Associative Cortex of Behaving Monkeys” by Gat et al.,Comput. Neural Sys, vol. 8, 1997, pp. 297-322.
“Networks of Coactive Neurons in Developing Layer 1” by Schwartz et al.,Neuron, vol. 20, Mar. 1998, pp. 541-552.
“Solution of Linear Algebraic Equations”, Chapter 2.0 Introduction, byPress Numerical Recipes in C, 1992, pp. 29-31.
“Least-Squares Techniques” by Christopher Bishop,Neural Network for Pattern Recognition, 1995, p. 93, 171, an 260.
“Maximum-Likelihood Estimation for Mixture Multivariate Stochastic Observations of Markov Chains” by B. H. Juang,AT&T Technical Journal, vol. 64, No. 6, Jul.-Aug. 1985, pp. 1235-1249.
“Statistical Interference for Probabilistic Functions of Finite State of Markov Chains” by Baum et al.,Interference for Functions of Markov Chains, 1966, pp. 1554-1563.
“Mixture Autoregressive Hidden Markov Models for Speech Signals” by Juang et al.,IEEE Transactions of Acoustics, Speech, and Processing, vol. ASSP 33, No. 6, Dec. 1985, pp. 1404-1413.
“Maximum Likelihood Estimation for Multivariate Observations of Markov Sources” by Louis A. Liporace,IEEE Transactions on Information Theory, vol. IT-28, No. 5, Sep. 1982, pp. 729-734.
“Cluster Analysis and Data Visualization of Large-Scale Gene Expression Data” by George S. Michaels, Daniel B. Carr, Manor Askenazi, Stefanie Fuhrman, Xiling Wen, Roland Somogyi, XP-000974575, 1997, pp. 41-53.
Froemke Robert C.
Kumar Vikram S.
Yuste Rafael
Baker & Botts L.L.P.
Hoff Marc S.
Raymond Edward
The Trustees of Columbia University in the City of New York
LandOfFree
Method and system for analyzing multi-dimensional data does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for analyzing multi-dimensional data, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for analyzing multi-dimensional data will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3320967