Data processing: measuring – calibrating – or testing – Measurement system – Statistical measurement
Reexamination Certificate
2001-03-27
2004-01-06
Hoff, Marc S. (Department: 2857)
Data processing: measuring, calibrating, or testing
Measurement system
Statistical measurement
C706S013000, C704S236000
Reexamination Certificate
active
06675126
ABSTRACT:
CROSS-REFERENCE TO RELATED APPLICATIONS
Not Applicable.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a statistical analysis of data, specifically to a technology for estimating a measure of randomness of a function of at least one random variable.
2. Discussion of the Related Art
Frequently, there are performance measures of systems, which are based on the means of random variables. For example, a percentage of time of machine under repair is a function of the mean repair time divided by the mean time between the beginning of repairs. It is important to distinguish between the mean of a function of at least one random variable and the function of the means of at least one random variable. In the case of the machine repair, it would be possible to divide the individual repair times by the individual times between the beginning of repairs, and to obtain the mean of this ratio. However, this mean of the function would differ from the function of the means. Only the function of the means represents the correct percentage of the machine under repair.
Frequently, the means of the random variables are not known exactly, but rather are based on a set of collected data. Therefore, these means may differ from the true means. Subsequently, the function of the means may differ from the function of the true means. Frequently, there is interest in a measurement of the accuracy of the function of the means. This measurement of accuracy is usually expressed as a confidence interval around the mean or median, but may also be expressed as a variance, a standard deviation, or a quantile. While the calculation of such measures is well known in statistical analysis for individual random variables, it is more difficult for functions of the means.
Common uses of the function of at least one mean are frequencies of occurrences, where the mean frequency is the inverse of the mean time between occurrences. Another common uses are percentages of times, where the mean percentage is the mean duration divided by the mean time between the start of duration's cycles.
One conventional method to calculate the confidence interval of the function of means is called batching, also known as non-overlapping batch means method. In this method, the sufficiently large sets of data are split into a number of subsets. The means for each subset is calculated and subsequently the function of the means is calculated for each subset. A confidence interval can be constructed on the different values of the function of means.
However, this conventional method is suitable only for sufficiently large sets of data in order to satisfy the central limit theorem. This method can therefore not be used on small data sets. In addition, the confidence interval for a set of data can vary significantly with the number of subsets used. The selection of an unsuitable number of subsets may cause incorrect results. Furthermore, this method requires significant storage capacity and computational power as the size of the data set increases. Finally, due to the nature of the computation, these intensive calculations have to be repeated every time additional data becomes available.
Many approaches have been developed to assist the selection of the number of subsets for the above batching method. However, they are usually very complicated and require a high level of expertise. In addition, the results of these approaches may differ from one another. Furthermore, the computational requirements increased ever further as these approaches frequently require a significant statistical effort to analyze the subsets and the relation therebetween.
A variant of the above conventional batching method, known as overlapping batch means method, creates overlapping subsets. While this variant may have a slight improvement over the basic batching method, it still requires a large data set, the selection of a number of subsets, significant storage and computational capacity. Furthermore, the complexity of the variant is still significant and requires significant statistical knowledge.
BRIEF SUMMARY OF THE INVENTION
It is therefore an object of the present invention to permit the estimation of a measure of randomness of a function of at least one representative value of at least one random variable, even for a relatively small size of data set to be used, in a reduced time.
The object may be achieved according to any one of the following modes of this invention. Each of these modes of the invention is numbered like the appended claims, and depends from the other mode or modes, where appropriate. This type of explanation about the present invention is for better understanding of some instances of a plurality of technical features and a plurality of combinations thereof disclosed in this specification, and does not mean that the plurality of technical features and the plurality of combinations in this specification are interpreted to encompass only the following modes of this invention:
(1) A method of estimating a measure of randomness of a function of at least one representative value of at least one random variable, comprising:
a step of obtaining the at least one random variable;
a step of determining the at least one representative value of the obtained at least one random variable;
a step of determining a statistic of the obtained at least one random variable;
a step of determining a gradient of the function with respect to the determined at least one representative value; and
a step of transforming the obtained statistic of the at least one random variable into a statistic of the function, using the determined gradient.
As the result of the inventor's research, he has found that there exists a statistical characteristic that, while a statistic of a function of a random variable, which statistic may include a measure of randomness or dispersion, strongly tends to reflect a statistic of the random variable, which statistic may include the measure of randomness or dispersion, such that the statistic of the random variable is enlarged in the case of a steep gradient of the function of the random variable, the statistic of the function strongly tends to reflect a statistic of the random variable such that the statistic of the random variable is reduced in the case of a gentle gradient of the function of the random variable
In addition, the above research also revealed that, the utilization of the characteristic mentioned above would permit the estimation of a measure of randomness of a function of a representative value of a random variable, ensuring an accuracy thereof almost equal to one established in the use of the conventional batching method aforementioned, with a smaller size of data used than in the batching method, in a shorter time required than in the batching method.
On the basis of the above findings, in the above mode (1) of the present invention, at least one representative value of at least one random variable is determined and a statistic of the at least one random variable is determined. Furthermore, in the mode (1), a gradient of a function of the at least one random variable with respect to the determined at least one representative value is determined, and, by the use of the determined gradient, the determined statistic of the at least one random variable is transformed into a statistic of the function.
Hence, the mode (1) would permit the estimation of a measure of randomness of a function of at least one representative value, by the use of a smaller size of data used than in the conventional batching method, in a shorter time required than in the batching method.
The term “representative value” may be defined, in the above mode (1) and other modes of the present invention, to mean a measure of central tendency of a distribution of a plurality of individual data values belonging to the at least one random variable or the function, for instance.
Further, in the case of a plurality of random variables or a plurality of functions, the term “representative value” may be defined, in the above mod
Charioui Mohamed
Hoff Marc S.
Kabushiki Kaisha Toyota Chuo Kenkyusho
LandOfFree
Method, computer program, and storage medium for estimating... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method, computer program, and storage medium for estimating..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method, computer program, and storage medium for estimating... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3234368