Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2001-11-28
2004-04-20
Rones, Charles (Department: 2175)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000
Reexamination Certificate
active
06725234
ABSTRACT:
FIELD OF THE INVENTION
The invention relates to a partitioning method of partitioning a set of objects, comprising:
a first iterative partitioning step, which utilizes a first partition formed by classes containing an object and which, at each iteration, combines two classes that maximize a criterion of similarity for creating a new partition used as an initial partition for the next step, until a partition is obtained that contains only a single class,
a selection step of selecting from the obtained partitions a partition which comprises an optimal number of classes.
The invention also relates to a multilevel partitioning method of partitioning a set of objects, which utilizes such a partitioning method.
The invention finally relates to a method of searching a predetermined number of objects which are closest to an example in a tree-like structure of objects created by using such a multilevel partitioning method.
The invention further relates to computer programs for implementing such methods. The invention also relates to equipment which comprises data storage and data processing means, of which notably means for implementing such methods. The invention finally relates to a data transmission system which comprises at least such equipment.
The invention has interesting applications in the field of data classification, notably data relating to audio and/or video, for example, MPEG-7 descriptions.
Data storage and transmission capacities augment considerably, so much so that in many fields including the field of consumer electronics, the user, however, has difficulty in managing the information he or she has at his or her disposal. In this context the methods of partitioning a set of objects become ever more important. They notably apply to the navigation in a set of objects and to the search for objects.
BACKGROUND OF THE INVENTION
The article <<Symbolic Clustering using a new dissimilarity measure>> by K. Chidananda Gowda and E. Diday, published in the journal <<Pattern Recognition>>, vol. 24 no. 6, pp. 567 -578, 1991, describes a partitioning method of a set of objects as cited in the introductory paragraph and called <<Hierarchical Agglomerative Clustering Methodology>>.
This partitioning method provides an optimal number of classes. But, the obtained partition is not well adapted to the search for objects.
SUMMARY OF THE INVENTION
The invention notably has for its object to propose a partitioning method of partitioning a set of objects particularly well adapted to the search for objects. For this purpose, a method according to the invention and as described in the opening paragraph is characterized in that it comprises:
a step of calculating elements which are representative of classes of the selected partition,
a second iterative partitioning step which utilizes the selected partition as a first current partition and which, at each iteration, creates a new partition by associating each object of said set with the element which represents the current partition it is closest to, and calculates elements which represent classes of the new partition, the new partition forming the current partition for the next iteration, until the elements which are representative of the new partition are identical with the elements which are representative of the current partition.
Thus the method according to the invention comprises determining in a first stage a partition that contains an optimal number of classes, and then modifying the classes so that their contents are adapted to the search for objects.
It should be observed that the article “An efficient K-means clustering algorithm” by K. Alsabti, S. Ranka and V. Singh, published on the occasion of the “IPPS/SPDP Workshop on High Performance Data Mining, 1998, Orlando, Fla.” describes a partitioning method called “K-means” which permits to divide a set of objects into a number N of classes known a priori, by choosing N start prototypes. This method is iterative. At each iteration each object is combined with the prototype it is closest to so as to determine classes, after which the prototype of each class is recalculated. The new obtained prototype serves as a start prototype for the next iteration. The final partition is obtained when the convergence is reached, that is to say, when all the new obtained prototypes correspond to the start prototypes at the end of an iteration.
This method makes it necessary to know a priori the number of classes of the partition to be constructed and to have a set of start prototypes.
The invention thus notably comprises the use of a “hierarchical agglomerative clustering” method for the initialization of a method of the type “K-means”, so as to provide a partitioning method whose number of classes and the contents of the classes are optimal as regards the search for objects.
In a general manner, the invention can be applied to any type of objects provided that a similarity measure f is defined for this type of objects and that this measure of similarity satisfies the following conditions:
f is an application which associates two objects with a real number,
this real number is identical whatever the order in which the two objects are considered,
the real number associated with two identical objects is higher than the real number associated with two different objects.
In a particularly interesting embodiment of the invention said objects are meta data, that is to say, structures containing a set of data. They are, for example, descriptions of video shots, notably descriptions of the MPEG-7 type. The MPEG-7 standard project defines indeed a certain number of descriptors for video shots (color descriptors, text descriptors, camera movement descriptors . . . ), as well as similarity measures associated with these descriptors. For more details reference is made to the document ISO/IEC JTC1/SC29/WG11 N3521 (July 2000) entitled “Coding of moving pictures and associated audio information”, which refers to the document “Visual Working Draft” version 4.0.
A second object of the invention is to propose a multilevel partitioning method of a set of objects. In accordance with the invention such a method comprises:
a partitioning step which utilizes a partitioning method as described above for determining a partition of a group of objects,
a step of determining elements which are representative of classes of the obtained partition,
a step of storing said representative elements in a tree-like structure,
said steps being executed a first time with a group of objects formed by said set, then repeated with groups of objects formed by the classes of the obtained partitions so as to obtain classes that satisfy a predetermined criterion,
a step of storing the last partition in said tree-like structure.
The tree-like structure thus obtained forms a partition of the set of objects which has several levels. This type of partition is particularly advantageous for searching in a set of a large number of objects, because it permits to accelerate the search. Actually, with a single-level partition, when the size of the set of objects increases significantly, this brings about either the increase of the number of classes, or the increase of the number of objects contained in each class. In either one of the two cases one is led to compare the example searched for with a much larger number of objects. The processing time thus increases considerably. On the contrary, with a multilevel partition, the example searched for is only compared with a restricted number of objects at each level of the partition. The increase of the size of the set thus has much less influence on the processing time of the search.
A third object of the invention is to propose a method of searching a predetermined number of objects which are closest to an example in a tree-like structure of objects, which comprises nodes and leaves and which has been created with a multilevel partitioning method as described above is used.
According to the invention such a search method comprises the following steps:
a step of passing thro
Mory Benoit
Santini Nicolas
Abel Jalil Neveen
Rones Charles
LandOfFree
Methods of partitioning a set of objects and search method... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Methods of partitioning a set of objects and search method..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Methods of partitioning a set of objects and search method... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3207684