[4.4.1] [4.4.2] [4.4.3]

4.4    Representation of Spatial Information

In the last section, an indexing method for generating the 1-D Class Key and the hierarchical classification was introduced. However, it mainly represents global colour information (or features) in an image. This section starts to explore the method of representing spatial information (or features) in an image.

4.4.1    The 2-D Index Vector

In Section 3.4.3, Harmon’s work (1973) on image representation by quantisation was introduced, which inspired the author to apply this method to the development of the indexing method of representing spatial information of an image.

Combination of Image Quantisation and the Perceptual Colour Model

The section presents a novel indexing method of combination of image quantisation and the Perceptual Colour Model. To represent the information in quantised images, a 2-D Index Vector is defined.

The 2-D Index Vector contains 15 ´ 15 components, which are represented as the Colour Descriptors. Each Colour Descriptor is corresponding to the actual colour in a quantised image. The definition of the Index Vector is illustrated in Equation (4.2).
 
(4.2)

Thus the Index Vector has capability to provide maximum 10 225 (or 10 15´ 15) different patterns, since there are 10 different Colour Descriptors in the Perceptual Colour Model.

Issue of the Size of the Index Vector

Currently, the size of the Index Vector, 15 ´ 15, was suggested according to the informal investigation (cf. Figure 3.3 and Section 3.4.3), and also referred to the previous work (Harmon, 1973; Harmon and Julesz, 1973). Harmon (1973, p.73) stated that "Our informal investigation revealed that a spatial resolution of 16 x 16 squares was very close to the minimum resolution that allows identification."

Further study on the minimum size of the index vector could be explored to achieve the minimum storage, after the implementation of the method and its evaluation.

4.4.2    The Indexing Scheme

In order to clearly describe the indexing scheme based on image quantisation, three "visually similar" images are used, which are shown in Figure 4.10(a). The author emphasises that all of these three images, which contain "blue" of the sky and "brown" of sands, architectures, and the cannon, seem similar in their overall colours.

Quantising Operations

First, the three original images were 15´ 15 quantised and were represented as 15´ 15 colour "blocks", which are shown in Figure 4.10(b). It is likely that human eyes have the ability to identify these fuzzy images, and major colour appearances still present in each image which are correspond to spatial positions. Several methods can be used to assist in identifying quantised images, such as slightly closing our eyes (cf. Section 3.4.3).

Scaling into Square Patterns

To clearly demonstrate how to compare the similarities between these quantised images, they were scaled into a uniform visual representation, the same size and scale of "square pattern". The results are shown in Figure 4.10(c).

Although these quantised images are likely to be difficult to identify by human eyes, this operation merely changed the "aspect ratios" of these quantised image shown in Figure 4.10(b). Actually, these quantised images contain shown in Figure 4.10(c) the same number and colour "square patterns" as those shown in Figure 4.10(b).
 


Figure 4.10: The Processes of the Indexing Scheme Based on Image Quantisation
(a) Three Similar Images;  (b) Transformation of Images Quantisation;
(c) Transformation of Scaling to Square; (d) Transformation of Index Vector.

Filtering and Indexing Operations

Figure 4.10(c) illustrates that these quantised images are in the same size and the scale (15´ 15). The filtering operation was used to transform all of the actual colours in the quantised images into the Colour Descriptors defined in the Perceptual Colour Model. Note that the algorithm of the filtering operation was actually the same as the filtering operation in Section 4.3.2.

Eventually, the image representation based on quantised images using the colour descriptors for the three "similar images" are illustrated in Figure 4.10(d), which are actually the 2-D Index Vector presented in Section 4.4.1.

4.4.3    Supportive Techniques on Image Processing

It is obvious that the image acquirement devices and the image processing techniques that Harmon (1973) used for achieving quantised image were not well-developed (cf. Section 3.4.3). It is likely that it is much easier to perform this method by using current techniques and devices, especially the capability of dealing with high-resolution colour images.

On the other hand, Harmon did not state whether the images with different aspect ratios (i.e. different width and height of images) could still use this method. However, a "scaling operation" was used in the generation of the 2-D Index Vector.

Therefore, it is necessary to investigate modern techniques on image processing for performing the effects of quantised images. Some possible ways to achieve the effects are briefly outlined as follow.

The algorithms of image scaling which transform the actual size of image into the smaller size of image by horizontal and vertical transforms will achieve the purpose. Some methods can be found in (Lindley, 1991, pp.426-438; Baxes, 1994, pp.373-374; Lindley, 1995, pp.23-26).

On the other hand, the existing algorithms of wavelet transforms also achieve the similar efforts. Some work can be found in (Castleman, 1996, pp.303-349; Parker, 1997, pp.250-274).

Issue of Information Theory

Since the indexing scheme involves the transformation of image quantisation, which actually reduces the resolution of the original images to represent "spatial information", there is a need for discuss the issue of the influence of image resolution versus image information content.

"Information theory" provides the basic concepts that are useful to deal with information representation and manipulation directly and quantitatively, which are discussed in the research community of image processing (Jain, 1989, pp.41-44; Gonzalez and Woods, 1993, pp.325-343). Gonzalez and Woods (1993, p.324) stated that "the fundamental premise of the information theory is that the generation of information can be modelled as a probabilistic process that can be measured in a manner that agrees with intuition."

The use of digital images, like other types of information media, involves the transformation of encoding and decoding (Gonzalez and Woods, 1992; Brebner, 1997). In the view of image retrieval system, the indexing processes are to perform the transformation of encoding; the retrieval processes are to perform the transformation of decoding. According to the purpose of indexing, the encoding will result in the loss in the image information content.

To estimate the representation of the information content, the idea of the entropy is used to measure the average information generated by the source, such as an image (Jain, 1989; Gonzalez and Woods, 1992; Brebner, 1997). Suppose a source has n different values that a piece of information might have, such as image resolution, and the probability of each value on any particular occasion is p. Mathematically, the entropy is defined as:
 
(4.3)

However, there is a limitation to the use of the idea of entropy. Brebner (1997, p.47) pointed out two main reasons why the results are usually not applicable in practical circumstances:

At present, the indexing scheme for generating the 2-D Index Vector, which contains the transformation of quantising images and representing by the colour descriptors, has not yet researched into the measures of the entropy. Since some values needed for entropy measures, such as the actual image sizes, are various and difficult to be defined, further research may be needed to overcome these practical limitations in the future.

Issue of the Resolution for Image Representation

Ballard and Brown (1982, p.106) discussed the issues of the best spatial resolution for an image. They stated that "the sampling theorem states that the maximum spatial frequency in the image data must be less than half the sampling frequency in order that the sampled image represent the original unambiguously."

However, Ballard and Brown (1982, p.106) stated that "the sampling theorem is not a good predictor of how easily objects can be recognised by computer programs. Often objects can be more easily objects can be recognised in images that have a very low sampling rate." They addressed two reasons:

Gonzalez and Woods (1992, p.33) also stated that "A ‘good’ image is difficult to define, because image quality not only is highly subjective, but it is also strongly dependent on the requirements of a given application."

Therefore, the resolution of 15´ 15 was chosen by referring to the work of Harmon (Harmon, 1973; Harmon and Julesz, 1973). The author emphasises that the use of this resolution is able to meet the current purpose of developing an algorithm of similarity measures for accessing the overall "spatial information" for general photographic images.

In addition, the odd number, 15, was chosen since it may be easy to identify "the centre of an image", which may be used to develop the further algorithms for detecting objects in an image.

The Current Algorithm

To implement the prototype system, the author used the existing scaling functions which are provided by the software development tool. The 15´ 15 quantised images are processed by reducing the resolution as 15´ 15 in their width and height. In other words, the operation can be regarded as shrinking these images to the size of 15´ 15.

For the purpose of demonstrating the method, the author emphasises that although different algorithms can achieve different results for representing image, however, the author believes the slight differences would not have a great influence on the retrieval effectiveness.

Accessing the Confident Level of Image Quantisation

Since the choice of the size of 15´ 15 was originally decided according to the author’s observation and experience, an informal experiment was conducted to assess this choice. In other words, the experiment was to discover how low the resolution of the quantised image may be used striking a balance between loss of image information and the loss of differentiation between different images.

The experiment was to use the algorithm of similarity measures to calculate the similarities between a query image and 12 sample images by using 30 different quantisation levels. Specifically, the method of "Query-by-Image-Example" was used (cf. Section 4.5.2).

These 12 sample images were randomly selected from an image collection, which were expected to on average include some "very similar", "similar", or "dissimilar" images to the query image. These sample images are illustrated in Appendix D.1.

The results of the experiment about the similarities of these 12 sample images with 30 different levels of resolutions were listed in Appendix D.2.

To examine the size of the quantisation level of images, the experimental data was analysed by using a statistical method, two-tailed statistical test. The null hypothesis is that the similarity measure maintains the rank order of images regardless of the quantisation level used.
Give a 95% confidence interval for 30 different levels of resolutions; the critical value of the statistic () that separates the rejection region and the acceptance region is 2.045 as the degrees of freedom is 29 (i.e.). (cf. Moore and McGabe, 1998, p.T-11). As Sims (1987, p.443) points out that "there is no scientific justification for testing hypothesis at the 5% significance level in levels is common because testing at fixed levels facilitates communication."

By using the t-test (cf. Section 6.2.6), the t-scores of the data were listed in Appendix D.3. Since the confidence interval can be decided when the absolute value  should be smaller than 2.045. Therefore, by observing the results of the t-test for 12 image samples, overall the sample data which had larger than 13 quantisation level are correct theoretically.

Finally, the author concludes the results of the experiment and asserts that at quantisation level 13, there is 95% confidence (or better) that dissimilar images are in fact still dissimilar. That is, the quantised images may be used to distinguish similar images from dissimilar images for achieving the purpose of image retrieval. Further, the amount of detail in an image is not correlated with retrieval performance at different quantisation levels, provided the quantisation level is greater than about 13. Therefore, the use of the size of 15´ 15 in the prototype system was reasonable.


[<<First]  [<Previous] [Next >]
Sam Lai
28 January 2000