| 13 | | 4. Wide variety of customization capabilities. Basic index and storage characteristics like |
|---|
| 14 | | the page size, node capacity, minimum fan-out, splitting algorithm, etc. should be |
|---|
| 15 | | easy to customize. |
|---|
| 16 | | 5. Index persistence. Internal memory and external memory structures should be supported. |
|---|
| 17 | | Clustered and non-clustered indices should be easy to be persisted. |
|---|
| 18 | | |
|---|
| 19 | | 2. INSTALLING |
|---|
| 20 | | |
|---|
| 21 | | To install the library you need to do the |
|---|
| 22 | | following: |
|---|
| | 33 | 4. Wide variety of customization capabilities. Basic index and storage |
|---|
| | 34 | characteristics like the page size, node capacity, minimum fan-out, |
|---|
| | 35 | splitting algorithm, etc. should be easy to customize. |
|---|
| | 36 | 5. Index persistence. Internal memory and external memory structures |
|---|
| | 37 | should be supported. Clustered and non-clustered indices should |
|---|
| | 38 | be easy to be persisted. |
|---|
| | 39 | |
|---|
| | 40 | ------------------------------------------------------------------------------ |
|---|
| | 41 | Installation |
|---|
| | 42 | ------------------------------------------------------------------------------ |
|---|
| | 43 | |
|---|
| | 44 | To install the library you need to do the following: |
|---|
| 61 | | g++ MyFile.cc -o MyFile -L/home/marioh/usr/lib -I/home/marioh/usr/include -lpthread -ltools -lspatialindex |
|---|
| 62 | | |
|---|
| 63 | | If the library is installed in the default /usr/local |
|---|
| 64 | | path, then the -I and -L options are not necessary. |
|---|
| 65 | | |
|---|
| 66 | | If you are compiling on Mac OS X you will probably |
|---|
| 67 | | have to add the -bind_at_load option when linking |
|---|
| 68 | | against the dynamic link libraries. |
|---|
| 69 | | |
|---|
| 70 | | 4. LIBRARY OVERVIEW |
|---|
| | 85 | |
|---|
| | 86 | :: |
|---|
| | 87 | |
|---|
| | 88 | g++ MyFile.cc -o MyFile -L/home/marioh/usr/lib -I/home/marioh/usr/include -lpthread -lspatialindex |
|---|
| | 89 | |
|---|
| | 90 | If the library is installed in the default /usr/local path, then the |
|---|
| | 91 | -I and -L options are not necessary. |
|---|
| | 92 | |
|---|
| | 93 | If you are compiling on Mac OS X you will might need to add the -bind_at_load |
|---|
| | 94 | option when linking against the dynamic link libraries. OS X Tiger should |
|---|
| | 95 | work out of the box, however, with XCode 3.0. |
|---|
| | 96 | |
|---|
| | 97 | ------------------------------------------------------------------------------ |
|---|
| | 98 | Library Overview |
|---|
| | 99 | ------------------------------------------------------------------------------ |
|---|
| 80 | | I will briefly present the basic features supported by each package. For more details you will |
|---|
| 81 | | have to refer to the code, for now. |
|---|
| 82 | | |
|---|
| 83 | | 4.1 SPATIALINDEX UTILITIES |
|---|
| 84 | | |
|---|
| 85 | | To provide common constructors and uniform initialization for all objects provided by the library |
|---|
| 86 | | a PropertySet class is provided. A PropertySet associates strings with Variants. Each property |
|---|
| 87 | | corresponds to one string. |
|---|
| 88 | | |
|---|
| 89 | | A basic implementation of a Variant is also provided that supports a number of data types. The |
|---|
| 90 | | supported data types can be found in SpatialIndex.h |
|---|
| | 109 | I will briefly present the basic features supported by each package. |
|---|
| | 110 | For more details you will have to refer to the code, for now. |
|---|
| | 111 | |
|---|
| | 112 | Spatial Index Utilities |
|---|
| | 113 | ------------------------------------------------------------------------------ |
|---|
| | 114 | |
|---|
| | 115 | To provide common constructors and uniform initialization for all objects |
|---|
| | 116 | provided by the library a PropertySet class is provided. A PropertySet |
|---|
| | 117 | associates strings with Variants. Each property corresponds to one string. |
|---|
| | 118 | |
|---|
| | 119 | A basic implementation of a Variant is also provided that supports a |
|---|
| | 120 | number of data types. The supported data types can be found in SpatialIndex.h |
|---|
| 93 | | 1.getProperty returns the Variant associated with the given string. |
|---|
| 94 | | 2.setProperty associates the given Variant with the given string. |
|---|
| 95 | | 3.removeProperty removes the specified property from the PropertySet. |
|---|
| 96 | | |
|---|
| 97 | | A number of exceptions are also defined here. All exceptions extend Exception and thus provide |
|---|
| 98 | | the what() method that returns a string representation of the exception with useful comments. |
|---|
| 99 | | It is advisable to use enclosing try/catch blocks when using any library objects. Many constructors |
|---|
| 100 | | throw exceptions when invalid initialization properties are specified. |
|---|
| 101 | | |
|---|
| 102 | | A general IShape interface is defined. All shape classes should extend IShape. Basic Region |
|---|
| 103 | | and Point classes are already provided. Please check Region.h and Point.h for further details. |
|---|
| 104 | | |
|---|
| 105 | | 4.2 STORAGE MANAGER |
|---|
| 106 | | |
|---|
| 107 | | The library provides a common interface for storage management of all indices. It consists of the |
|---|
| 108 | | IStorageManager interface, which provides functions for storing and retrieving entities. An entity |
|---|
| 109 | | is viewed as a simple byte array; hence it can be an index entry, a data entry or anything else that |
|---|
| 110 | | the user wants to store. The storage manager interface is generic and does not apply only to spatial |
|---|
| 111 | | indices. |
|---|
| 112 | | |
|---|
| 113 | | Classes that implement the IStorageManager interface decide on how to store entities. A |
|---|
| 114 | | simple main memory implementation is provided, for example, that stores the entities using a vector, |
|---|
| 115 | | associating every entity with a unique ID (the entry's index in the vector). A disk based storage |
|---|
| 116 | | manager could choose to store the entities in a simple random access file, or a database storage manager |
|---|
| 117 | | could store them in a relational table, etc. as long as unique IDs are associated with every entity. |
|---|
| 118 | | Also, storage managers should implement their own paging, compaction and deletion policies transparently |
|---|
| | 123 | |
|---|
| | 124 | 1. getProperty returns the Variant associated with the given string. |
|---|
| | 125 | 2. setProperty associates the given Variant with the given string. |
|---|
| | 126 | 3. removeProperty removes the specified property from the PropertySet. |
|---|
| | 127 | |
|---|
| | 128 | A number of exceptions are also defined here. All exceptions extend |
|---|
| | 129 | Exception and thus provide the what() method that returns a string |
|---|
| | 130 | representation of the exception with useful comments. It is advisable to |
|---|
| | 131 | use enclosing try/catch blocks when using any library objects. Many |
|---|
| | 132 | constructors throw exceptions when invalid initialization properties are specified. |
|---|
| | 133 | |
|---|
| | 134 | A general IShape interface is defined. All shape classes should extend |
|---|
| | 135 | IShape. Basic Region and Point classes are already provided. Please |
|---|
| | 136 | check Region.h and Point.h for further details. |
|---|
| | 137 | |
|---|
| | 138 | Storage Manager |
|---|
| | 139 | ------------------------------------------------------------------------------ |
|---|
| | 140 | |
|---|
| | 141 | The library provides a common interface for storage management of all |
|---|
| | 142 | indices. It consists of the IStorageManager interface, which provides functions |
|---|
| | 143 | for storing and retrieving entities. An entity is viewed as a simple byte |
|---|
| | 144 | array; hence it can be an index entry, a data entry or anything else that the |
|---|
| | 145 | user wants to store. The storage manager interface is generic and does not apply |
|---|
| | 146 | only to spatial indices. |
|---|
| | 147 | |
|---|
| | 148 | Classes that implement the IStorageManager interface decide on how to |
|---|
| | 149 | store entities. simple main memory implementation is provided, for example, |
|---|
| | 150 | that stores the entities using a vector, associating every entity with a |
|---|
| | 151 | unique ID (the entry's index in the vector). A disk based storage manager |
|---|
| | 152 | could choose to store the entities in a simple random access file, or a |
|---|
| | 153 | database storage manager could store them in a relational table, etc. as long |
|---|
| | 154 | as unique IDs are associated with every entity. Also, storage managers should |
|---|
| | 155 | implement their own paging, compaction and deletion policies transparently |
|---|
| 121 | | The storeByteArray method gets a byte array and its length and an entity ID. If the caller |
|---|
| 122 | | specifies NewPage as the input ID, the storage manager allocates a new ID, stores the entity and returns |
|---|
| 123 | | the ID associated with the entity. If, instead, the user specifies an already existing ID the storage |
|---|
| 124 | | manager overwrites the old data. An exception is thrown if the caller requests an invalid ID to be overwritten. |
|---|
| 125 | | |
|---|
| 126 | | The loadByteArray method gets an entity ID and returns the associated byte array along with its length. If an |
|---|
| 127 | | invalid ID is requested, an exception is thrown. |
|---|
| | 158 | The storeByteArray method gets a byte array and its length and an entity ID. |
|---|
| | 159 | If the caller specifies NewPage as the input ID, the storage manager allocates |
|---|
| | 160 | a new ID, stores the entity and returns the ID associated with the entity. |
|---|
| | 161 | If, instead, the user specifies an already existing ID the storage manager |
|---|
| | 162 | overwrites the old data. An exception is thrown if the caller requests |
|---|
| | 163 | an invalid ID to be overwritten. |
|---|
| | 164 | |
|---|
| | 165 | The loadByteArray method gets an entity ID and returns the associated byte |
|---|
| | 166 | array along with its length. If an invalid ID is requested, an exception is thrown. |
|---|
| 131 | | The storage managers should have no information about the types of entities that are stored. |
|---|
| 132 | | There are three main reasons for this decision: |
|---|
| 133 | | 1.Any number of spatial indices can be stored in a single storage manager |
|---|
| 134 | | (i.e. the same relational table, or binary file, or hash table, etc., can be used to store many indices) |
|---|
| 135 | | using an arbitrary number of pages and a unique index ID per index (this will be discussed shortly). |
|---|
| 136 | | 2.Both clustered and non-clustered indices can be supported. A clustered index stores the data associated |
|---|
| 137 | | with the entries that it contains along with the spatial information that it indexes. A non-clustered |
|---|
| 138 | | index stores only the spatial information of its entries. Any associated data are stored separately |
|---|
| 139 | | and are associated with the index entries by a unique ID. To support both types of indices, the storage |
|---|
| 140 | | manager interface should be quite generic, allowing the index to decide how to store its data. |
|---|
| 141 | | Otherwise clustered and non-clustered indices would have to be implemented separately. |
|---|
| 142 | | 3.Decision flexibility. For example, the users can choose a clustered index that will take care of storing |
|---|
| 143 | | everything. They can choose a main memory non-clustered index and store the actual data in MySQL. |
|---|
| 144 | | They can choose a disk based non-clustered index and store the data manually in a separate binary file |
|---|
| 145 | | or even in the same storage manager but doing a low level customized data processing. |
|---|
| | 170 | The storage managers should have no information about the types of entities |
|---|
| | 171 | that are stored. There are three main reasons for this decision: |
|---|
| | 172 | |
|---|
| | 173 | 1. Any number of spatial indices can be stored in a single storage manager |
|---|
| | 174 | (i.e. the same relational table, or binary file, or hash table, etc., can |
|---|
| | 175 | be used to store many indices) using an arbitrary number of pages and |
|---|
| | 176 | a unique index ID per index (this will be discussed shortly). |
|---|
| | 177 | 2. Both clustered and non-clustered indices can be supported. A clustered |
|---|
| | 178 | index stores the data associated with the entries that it contains along |
|---|
| | 179 | with the spatial information that it indexes. A non-clustered index stores |
|---|
| | 180 | only the spatial information of its entries. Any associated data are |
|---|
| | 181 | stored separately and are associated with the index entries by a unique ID. |
|---|
| | 182 | To support both types of indices, the storage manager interface should be |
|---|
| | 183 | quite generic, allowing the index to decide how to store its data. |
|---|
| | 184 | Otherwise clustered and non-clustered indices would have to be |
|---|
| | 185 | implemented separately. |
|---|
| | 186 | 3. Decision flexibility. For example, the users can choose a clustered index |
|---|
| | 187 | that will take care of storing everything. They can choose a main memory |
|---|
| | 188 | non-clustered index and store the actual data in MySQL. They can choose |
|---|
| | 189 | a disk based non-clustered index and store the data manually in a |
|---|
| | 190 | separate binary file or even in the same storage manager but doing a low |
|---|
| | 191 | level customized data processing. |
|---|
| 148 | | 1.MemoryStorageManager |
|---|
| 149 | | 2.DiskStorageManager |
|---|
| 150 | | |
|---|
| 151 | | 4.2.1 MemoryStorageManager |
|---|
| 152 | | |
|---|
| 153 | | As it is implied be the name, this is a main memory implementation. Everything is stored in main memory using |
|---|
| 154 | | a simple vector. No properties are needed to initialize a MemoryStorageManager object. When a |
|---|
| 155 | | MemoryStorageManager instance goes out of scope, all data that it contains are lost. |
|---|
| 156 | | |
|---|
| 157 | | 4.2.2 DiskStorageManager |
|---|
| 158 | | |
|---|
| 159 | | The disk storage manager uses two random access files for storing information. One with extension .idx and |
|---|
| 160 | | the other with extension .dat. |
|---|
| 161 | | |
|---|
| 162 | | A list of all the supported properties that can be provided during initialization, follows: |
|---|
| | 194 | 1) MemoryStorageManager |
|---|
| | 195 | 2) DiskStorageManager |
|---|
| | 196 | |
|---|
| | 197 | |
|---|
| | 198 | MemoryStorageManager |
|---|
| | 199 | .............................................................................. |
|---|
| | 200 | |
|---|
| | 201 | As it is implied be the name, this is a main memory implementation. Everything |
|---|
| | 202 | is stored in main memory using a simple vector. No properties are needed to |
|---|
| | 203 | initialize a MemoryStorageManager object. When a MemoryStorageManager instance |
|---|
| | 204 | goes out of scope, all data that it contains are lost. |
|---|
| | 205 | |
|---|
| | 206 | DiskStorageManager |
|---|
| | 207 | .............................................................................. |
|---|
| | 208 | |
|---|
| | 209 | The disk storage manager uses two random access files for storing information. |
|---|
| | 210 | One with extension .idx and the other with extension .dat. |
|---|
| | 211 | |
|---|
| | 212 | A list of all the supported properties that can be provided during |
|---|
| | 213 | initialization, follows: |
|---|
| | 214 | |
|---|
| 166 | | 2. Overwrite VT_BOOL If Overwrite is true and a storage manager with the specified filename |
|---|
| 167 | | already exists, it will be truncated and overwritten. All data will be lost. |
|---|
| 168 | | 3. PageSize VT_ULONG The page size to use. If the specified filename already exists and Overwrite |
|---|
| 169 | | is false, PageSize is ignored. |
|---|
| 170 | | |
|---|
| 171 | | For entities that are larger than the page size, multiple pages are used. Although, the empty space |
|---|
| 172 | | on the last page is lost. Also, there is no effort whatsoever to use as many sequential pages as possible. |
|---|
| 173 | | A future version might support sequential I/O. Thus, real clustered indices cannot be supported yet. |
|---|
| 174 | | |
|---|
| 175 | | The purpose of the .idx file is to store vital information like the page size, the next available page, a |
|---|
| 176 | | list of empty pages and the sequence of pages associated with every entity ID. |
|---|
| 177 | | |
|---|
| 178 | | This class also provides a flush method that practically overwrites the .idx file and syncs both file |
|---|
| 179 | | pointers. |
|---|
| 180 | | |
|---|
| 181 | | The .idx file is loaded into main memory during initialization and is written to disk only after flushing the |
|---|
| 182 | | storage manager or during object destruction. In case of an unexpected failure changes to the storage manager |
|---|
| | 218 | 2. Overwrite VT_BOOL If Overwrite is true and a storage manager with the |
|---|
| | 219 | specified filename already exists, it will be |
|---|
| | 220 | truncated and overwritten. All data will be lost. |
|---|
| | 221 | 3. PageSize VT_ULONG The page size to use. If the specified filename |
|---|
| | 222 | already exists and Overwrite is false, PageSize is ignored. |
|---|
| | 223 | |
|---|
| | 224 | For entities that are larger than the page size, multiple pages are used. |
|---|
| | 225 | Although, the empty space on the last page is lost. Also, there is no effort |
|---|
| | 226 | whatsoever to use as many sequential pages as possible. A future version |
|---|
| | 227 | might support sequential I/O. Thus, real clustered indices cannot be supported yet. |
|---|
| | 228 | |
|---|
| | 229 | The purpose of the .idx file is to store vital information like the page size, |
|---|
| | 230 | the next available page, a list of empty pages and the sequence of pages |
|---|
| | 231 | associated with every entity ID. |
|---|
| | 232 | |
|---|
| | 233 | This class also provides a flush method that practically overwrites the |
|---|
| | 234 | .idx file and syncs both file pointers. |
|---|
| | 235 | |
|---|
| | 236 | The .idx file is loaded into main memory during initialization and is |
|---|
| | 237 | written to disk only after flushing the storage manager or during object |
|---|
| | 238 | destruction. In case of an unexpected failure changes to the storage manager |
|---|
| 185 | | 4.3 SPATIALINDEX INTERFACES |
|---|
| 186 | | |
|---|
| 187 | | A spatial index is any index structure that accesses spatial information efficiently. It could range from a |
|---|
| 188 | | simple grid file to a complicated tree structure. A spatial index indexes entries of type IEntry, which can |
|---|
| 189 | | be index nodes, leaf nodes, data etc. depending on the structure characteristics. The appropriate interfaces |
|---|
| 190 | | with useful accessor methods should be provided for all types of entries. |
|---|
| | 241 | SpatialIndex Interfaces |
|---|
| | 242 | ------------------------------------------------------------------------------ |
|---|
| | 243 | |
|---|
| | 244 | A spatial index is any index structure that accesses spatial information |
|---|
| | 245 | efficiently. It could range from a simple grid file to a complicated tree |
|---|
| | 246 | structure. A spatial index indexes entries of type IEntry, which can be index |
|---|
| | 247 | nodes, leaf nodes, data etc. depending on the structure characteristics. |
|---|
| | 248 | The appropriate interfaces with useful accessor methods should be provided |
|---|
| | 249 | for all types of entries. |
|---|
| 194 | | The containmentQuery method requires a query shape and a reference to a valid IVisitor instance (described shortly). |
|---|
| 195 | | The intersectionQuery method is the same. Both accept an IShape as the query. If the query shape is a simple |
|---|
| 196 | | Region, than a classic range query is performed. The user though has the ability to create her own shapes, thus |
|---|
| 197 | | defining her own intersection and containment methods making possible to run any kind of range query without |
|---|
| 198 | | having to modify the index. An example of a trapezoidal query is given in the regressiontest directory. Have |
|---|
| 199 | | in mind that it is the users responsibility to implement the correct intersection and containment methods |
|---|
| 200 | | between their shape and the type of shapes that are stored by the specific index that they are planning to use. |
|---|
| 201 | | For example, if an rtree index will be used, a trapezoid should define intersection and containment between |
|---|
| 202 | | itself and Regions, since all rtree nodes are of type Region. Hence, the user should have some knowledge |
|---|
| | 253 | The containmentQuery method requires a query shape and a reference to a |
|---|
| | 254 | valid IVisitor instance (described shortly). The intersectionQuery method |
|---|
| | 255 | is the same. Both accept an IShape as the query. If the query shape is a simple |
|---|
| | 256 | Region, than a classic range query is performed. The user though has the |
|---|
| | 257 | ability to create her own shapes, thus defining her own intersection and |
|---|
| | 258 | containment methods making possible to run any kind of range query without |
|---|
| | 259 | having to modify the index. An example of a trapezoidal query is given in the |
|---|
| | 260 | regressiontest directory. Have in mind that it is the users responsibility |
|---|
| | 261 | to implement the correct intersection and containment methods between their |
|---|
| | 262 | shape and the type of shapes that are stored by the specific index that they |
|---|
| | 263 | are planning to use. For example, if an rtree index will be used, a trapezoid |
|---|
| | 264 | should define intersection and containment between itself and Regions, since |
|---|
| | 265 | all rtree nodes are of type Region. Hence, the user should have some knowledge |
|---|
| 205 | | A point location query is performed using the pointLocationQuery method. It takes the query point and a visitor |
|---|
| 206 | | as arguments. |
|---|
| 207 | | |
|---|
| 208 | | Nearest neighbor queries can be performed with the nearestNeighborQuery method. Its first argument is the |
|---|
| 209 | | number k of nearest neighbors requested. This method also requires the query shape and a visitor object. |
|---|
| 210 | | The default implementation uses the getMinimumDistance function of IShape for calculating the distance |
|---|
| 211 | | of the query from the rectangular node and data entries stored in the tree. A more sophisticated |
|---|
| 212 | | distance measure can be used by implementing the INearestNeighborComparator interface and passing it |
|---|
| 213 | | as the last argument of nearestNeighborQuery. For example, a comparator is necessary when the query |
|---|
| 214 | | needs to be checked against the actual data stored in the tree, instead of the rectangular data entry |
|---|
| 215 | | approximations stored in the leaves. |
|---|
| 216 | | |
|---|
| 217 | | For customizing queries the IVisitor interface (based on the Visitor pattern [gamma94]) provides callback |
|---|
| 218 | | functions for visiting index and leaf nodes, as well as data entries. Node and data information can be obtained |
|---|
| 219 | | using the INode and IData interfaces (both extend IEntry). Examples of using this interface include visualizing |
|---|
| 220 | | a query, counting the number of leaf or index nodes visited for a specific query, throwing alerts when a |
|---|
| | 268 | A point location query is performed using the pointLocationQuery method. It |
|---|
| | 269 | takes the query point and a visitor as arguments. |
|---|
| | 270 | |
|---|
| | 271 | Nearest neighbor queries can be performed with the nearestNeighborQuery method. |
|---|
| | 272 | Its first argument is the number k of nearest neighbors requested. This |
|---|
| | 273 | method also requires the query shape and a visitor object. The default |
|---|
| | 274 | implementation uses the getMinimumDistance function of IShape for calculating |
|---|
| | 275 | the distance of the query from the rectangular node and data entries stored |
|---|
| | 276 | in the tree. A more sophisticated distance measure can be used by implementing |
|---|
| | 277 | the INearestNeighborComparator interface and passing it as the last argument |
|---|
| | 278 | of nearestNeighborQuery. For example, a comparator is necessary when the query |
|---|
| | 279 | needs to be checked against the actual data stored in the tree, instead of |
|---|
| | 280 | the rectangular data entry approximations stored in the leaves. |
|---|
| | 281 | |
|---|
| | 282 | For customizing queries the IVisitor interface (based on the Visitor |
|---|
| | 283 | pattern [gamma94]) provides callback functions for visiting index and |
|---|
| | 284 | leaf nodes, as well as data entries. Node and data information can be obtained |
|---|
| | 285 | using the INode and IData interfaces (both extend IEntry). Examples of using |
|---|
| | 286 | this interface include visualizing a query, counting the number of leaf |
|---|
| | 287 | or index nodes visited for a specific query, throwing alerts when a |
|---|
| 227 | | A data entry can be inserted using the insertData method. The insertion function will convert any shape into |
|---|
| 228 | | an internal representation depending on the index. Every inserted object should be assigned an ID |
|---|
| 229 | | (called object identifier) that will allow updating, deleting and reporting the object. |
|---|
| 230 | | It is the responsibility of the caller to provide the index with IDs (unique or not). Also, |
|---|
| 231 | | a byte array can be associated with an entry. The byte arrays are stored along with the spatial |
|---|
| 232 | | information inside the leaf nodes. Clustered indices can be supported in that way. The byte array can |
|---|
| 233 | | also by null (in which case the length field should be zero), and no extra space should be used per node. |
|---|
| 234 | | |
|---|
| 235 | | A data entry can be deleted using the deleteData method. The object shape and ID should be provided. |
|---|
| 236 | | Spatial indices cluster objects according to spatial characteristics and not IDs. Hence, the shape |
|---|
| 237 | | is essential for locating and deleting an entry. |
|---|
| 238 | | |
|---|
| 239 | | Useful statistics are provided through the IStatistics interface and the getStatistics method. |
|---|
| 240 | | |
|---|
| 241 | | Method getIndexProperties returns a PropertySet with all useful index properties like dimensionality etc. |
|---|
| 242 | | |
|---|
| 243 | | A NodeCommand interface is provided for customizing Node operations. Using the addWriteNodeCommand, |
|---|
| 244 | | addReadNodeCommand and addDeleteNodeCommand methods, custom command objects are added in listener lists |
|---|
| 245 | | and get executed after the corresponding operations. |
|---|
| 246 | | |
|---|
| 247 | | The isIndexValid method performs internal checks for testing the integrity of a structure. It is used for debugging |
|---|
| 248 | | purposes. |
|---|
| 249 | | |
|---|
| 250 | | When a new index is created a unique index ID should be assigned to it, that will be used when reloading |
|---|
| 251 | | the index from persistent storage. This index ID should be returned as an IndexIdentifier property in |
|---|
| 252 | | the instance of the PropsertySet that was used for constructing the index instance. Using index IDs, |
|---|
| 253 | | multiple indices can be stored in the same storage manager. It is the users responsibility to manager |
|---|
| 254 | | the index IDs. Associating the wrong index ID with the wrong storage manager or index type has undefined |
|---|
| | 295 | A data entry can be inserted using the insertData method. The insertion |
|---|
| | 296 | function will convert any shape into an internal representation depending on |
|---|
| | 297 | the index. Every inserted object should be assigned an ID (called object |
|---|
| | 298 | identifier) that will allow updating, deleting and reporting the object. |
|---|
| | 299 | It is the responsibility of the caller to provide the index with IDs |
|---|
| | 300 | (unique or not). Also, a byte array can be associated with an entry. The |
|---|
| | 301 | byte arrays are stored along with the spatial information inside the leaf |
|---|
| | 302 | nodes. Clustered indices can be supported in that way. The byte array can |
|---|
| | 303 | also by null (in which case the length field should be zero), and no extra |
|---|
| | 304 | space should be used per node. |
|---|
| | 305 | |
|---|
| | 306 | A data entry can be deleted using the deleteData method. The object shape |
|---|
| | 307 | and ID should be provided. Spatial indices cluster objects according to |
|---|
| | 308 | spatial characteristics and not IDs. Hence, the shape is essential for |
|---|
| | 309 | locating and deleting an entry. |
|---|
| | 310 | |
|---|
| | 311 | Useful statistics are provided through the IStatistics interface and |
|---|
| | 312 | the getStatistics method. |
|---|
| | 313 | |
|---|
| | 314 | Method getIndexProperties returns a PropertySet with all useful index |
|---|
| | 315 | properties like dimensionality etc. |
|---|
| | 316 | |
|---|
| | 317 | A NodeCommand interface is provided for customizing Node operations. Using |
|---|
| | 318 | the addWriteNodeCommand, addReadNodeCommand and addDeleteNodeCommand methods, |
|---|
| | 319 | custom command objects are added in listener lists and get executed after |
|---|
| | 320 | the corresponding operations. |
|---|
| | 321 | |
|---|
| | 322 | The isIndexValid method performs internal checks for testing the |
|---|
| | 323 | integrity of a structure. It is used for debugging purposes. |
|---|
| | 324 | |
|---|
| | 325 | When a new index is created a unique index ID should be assigned to it, that |
|---|
| | 326 | will be used when reloading the index from persistent storage. This index ID |
|---|
| | 327 | should be returned as an IndexIdentifier property in the instance of the |
|---|
| | 328 | PropsertySet that was used for constructing the index instance. Using |
|---|
| | 329 | index IDs, multiple indices can be stored in the same storage manager. |
|---|
| | 330 | It is the users responsibility to manager the index IDs. Associating the |
|---|
| | 331 | wrong index ID with the wrong storage manager or index type has undefined |
|---|
| 257 | | 4.4 THE RTREE PACKAGE |
|---|
| 258 | | |
|---|
| 259 | | The RTree index [guttman84] is a balanced tree structure that consists of index nodes, leaf nodes and data. |
|---|
| 260 | | Every node (leaf and index) has a fixed capacity of entries, (the node capacity) chosen at index creation |
|---|
| 261 | | An RTree abstracts the data with their Minimum Bounding Region (MBR) and clusters these MBRs according to |
|---|
| 262 | | various heuristics in the leaf nodes. Queries are evaluated from the root of the tree down the leaves. |
|---|
| 263 | | Since the index is balanced nodes can be under full. They cannot be empty though. A fill factor specifies |
|---|
| 264 | | the minimum number of entries allowed in any node. The fill factor is usually close to 70%. |
|---|
| | 334 | The RTree Package |
|---|
| | 335 | ------------------------------------------------------------------------------ |
|---|
| | 336 | |
|---|
| | 337 | The RTree index [guttman84] is a balanced tree structure that consists of |
|---|
| | 338 | index nodes, leaf nodes and data. Every node (leaf and index) has a fixed |
|---|
| | 339 | capacity of entries, (the node capacity) chosen at index creation An RTree |
|---|
| | 340 | abstracts the data with their Minimum Bounding Region (MBR) and clusters |
|---|
| | 341 | these MBRs according to various heuristics in the leaf nodes. Queries are |
|---|
| | 342 | evaluated from the root of the tree down the leaves. Since the index is |
|---|
| | 343 | balanced nodes can be under full. They cannot be empty though. A fill |
|---|
| | 344 | factor specifies the minimum number of entries allowed in any node. The |
|---|
| | 345 | fill factor is usually close to 70%. |
|---|
| 267 | | 1.Deciding if the index will be internal or external memory and selecting the appropriate storage manager. |
|---|
| 268 | | 2.Choosing the index and leaf capacity (also known as fan-out). |
|---|
| 269 | | 3.Choosing the fill factor (from 1% to 99% of the node capacity). |
|---|
| 270 | | 4.Choosing the dimensionality of the data. |
|---|
| 271 | | 5.Choosing the insert/update policy (the RTree variant). |
|---|
| 272 | | |
|---|
| 273 | | If an already stored RTree is being reloaded for reuse, only the index ID needs to be supplied during construction. |
|---|
| 274 | | In that case, some options cannot be modified. These include: the index and leaf capacity, the fill factor |
|---|
| 275 | | and the dimensionality. Note here, that the RTree variant can actually be modified. The variant affects only |
|---|
| 276 | | when and how splitting occurs, and thus can be changed at any time. |
|---|
| 277 | | |
|---|
| 278 | | An initialization PropertySet is used for setting the above options, complying with the following property strings: |
|---|
| 279 | | |
|---|
| 280 | | Property Type Description |
|---|
| 281 | | ---------------------------------------------- |
|---|
| 282 | | 1. IndexIndentifier VT_LONG If specified an existing index will be opened from the supplied |
|---|
| 283 | | storage manager with the given index id. Behavior is unspecified |
|---|
| | 348 | |
|---|
| | 349 | 1. Deciding if the index will be internal or external memory and selecting |
|---|
| | 350 | the appropriate storage manager. |
|---|
| | 351 | 2. Choosing the index and leaf capacity (also known as fan-out). |
|---|
| | 352 | 3. Choosing the fill factor (from 1% to 99% of the node capacity). |
|---|
| | 353 | 4. Choosing the dimensionality of the data. |
|---|
| | 354 | 5. Choosing the insert/update policy (the RTree variant). |
|---|
| | 355 | |
|---|
| | 356 | If an already stored RTree is being reloaded for reuse, only the index ID |
|---|
| | 357 | needs to be supplied during construction. In that case, some options cannot |
|---|
| | 358 | be modified. These include: the index and leaf capacity, the fill factor and |
|---|
| | 359 | the dimensionality. Note here, that the RTree variant can actually be |
|---|
| | 360 | modified. The variant affects only when and how splitting occurs, and |
|---|
| | 361 | thus can be changed at any time. |
|---|
| | 362 | |
|---|
| | 363 | An initialization PropertySet is used for setting the above options, |
|---|
| | 364 | complying with the following property strings: |
|---|
| | 365 | |
|---|
| | 366 | Property Type Description |
|---|
| | 367 | ----------------------- -------- ----------------------------------------------- |
|---|
| | 368 | 1. IndexIndentifier VT_LONG If specified an existing index will be |
|---|
| | 369 | opened from the supplied storage manager with |
|---|
| | 370 | the given index id. Behavior is unspecified |
|---|