Changeset 47

Show
Ignore:
Timestamp:
01/19/08 10:26:06
Author:
hobu
Message:

update to ReST

Files:

Legend:

Unmodified
Added
Removed
Modified
Copied
Moved
  • spatialindex/trunk/README

    • Property svn:keywords set to Id Rev Author
    r46 r47  
    1 1. INTRODUCTION 
    2  
    3 You have downloaded the Spatial Index Library. This is free software under LGPL. 
     1***************************************************************************** 
     2 SpatialIndex Reference  
     3***************************************************************************** 
     4 
     5:Author: Marios Hadjieleftheriou 
     6:Contact: marioh@research.att.com 
     7:Revision: $Revision$ 
     8:Date: $Date: 2007-09-05 17:53:35 -0500 (Wed, 05 Sep 2007) $ 
     9 
     10..  The next heading encountered becomes our H2 
     11.. 
     12 
     13.. sectnum:: 
     14 
     15.. contents:: 
     16    :depth: 2 
     17    :backlinks: top 
     18 
     19------------------------------------------------------------------------------ 
     20Introduction 
     21------------------------------------------------------------------------------ 
     22 
     23You have downloaded the SpatialIndex Library. This is free software under LGPL. 
    424The library is in beta testing stage. Use at your own risk. 
    525 
     
    1131    queries (defined by spatial constraints) should be easy to deploy and run. 
    1232 3. Easy to use interfaces for inserting, deleting and updating information. 
    13  4. Wide variety of customization capabilities. Basic index and storage characteristics like 
    14     the page size, node capacity, minimum fan-out, splitting algorithm, etc. should be 
    15     easy to customize. 
    16  5. Index persistence. Internal memory and external memory structures should be supported. 
    17     Clustered and non-clustered indices should be easy to be persisted. 
    18  
    19 2. INSTALLING 
    20  
    21 To install the library you need to do the 
    22 following: 
     33 4. Wide variety of customization capabilities. Basic index and storage  
     34    characteristics like the page size, node capacity, minimum fan-out,  
     35    splitting algorithm, etc. should be easy to customize. 
     36 5. Index persistence. Internal memory and external memory structures  
     37    should be supported.  Clustered and non-clustered indices should  
     38    be easy to be persisted. 
     39 
     40------------------------------------------------------------------------------ 
     41Installation 
     42------------------------------------------------------------------------------ 
     43 
     44To install the library you need to do the following: 
    2345 
    2446  1. Install the Tools library (download from 
     
    5476    make install 
    5577 
    56 3. USING THE LIBRARY 
     78------------------------------------------------------------------------------ 
     79Using the Library 
     80------------------------------------------------------------------------------ 
    5781 
    5882You are ready to use the library. All you have to 
    5983do is to include the file SpatialIndex.h in your source 
    6084files and then compile with the following options: 
    61   g++ MyFile.cc -o MyFile -L/home/marioh/usr/lib -I/home/marioh/usr/include -lpthread -ltools -lspatialindex 
    62  
    63 If the library is installed in the default /usr/local 
    64 path, then the -I and -L options are not necessary. 
    65  
    66 If you are compiling on Mac OS X you will probably 
    67 have to add the -bind_at_load option when linking 
    68 against the dynamic link libraries. 
    69  
    70 4. LIBRARY OVERVIEW 
     85 
     86:: 
     87  
     88  g++ MyFile.cc -o MyFile -L/home/marioh/usr/lib -I/home/marioh/usr/include -lpthread -lspatialindex 
     89 
     90If the library is installed in the default /usr/local path, then the  
     91-I and -L options are not necessary. 
     92 
     93If you are compiling on Mac OS X you will might need to add the -bind_at_load  
     94option when linking against the dynamic link libraries.  OS X Tiger should  
     95work out of the box, however, with XCode 3.0. 
     96 
     97------------------------------------------------------------------------------ 
     98Library Overview 
     99------------------------------------------------------------------------------ 
    71100 
    72101The library currently consists of six packages: 
     
    78107 6. The tprtree index. 
    79108 
    80 I will briefly present the basic features supported by each package. For more details you will 
    81 have to refer to the code, for now. 
    82  
    83 4.1 SPATIALINDEX UTILITIES 
    84  
    85 To provide common constructors and uniform initialization for all objects provided by the library 
    86 a PropertySet class is provided. A PropertySet associates strings with Variants. Each property 
    87 corresponds to one string. 
    88  
    89 A basic implementation of a Variant is also provided that supports a number of data types. The 
    90 supported data types can be found in SpatialIndex.h 
     109I will briefly present the basic features supported by each package.  
     110For more details you will have to refer to the code, for now. 
     111 
     112Spatial Index Utilities 
     113------------------------------------------------------------------------------ 
     114 
     115To provide common constructors and uniform initialization for all objects  
     116provided by the library a PropertySet class is provided. A PropertySet  
     117associates strings with Variants. Each property corresponds to one string. 
     118 
     119A basic implementation of a Variant is also provided that supports a  
     120number of data types. The supported data types can be found in SpatialIndex.h 
    91121 
    92122PropertySet supports three functions: 
    93  1.getProperty returns the Variant associated with the given string. 
    94  2.setProperty associates the given Variant with the given string. 
    95  3.removeProperty removes the specified property from the PropertySet. 
    96  
    97 A number of exceptions are also defined here. All exceptions extend Exception and thus provide 
    98 the what() method that returns a string representation of the exception with useful comments. 
    99 It is advisable to use enclosing try/catch blocks when using any library objects. Many constructors 
    100 throw exceptions when invalid initialization properties are specified. 
    101  
    102 A general IShape interface is defined. All shape classes should extend IShape. Basic Region 
    103 and Point classes are already provided. Please check Region.h and Point.h for further details. 
    104  
    105 4.2 STORAGE MANAGER 
    106  
    107 The library provides a common interface for storage management of all indices. It consists of the 
    108 IStorageManager interface, which provides functions for storing and retrieving entities.  An entity 
    109 is viewed as a simple byte array; hence it can be an index entry, a data entry or anything else that 
    110 the user wants to store. The storage manager interface is generic and does not apply only to spatial 
    111 indices. 
    112  
    113 Classes that implement the IStorageManager interface decide on how to store entities. A 
    114 simple main memory implementation is provided, for example, that stores the entities using a vector, 
    115 associating every entity with a unique ID (the entry's index in the vector). A disk based storage 
    116 manager could choose to store the entities in a simple random access file, or a database storage manager 
    117 could store them in a relational table, etc. as long as unique IDs are associated with every entity. 
    118 Also, storage managers should implement their own paging, compaction and deletion policies transparently 
     123 
     124 1. getProperty returns the Variant associated with the given string. 
     125 2. setProperty associates the given Variant with the given string. 
     126 3. removeProperty removes the specified property from the PropertySet. 
     127 
     128A number of exceptions are also defined here. All exceptions extend  
     129Exception and thus provide the what() method that returns a string  
     130representation of the exception with useful comments.  It is advisable to  
     131use enclosing try/catch blocks when using any library objects. Many  
     132constructors throw exceptions when invalid initialization properties are specified. 
     133 
     134A general IShape interface is defined. All shape classes should extend  
     135IShape. Basic Region and Point classes are already provided. Please  
     136check Region.h and Point.h for further details. 
     137 
     138Storage Manager 
     139------------------------------------------------------------------------------ 
     140 
     141The library provides a common interface for storage management of all  
     142indices. It consists of the IStorageManager interface, which provides functions  
     143for storing and retrieving entities.  An entity is viewed as a simple byte  
     144array; hence it can be an index entry, a data entry or anything else that the  
     145user wants to store. The storage manager interface is generic and does not apply  
     146only to spatial indices. 
     147 
     148Classes that implement the IStorageManager interface decide on how to  
     149store entities.  simple main memory implementation is provided, for example,  
     150that stores the entities using a vector, associating every entity with a  
     151unique ID (the entry's index in the vector). A disk based storage manager  
     152could choose to store the entities in a simple random access file, or a  
     153database storage manager could store them in a relational table, etc. as long  
     154as unique IDs are associated with every entity. Also, storage managers should  
     155implement their own paging, compaction and deletion policies transparently  
    119156from the callers (be it an index or a user). 
    120157 
    121 The storeByteArray method gets a byte array and its length and an entity ID. If the caller 
    122 specifies NewPage as the input ID, the storage manager allocates a new ID, stores the entity and returns 
    123 the ID associated with the entity. If, instead, the user specifies an already existing ID the storage 
    124 manager overwrites the old data. An exception is thrown if the caller requests an invalid ID to be overwritten. 
    125  
    126 The loadByteArray method gets an entity ID and returns the associated byte array along with its length. If an 
    127 invalid ID is requested, an exception is thrown. 
     158The storeByteArray method gets a byte array and its length and an entity ID.  
     159If the caller specifies NewPage as the input ID, the storage manager allocates  
     160a new ID, stores the entity and returns the ID associated with the entity.  
     161If, instead, the user specifies an already existing ID the storage manager  
     162overwrites the old data. An exception is thrown if the caller requests  
     163an invalid ID to be overwritten. 
     164 
     165The loadByteArray method gets an entity ID and returns the associated byte  
     166array along with its length. If an invalid ID is requested, an exception is thrown. 
    128167 
    129168The deleteByteArray method removes the requested entity from storage. 
    130169 
    131 The storage managers should have no information about the types of entities that are stored. 
    132 There are three main reasons for this decision: 
    133  1.Any number of spatial indices can be stored in a single storage manager 
    134    (i.e. the same relational table, or binary file, or hash table, etc., can be used to store many indices) 
    135    using an arbitrary number of pages and a unique index ID per index (this will be discussed shortly). 
    136  2.Both clustered and non-clustered indices can be supported. A clustered index stores the data associated 
    137    with the entries that it contains along with the spatial information that it indexes. A non-clustered 
    138    index stores only the spatial information of its entries. Any associated data are stored separately 
    139    and are associated with the index entries by a unique ID. To support both types of indices, the storage 
    140    manager interface should be quite generic, allowing the index to decide how to store its data. 
    141    Otherwise clustered and non-clustered indices would have to be implemented separately. 
    142  3.Decision flexibility. For example, the users can choose a clustered index that will take care of storing 
    143    everything. They can choose a main memory non-clustered index and store the actual data in MySQL. 
    144    They can choose a disk based non-clustered index and store the data manually in a separate binary file 
    145    or even in the same storage manager but doing a low level customized data processing. 
     170The storage managers should have no information about the types of entities  
     171that are stored. There are three main reasons for this decision: 
     172 
     173 1. Any number of spatial indices can be stored in a single storage manager 
     174    (i.e. the same relational table, or binary file, or hash table, etc., can  
     175    be used to store many indices) using an arbitrary number of pages and  
     176    a unique index ID per index (this will be discussed shortly). 
     177 2. Both clustered and non-clustered indices can be supported. A clustered  
     178    index stores the data associated with the entries that it contains along  
     179    with the spatial information that it indexes. A non-clustered index stores  
     180    only the spatial information of its entries. Any associated data are  
     181    stored separately and are associated with the index entries by a unique ID.  
     182    To support both types of indices, the storage manager interface should be  
     183    quite generic, allowing the index to decide how to store its data.   
     184    Otherwise clustered and non-clustered indices would have to be  
     185    implemented separately. 
     186 3. Decision flexibility. For example, the users can choose a clustered index  
     187    that will take care of storing everything. They can choose a main memory  
     188    non-clustered index and store the actual data in MySQL.  They can choose  
     189    a disk based non-clustered index and store the data manually in a  
     190    separate binary file or even in the same storage manager but doing a low  
     191    level customized data processing. 
    146192 
    147193Two storage managers are provided in the current implementation: 
    148  1.MemoryStorageManager 
    149  2.DiskStorageManager 
    150  
    151 4.2.1 MemoryStorageManager 
    152  
    153 As it is implied be the name, this is a main memory implementation. Everything is stored in main memory using 
    154 a simple vector. No properties are needed to initialize a MemoryStorageManager object. When a 
    155 MemoryStorageManager instance goes out of scope, all data that it contains are lost. 
    156  
    157 4.2.2 DiskStorageManager 
    158  
    159 The disk storage manager uses two random access files for storing information. One with extension .idx and 
    160 the other with extension .dat. 
    161  
    162 A list of all the supported properties that can be provided during initialization, follows: 
     194 1) MemoryStorageManager 
     195 2) DiskStorageManager 
     196 
     197 
     198 MemoryStorageManager 
     199.............................................................................. 
     200 
     201As it is implied be the name, this is a main memory implementation. Everything  
     202is stored in main memory using a simple vector. No properties are needed to  
     203initialize a MemoryStorageManager object. When a MemoryStorageManager instance  
     204goes out of scope, all data that it contains are lost. 
     205 
     206 DiskStorageManager 
     207.............................................................................. 
     208 
     209The disk storage manager uses two random access files for storing information.  
     210One with extension .idx and the other with extension .dat. 
     211 
     212A list of all the supported properties that can be provided during  
     213initialization, follows: 
     214 
    163215    Property      Type        Description 
    164     -------------------------------------- 
     216    ---------   -------- ------------------ 
    165217 1. FileName    VT_PCHAR The base name of the file to open (no extension) 
    166  2. Overwrite   VT_BOOL  If Overwrite is true and a storage manager with the specified filename 
    167                          already exists, it will be truncated and overwritten. All data will be lost. 
    168  3. PageSize    VT_ULONG The page size to use. If the specified filename already exists and Overwrite 
    169                          is false, PageSize is ignored. 
    170  
    171 For entities that are larger than the page size, multiple pages are used. Although, the empty space 
    172 on the last page is lost. Also, there is no effort whatsoever to use as many sequential pages as possible. 
    173 A future version might support sequential I/O. Thus, real clustered indices cannot be supported yet. 
    174  
    175 The purpose of the .idx file is to store vital information like the page size, the next available page, a 
    176 list of empty pages and the sequence of pages associated with every entity ID. 
    177  
    178 This class also provides a flush method that practically overwrites the .idx file and syncs both file 
    179 pointers. 
    180  
    181 The .idx file is loaded into main memory during initialization and is written to disk only after flushing the 
    182 storage manager or during object destruction. In case of an unexpected failure changes to the storage manager 
     218 2. Overwrite   VT_BOOL  If Overwrite is true and a storage manager with the  
     219                         specified filename already exists, it will be  
     220                         truncated and overwritten. All data will be lost. 
     221 3. PageSize    VT_ULONG The page size to use. If the specified filename  
     222                         already exists and Overwrite is false, PageSize is ignored. 
     223 
     224For entities that are larger than the page size, multiple pages are used.  
     225Although, the empty space on the last page is lost. Also, there is no effort  
     226whatsoever to use as many sequential pages as possible. A future version  
     227might support sequential I/O. Thus, real clustered indices cannot be supported yet. 
     228 
     229The purpose of the .idx file is to store vital information like the page size,  
     230the next available page, a list of empty pages and the sequence of pages  
     231associated with every entity ID. 
     232 
     233This class also provides a flush method that practically overwrites the  
     234.idx file and syncs both file pointers. 
     235 
     236The .idx file is loaded into main memory during initialization and is  
     237written to disk only after flushing the storage manager or during object  
     238destruction. In case of an unexpected failure changes to the storage manager 
    183239will be lost due to a stale .idx file. Avoiding such disasters is future work. 
    184240 
    185 4.3 SPATIALINDEX INTERFACES 
    186  
    187 A spatial index is any index structure that accesses spatial information efficiently. It could range from a 
    188 simple grid file to a complicated tree structure. A spatial index indexes entries of type IEntry, which can 
    189 be index nodes, leaf nodes, data etc. depending on the structure characteristics. The appropriate interfaces 
    190 with useful accessor methods should be provided for all types of entries. 
     241 SpatialIndex Interfaces 
     242------------------------------------------------------------------------------ 
     243 
     244A spatial index is any index structure that accesses spatial information  
     245efficiently. It could range from a simple grid file to a complicated tree  
     246structure. A spatial index indexes entries of type IEntry, which can be index  
     247nodes, leaf nodes, data etc. depending on the structure characteristics.  
     248The appropriate interfaces with useful accessor methods should be provided  
     249for all types of entries. 
    191250 
    192251A spatial index should implement the ISpatialIndex interface. 
    193252 
    194 The containmentQuery method requires a query shape and a reference to a valid IVisitor instance (described shortly). 
    195 The intersectionQuery method is the same. Both accept an IShape as the query. If the query shape is a simple 
    196 Region, than a classic range query is performed. The user though has the ability to create her own shapes, thus 
    197 defining her own intersection and containment methods making possible to run any kind of range query without 
    198 having to modify the index. An example of a trapezoidal query is given in the regressiontest directory. Have 
    199 in mind that it is the users responsibility to implement the correct intersection and containment methods 
    200 between their shape and the type of shapes that are stored by the specific index that they are planning to use. 
    201 For example, if an rtree index will be used, a trapezoid should define intersection and containment between 
    202 itself and Regions, since all rtree nodes are of type Region. Hence, the user should have some knowledge 
     253The containmentQuery method requires a query shape and a reference to a  
     254valid IVisitor instance (described shortly). The intersectionQuery method  
     255is the same. Both accept an IShape as the query. If the query shape is a simple 
     256Region, than a classic range query is performed. The user though has the  
     257ability to create her own shapes, thus defining her own intersection and  
     258containment methods making possible to run any kind of range query without 
     259having to modify the index. An example of a trapezoidal query is given in the  
     260regressiontest directory. Have in mind that it is the users responsibility  
     261to implement the correct intersection and containment methods between their  
     262shape and the type of shapes that are stored by the specific index that they  
     263are planning to use.  For example, if an rtree index will be used, a trapezoid  
     264should define intersection and containment between itself and Regions, since  
     265all rtree nodes are of type Region. Hence, the user should have some knowledge 
    203266about the index internal representation, to run more sophisticated queries. 
    204267 
    205 A point location query is performed using the pointLocationQuery method. It takes the query point and a visitor 
    206 as arguments. 
    207  
    208 Nearest neighbor queries can be performed with the nearestNeighborQuery method. Its first argument is the 
    209 number k of nearest neighbors requested. This method also requires the query shape and a visitor object. 
    210 The default implementation uses the getMinimumDistance function of IShape for calculating the distance 
    211 of the query from the rectangular node and data entries stored in the tree. A more sophisticated 
    212 distance measure can be used by implementing the INearestNeighborComparator interface and passing it 
    213 as the last argument of nearestNeighborQuery. For example, a comparator is necessary when the query 
    214 needs to be checked against the actual data stored in the tree, instead of the rectangular data entry 
    215 approximations stored in the leaves. 
    216  
    217 For customizing queries the IVisitor interface (based on the Visitor pattern [gamma94]) provides callback 
    218 functions for visiting index and leaf nodes, as well as data entries. Node and data information can be obtained 
    219 using the INode and IData interfaces (both extend IEntry). Examples of using this interface include visualizing 
    220 a query, counting the number of leaf or index nodes visited for a specific query, throwing alerts when a 
     268A point location query is performed using the pointLocationQuery method. It  
     269takes the query point and a visitor as arguments. 
     270 
     271Nearest neighbor queries can be performed with the nearestNeighborQuery method.  
     272Its first argument is the  number k of nearest neighbors requested. This  
     273method also requires the query shape and a visitor object.  The default  
     274implementation uses the getMinimumDistance function of IShape for calculating  
     275the distance of the query from the rectangular node and data entries stored  
     276in the tree. A more sophisticated distance measure can be used by implementing  
     277the INearestNeighborComparator interface and passing it as the last argument  
     278of nearestNeighborQuery. For example, a comparator is necessary when the query 
     279needs to be checked against the actual data stored in the tree, instead of  
     280the rectangular data entry approximations stored in the leaves. 
     281 
     282For customizing queries the IVisitor interface (based on the Visitor  
     283pattern [gamma94]) provides callback functions for visiting index and  
     284leaf nodes, as well as data entries. Node and data information can be obtained 
     285using the INode and IData interfaces (both extend IEntry). Examples of using  
     286this interface include visualizing a query, counting the number of leaf  
     287or index nodes visited for a specific query, throwing alerts when a 
    221288specific spatial region is accessed, etc. 
    222289 
    223 The queryStrategy method provides the ability to design more sophisticated queries. It uses the IQueryStrategy 
    224 interface as a callback that is called continuously until no more entries are requested. It can be used to 
     290The queryStrategy method provides the ability to design more sophisticated  
     291queries. It uses the IQueryStrategy interface as a callback that is called  
     292continuously until no more entries are requested. It can be used to 
    225293implement custom query algorithms (based on the strategy pattern [gamma94]). 
    226294 
    227 A data entry can be inserted using the insertData method. The insertion function will convert any shape into 
    228 an internal representation depending on the index. Every inserted object should be assigned an ID 
    229 (called object identifier) that will allow updating, deleting and reporting the object. 
    230 It is the responsibility of the caller to provide the index with IDs (unique or not). Also, 
    231 a byte array can be associated with an entry. The byte arrays are stored along with the spatial 
    232 information inside the leaf nodes. Clustered indices can be supported in that way. The byte array can 
    233 also by null (in which case the length field should be zero), and no extra space should be used per node. 
    234  
    235 A data entry can be deleted using the deleteData method. The object shape and ID should be provided. 
    236 Spatial indices cluster objects according to spatial characteristics and not IDs. Hence, the shape 
    237 is essential for locating and deleting an entry. 
    238  
    239 Useful statistics are provided through the IStatistics interface and the getStatistics method. 
    240  
    241 Method getIndexProperties returns a PropertySet with all useful index properties like dimensionality etc. 
    242  
    243 A NodeCommand interface is provided for customizing Node operations. Using the addWriteNodeCommand, 
    244 addReadNodeCommand and addDeleteNodeCommand methods, custom command objects are added in listener lists 
    245 and get executed after the corresponding operations. 
    246  
    247 The isIndexValid method performs internal checks for testing the integrity of a structure. It is used for debugging 
    248 purposes. 
    249  
    250 When a new index is created a unique index ID should be assigned to it, that will be used when reloading 
    251 the index from persistent storage. This index ID should be returned as an IndexIdentifier property in 
    252 the instance of the PropsertySet that was used for constructing the index instance. Using index IDs, 
    253 multiple indices can be stored in the same storage manager. It is the users responsibility to manager 
    254 the index IDs. Associating the wrong index ID with the wrong storage manager or index type has undefined 
     295A data entry can be inserted using the insertData method. The insertion  
     296function will convert any shape into an internal representation depending on  
     297the index. Every inserted object should be assigned an ID (called object  
     298identifier) that will allow updating, deleting and reporting the object. 
     299It is the responsibility of the caller to provide the index with IDs  
     300(unique or not). Also, a byte array can be associated with an entry. The  
     301byte arrays are stored along with the spatial information inside the leaf  
     302nodes. Clustered indices can be supported in that way. The byte array can 
     303also by null (in which case the length field should be zero), and no extra  
     304space should be used per node. 
     305 
     306A data entry can be deleted using the deleteData method. The object shape  
     307and ID should be provided. Spatial indices cluster objects according to  
     308spatial characteristics and not IDs. Hence, the shape is essential for  
     309locating and deleting an entry. 
     310 
     311Useful statistics are provided through the IStatistics interface and  
     312the getStatistics method. 
     313 
     314Method getIndexProperties returns a PropertySet with all useful index  
     315properties like dimensionality etc. 
     316 
     317A NodeCommand interface is provided for customizing Node operations. Using  
     318the addWriteNodeCommand, addReadNodeCommand and addDeleteNodeCommand methods,  
     319custom command objects are added in listener lists and get executed after  
     320the corresponding operations. 
     321 
     322The isIndexValid method performs internal checks for testing the  
     323integrity of a structure. It is used for debugging purposes. 
     324 
     325When a new index is created a unique index ID should be assigned to it, that  
     326will be used when reloading the index from persistent storage. This index ID  
     327should be returned as an IndexIdentifier property in the instance of the  
     328PropsertySet that was used for constructing the index instance. Using  
     329index IDs, multiple indices can be stored in the same storage manager.  
     330It is the users responsibility to manager the index IDs. Associating the  
     331wrong index ID with the wrong storage manager or index type has undefined 
    255332results. 
    256333 
    257 4.4 THE RTREE PACKAGE 
    258  
    259 The RTree index [guttman84] is a balanced tree structure that consists of index nodes, leaf nodes and data. 
    260 Every node (leaf and index) has a fixed capacity of entries, (the node capacity) chosen at index creation 
    261 An RTree abstracts the data with their Minimum Bounding Region (MBR) and clusters these MBRs according to 
    262 various heuristics in the leaf nodes. Queries are evaluated from the root of the tree down the leaves. 
    263 Since the index is balanced nodes can be under full. They cannot be empty though. A fill factor specifies 
    264 the minimum number of entries allowed in any node. The fill factor is usually close to 70%. 
     334 The RTree Package 
     335------------------------------------------------------------------------------ 
     336 
     337The RTree index [guttman84] is a balanced tree structure that consists of  
     338index nodes, leaf nodes and data. Every node (leaf and index) has a fixed  
     339capacity of entries, (the node capacity) chosen at index creation An RTree  
     340abstracts the data with their Minimum Bounding Region (MBR) and clusters  
     341these MBRs according to various heuristics in the leaf nodes. Queries are  
     342evaluated from the root of the tree down the leaves. Since the index is  
     343balanced nodes can be under full. They cannot be empty though. A fill 
     344factor specifies the minimum number of entries allowed in any node. The 
     345fill factor is usually close to 70%. 
    265346 
    266347RTree creation involves: 
    267  1.Deciding if the index will be internal or external memory and selecting the appropriate storage manager. 
    268  2.Choosing the index and leaf capacity (also known as fan-out). 
    269  3.Choosing the fill factor (from 1% to 99% of the node capacity). 
    270  4.Choosing the dimensionality of the data. 
    271  5.Choosing the insert/update policy (the RTree variant). 
    272  
    273 If an already stored RTree is being reloaded for reuse, only the index ID needs to be supplied during construction. 
    274 In that case, some options cannot be modified. These include: the index and leaf capacity, the fill factor 
    275 and the dimensionality. Note here, that the RTree variant can actually be modified. The variant affects only 
    276 when and how splitting occurs, and thus can be changed at any time. 
    277  
    278 An initialization PropertySet is used for setting the above options, complying with the following property strings: 
    279  
    280     Property                 Type      Description 
    281     ---------------------------------------------- 
    282  1. IndexIndentifier           VT_LONG   If specified an existing index will be opened from the supplied 
    283                                          storage manager with the given index id. Behavior is unspecified 
     348 
     349 1. Deciding if the index will be internal or external memory and selecting  
     350    the appropriate storage manager. 
     351 2. Choosing the index and leaf capacity (also known as fan-out). 
     352 3. Choosing the fill factor (from 1% to 99% of the node capacity). 
     353 4. Choosing the dimensionality of the data. 
     354 5. Choosing the insert/update policy (the RTree variant). 
     355 
     356If an already stored RTree is being reloaded for reuse, only the index ID  
     357needs to be supplied during construction. In that case, some options cannot  
     358be modified. These include: the index and leaf capacity, the fill factor and  
     359the dimensionality. Note here, that the RTree variant can actually be  
     360modified. The variant affects only when and how splitting occurs, and  
     361thus can be changed at any time. 
     362 
     363An initialization PropertySet is used for setting the above options,  
     364complying with the following property strings: 
     365 
     366    Property                    Type      Description 
     367    -----------------------    --------  ----------------------------------------------- 
     368 1. IndexIndentifier           VT_LONG   If specified an existing index will be  
     369                                         opened from the supplied storage manager with  
     370                                         the given index id. Behavior is unspecified 
    284371                                         if the index id or the storage manager are incorrect. 
    285372 2. Dimension                  VT_ULONG  Dimensionality of the data that will be inserted. 
     
    298385 
    299386 
    300 5. CONTACT INFORMATION 
    301  
    302 You can contact me at mhadji@gmail.com for further assistance. Please read the above information 
    303 carefully and also do not be afraid to browse through the code and especially the test files 
    304 inside regressiontest directory. 
    305  
    306 6. REFERENCES 
     387Contact Information 
     388------------------------------------------------------------------------------ 
     389 
     390You can contact me at mhadji@gmail.com for further assistance. Please read the  
     391above information carefully and also do not be afraid to browse through the  
     392code and especially the test files inside regressiontest directory. 
     393 
     394References 
     395------------------------------------------------------------------------------ 
    307396[guttman84] "R-Trees: A Dynamic Index Structure for Spatial Searching" 
    308397            Antonin Guttman, Proc. 1984 ACM-SIGMOD Conference on Management of Data (1985), 47-57.