root/WorldMill/trunk/docs/reading-data.txt

Revision 948, 3.3 KB (checked in by seang, 3 years ago)

Add benchmark script and set eol-style to native on everything except the shapefile

  • Property svn:eol-style set to native
Line 
1Reading data
2============
3
4Create a workspace from the standard workspace factory using either a path to a
5single data file or a path to a directory of similiar files. A workspace maps
6(in the Python sense) collections (or layers) of feature data, providing access
7to them via items() etc.
8
9    >>> from mill import workspace
10    >>> w = workspace('docs/data')
11    >>> w # doctest: +ELLIPSIS
12    <mill.workspace.Workspace object at ...>
13    >>> w.path
14    'docs/data'
15    >>> w.items() # doctest: +ELLIPSIS
16    [('test_uk', <mill.collection.Collection object at ...>)]
17
18A collection is a workspace item and can be obtained as with any mapping or
19dict. The name of a collection corresponds to the OGR layer name, meaning that
20it's the base of a file name. A collection's *schema* property is currently a
21list of (name, integer type) tuples, but this is likely to change in the
22future. The number of features in a collection, or its length, can be obtained
23in the usual Python way.
24
25    >>> c = w['test_uk']
26    >>> c # doctest: +ELLIPSIS
27    <mill.collection.Collection object at ...>
28    >>> c.name
29    'test_uk'
30    >>> c.schema
31    [('CAT', 2), ('FIPS_CNTRY', 4), ('CNTRY_NAME', 4), ('AREA', 2), ('POP_CNTRY', 2)]
32    >>> len(c)
33    48
34
35Features in a collection can be accessed as if the collection were a dict.
36 
37    >>> f = c['1']
38    >>> f.id
39    '1'
40    >>> f.properties['CNTRY_NAME']
41    'United Kingdom'
42
43Users can control the response by binding callables to the collection's object
44hook. The default object hook (used above) is mill.feature.Feature, a class
45modeled loosely on GeoJSON. Callables must take 3 positional parameters and
46return a Python object, like so:
47
48    >>> from mill.feature import Feature
49    >>> def testing_feature(id, properties, wkb):
50    ...     d = {}
51    ...     for key, val in properties.items():
52    ...         if type(val) == type('string'):
53    ...             val = val.encode('utf-8')
54    ...             val = unicode(val)
55    ...         d[key] = val
56    ...     return Feature(id, d, wkb.encode('hex'))
57
58This one converts all string properties to unicode and hex encodes the WKB byte
59string extracted from the data and is used like so:
60
61    >>> c.object_hook = testing_feature
62    >>> f = c['1']
63    >>> f.id
64    '1'
65    >>> f.properties['CNTRY_NAME']
66    u'United Kingdom'
67    >>> f.geometry # doctest: +ELLIPSIS
68    '0103000000010000000d0000003d1059a48...'
69
70More efficient access to features by id is provided by a collection's *all*
71attribute, which opens a data access session:
72
73    >>> features = c.all
74    >>> f = features['1']
75    >>> f.id
76    '1'
77    >>> f.properties['CNTRY_NAME']
78    u'United Kingdom'
79
80This attribute also provides the iterator protocol:
81
82    >>> features = c.all
83    >>> f = features.next()
84    >>> f.id
85    '0'
86    >>> f.properties['CNTRY_NAME']
87    u'United Kingdom'
88
89Collections also provide a filtering iterator. The bbox positional parameter
90causes the filter to pass only the features which intersect with the specified
91(minx, miny, maxx, maxy) bounding value tuple. The tuple values are understood
92to be of the same coordinate system as the data.
93
94  >>> query = c.filter(bbox=(-1.0, 50.0, -0.5, 50.5))
95  >>> query # doctest: +ELLIPSIS
96  <mill.collection.Iterator object at ...>
97  >>> results = [f for f in query]
98  >>> len(results)
99  1
100  >>> f = results[0]
101  >>> f.id
102  '45'
103  >>> f.properties['CNTRY_NAME']
104  u'United Kingdom'
105
Note: See TracBrowser for help on using the browser.