casacore
Loading...
Searching...
No Matches
IncrementalStMan.h
Go to the documentation of this file.
1//# IncrementalStMan.h: The Incremental Storage Manager
2//# Copyright (C) 1996,1997,1999
3//# Associated Universities, Inc. Washington DC, USA.
4//#
5//# This library is free software; you can redistribute it and/or modify it
6//# under the terms of the GNU Library General Public License as published by
7//# the Free Software Foundation; either version 2 of the License, or (at your
8//# option) any later version.
9//#
10//# This library is distributed in the hope that it will be useful, but WITHOUT
11//# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
12//# FITNESS FOR A PARTICULAR PURPOSE. See the GNU Library General Public
13//# License for more details.
14//#
15//# You should have received a copy of the GNU Library General Public License
16//# along with this library; if not, write to the Free Software Foundation,
17//# Inc., 675 Massachusetts Ave, Cambridge, MA 02139, USA.
18//#
19//# Correspondence concerning AIPS++ should be addressed as follows:
20//# Internet email: casa-feedback@nrao.edu.
21//# Postal address: AIPS++ Project Office
22//# National Radio Astronomy Observatory
23//# 520 Edgemont Road
24//# Charlottesville, VA 22903-2475 USA
25
26#ifndef TABLES_INCREMENTALSTMAN_H
27#define TABLES_INCREMENTALSTMAN_H
28
29//# Includes
30#include <casacore/casa/aips.h>
31#include <casacore/tables/DataMan/ISMBase.h>
32
33
34namespace casacore { //# NAMESPACE CASACORE - BEGIN
35
36// <summary>
37// The Incremental Storage Manager
38// </summary>
39
40// <use visibility=export>
41
42// <reviewed reviewer="UNKNOWN" date="before2004/08/25" tests="tIncrementalStMan.cc">
43// </reviewed>
44
45// <prerequisite>
46//# Classes you should understand before using this one.
47// <li> The Table Data Managers concept as described in module file
48// <linkto module="Tables:Data Managers">Tables.h</linkto>
49// <li> <linkto class=ROIncrementalStManAccessor>
50// ROIncrementalStManAccessor</linkto>
51// for a discussion of the cache size
52// </prerequisite>
53
54// <etymology>
55// IncrementalStMan is the data manager storing values in an incremental way
56// (similar to an incremental backup). A value is only stored when it
57// differs from the previous value.
58// </etymology>
59
60// <synopsis>
61// IncrementalStMan stores the data in a way that a value is only stored
62// when it is different from the value in the previous row. This storage
63// manager is very well suited for columns with slowly changing values,
64// because the resulting file can be much smaller. It is not suited at
65// all for columns with continuously changing data.
66// <p>
67// In general it can be advantageous to use this storage manager when
68// a value changes at most every 4 rows (although it depends on the length
69// of the data values themselves). The following simple example
70// shows the approximate savings that can be achieved when storing a column
71// with double values changing every CH rows.
72// <srcblock>
73// #rows CH normal length ISM length compress ratio
74// 50000 5 4000000 1606000 2.5
75// 50000 50 4000000 164000 24.5
76// 50000 500 4000000 32800 122
77// </srcblock>
78// There is a special test program <src>nISMBucket</src> in the Tables module
79// doing a simple, but usually adequate, simulation of the amount of
80// storage needed for a scenario.
81// <p>
82// IncrementalStMan stores the values (and associated indices) in
83// fixed-length buckets. A <linkto class=BucketCache>BucketCache</linkto>
84// object is used to read/write
85// the buckets. The default cache size is 1 bucket (which is fine for
86// sequential access), but for random access it can make sense to
87// increase the size of the cache. This can be done using
88// the class <linkto class=ROIncrementalStManAccessor>
89// ROIncrementalStManAccessor</linkto>.
90// <p>
91// The IncrementalStMan can hold values of any standard data type (thus
92// from Bool to String). It can handle scalars, direct and indirect
93// arrays. It can support an arbitrary number of columns. The values in
94// each of them can vary at its own speed.
95// <br>
96// A bucket contains the values of several consecutive rows.
97// At the beginning of a bucket the values of the starting row of all
98// columns for this storage manager are repeated. In this way the value
99// of a cell can always be found in the bucket and no references
100// to previous buckets are needed.
101// <br>A bucket should be big enough to hold all starting values and
102// a reasonable number of other values. As a rule of thumb it should be
103// big enough to hold at least 100 values of each column. In general the
104// default bucket size will do. Only in special cases (e.g. when storing
105// large variable length strings) the bucket size should be set explicitly.
106// Giving a zero bucket size means that a suitale default bucket size
107// will be calculated.
108// <br>
109// When a table is filled sequentially each bucket can be filled as
110// much as possible. When writing in a random way, buckets can contain
111// some unused space, because a bucket in the middle of the file
112// has to be split when a new value has to be put in it.
113// <p>
114// Each column in the IncrementalStMan has the following properties to
115// achieve the "store-different-values-only" behaviour.
116// <ul>
117// <li> When a row is not explicitly put, it has the same value as the
118// previous row.
119// The first row gets the standard undefined values when not put.
120// The order of put's and addRow's is not important.
121// <br>E.g. when a table has N rows and row N and the following M rows
122// have the same value, the following schematic code has the same effect:
123// <br><src> add 1 row; put value in row N; add M rows;</src>
124// <br><src> add M+1 rows; put value in row N;</src>
125// <li> When putting a scalar or direct array, it is tested if it matches
126// the previous row. If so, it is not stored again.
127// This test is not done for indirect arrays, because those can
128// be (very) big and it would be too time-consuming. So the only
129// way to save space for indirect arrays is by not putting them
130// as explained in the previous item.
131// <li> For indirect arrays the buckets contain a pointer only. The
132// arrays themselves are stored in a separate file.
133// <li> When a value of an existing row is updated, only that one row is
134// updated. The next row(s) keep their value, even if it was
135// shared with the row being updated.
136// <br>For scalars and direct arrays it will be tested if the
137// new value matches the value in the previous and/or next row.
138// If so, those rows will be combined to save storage.
139// <li> The IncrementalStMan is optimized for sequential access to a table.
140// <br>- A bucket is accessed only once, because a bucket contains
141// consecutive rows.
142// <br>- For each column a copy is kept of the last value read.
143// So the value for the next rows (with that same value)
144// is immediately available.
145// <br>For random access the performance can be improved by setting
146// the cache size using class
147// <linkto class=ROIncrementalStManAccessor>
148// ROIncrementalStManAccessor</linkto>.
149// </ul>
150//
151// <note>This class contains many public functions which are only used
152// by other ISM classes. The only useful function for the user is the
153// constructor.
154// </note>
155
156// <motivation>
157// IncrementalStMan can save a lot of storage space.
158// Unlike the old StManMirAIO it stores the values directly in the
159// file to save on memory usage.
160// </motivation>
161
162// <example>
163// This example shows how to create a table and how to attach
164// the storage manager to some columns.
165// <srcblock>
166// SetupNewTable newtab("name.data", tableDesc, Table::New);
167// IncrementalStMan stman; // define storage manager
168// newtab.bindColumn ("column1", stman); // bind column to st.man.
169// newtab.bindColumn ("column2", stman); // bind column to st.man.
170// Table tab(newtab); // actually create table
171// </srcblock>
172// </example>
173
174//# <todo asof="$DATE:$">
175//# A List of bugs, limitations, extensions or planned refinements.
176//# </todo>
177
178
180{
181public:
182 // Create an incremental storage manager with the given name.
183 // If no name is used, it is set to an empty string.
184 // The name can be used to construct a
185 // <linkto class=ROIncrementalStManAccessor>ROIncrementalStManAccessor
186 // </linkto> object (e.g. to set the cache size).
187 // <br>
188 // The bucket size has to be given in bytes and the cache size in buckets.
189 // Bucket size 0 means that the storage manager will set the bucket
190 // size such that it can contain about 100 rows
191 // (with a minimum size of 32768 bytes). However, if that results
192 // in a very large bucket size (>327680) it'll make it smaller.
193 // Note it uses 32 bytes for the size of variable length strings,
194 // so this heuristic may fail when a column contains large strings.
195 // When <src>checkBucketSize</src> is set and Bucket size > 0
196 // the storage manager throws an exception
197 // when the size is too small to hold the values of at least 2 rows.
198 // For this check it uses 0 for the length of variable length strings.
199 // <group>
201 Bool checkBucketSize = True,
202 uInt cacheSize = 1);
204 uInt bucketSize = 0,
205 Bool checkBucketSize = True,
206 uInt cacheSize = 1);
207 // </group>
208
210
211 // Copy constructor cannot be used.
213
214 // Assignment cannot be used.
216};
217
218
219
220} //# NAMESPACE CASACORE - END
221
222#endif
uInt bucketSize() const
Get the bucket size (in bytes).
Definition ISMBase.h:405
virtual String dataManagerName() const
Get the name given to the storage manager (in the constructor).
uInt cacheSize() const
Get the current cache size (in buckets).
Definition ISMBase.h:390
IncrementalStMan(const IncrementalStMan &)=delete
Copy constructor cannot be used.
IncrementalStMan & operator=(const IncrementalStMan &)=delete
Assignment cannot be used.
IncrementalStMan(uInt bucketSize=0, Bool checkBucketSize=True, uInt cacheSize=1)
Create an incremental storage manager with the given name.
IncrementalStMan(const String &dataManagerName, uInt bucketSize=0, Bool checkBucketSize=True, uInt cacheSize=1)
String: the storage and methods of handling collections of characters.
Definition String.h:223
this file contains all the compiler specific defines
Definition mainpage.dox:28
unsigned int uInt
Definition aipstype.h:49
bool Bool
Define the standard types used by Casacore.
Definition aipstype.h:40
const Bool True
Definition aipstype.h:41