#include <on-disk-point-set-list.h>
Inheritance diagram for OnDiskPointSetList:
This class is suitable for loading data that is too large to fit in memory. For performance, OnDiskPointSetList operates using two caches: an LRU cache (of sorts) which remembers the most recently used PointSets, and what we call an "area" cache, which caches a contiguous block of PointSets. In general, if it is known that a particular application access data in a sequential manner, it is generally more useful to make the LRU cache small (size 1) and the area cache large. If the application uses mostly random access, it is better to have a large LRU cache and a very small area cache.
Public Member Functions | |
OnDiskPointSetList (string filename) | |
Load from a particular file, using default cache sizes. | |
OnDiskPointSetList (string filename, int lru_cache_size, int area_cache_size) | |
Load from a particular file, with specified cache sizes. | |
virtual | ~OnDiskPointSetList () |
virtual int | GetNumPointSets () const |
Get the total number of PointSets in this PointSetList. | |
virtual int | GetNumPoints () const |
Get the total number of Features in this PointSetList. | |
virtual int | GetFeatureDim () const |
Get the dim of every Feature in this PointSetList. | |
virtual vector< int > | GetSetCardinalities () const |
Get the number of Features in each PointSet. | |
virtual const PointSet & | operator[] (int index) const |
Return a reference to the <index>th PointSet. READ WARNING BELOW!. | |
const Feature & | GetFeature (int index) const |
Returns the <index>th Feature in the PointSetList. | |
pair< int, int > | GetFeatureIndices (int index) const |
Locate a Feature in a PointSet. | |
void | GetPointRefs (vector< PointRef > *out_refs) const |
Get PointRefs to every Feature in this PointSetList. | |
void | WriteToStream (ostream &output_stream) const |
Writes the entire PointSetList to a stream. | |
void | WriteHeaderToStream (ostream &output_stream) const |
Writes just the serialized header to a stream. | |
void | WritePointSetsToStream (ostream &output_stream) const |
Writes the point sets (without a header) sequentially to a stream. | |
void | WriteToFile (const char *output_file) const |
Writes the entire PointSetList to a file. | |
Static Public Attributes | |
static const int | DEFAULT_LRU_CACHE_SIZE |
Default value: 1500. | |
static const int | DEFAULT_AREA_CACHE_SIZE |
Default value: 1500. |
OnDiskPointSetList | ( | string | filename | ) |
Load from a particular file, using default cache sizes.
Makes one pass through the entire data set for preprocessing. Aborts if <filename> is not valid or not readable.
OnDiskPointSetList | ( | string | filename, | |
int | lru_cache_size, | |||
int | area_cache_size | |||
) |
Load from a particular file, with specified cache sizes.
Makes one pass through the entire data set for preprocessing. Aborts if <filename> is not valid or not readable.
~OnDiskPointSetList | ( | ) | [virtual] |
int GetNumPointSets | ( | ) | const [virtual] |
int GetNumPoints | ( | ) | const [virtual] |
Get the total number of Features in this PointSetList.
This is the sum of GetNumFeatures() over all PointSets in in this PointSetList.
Implements PointSetList.
int GetFeatureDim | ( | ) | const [virtual] |
vector< int > GetSetCardinalities | ( | ) | const [virtual] |
Get the number of Features in each PointSet.
Returns a vector of size this.GetNumPointSets(), where the nth element is the number of Features in the nth PointSet in this PointSetList.
Implements PointSetList.
const PointSet & operator[] | ( | int | index | ) | const [virtual] |
Return a reference to the <index>th PointSet. READ WARNING BELOW!.
Let A be the size of the area cache, and L be the size of the LRU cache. First, check to see if it's in the area cache, and return it if it's there(O(1)). If it's not there, we check the LRU cache (O(L)). If it's there, move it to the front of the LRU cache and return it. Otherwise, we insert it into the LRU cache, and also fill up the area cache starting at the element we've just retrieved (O(A)).
Warning: The reference returned by this operator is no longer valid the next time operator[] is called. This means that whenever you want to access more than one PointSet at a time, you must make a copy of them!
Implements PointSetList.
const Feature & GetFeature | ( | int | index | ) | const [inherited] |
Returns the <index>th Feature in the PointSetList.
We can also think of a PointSetList as a long vector of Features if we ignore the PointSet boundaries, so you can reference individual features of a PointSetList through this.
pair< int, int > GetFeatureIndices | ( | int | index | ) | const [inherited] |
void GetPointRefs | ( | vector< PointRef > * | out_refs | ) | const [inherited] |
Get PointRefs to every Feature in this PointSetList.
Requires out_refs != NULL. Makes out_refs a vector of size GetNumPoints(), where the nth element is a PointRef pointing to the nth Feature (i.e., points to GetFeature(n)). If out_refs has something in it beforehand, it will be cleared prior to filling it.
void WriteToStream | ( | ostream & | output_stream | ) | const [inherited] |
Writes the entire PointSetList to a stream.
See the detailed description of PointSetList for the format.
void WriteHeaderToStream | ( | ostream & | output_stream | ) | const [inherited] |
Writes just the serialized header to a stream.
void WritePointSetsToStream | ( | ostream & | output_stream | ) | const [inherited] |
Writes the point sets (without a header) sequentially to a stream.
void WriteToFile | ( | const char * | output_file | ) | const [inherited] |
Writes the entire PointSetList to a file.
See detailed description of PointSetList for the format.
const int DEFAULT_LRU_CACHE_SIZE [static] |
Default value: 1500.
const int DEFAULT_AREA_CACHE_SIZE [static] |
Default value: 1500.