util: SumTree implementation
The SumTree implements an efficient tree data structure for "roulette-wheel" sampling, or "sampling with fault expansion", i.e., sampling of trace entries / pilots without replacement and with a picking probability proportional to the entries' sizes. For every sample, the naive approach picks a random number between 0 and the sum of all entry sizes minus one. It then iterates over all entries and sums their sizes until the sum exceeds the random number. The current entry gets picked. The main disadvantage is the linear complexity, which gets unpleasant for millions of entries. The core idea behind the SumTree implementation is to maintain the size sum of groups of entries, kept in "buckets". Thereby, a bucket can be quickly jumped over. To keep bucket sizes (and thereby linear search times) bounded, more bucket hierarchy levels are introduced when a defined bucket size limit is reached. Note that the current implementation is built for a pure growth phase (when the tree gets filled with pilots from the database), followed by a sampling phase when the tree gets emptied. It does not handle a mixed add/remove case very smartly, although it should remain functional. Change-Id: If05e9700bc84761b5bc31006402641e7112b3a72
This commit is contained in:
34
src/core/util/testing/SumTreeTest.cc
Normal file
34
src/core/util/testing/SumTreeTest.cc
Normal file
@ -0,0 +1,34 @@
|
||||
#include "util/SumTree.hpp"
|
||||
|
||||
#include <iostream>
|
||||
#define LOG std::cerr
|
||||
|
||||
using std::endl;
|
||||
|
||||
struct Pilot {
|
||||
uint32_t id;
|
||||
uint32_t instr2;
|
||||
uint32_t data_address;
|
||||
uint64_t duration;
|
||||
|
||||
typedef uint64_t size_type;
|
||||
size_type size() const { return duration; }
|
||||
};
|
||||
|
||||
int main()
|
||||
{
|
||||
fail::SumTree<Pilot, 2> tree;
|
||||
for (int i = 0; i <= 20; ++i) {
|
||||
Pilot p;
|
||||
p.duration = i;
|
||||
tree.add(p);
|
||||
}
|
||||
|
||||
while (tree.get_size() > 0) {
|
||||
uint64_t pos = tree.get_size() / 2;
|
||||
LOG << "MAIN tree.get_size() = " << tree.get_size()
|
||||
<< ", trying to retrieve pos = " << pos << endl;
|
||||
Pilot p = tree.get(pos);
|
||||
LOG << "MAIN retrieved pilot with duration " << p.duration << endl;
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user