A partition of a set S is a separation of the set into an arbitrary number of no
ID: 648335 • Letter: A
Question
A partition of a set S is a separation of the set into an arbitrary number of non-empty, pairwise disjoint subsets whose union is exactly S. What manner of a data structure should be used to represent a partition of a set if the following methods need to be optimized:
moving an element from one part to another, possibly an entirely new one, and
iterating over the parts of the partition.
A naive way of prioritizing 1 would be a hash/tree/whatever mapping from set elements to "part labels", but iterating over the parts would require O(N) for first constructing the actual parts from the labels. 2 is naively prioritized as a hash/tree/whatever set of hash/tree/whatever sets, but then moving elements around, especially to new subsets, incurs that memory management overhead.
Is there a way to get the best of both worlds? The implementation I need is Python but I imagine this is a cross-language question.
Explanation / Answer
I'm not familiar with the topic, so all I can do is give a few pointers. Manipulating partitions of finite sets is known by several names:
union-find when you start with singletons and progressively merge sets;
partition refinement when you start with a single set and progressively split it;
union-split-find when sets can be both merged and split as in your case.
Katherine J. Lai's discusses the union-split-find problem and proposes an algorithm using B-trees that double as finger trees to represent each set. Each element is represented by a distinct integer; using consecutive values for elements that are likely to remain together is more efficient. To merge two sets, split them into non-overlapping intervals that are entirely contained in one of the sets, and join the pieces together. See her thesis for details.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.