ddalg
Algorithms and data structures for my Python projects.
Interval tree
Interval tree is a data structure for holding intervals and to allow efficiently find intervals that overlap with given interval or point. Read more on Wikipedia.
Implementation note
This implementation uses halfopen intervals, where begin coordinate is excluded. Halfopen intervals are used in e.g. BED genomic format.
The current implementation needs to rebuild the tree after each insert
, hence the tree is not efficient for using in read/write fashion.
Usage
 implement your custom interval object while extending
Interval
. Two properties need to be overwritten:
begin
 0based (excluded) begin coordinate of the interval 
end
 0based (included) end coordinate of the intervalfrom ddalg.model import Interval class YourInterval(Interval): def __init__(self, begin: int, end: int): self._begin = begin self._end = end # anything else @property def begin(self): return self._begin @property def end(self): return self._end

 create a collection of your intervals and store them in the interval tree:
from ddalg.itree import IntervalTree itree = IntervalTree([YourInterval(0, 3), YourInterval(1, 4)]) # ...
 query
itree
: using 1based position:
itree.search(1)
returns
(0,3)
 using halfopen interval coordinates:
itree.get_overlaps(0, 1)
returns
(0,3)
, effectively the same query as above  for intervals with minimal required coverage
itree.fuzzy_query(0, 1, coverage=.90)
return intervals with >=.9 overlap with respect to query coordinates
 for intervals with minimal jaccard index
itree.jaccard_query(0, 1, min_jaccard=.90)
return intervals having jaccard_index>=.9 with respect to query coordinates
 using 1based position: