Tuesday 28 April 2015

Data Structure and Algorithm - Find the minimal intervals (Google Interview Question)`

1. Problem Description
This is a Google interview question for Interns from careercup. Here is the original description of this thread,

"
From a list of integer intervals, write a function to minimize the number of overlapping or consecutive ones. 

Test Input: [4, 8], [3, 5], [-1 2], [10, 12]
Test ouput: [-1, 8], [10,12]"
2. Analysis
As I introduced the sweep line algorithm in this blog entry, Data Structure and Algorithm - Sweep Line Algorithm, I believe that this problem can be solved with the similar idea.

Problem:
Given N intervals, [X1, Y1], [X2, Y2], ..., [Xn, Yn], where
    - Xi and Yi are all integers with i's range from 1 to N, and
    - the intervals are all inclusive on both ends.

Properties of the Sorted Lower and Upper Bounds:
According to the blog entry, Data Structure and Algorithm - Sweep Line Algorithm, we group the lower bounds and the upper bounds, {X1, X2, ..., Xn} and {Y1, Y2, ..., Yn}. Now we sort both lower bounds and upper bounds in the ascending order as,
    - Lower bounds: {L1, L2, ..., Ln}
    - Upper bounds: {U1, U2, ..., Un}
Because the lower and upper bounds are sorted, the following properties hold,
    - L1 <= L2 <= ... <= Ln
    - U1 <= U2 <= ... <= Un
    - Li <= Ui, for any given i between 1 to N
    - L1 <= L2 <= ... <= Li <= Ui, for any given i betwen 1 to N

Detect the Continuous/Overlap
Given the sorted lower and upper bounds,
    - Lower bounds: { L1, l2, ...,   Li, Li+1, ...., Ln}
    - Upper bounds: { U1, U2, ..., Ui, Ui+1, ..., Un]

Let's consider if Ui and Li+1 are overlapped, Based on the properties of the sorted lower and upper bounds, Ui is the largest value of {L1, U1, L2, U2, ..., Li, Ui} and Li+1 is the least value of { Li+1, Ui+1, Li+2, and Ui+2, ..., Ln, Un}. Therefore let's examine their relationship,
    - If Li+1 > 1 + Ui, then there is NO Continuous/Overlap here,
        * A interval with upper bound of Ui should stop here.
        * A new interval with the lower bound of Li+1 should start here
    - Otherwise there is either a CONTINUOUS or OVERLAP,
        * If Li+1 = 1 + Ui, then this is a CONTINUOUS  case.
        * Otherwise it is a OVERLAP case

3. Solution
Example:
Let's take a look at the example: [4, 8], [3, 5], [-1 2], [10, 12]
The sorted lower and upper bounds are,
    - Lower bounds: {-1, 3, 4, 10}
    - Upper bounds: { 2,  5, 8, 12}
Here is the algorithm runs
    - Take -1 as the lower bound of the first interval
    - Ui = 2, Li+1 = 3, where  i = 1. it is a CONTINUOUS case and continue
    - Ui = 5, Li+1 = 4, where i = 2, it is a OVERLAP case and continue
    - Ui = 8, Li+1 = 10, where i = 3, this is none of above cases
        * The first interval stops her as [-1, 8]
        * The second interval starts here with lower bound as 10
    - Reach the end, construct the second interval with the upper bound of Un, [10, 12]

Pseudo-code:
    - 1. Group lower bounds and upper bounds as {X1, ..., Xn} and {Y1, ..., Yn}
    - 2. Sort lower and upper bounds in ascending order as {L1, ..., Ln} and {U1, ..., Un}
    - 3. Take LowerBound = L1 as the lower bound of the first interval
    - 4. Compare Li+1 and Ui where i ranges from 1 to N-1
        * 4.1 if Li+1 > 1 + Ui, then 
                 construct the interval with Ui as the upper bound, [LowerBound, Ui]
                 update LowerBound = Li+1
                 go to Step 4.3
        * 4.2 if i = N - 1, then
                 construct the interval with Un as the upper bound, [LowerBound, Un]
                 Terminate the program
        * 4.3 Increment i by 1 and repeat Step 4         
This solution has O(NlogN) computation complexity and O(N) space complexity.

No comments:

Post a Comment