合并重叠区间

16

目前,我的间隔时间为:

temp_tuple = [[-25, -14], [-21, -16], [-20, -15], [-10, -7], [-8, -5], [-6, -3], [2, 4], [2, 3], [3, 6], [12, 15], [13, 18], [14, 17], [22, 27], [25, 30], [26, 29]]

按照下界升序排列。我的任务是合并重叠的区间,使结果为:

[-25, -14]
[-10, -3]
[2, 6]
[12, 18]
[22, 30]

我的第一次尝试是删除完全位于先前时间段内的时间段,比如[-21,-16],它在[-25,-14]内。但是,删除列表中的对象会干扰循环条件。我的第二次尝试是删除不必要的时间段:

i = 0
j = 1
while i < len(temp_tuples):
    while j < len(temp_tuples):
        if temp_tuples[i][1] > temp_tuples[j][1]:
            del temp_tuples[j]
        j += 1
    i += 1

但出于某些原因,这并不会删除所有不必要的间隔。我该怎么办?


3
不要试图从列表中删除元素,而是建立一个只包含必要元素的新列表。 - Paul Rooney
1
这个回答解决了你的问题吗?在Python中合并重叠的区间 - Guybrush
1
@Guybrush 应该应用闭包的另一个方向(已投票) - Tomerikoo
5个回答

28

如果您设置一个新列表,那么处理(即思考)会变得更容易一些。同时,您还可以保留原始数据。

temp_tuple.sort(key=lambda interval: interval[0])
merged = [temp_tuple[0]]
for current in temp_tuple:
    previous = merged[-1]
    if current[0] <= previous[1]:
        previous[1] = max(previous[1], current[1])
    else:
        merged.append(current)

如果您现在运行print(merged),将输出:

[[-25, -14], [-10, -3], [2, 6], [12, 18], [22, 30]]

我知道这是一个愚蠢的问题,如果它们是从temp_tuple列表中获取的,“当前”区间为什么会改变其上限? - Joon Choi
我不确定我完全理解。但是由于temp_tuple.sort(...)的存在,顺序并不重要。因此,我们只需要检查merged中的最后一个区间即可。 - vallentin

10
这是我想出来的一个numpy解决方案:
import numpy as np

def merge_intervals(intervals):
    starts = intervals[:,0]
    ends = np.maximum.accumulate(intervals[:,1])
    valid = np.zeros(len(intervals) + 1, dtype=np.bool)
    valid[0] = True
    valid[-1] = True
    valid[1:-1] = starts[1:] >= ends[:-1]
    return np.vstack((starts[:][valid[:-1]], ends[:][valid[1:]])).T

#example
a=[]
a.append([1,3])
a.append([4,10])
a.append([5,12])
a.append([6,8])
a.append([20,33])
a.append([30,35])

b = np.array(a)

print("intervals")
print(b)
print()
print("merged intervals")
print(merge_intervals(b))

输出:

intervals
[[ 1  3]
 [ 4 10]
 [ 5 12]
 [ 6  8]
 [20 33]
 [30 35]]

merged intervals
[[ 1  3]
 [ 4 12]
 [20 35]]

请注意,输入的数组必须按起始位置排序。

0
下面解决方案的显著特点是在输入列表并行运行一个名为final的列表。在进入for循环之前,final列表将第一个数组项插入其中。然后开始比较。就是这样!
def merger(a):

    final=[]

    final.append(a[0])

    for i in range(1,len(a)):

        if a[i][0]<=final[-1][1]:

           if a[i][1]>final[-1][1]:

               low_limit=final[-1][0]
               hi_limit=a[i][1]
               final.pop()
               final.append([low_limit,hi_limit])

           if a[i][1]<=final[-1][1]:

                low_limit=final[-1][0]
                hi_limit=final[-1][1]
                final.pop()
                final.append([low_limit,hi_limit])

          if final[-1][0]<a[i][1]<final[-1][1]:

           low_limit=a[i][0]
           hi_limit=final[-1][1]
           final.pop()
           final.append([low_limit,hi_limit])



        else:
            final.append(a[i])

    return final

if __name__=="__main__":

    array=[[-25, -14], [-21, -16], [-20, -15], [-10, -7], [-8, -5], [-6, -3], [2, 4], [2, 3], [3, 6], [12, 15], [13, 18], [14, 17], [22, 27], [25, 30], [26, 29]]
    print(merger(array))

输出:

[[-25, -14], [-10, -3], [2, 6], [12, 18], [22, 30]]

0
#Given an array of intervals in sorted order and a new interval, return a sorted array after merging the interval

def mergeinter(intervals,newinter):
    n = len(intervals)
    start = newinter[0]# we mark the start and end of the new interval to be merged
    end = newinter[1]
    right,left = 0,0
    while right < n:# we track where this new interval belongs, i.e. how many interval are to the left of it and how many are to the right
        if start <= intervals[right][1]:# we find the first interval before which it fits
            if end < intervals[right][0]:# this checks if the interval is disjoint and lies between two given intervals
                break# in this case we have nothing to do and go to line 29 directly
            start = min(start,intervals[right][0])# if it intersects with the given intervals then we just update and merge the ends
            end = max(end,intervals[right][1])
        else:# counting how many to the left continuing from line 20
            left += 1 
        right += 1# moving right to find the fit continuation of line 20 and even if we merge in line 25, we go to the next interval before
    return intervals[:left] + [(start,end)] + intervals[right:] # we return starting from the next interval

#Given a collection of intervals, merge all overlapping intervals and return sorted list of disjoint intervals.

def merge(I):
    I.sort(key:lambda i:i[0])# sorting according to the start of all intervals
    res = []# we start from the last of the given arr of lists and check the ends of the intervals and merge from the end
    for i in I:
        if not res or res[-1][0] < i[1]:# if res is empty then we put an elem in it from I
            res.append(i)# if there is no overlap, just add it
        else:
            res[-1][1] = max(i[1], res[-1][1])# here we merge from the end so that res remains sorted
    return res

0

我根据vallentin的回答做了一些改进,主要是使用copy.deepcopy来防止对输入进行任何更改。

请注意,使用sorted().copy()是不够的,因为在发生任何合并时,子元素会被更改。

我还添加了对每个区间的.sort()调用以处理未排序的输入,以及一个更明确的初始排序调用以提高清晰度。

代码:

import copy

def mergeIntervals(input_array, preserve_input=True):
    '''
    Python3 program for merging overlapping intervals.
    If preserve_input is False, the input_array can be modified! Not just sorted, but even sub-elements can change!

    See:
        https://www.educative.io/answers/what-is-the-merge-overlapping-intervals-problem
        https://www.geeksforgeeks.org/merging-intervals/
        https://dev59.com/uFcP5IYBdhLWcg3w8eW5
    '''
    if preserve_input:
        intervals = copy.deepcopy(input_array)
    else:
        intervals = input_array

    # Sort the given list of time intervals in ascending order of starting time.
    intervals.sort(key=lambda interval: interval[0])

    intervals[0].sort() # deal with unsorted input

    # Create a stack with the first interval
    merged = [intervals[0]]

    for current in intervals[1:]:
        current.sort() # deal with unsorted input
        previous = merged[-1]
        # Check for overlapping interval
        if current[0] <= previous[-1]: # If it’s overlapping, then merge them into one interval;
            previous[-1] = max(previous[-1], current[-1])
        else: # otherwise, push it in the stack.
            merged.append(current)

    return merged

##### Testing code:
A_list = []

# Example from original question:
input_array = [[-25, -14], [-21, -16], [-20, -15], [-10, -7], [-8, -5], [-6, -3], [2, 4], [2, 3], [3, 6], [12, 15], [13, 18], [14, 17], [22, 27], [25, 30], [26, 29]]
A_list.append(input_array)

# Example with unsorted elements and sub-elements:
input_array = [[4, 2], [3, 1]]
A_list.append(input_array)

for preserve_input in [False, True]:
    print(f'==> preserve_input = {preserve_input}')
    for idx, a in enumerate(A_list):
        print('------> idx = ', idx)
        input_array = copy.deepcopy(a)
        print('input before:', input_array)
        output_array = mergeIntervals(input_array, preserve_input=preserve_input)
        print('input after:', input_array)
        print('output', output_array)

        if input_array != a:
            print('Input modified!')
            if preserve_input:
                raise
        else:
            print('No change in input.')

输出:

请特别注意第一个示例中的[22, 27]在末尾附近变为[22, 30]

==> preserve_input = False
------> idx =  0
input before: [[-25, -14], [-21, -16], [-20, -15], [-10, -7], [-8, -5], [-6, -3], [2, 4], [2, 3], [3, 6], [12, 15], [13, 18], [14, 17], [22, 27], [25, 30], [26, 29]]
input after: [[-25, -14], [-21, -16], [-20, -15], [-10, -3], [-8, -5], [-6, -3], [2, 6], [2, 3], [3, 6], [12, 18], [13, 18], [14, 17], [22, 30], [25, 30], [26, 29]]
output [[-25, -14], [-10, -3], [2, 6], [12, 18], [22, 30]]
Input modified!
------> idx =  1
input before: [[4, 2], [3, 1]]
input after: [[1, 4], [2, 4]]
output [[1, 4]]
Input modified!
==> preserve_input = True
------> idx =  0
input before: [[-25, -14], [-21, -16], [-20, -15], [-10, -7], [-8, -5], [-6, -3], [2, 4], [2, 3], [3, 6], [12, 15], [13, 18], [14, 17], [22, 27], [25, 30], [26, 29]]
input after: [[-25, -14], [-21, -16], [-20, -15], [-10, -7], [-8, -5], [-6, -3], [2, 4], [2, 3], [3, 6], [12, 15], [13, 18], [14, 17], [22, 27], [25, 30], [26, 29]]
output [[-25, -14], [-10, -3], [2, 6], [12, 18], [22, 30]]
No change in input.
------> idx =  1
input before: [[4, 2], [3, 1]]
input after: [[4, 2], [3, 1]]
output [[1, 4]]
No change in input.

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接