56. Merge Intervals

題目 / Problem

中文： 給定一個區間陣列 intervals，其中 intervals[i] = [startᵢ, endᵢ] 代表第 i 個區間的起點與終點。請合併所有「重疊」的區間，並回傳一個由「互不重疊」的區間組成的陣列，使其涵蓋輸入中所有的原始區間。只要兩個區間有任何接觸（包含端點相等，例如 [1,4] 與 [4,5]），就視為重疊。

English: You are given an array intervals where intervals[i] = [startᵢ, endᵢ]. Merge every pair of overlapping intervals and return an array of non-overlapping intervals that together cover all the original intervals. Two intervals count as overlapping even if they only touch at an endpoint (e.g. [1,4] and [4,5] merge into [1,5]).

Constraints / 限制： - 1 <= intervals.length <= 10^4 - intervals[i].length == 2 - 0 <= startᵢ <= endᵢ <= 10^4

Worked example / 範例：

Input:  [[1,3],[2,6],[8,10],[15,18]]
Output: [[1,6],[8,10],[15,18]]

[1,3] 和 [2,6] 重疊（3 ≥ 2），合併成 [1,6]；其餘區間彼此不重疊，保持原樣。

名詞解釋 / Glossary

區間 / Interval： 一對數字 [start, end]，代表數線上從 start 到 end 的一段範圍。A pair [start, end] describing a segment of the number line from start to end.
重疊 / Overlap： 兩個區間有共同部分。判斷方式：若第二個區間的 start ≤ 第一個區間的 end，它們就重疊（本題端點相等也算）。Two intervals share points; here even touching endpoints counts, so start₂ <= end₁ means overlap.
排序 / Sorting： 把陣列依某個鍵值（這裡是 start）由小到大重新排列。Rearranging the array in increasing order of a key — here, the interval's start value.
比較函式 / Comparator： 傳給排序函式的一個小函式，告訴它「兩個元素誰應該排前面」。回傳負數代表第一個排前面。A small function handed to the sort routine that decides which of two elements comes first.
原地修改 / In-place： 直接在現有陣列上覆寫，而不另開等量的記憶體。Overwriting the existing array instead of allocating a second one of the same size.
malloc / free（C）： malloc 向系統借一塊記憶體並回傳指標；用完要 free 還回去，否則造成記憶體洩漏。malloc requests a block of memory and returns a pointer; free returns it to the system.
指標 / Pointer（C）： 一個存「記憶體位址」的變數。int** 是「指向 int 指標的指標」，常用來表示二維陣列。A variable holding a memory address; int** represents a 2-D array of ints.
vector（C++）： STL 的動態陣列，能自動增長，用 push_back 加元素、.size() 取長度。A growable array from the C++ standard library.

思路

最直覺的暴力法是：拿每一個區間去和其他所有區間比較，發現重疊就合併，然後重複整個過程直到沒有任何兩個區間能再合併為止。問題在於區間的順序是亂的，重疊的兩段可能隔得很遠，所以你必須反覆掃描，最壞情況是 O(n²) 甚至更糟，當 n 高達 10⁴ 時會很慢。關鍵的觀察是：如果我們先依照起點 start 由小到大排序，那麼所有可能和某段重疊的區間，一定都緊鄰在它後面。這樣就只需要從左到右掃一遍。排序後，我們維護一個「結果清單」，先把第一個區間放進去當作「目前正在累積的區間」。接著看下一個區間：因為已排序，它的 start 必定 ≥ 目前區間的 start，所以判斷重疊只要看「它的 start 是否 ≤ 目前區間的 end」。若是，就把目前區間的 end 更新成兩者 end 的較大值（延伸右端點）；若否，代表出現了一個缺口，目前區間已經結束，就把下一個區間當作新的「正在累積的區間」放入結果。這個「目前區間的 end」就是核心不變量——它永遠代表至今合併出的、尚未封閉的那一段的右邊界。

The brute-force idea is to compare every interval against every other one, merge any pair that overlaps, and repeat the whole sweep until nothing changes. The trouble is that the input order is arbitrary, so two overlapping pieces may sit far apart, forcing repeated scans — that's O(n²) or worse, which is sluggish at n = 10⁴. The key insight is that if we sort the intervals by their start value, then any interval that could possibly overlap a given one must come right after it in the sorted order. That lets us solve it in a single left-to-right pass. After sorting, keep a result list and seed it with the first interval as the "current" interval we are extending. For each following interval, since starts are sorted its start is ≥ the current start, so overlap reduces to one check: is its start <= the current end? If yes, extend the current interval's end to the larger of the two ends. If no, there is a gap — the current interval is finished, so append the new one and make it the current interval. The current interval's end is the invariant: it always marks the right edge of the still-open merged segment.

逐步走查 / Walkthrough

Example input: [[1,3],[2,6],[8,10],[15,18]]

Step 0 — 排序 / Sort by start: 已經是排好的順序 → [[1,3],[2,6],[8,10],[15,18]]. Seed result with the first interval.

步驟 / Step	看的區間 / Looking at	目前 end / current end	重疊？ start ≤ end?	動作 / Action	結果 / Result so far
init	`[1,3]`	3	—	放入第一個 / seed	`[[1,3]]`
1	`[2,6]`	3	2 ≤ 3 ✔	合併，end = max(3,6)=6 / merge	`[[1,6]]`
2	`[8,10]`	6	8 ≤ 6 �’✘	有缺口，新增 / append new	`[[1,6],[8,10]]`
3	`[15,18]`	10	15 ≤ 10 ✘	有缺口，新增 / append new	`[[1,6],[8,10],[15,18]]`

最終答案 / Final answer: [[1,6],[8,10],[15,18]] ✅

Solution — C

// 演算法：先依 start 排序，再從左到右掃描；若下一段的 start <= 目前段的 end 就合併
// （延伸右端點），否則開一個新段。排序 O(n log n)，掃描 O(n)。
// Algorithm: sort by start, sweep left to right; if the next start <= current end,
// merge by extending the right end, otherwise begin a new segment.

#include <stdlib.h>  // 提供 malloc / free / qsort / Provides malloc, free, qsort

// 比較函式：告訴 qsort 兩個區間如何排序，依 start 由小到大
// Comparator: tells qsort how to order two intervals, ascending by start
static int cmp(const void *a, const void *b) {
    // a、b 實際上是「指向 int* 的指標」，因為每個元素本身是 int*（一個區間）
    // a and b really point to int* elements (each element is an int* = one interval)
    int *ia = *(int **)a;            // 取出第一個區間的指標 / dereference to first interval
    int *ib = *(int **)b;            // 取出第二個區間的指標 / dereference to second interval
    return ia[0] - ib[0];            // 比較起點；負數表示 ia 排前面 / compare starts
}

// LeetCode 簽名：回傳二維陣列；用 returnSize 回傳列數，returnColumnSizes 回傳每列長度
// LeetCode signature: returns a 2-D array; report row count via returnSize,
// per-row lengths via returnColumnSizes.
int** merge(int** intervals, int intervalsSize, int* intervalsColSize,
            int* returnSize, int** returnColumnSizes) {
    // 先把區間依 start 排序，重疊者就會彼此相鄰
    // Sort intervals by start so overlapping ones become adjacent
    qsort(intervals, intervalsSize, sizeof(int *), cmp);

    // 最多就是輸入的列數（完全不重疊時），先配置這麼多空間
    // At most intervalsSize rows (when nothing merges); allocate that many
    int **result = (int **)malloc(sizeof(int *) * intervalsSize);
    *returnColumnSizes = (int *)malloc(sizeof(int) * intervalsSize);

    int k = 0;  // k 是結果中目前的列數，也是下一個寫入位置 / count of result rows so far

    for (int i = 0; i < intervalsSize; i++) {
        // 判斷是否要合併：k>0 表示已有段；且新 start <= 目前段的 end 代表重疊
        // Merge if there is a current segment (k>0) and new start <= its end
        if (k > 0 && intervals[i][0] <= result[k - 1][1]) {
            // 重疊：延伸右端點為兩者較大值 / overlap: extend end to the larger one
            if (intervals[i][1] > result[k - 1][1]) {
                result[k - 1][1] = intervals[i][1];
            }
        } else {
            // 不重疊：開新段，配置一個有 2 格的陣列存 [start, end]
            // No overlap: start a new segment; allocate a 2-slot array for [start, end]
            result[k] = (int *)malloc(sizeof(int) * 2);
            result[k][0] = intervals[i][0];   // 複製起點 / copy start
            result[k][1] = intervals[i][1];   // 複製終點 / copy end
            (*returnColumnSizes)[k] = 2;      // 這一列長度固定為 2 / this row has 2 columns
            k++;                              // 前進到下一個寫入位置 / advance write slot
        }
    }

    *returnSize = k;  // 回報實際合併後的列數 / report the real number of merged rows
    return result;
}

Solution — C++

// 演算法：依 start 排序後單趟掃描；新段 start <= 結果中最後一段的 end 就合併，
// 否則 push_back 一個新段。Sort by start, then one pass: merge when overlapping,
// otherwise append a new interval.

#include <vector>
#include <algorithm>   // 提供 std::sort 與 std::max / provides std::sort and std::max
using namespace std;

class Solution {
public:
    vector<vector<int>> merge(vector<vector<int>>& intervals) {
        // 依每個區間的 start（即 a[0]）由小到大排序
        // lambda 是匿名比較函式，告訴 sort 誰排前面
        // Sort by start (a[0]); the lambda is an inline comparator for std::sort
        sort(intervals.begin(), intervals.end(),
             [](const vector<int>& a, const vector<int>& b) {
                 return a[0] < b[0];   // a 的起點較小就排前面 / smaller start comes first
             });

        vector<vector<int>> result;  // 動態陣列，存放合併後的區間 / output, grows as needed

        for (const auto& iv : intervals) {   // range-for 逐一取出每個區間 / iterate each interval
            // 若結果為空，或目前區間 start > 最後一段的 end，代表沒重疊 → 新增
            // Empty result, or this start beyond the last end → no overlap, append
            if (result.empty() || iv[0] > result.back()[1]) {
                result.push_back(iv);        // 複製整段加到尾端 / append a copy
            } else {
                // 重疊：把最後一段的 end 延伸為兩者較大值 / overlap: extend the last end
                result.back()[1] = max(result.back()[1], iv[1]);
            }
        }

        return result;  // 回傳合併後互不重疊的區間 / return the merged, non-overlapping intervals
    }
};

複雜度 / Complexity

Time: O(n log n) — 排序佔主導，需要 O(n log n)；之後的單趟掃描只是 O(n)，兩者相加仍是 O(n log n)。n 是區間數量。Sorting dominates at O(n log n); the single sweep afterward is only O(n), so the total stays O(n log n), where n is the number of intervals.
Space: O(n)（或視排序實作為 O(log n) 額外空間）— 結果陣列最壞情況（完全不重疊）要存下全部 n 個區間。排序本身可能用到 O(log n) 的堆疊空間。The result holds up to n intervals when nothing merges; sorting itself may use O(log n) auxiliary stack space.

Pitfalls & Edge Cases

端點相等也算重疊 / Touching endpoints overlap： [1,4] 與 [4,5] 要合併成 [1,5]，所以判斷式必須用 start <= end（含等號），不能用 <。漏掉等號會錯誤地分成兩段。
必須先排序 / Must sort first： 若不排序，重疊區間不相鄰，單趟掃描會漏掉合併。排序是整個演算法成立的前提。Without sorting, overlapping intervals aren't adjacent and the one-pass logic breaks.
延伸 end 要取 max / Extend with max, not blind overwrite： 目前段可能完全包住新段（如目前 [1,10]、新 [2,3]），這時 end 不能被改小。用 max(currentEnd, newEnd) 保護。Use max so a fully-contained interval doesn't shrink the segment.
單一區間 / Single interval： 長度為 1 的輸入（如 [[1,2]]）必須原樣回傳；程式碼透過「先 seed / 結果為空時直接 append」自然處理，無需特例。
C 的記憶體與回傳大小 / C memory & return sizes： 一定要設定 *returnSize 與 *returnColumnSizes，否則 LeetCode 讀不到結果或讀到垃圾值。每列用 malloc 配置 2 格；只有真正新增段時才配置，避免浪費。
不要假設輸入已排序 / Don't assume sorted input： 範例 3 [[4,7],[1,4]] 的起點是遞減的，正凸顯排序的必要性。Example 3 has decreasing starts, showing why sorting is mandatory.