26. Remove Duplicates from Sorted Array

題目 / Problem

中文： 給你一個按 非遞減順序 排序的整數陣列 nums，請你原地刪除重複出現的元素，使每個元素 只出現一次，返回刪除後陣列的新長度 k。元素的 相對順序 應該保持一致。

要求： - 不要使用額外的陣列空間，必須在原地修改輸入陣列。 - 修改完成後，nums 的前 k 個元素應該按排序順序包含所有不重複的元素，後面的內容可以是任意值。

English: Given an integer array nums sorted in non-decreasing order, remove the duplicates in-place so that each unique element appears only once. The relative order of the elements should be kept the same. Return k, the number of unique elements. The first k slots of nums must hold the unique values in sorted order; anything after index k - 1 is ignored by the judge.

Constraints: - 1 <= nums.length <= 3 * 10^4 - -100 <= nums[i] <= 100 - nums is sorted in non-decreasing order.

Example:

Input:  nums = [0,0,1,1,1,2,2,3,3,4]
Output: 5, nums = [0,1,2,3,4,_,_,_,_,_]

The function returns k = 5, and the first five slots of nums are 0,1,2,3,4. The trailing slots (_) can be anything.

名詞解釋 / Glossary

In-place / 原地修改：直接在原本的陣列上做修改，而不是另外開一個新陣列。題目要求空間複雜度是 O(1) 的額外空間。 / Modifying the input array directly without allocating a new array — extra space must be O(1).
Two pointers / 雙指針：用兩個索引變數同時掃描陣列。常見模式包括「快慢指針」（兩個都從左走，但速度不同）和「對撞指針」（一個從左、一個從右）。本題用快慢指針。 / A technique using two index variables to traverse the array. The "fast/slow" variant (both moving left-to-right at different speeds) is what we use here.
Write pointer / 寫入指針：慢指針，指向「下一個要寫入唯一值的位置」。也就是目前已經整理好的唯一元素區段的長度。 / The slow pointer — it always points to the next slot where a new unique value should be written. Equivalently, it equals the count of unique values found so far.
Read pointer / 讀取指針：快指針，掃過整個陣列，負責檢查每個元素是不是新的唯一值。 / The fast pointer that scans every element to check whether it is a new unique value.
Sorted (non-decreasing) / 排序（非遞減）：相鄰元素滿足 a[i] <= a[i+1]。重要性質：所有相同的數一定連續出現，所以只要比較相鄰兩個元素就能判斷是否為重複。 / Adjacent elements satisfy a[i] <= a[i+1]. Crucially, all duplicates of the same value sit consecutively, so checking neighbors is enough to detect them.
Time / Space complexity / 時間與空間複雜度：分別衡量演算法跑多久、用多少額外記憶體（不算輸入本身）。O(n) 表示與輸入大小成正比。 / Measures of how runtime and extra memory grow with the input size n. O(n) means linear, O(1) means constant.

思路

因為陣列已經是排序好的，所有重複的元素一定是連續出現的，例如 [0,0,1,1,1,2,...]。最直觀的暴力做法是：每讀到一個元素就和前面所有保留的元素比較，若沒出現過就加入；但這樣是 O(n²)，而且若用一個新陣列就違反了「原地修改」的要求。我們需要更聰明的方法：利用「排序」這個性質——既然重複的元素都黏在一起，那只要比較目前的元素和「上一個保留下來的元素」是否相同就好。於是引入 雙指針（快慢指針）：慢指針 k 指向「下一個要寫入唯一值的位置」，快指針 i 從頭到尾掃過整個陣列。對於每個 nums[i]，如果它和最後一個保留的元素 nums[k-1] 不同，代表它是一個新的唯一值，就把它寫到 nums[k] 並把 k 加 1；否則跳過。第一個元素一定是唯一的，所以 k 從 1 開始，i 從 1 開始。最後 k 就是答案，且 nums[0..k-1] 已經是去重後的結果。整個流程只掃一次陣列，時間 O(n)，額外空間 O(1)。

Since the array is already sorted, every group of duplicates sits as a consecutive block, like [0,0,1,1,1,2,...]. A naive approach would compare each element to every kept element, which is O(n²) and typically needs a second array — neither acceptable here. The key insight is to exploit the sortedness: a value is a duplicate if and only if it equals the previous element. This unlocks the two-pointer (fast/slow) pattern. The slow pointer k marks the next write position (and equals the count of unique values found so far); the fast pointer i scans every element. Whenever nums[i] != nums[k-1], we have discovered a brand-new unique value, so we copy it into nums[k] and advance k. Otherwise we skip. The first element is trivially unique, so we start with k = 1 and i = 1. The invariant — nums[0..k-1] always contains the unique values seen so far in sorted order — holds throughout the loop. After one linear pass, k is the answer. Time O(n), extra space O(1).

逐步走查 / Walkthrough

Input: nums = [0,0,1,1,1,2,2,3,3,4], length n = 10. Initial state: k = 1 (the first element 0 is already unique and parked at index 0), i = 1.

step / 步驟	i	nums[i]	nums[k-1]	equal? / 相等？	action / 動作	k after	nums (前 k 項 / first k slots)
0	—	—	—	—	initial / 初始	1	`[0]`
1	1	0	0	yes	skip / 跳過	1	`[0]`
2	2	1	0	no	write `nums[1]=1`, k++ / 寫入並前進	2	`[0,1]`
3	3	1	1	yes	skip	2	`[0,1]`
4	4	1	1	yes	skip	2	`[0,1]`
5	5	2	1	no	write `nums[2]=2`, k++	3	`[0,1,2]`
6	6	2	2	yes	skip	3	`[0,1,2]`
7	7	3	2	no	write `nums[3]=3`, k++	4	`[0,1,2,3]`
8	8	3	3	yes	skip	4	`[0,1,2,3]`
9	9	4	3	no	write `nums[4]=4`, k++	5	`[0,1,2,3,4]`

Loop ends. Return k = 5. The first 5 slots of nums are [0,1,2,3,4]. ✅

Solution — C

/*
 * Algorithm / 演算法：
 *   Two pointers (fast/slow) on a sorted array.
 *   k = next write slot; i scans every element.
 *   Write nums[i] into nums[k] only when it differs from nums[k-1].
 *   雙指針：k 指向下一個寫入位置，i 掃過所有元素，遇到新值才寫入。
 */

int removeDuplicates(int* nums, int numsSize) {
    // 邊界情況：空陣列直接回傳 0 / Edge case: empty array → answer is 0.
    // 題目保證 numsSize >= 1，但寫上去比較保險。
    if (numsSize == 0) return 0;

    // k 是慢指針，代表「目前已經放好的唯一元素個數」，也是下一個要寫入的位置。
    // k is the slow pointer: count of unique elements stored so far AND the next write index.
    // 第一個元素一定是唯一的，所以 k 從 1 開始（nums[0] 自己就佔一個位置）。
    // The first element is trivially unique, so start k at 1 (nums[0] is already in place).
    int k = 1;

    // i 是快指針，從索引 1 開始掃到結尾 / Fast pointer i scans from index 1 to the end.
    for (int i = 1; i < numsSize; i++) {
        // 因為陣列已排序，重複值一定相鄰；只需比較目前值和「最後保留的值」nums[k-1]。
        // Since the array is sorted, duplicates are consecutive — compare nums[i] to nums[k-1].
        if (nums[i] != nums[k - 1]) {
            // 發現新的唯一值：把它寫到 nums[k]，再讓 k 前進一格。
            // Found a new unique value: copy it to nums[k], then advance k.
            // 註：nums[i] 和 nums[k] 可能是同一格（當 k == i），這時等於什麼都沒做，仍正確。
            // Note: when k == i, this is a self-assignment — harmless and still correct.
            nums[k] = nums[i];
            k++;  // 寫入後寫入位置往右移一格 / Move the write slot one step right.
        }
        // 若 nums[i] == nums[k-1]，代表是重複，直接跳過（i 自然會在下一輪前進）。
        // Otherwise it's a duplicate — skip it; the for-loop advances i automatically.
    }

    // 回傳唯一元素的個數；nums[0..k-1] 即為去重後的結果。
    // Return the count of unique elements; nums[0..k-1] holds the deduplicated values.
    return k;
}

Solution — C++

/*
 * Algorithm / 演算法：
 *   Same fast/slow two-pointer scan as the C version.
 *   Invariant: nums[0..k-1] always contains the unique values seen so far, in sorted order.
 *   不變量：nums[0..k-1] 永遠存放目前已遇到的所有唯一值，且仍維持排序。
 */

#include <vector>   // std::vector：動態陣列容器 / dynamic array container from the STL
using namespace std;

class Solution {
public:
    // LeetCode 簽章：傳入 vector<int>& （by reference 表示直接修改原陣列，不是複製）。
    // LeetCode signature: vector<int>& means "pass by reference" — we mutate the caller's array.
    int removeDuplicates(vector<int>& nums) {
        // .empty() 是 vector 的成員函式，O(1) 判斷是否為空。
        // .empty() is a vector member function returning true iff size() == 0; O(1).
        if (nums.empty()) return 0;

        // k：慢指針，下一個寫入位置；同時等於目前唯一元素的數量。
        // k: slow pointer / next write index / count of uniques so far.
        // 用 size_t（無號整數型別）以匹配 vector::size() 的回傳型別，避免有號/無號比較警告。
        // Use size_t to match vector::size()'s return type and avoid signed/unsigned warnings.
        size_t k = 1;

        // 範圍限制 for 迴圈：i 從 1 跑到 nums.size() - 1。
        // Classic indexed for-loop: i goes from 1 up to nums.size() - 1.
        for (size_t i = 1; i < nums.size(); ++i) {
            // 比較目前元素和最後一個已保留的唯一值。
            // Compare current element with the last kept unique value.
            if (nums[i] != nums[k - 1]) {
                // 發現新值，寫入並前進寫入指針。
                // New unique value found — write it and advance the write pointer.
                nums[k] = nums[i];
                ++k;  // 前置 ++ 在這裡和後置 ++ 等效，但前置習慣上略快（雖然對 int 沒差）。
                      // Pre-increment is idiomatic in C++ (matters for iterators, not for int).
            }
            // 否則是重複，略過。/ Otherwise it's a duplicate — skip.
        }

        // 回傳值要轉回 int 以符合題目簽章 / Cast back to int to match the required return type.
        return static_cast<int>(k);
    }
};

複雜度 / Complexity

Time: O(n) — 我們只用一個迴圈把陣列從頭到尾掃一遍，每個元素最多做 1 次比較和 1 次寫入。n 是 nums 的長度。 / We perform a single linear scan; each element is visited once, with O(1) work per element. n is the length of nums.
Space: O(1) — 只用了兩個整數變數 i 和 k，沒有額外配置任何陣列或容器。 / Only two integer variables (i and k) — no auxiliary arrays or containers.

Pitfalls & Edge Cases

回傳值 vs 陣列長度 / Return value vs array length：題目要回傳的是「唯一元素的數量 k」，不是陣列總長度 numsSize。判題程式只會檢查 nums[0..k-1]，後面的格子可以是任何值。 / Return k, not numsSize. The judge only inspects nums[0..k-1]; trailing slots are ignored.
單一元素 / Single element：numsSize == 1 時，k 從 1 開始，迴圈不會執行，直接回傳 1，正確。 / When the array has one element, the loop body never runs and we correctly return 1.
全部相同 / All duplicates：例如 [2,2,2,2]，每次比較都相等，k 永遠是 1，回傳 1，前 1 格是 [2]，正確。 / Inputs like [2,2,2,2] end with k = 1 and nums[0] = 2, as required.
全部不同 / All distinct：例如 [1,2,3,4]，每次都會寫入，但因為 k == i，寫入是「自己寫給自己」，無副作用，回傳 4，正確。 / Inputs like [1,2,3,4] perform a self-copy on each iteration (k == i), which is safe.
不要用 nums[i] != nums[i-1] / Don't compare against nums[i-1]：那是另一種寫法，但搭配寫入時要小心邏輯，因為「最後寫入的值」其實是 nums[k-1]，而不是 nums[i-1]。本題用 nums[k-1] 為基準比較最直接。 / A common alternative is comparing nums[i] to nums[i-1]. It works but is slightly more error-prone; comparing to nums[k-1] makes the invariant obvious.
Off-by-one on k：k 同時扮演「計數」和「下一個寫入位置」兩個角色，這只在「先寫入再 k++」的順序下成立。若先 k++ 再寫入會錯位一格。 / k doubles as both the count and the next write index — this only works if you write first then increment. Reversing the order causes an off-by-one bug.
空陣列保護 / Guard against empty input：題目雖保證 numsSize >= 1，但 if (numsSize == 0) return 0; 可避免 nums[k-1] 在 k = 1 時越界（雖然在保證下不會發生，仍是良好習慣）。 / The constraints guarantee n >= 1, but the empty-input guard prevents nums[k-1] from being read when no element exists — a defensive habit worth keeping.