80. Remove Duplicates from Sorted Array II
題目 / Problem
中文:
給定一個按 非遞減順序 排序的整數陣列 nums,請 原地 移除部分重複元素,使每個唯一元素 最多出現兩次。元素的相對順序必須保持不變。
由於某些語言無法改變陣列長度,你必須將結果放在陣列 nums 的 前段。具體來說,若移除重複後剩下 k 個元素,則 nums 的前 k 個元素應為最終結果。k 之後的內容是什麼都無所謂。
回傳 k。要求 不可 額外開新陣列,必須使用 O(1) 額外空間就地修改。
English:
Given an integer array nums sorted in non-decreasing order, remove some duplicates in-place so that each unique element appears at most twice, preserving the relative order. Place the result in the first k slots of nums and return k. You must use O(1) extra memory.
Constraints:
- 1 <= nums.length <= 3 * 10^4
- -10^4 <= nums[i] <= 10^4
- nums is sorted in non-decreasing order.
Example:
Input: nums = [1,1,1,2,2,3]
Output: 5, with nums = [1,1,2,2,3,_]
名詞解釋 / Glossary
- In-place / 原地修改:直接在輸入資料結構上修改,不開新陣列;只允許常數額外空間。Modifying the input directly without allocating a new array; only O(1) extra memory is allowed.
- Two pointers / 雙指針:用兩個索引同時掃描陣列。常見的一種是「快慢指針」:快指針
i讀取資料,慢指針k標記下一個寫入位置。Two indices traversing the array. A common form is "fast/slow": fast indexireads, slow indexkmarks the next write slot. - Write pointer / 寫入指標:慢指針的角色,指向目前可以寫入結果的位置。寫入後遞增。The slow pointer's role; points to the next slot where a kept element should be written, then advances.
- Sorted array invariant / 已排序陣列的不變量:相同元素必定相鄰。這讓我們不需要 hash 計數,只要看「往前 2 格」是否相同就能判斷重複次數。Equal elements are guaranteed to be adjacent, so we can detect duplicates by comparing with elements 1 or 2 slots earlier instead of using a hash map.
- k-th occurrence rule / 第 k 次出現規則:本題上限是 2,所以判斷條件是「目前元素是否與
nums[k-2]不同」;若不同代表此值出現次數仍 < 2,可以保留。The "at most twice" rule turns into the comparisonnums[i] != nums[k-2]; if true, this value has appeared fewer than twice so far and we keep it.
思路
最直觀的暴力做法是:開一個新陣列,計算每個值的出現次數,最多複製兩份回去。可是題目要求 O(1) 額外空間,所以不能新開陣列。第二個直覺是「邊掃邊用 hash map 計次」,但既然輸入已經排序,相同的值必定相鄰,根本不需要 hash 表。關鍵觀察是:我們只需要兩個指標。慢指標 k 代表「下一個合法元素該寫到哪裡」,快指標 i 從頭掃到尾讀資料。對每個 nums[i],問一個簡單的問題:如果我把它寫到位置 k,這個值在前面(合法區間 nums[0..k-1])會不會已經出現兩次?由於前面已經是排好序、最多重複兩次的合法狀態,要檢查只需要看 nums[k-2]:若 nums[i] != nums[k-2],代表此值還沒滿兩次,可以寫;否則跳過。前兩個元素 (k < 2) 一定可以無條件寫入,因為任何值最多出現兩次的限制此時還不可能違反。最後 k 就是答案。這個演算法 O(n) 時間、O(1) 空間,且只用一次掃描。
The brute-force idea is to allocate a new array and copy at most two of each value back, but the problem forbids extra space. A second instinct — using a hash map for counts — is also overkill, because the input is already sorted, so equal values are always adjacent. The key insight is that we only need two pointers. The slow pointer k marks where the next kept element should be written, and the fast pointer i scans the input. For each nums[i], ask: if I write it to position k, would this value already appear twice in the kept prefix nums[0..k-1]? Since the prefix is itself sorted and follows the "at most twice" rule, we just compare with nums[k-2]: if nums[i] != nums[k-2], the value has appeared fewer than two times and we keep it; otherwise we skip. The first two elements (k < 2) are always safe to keep, because no value can have exceeded the limit yet. At the end k is the answer. The algorithm runs in O(n) time and O(1) space in a single pass.
逐步走查 / Walkthrough
Input: nums = [1,1,1,2,2,3], initial k = 0.
| i | nums[i] | k before | Condition | Action | nums after | k after |
|---|---|---|---|---|---|---|
| 0 | 1 | 0 | k < 2 → keep | write nums[0] = 1 | [1,1,1,2,2,3] | 1 |
| 1 | 1 | 1 | k < 2 → keep | write nums[1] = 1 | [1,1,1,2,2,3] | 2 |
| 2 | 1 | 2 | nums[2]=1, nums[k-2]=nums[0]=1 → equal, skip | (no write) | [1,1,1,2,2,3] | 2 |
| 3 | 2 | 2 | nums[3]=2, nums[k-2]=nums[0]=1 → differ, keep | write nums[2] = 2 | [1,1,2,2,2,3] | 3 |
| 4 | 2 | 3 | nums[4]=2, nums[k-2]=nums[1]=1 → differ, keep | write nums[3] = 2 | [1,1,2,2,2,3] | 4 |
| 5 | 3 | 4 | nums[5]=3, nums[k-2]=nums[2]=2 → differ, keep | write nums[4] = 3 | [1,1,2,2,3,3] | 5 |
Final k = 5, prefix [1,1,2,2,3] ✅
Solution — C
// 演算法:用慢指針 k 當寫入位置;對每個 nums[i],若 k<2 或 nums[i]!=nums[k-2] 就保留。
// Algorithm: slow pointer k is the write index; keep nums[i] iff k<2 or nums[i]!=nums[k-2].
// 因為陣列已排序,前面區段最多出現兩次的不變量讓 nums[k-2] 比較足以判斷重複次數。
// Because the array is sorted, comparing with nums[k-2] is enough to enforce "at most twice".
int removeDuplicates(int* nums, int numsSize) {
// k 是下一個寫入位置,同時也是目前合法前綴的長度 / k = next write slot, also current kept length
int k = 0;
// 用 i 從頭掃到尾讀取每個元素 / iterate i from 0 to numsSize-1, reading each element
for (int i = 0; i < numsSize; i++) {
// 條件 1:前兩個元素無條件寫入(不可能違反「最多兩次」)
// Condition 1: the first two elements are always safe to keep
// 條件 2:若 nums[i] 與目前合法前綴倒數第二個 nums[k-2] 不同,代表此值尚未滿兩次
// Condition 2: if nums[i] differs from nums[k-2], this value has appeared fewer than twice
if (k < 2 || nums[i] != nums[k - 2]) {
// 把當前元素寫入位置 k;nums[i] 是讀,nums[k] 是寫 / write nums[i] into slot k
nums[k] = nums[i];
// 寫入後 k 前進一格 / advance write pointer after the write
k++;
}
// 若條件不成立就跳過此元素(相當於刪除)/ otherwise skip — effectively deletes this duplicate
}
// 回傳合法前綴的長度,即答案 k / return the length of the kept prefix
return k;
}
Solution — C++
// 演算法與 C 版完全相同:兩個指針,k 為寫入位置,逐一檢查 nums[i] 是否與 nums[k-2] 相同。
// Algorithm identical to the C version: two pointers, k as write index, compare with nums[k-2].
// 用 std::vector<int>& 接收參數;vector 是 C++ 的動態陣列,operator[] 提供 O(1) 隨機存取。
// We take a std::vector<int>& — vector is C++'s dynamic array with O(1) indexed access via [].
class Solution {
public:
int removeDuplicates(vector<int>& nums) {
// size() 回傳容器內元素個數,型別為 size_t(無號整數)/ size() returns element count (size_t)
int n = static_cast<int>(nums.size());
// k 是慢指針,代表下一個寫入位置 / k is the slow pointer = next write slot
int k = 0;
// range 不可變、簡單明瞭時用一般 for;此處需要索引 i,所以用傳統 for 迴圈
// Use a classic for-loop because we need the index i (not just the value)
for (int i = 0; i < n; i++) {
// 同樣的判斷:前兩個直接放,之後比 nums[k-2] / same rule as C version
if (k < 2 || nums[i] != nums[k - 2]) {
// 寫入:vector 的 [] 與 C 陣列一樣可以直接賦值 / vector[] supports direct assignment
nums[k] = nums[i];
// 寫入後 k 自增 / advance k after writing
++k;
}
}
// 回傳合法前綴長度 / return the kept prefix length
return k;
}
};
複雜度 / Complexity
- Time: O(n) — 只用一個 for 迴圈從頭掃到尾,每個元素僅被讀寫常數次。
n是nums的長度。A single pass over the array; each element is read and possibly written once.nis the length ofnums. - Space: O(1) — 只用了
i、k兩個整數變數,沒有額外配置陣列。Only two integer variables (i,k) are used; no auxiliary array is allocated, satisfying the in-place requirement.
Pitfalls & Edge Cases
- 誤用
k-1而非k-2/ Usingk-1instead ofk-2:比nums[k-1]等於要求「每個值只能出現一次」(這是 LeetCode 26 的版本)。本題上限是兩次,所以必須比nums[k-2]。Comparing againstnums[k-1]collapses the rule to "at most once" (problem 26). For "at most twice" you must look two slots back. - 忘了處理 k < 2 / Forgetting the
k < 2guard:若k < 2就直接存取nums[k-2]會讀到nums[-1]或nums[-2],是越界未定義行為。短路求值k < 2 ||在前面,C/C++ 都不會再評估右側。Without thek < 2short-circuit,nums[k-2]becomesnums[-1]ornums[-2]— undefined behavior. The||short-circuits in both C and C++. - 回傳長度 vs. 回傳陣列 / Returning length, not array:題目要求回傳整數
k,並要求nums前k格已被修改。如果只回傳k卻沒實際改寫陣列前段,judge 仍會失敗。You must both modifynumsin place and returnk; doing only one fails the custom judge. - 以為需要排序輸入 / Assuming input may be unsorted:演算法的正確性完全依賴「相同值相鄰」。若輸入未排序,
nums[k-2]比較不再有意義。題目保證已排序,所以可以直接用。Correctness depends on equal values being adjacent. The problem guarantees this; the algorithm would not work on an unsorted input. - 空陣列 / Empty array:題目保證
n >= 1,但本演算法仍能正確處理n == 0:迴圈不執行,回傳k = 0。The constraints rule it out, but the code still handlesn == 0correctly — the loop is skipped andk = 0is returned.