219. Contains Duplicate II
題目 / Problem
中文: 給定一個整數陣列 nums 和一個整數 k,請判斷陣列中是否存在兩個不同的索引 i 和 j,使得 nums[i] == nums[j] 並且 abs(i - j) <= k。如果存在,回傳 true,否則回傳 false。換句話說:是否有兩個「值相同」的元素,它們的位置間距不超過 k?
English: Given an integer array nums and an integer k, return true if there exist two distinct indices i and j such that nums[i] == nums[j] and abs(i - j) <= k. In plain words: are there two equal values whose positions are at most k apart?
Constraints / 限制條件:
- 1 <= nums.length <= 10^5
- -10^9 <= nums[i] <= 10^9(數值可正可負,範圍很大)
- 0 <= k <= 10^5
Worked example / 範例:
- Input: nums = [1,2,3,1], k = 3
- Output: true
- 理由 / Why: 索引 0 和索引 3 的值都是 1,而 abs(0 - 3) = 3 <= 3,符合條件。 / Indices 0 and 3 both hold 1, and abs(0-3)=3 <= 3.
名詞解釋 / Glossary
- 雜湊表 / hash map (hash table): 一種把「鍵 key」對應到「值 value」的資料結構,查詢、插入、更新平均只要 O(1) 時間。這裡我們用它記住「某個數值最後一次出現在哪個索引」。 / A structure mapping a key to a value with average O(1) lookup/insert/update. Here it remembers the most recent index where each value appeared.
- 索引 / index: 元素在陣列中的位置編號,從
0開始算。 / The position number of an element in the array, starting at0. - 滑動視窗 / sliding window: 一種只關注「最近
k個元素」的思考方式;當我們往右移動時,太舊的元素就會離開視窗。 / A technique that only cares about the most recentkelements; as we move right, stale elements fall out of the window. - 暴力解 / brute force: 不講技巧、直接檢查所有可能組合的做法,通常正確但很慢。 / The naive approach that checks every possible pair; correct but slow.
- 絕對值 / absolute value (
abs): 一個數去掉正負號後的大小,例如abs(-3) = 3。 / The magnitude of a number ignoring sign, e.g.abs(-3)=3.
思路
中文:最直接的想法是暴力解——對每一對索引 (i, j) 都檢查 nums[i] == nums[j] 且 abs(i - j) <= k。但這需要兩層迴圈,時間複雜度是 O(n²),當 n 高達 10⁵ 時會超時。我們需要更聰明的做法。關鍵觀察是:題目只關心「相同的值」之間的「最近距離」。對於任何一個值,如果它出現很多次,只有「相鄰的兩次出現」最有可能滿足 abs(i - j) <= k;更遠的兩次反而距離更大。所以我們只需要對每個值記住「它最後一次出現的索引」即可。我們用一個雜湊表 lastSeen,鍵是數值、值是該數值最後出現的索引。從左到右掃描陣列:對於目前的 nums[i],先看雜湊表裡有沒有這個值;如果有,代表之前出現過,計算現在的索引 i 和上次索引的差距,如果 <= k 就立刻回傳 true。不論有沒有命中,最後都把 lastSeen[nums[i]] 更新成目前的 i,因為下一次再遇到同樣的值時,跟「最近一次」比距離才是最小的。掃完都沒找到就回傳 false。這個做法只掃一遍,時間 O(n)。
English: The naive idea is brute force: for every pair of indices (i, j), check whether the values match and the distance is within k. That's two nested loops, O(n²), which times out when n reaches 10⁵. The key insight is that for any repeated value, only the closest pair of occurrences has a chance of satisfying abs(i - j) <= k — occurrences farther apart are only worse. So for each value we only need to remember its most recent index. We keep a hash map lastSeen mapping a value to the last index where we saw it. Scanning left to right, for the current nums[i] we check the map: if the value is already there, the gap between i and its stored index is the smallest possible gap so far, so if that gap is <= k we immediately return true. Whether or not it matched, we then overwrite lastSeen[nums[i]] = i, because the next time this value appears, comparing against the latest index gives the smallest distance. If the scan finishes with no hit, return false. One pass, O(n) time.
逐步走查 / Walkthrough
Example: nums = [1,2,3,1], k = 3. We keep a hash map lastSeen (value → last index), starting empty.
| Step / 步驟 | i | nums[i] | lastSeen before / 之前 |
In map? / 表中有嗎 | Check / 檢查 | Action / 動作 |
|---|---|---|---|---|---|---|
| 1 | 0 | 1 | {} |
No / 沒有 | — | 存入 lastSeen[1]=0 / store 1→0 |
| 2 | 1 | 2 | {1:0} |
No / 沒有 | — | 存入 lastSeen[2]=1 / store 2→1 |
| 3 | 2 | 3 | {1:0, 2:1} |
No / 沒有 | — | 存入 lastSeen[3]=2 / store 3→2 |
| 4 | 3 | 1 | {1:0, 2:1, 3:2} |
Yes, last index 0 / 有,上次是 0 | abs(3-0)=3 <= 3 ✅ |
return true |
在第 4 步,值 1 之前在索引 0 出現過,距離 3 不超過 k=3,立即回傳 true。 / At step 4, value 1 was last seen at index 0; the gap 3 is within k=3, so we return true.
Solution — C
// 演算法 / Algorithm:
// 用雜湊表記住「每個數值最後出現的索引」,從左到右掃一遍;
// 遇到重複值時,比較它和上次索引的距離是否 <= k。
// Use a hash map storing each value's last index; scan once and check the gap.
//
// LeetCode 沒有內建雜湊表,這裡用「開放定址法」自己實作一個簡單的雜湊表。
// LeetCode's C has no hash map, so we hand-roll one using open addressing.
#include <stdbool.h> // 提供 bool / true / false 型別 / gives us bool, true, false
#include <stdlib.h> // 提供 malloc, free / for malloc and free
#include <string.h> // 提供 memset / for memset
bool containsNearbyDuplicate(int* nums, int numsSize, int k) {
// 雜湊表容量取大於資料量的 2 的次方,減少碰撞 / capacity > n, power of two, fewer collisions
int cap = 1; // 容量從 1 開始 / start capacity at 1
while (cap < numsSize * 2) cap <<= 1; // 左移一位 = 乘 2,直到夠大 / shift left = ×2 until big enough
int mask = cap - 1; // cap 是 2 次方時,x & mask 等同 x % cap / fast modulo via bitmask
// keys 存數值,idxs 存對應的索引,used 標記某格是否被佔用
// keys holds values, idxs holds their indices, used marks whether a slot is taken
int* keys = (int*) malloc(sizeof(int) * cap); // 配置記憶體 / allocate memory
int* idxs = (int*) malloc(sizeof(int) * cap); // 配置記憶體 / allocate memory
bool* used = (bool*) calloc(cap, sizeof(bool)); // calloc 會把記憶體清成 0/false / zero-initialized
bool answer = false; // 預設答案為 false / default answer
for (int i = 0; i < numsSize; i++) { // 逐一掃描每個元素 / scan each element
int v = nums[i]; // 目前的數值 / current value
// 計算起始雜湊槽;轉成 unsigned 避免負數取模出問題 / hash slot; cast to unsigned to avoid negative issues
unsigned int h = ((unsigned int) v) & mask;
// 線性探測:若該槽被佔用且 key 不同,就往後找下一個空槽或相同 key
// Linear probing: if slot is taken by a different key, move to the next slot
while (used[h] && keys[h] != v) {
h = (h + 1) & mask; // 往後移一格並繞回開頭 / step forward, wrap around
}
if (used[h]) { // 找到相同的值,代表之前出現過 / value seen before
if (i - idxs[h] <= k) { // 因 i 一定比舊索引大,不需 abs / i is always larger, so no abs needed
answer = true; // 距離在範圍內,找到答案 / within range, found it
break; // 提早結束迴圈 / stop early
}
idxs[h] = i; // 否則更新成最新索引,下次比較才最近 / refresh to newest index
} else { // 此值第一次出現 / first time seeing this value
used[h] = true; // 標記此槽已佔用 / mark slot as used
keys[h] = v; // 存入數值 / store the value
idxs[h] = i; // 存入索引 / store the index
}
}
free(keys); // 釋放記憶體,避免外洩 / free memory to avoid leaks
free(idxs); // 同上 / same
free(used); // 同上 / same
return answer; // 回傳最終結果 / return the result
}
Solution — C++
// 演算法 / Algorithm:
// 用 unordered_map 記住每個數值最後出現的索引,掃描一遍;
// 遇到重複且距離 <= k 即回傳 true。
// Use unordered_map to store each value's last index; one pass, return true on a close repeat.
#include <vector> // 提供 std::vector / for std::vector
#include <unordered_map> // 提供 std::unordered_map(雜湊表)/ for std::unordered_map (a hash map)
using namespace std;
class Solution {
public:
bool containsNearbyDuplicate(vector<int>& nums, int k) {
// unordered_map:鍵=數值,值=該數值最後出現的索引;平均 O(1) 查詢
// unordered_map: key = value, value = its last index; average O(1) lookups
unordered_map<int, int> lastSeen;
// range-for + index:用傳統 for 才能拿到索引 i / classic for loop so we have index i
for (int i = 0; i < (int) nums.size(); i++) {
int v = nums[i]; // 目前數值 / current value
// find 回傳一個迭代器;若等於 end() 代表沒找到 / find returns an iterator; end() means not found
auto it = lastSeen.find(v);
if (it != lastSeen.end()) { // 這個值之前出現過 / this value appeared before
// it->second 是儲存的舊索引;i 一定較大,故不需 abs / it->second is the old index; i is larger
if (i - it->second <= k) {
return true; // 距離在 k 內,找到答案 / within k, found it
}
it->second = i; // 更新成最新索引 / update to the newest index
} else {
lastSeen[v] = i; // 第一次出現,記下索引 / first occurrence, record index
}
}
return false; // 掃完都沒找到 / nothing found after the full scan
}
};
複雜度 / Complexity
- Time: O(n) — 我們只把陣列掃描一遍(
n是nums的長度),每個元素的雜湊表查詢與更新平均都是 O(1),所以總時間由這一遍掃描主導。 / We scan the array once (n= length ofnums); each hash-map lookup/update is O(1) on average, so the single pass dominates. - Space: O(min(n, k+1)) — 雜湊表最多存下不同的數值。實作上最壞是 O(n)(全部不重複);但概念上只需保留視窗內的值,可視為 O(min(n, k))。 / The hash map stores at most the distinct values seen; worst case O(n) when all values differ. Conceptually only window-sized state is needed.
Pitfalls & Edge Cases
- 不需要
abs/ Noabsneeded: 因為我們由左往右掃描,目前的索引i一定大於存在表裡的舊索引,所以i - oldIndex必為正,直接比<= k即可。 / Because we scan left-to-right,iis always greater than the stored index, so the difference is positive — no need to callabs. - 記得更新最新索引 / Always refresh to the newest index: 若值重複但距離超過
k,必須把索引更新成目前的i,否則之後比較的是更舊、更遠的索引,會漏掉正確答案。 / When a repeat is too far, you must still update the stored index toi; otherwise later comparisons use a stale, farther index and miss valid answers. - 數值範圍大、可為負 / Large and negative values:
nums[i]範圍是 ±10⁹,不能拿來當陣列下標。雜湊表(或 C 裡的雜湊取模)正好處理任意整數鍵。 / Values span ±10⁹ and can be negative, so they can't index a plain array; a hash map handles arbitrary integer keys. k = 0的情況 / Thek = 0case: 此時要求abs(i-j) <= 0,即i == j,但i和j必須是不同索引,所以永遠回傳false。我們的程式自然處理:距離至少為 1,不會<= 0。 / Withk = 0the gap must be 0 but indices must differ, so the answer is alwaysfalse; our code handles this since any real gap is>= 1.- 整數溢位 / Overflow: 索引相減
i - oldIndex最多約 10⁵,遠在int範圍內,無溢位風險。 / The index difference is at most ~10⁵, safely withinint.