219. Contains Duplicate II

題目 / Problem

中文： 給定一個整數陣列 nums 和一個整數 k，請判斷陣列中是否存在兩個不同的索引 i 和 j，使得 nums[i] == nums[j] 並且 abs(i - j) <= k。如果存在，回傳 true，否則回傳 false。換句話說：是否有兩個「值相同」的元素，它們的位置間距不超過 k？

English: Given an integer array nums and an integer k, return true if there exist two distinct indices i and j such that nums[i] == nums[j] and abs(i - j) <= k. In plain words: are there two equal values whose positions are at most k apart?

Constraints / 限制條件： - 1 <= nums.length <= 10^5 - -10^9 <= nums[i] <= 10^9（數值可正可負，範圍很大） - 0 <= k <= 10^5

Worked example / 範例： - Input: nums = [1,2,3,1], k = 3 - Output: true - 理由 / Why: 索引 0 和索引 3 的值都是 1，而 abs(0 - 3) = 3 <= 3，符合條件。 / Indices 0 and 3 both hold 1, and abs(0-3)=3 <= 3.

名詞解釋 / Glossary

雜湊表 / hash map (hash table)： 一種把「鍵 key」對應到「值 value」的資料結構，查詢、插入、更新平均只要 O(1) 時間。這裡我們用它記住「某個數值最後一次出現在哪個索引」。 / A structure mapping a key to a value with average O(1) lookup/insert/update. Here it remembers the most recent index where each value appeared.
索引 / index： 元素在陣列中的位置編號，從 0 開始算。 / The position number of an element in the array, starting at 0.
滑動視窗 / sliding window： 一種只關注「最近 k 個元素」的思考方式；當我們往右移動時，太舊的元素就會離開視窗。 / A technique that only cares about the most recent k elements; as we move right, stale elements fall out of the window.
暴力解 / brute force： 不講技巧、直接檢查所有可能組合的做法，通常正確但很慢。 / The naive approach that checks every possible pair; correct but slow.
絕對值 / absolute value (abs)： 一個數去掉正負號後的大小，例如 abs(-3) = 3。 / The magnitude of a number ignoring sign, e.g. abs(-3)=3.

思路

中文：最直接的想法是暴力解——對每一對索引 (i, j) 都檢查 nums[i] == nums[j] 且 abs(i - j) <= k。但這需要兩層迴圈，時間複雜度是 O(n²)，當 n 高達 10⁵ 時會超時。我們需要更聰明的做法。關鍵觀察是：題目只關心「相同的值」之間的「最近距離」。對於任何一個值，如果它出現很多次，只有「相鄰的兩次出現」最有可能滿足 abs(i - j) <= k；更遠的兩次反而距離更大。所以我們只需要對每個值記住「它最後一次出現的索引」即可。我們用一個雜湊表 lastSeen，鍵是數值、值是該數值最後出現的索引。從左到右掃描陣列：對於目前的 nums[i]，先看雜湊表裡有沒有這個值；如果有，代表之前出現過，計算現在的索引 i 和上次索引的差距，如果 <= k 就立刻回傳 true。不論有沒有命中，最後都把 lastSeen[nums[i]] 更新成目前的 i，因為下一次再遇到同樣的值時，跟「最近一次」比距離才是最小的。掃完都沒找到就回傳 false。這個做法只掃一遍，時間 O(n)。

English: The naive idea is brute force: for every pair of indices (i, j), check whether the values match and the distance is within k. That's two nested loops, O(n²), which times out when n reaches 10⁵. The key insight is that for any repeated value, only the closest pair of occurrences has a chance of satisfying abs(i - j) <= k — occurrences farther apart are only worse. So for each value we only need to remember its most recent index. We keep a hash map lastSeen mapping a value to the last index where we saw it. Scanning left to right, for the current nums[i] we check the map: if the value is already there, the gap between i and its stored index is the smallest possible gap so far, so if that gap is <= k we immediately return true. Whether or not it matched, we then overwrite lastSeen[nums[i]] = i, because the next time this value appears, comparing against the latest index gives the smallest distance. If the scan finishes with no hit, return false. One pass, O(n) time.

逐步走查 / Walkthrough

Example: nums = [1,2,3,1], k = 3. We keep a hash map lastSeen (value → last index), starting empty.

Step / 步驟	i	nums[i]	`lastSeen` before / 之前	In map? / 表中有嗎	Check / 檢查	Action / 動作
1	0	1	`{}`	No / 沒有	—	存入 `lastSeen[1]=0` / store `1→0`
2	1	2	`{1:0}`	No / 沒有	—	存入 `lastSeen[2]=1` / store `2→1`
3	2	3	`{1:0, 2:1}`	No / 沒有	—	存入 `lastSeen[3]=2` / store `3→2`
4	3	1	`{1:0, 2:1, 3:2}`	Yes, last index 0 / 有，上次是 0	`abs(3-0)=3 <= 3` ✅	return `true`

在第 4 步，值 1 之前在索引 0 出現過，距離 3 不超過 k=3，立即回傳 true。 / At step 4, value 1 was last seen at index 0; the gap 3 is within k=3, so we return true.

Solution — C

// 演算法 / Algorithm:
// 用雜湊表記住「每個數值最後出現的索引」，從左到右掃一遍；
// 遇到重複值時，比較它和上次索引的距離是否 <= k。
// Use a hash map storing each value's last index; scan once and check the gap.
//
// LeetCode 沒有內建雜湊表，這裡用「開放定址法」自己實作一個簡單的雜湊表。
// LeetCode's C has no hash map, so we hand-roll one using open addressing.

#include <stdbool.h>   // 提供 bool / true / false 型別 / gives us bool, true, false
#include <stdlib.h>    // 提供 malloc, free / for malloc and free
#include <string.h>    // 提供 memset / for memset

bool containsNearbyDuplicate(int* nums, int numsSize, int k) {
    // 雜湊表容量取大於資料量的 2 的次方，減少碰撞 / capacity > n, power of two, fewer collisions
    int cap = 1;                          // 容量從 1 開始 / start capacity at 1
    while (cap < numsSize * 2) cap <<= 1; // 左移一位 = 乘 2，直到夠大 / shift left = ×2 until big enough
    int mask = cap - 1;                   // cap 是 2 次方時，x & mask 等同 x % cap / fast modulo via bitmask

    // keys 存數值，idxs 存對應的索引，used 標記某格是否被佔用
    // keys holds values, idxs holds their indices, used marks whether a slot is taken
    int*  keys = (int*)  malloc(sizeof(int)  * cap);  // 配置記憶體 / allocate memory
    int*  idxs = (int*)  malloc(sizeof(int)  * cap);  // 配置記憶體 / allocate memory
    bool* used = (bool*) calloc(cap, sizeof(bool));   // calloc 會把記憶體清成 0/false / zero-initialized

    bool answer = false;                  // 預設答案為 false / default answer

    for (int i = 0; i < numsSize; i++) {  // 逐一掃描每個元素 / scan each element
        int v = nums[i];                  // 目前的數值 / current value
        // 計算起始雜湊槽；轉成 unsigned 避免負數取模出問題 / hash slot; cast to unsigned to avoid negative issues
        unsigned int h = ((unsigned int) v) & mask;

        // 線性探測：若該槽被佔用且 key 不同，就往後找下一個空槽或相同 key
        // Linear probing: if slot is taken by a different key, move to the next slot
        while (used[h] && keys[h] != v) {
            h = (h + 1) & mask;           // 往後移一格並繞回開頭 / step forward, wrap around
        }

        if (used[h]) {                    // 找到相同的值，代表之前出現過 / value seen before
            if (i - idxs[h] <= k) {       // 因 i 一定比舊索引大，不需 abs / i is always larger, so no abs needed
                answer = true;            // 距離在範圍內，找到答案 / within range, found it
                break;                    // 提早結束迴圈 / stop early
            }
            idxs[h] = i;                  // 否則更新成最新索引，下次比較才最近 / refresh to newest index
        } else {                          // 此值第一次出現 / first time seeing this value
            used[h] = true;               // 標記此槽已佔用 / mark slot as used
            keys[h] = v;                  // 存入數值 / store the value
            idxs[h] = i;                  // 存入索引 / store the index
        }
    }

    free(keys);                           // 釋放記憶體，避免外洩 / free memory to avoid leaks
    free(idxs);                           // 同上 / same
    free(used);                           // 同上 / same
    return answer;                        // 回傳最終結果 / return the result
}

Solution — C++

// 演算法 / Algorithm:
// 用 unordered_map 記住每個數值最後出現的索引，掃描一遍；
// 遇到重複且距離 <= k 即回傳 true。
// Use unordered_map to store each value's last index; one pass, return true on a close repeat.

#include <vector>          // 提供 std::vector / for std::vector
#include <unordered_map>   // 提供 std::unordered_map（雜湊表）/ for std::unordered_map (a hash map)
using namespace std;

class Solution {
public:
    bool containsNearbyDuplicate(vector<int>& nums, int k) {
        // unordered_map：鍵=數值，值=該數值最後出現的索引；平均 O(1) 查詢
        // unordered_map: key = value, value = its last index; average O(1) lookups
        unordered_map<int, int> lastSeen;

        // range-for + index：用傳統 for 才能拿到索引 i / classic for loop so we have index i
        for (int i = 0; i < (int) nums.size(); i++) {
            int v = nums[i];                      // 目前數值 / current value

            // find 回傳一個迭代器；若等於 end() 代表沒找到 / find returns an iterator; end() means not found
            auto it = lastSeen.find(v);
            if (it != lastSeen.end()) {           // 這個值之前出現過 / this value appeared before
                // it->second 是儲存的舊索引；i 一定較大，故不需 abs / it->second is the old index; i is larger
                if (i - it->second <= k) {
                    return true;                  // 距離在 k 內，找到答案 / within k, found it
                }
                it->second = i;                   // 更新成最新索引 / update to the newest index
            } else {
                lastSeen[v] = i;                  // 第一次出現，記下索引 / first occurrence, record index
            }
        }

        return false;                             // 掃完都沒找到 / nothing found after the full scan
    }
};

複雜度 / Complexity

Time: O(n) — 我們只把陣列掃描一遍（n 是 nums 的長度），每個元素的雜湊表查詢與更新平均都是 O(1)，所以總時間由這一遍掃描主導。 / We scan the array once (n = length of nums); each hash-map lookup/update is O(1) on average, so the single pass dominates.
Space: O(min(n, k+1)) — 雜湊表最多存下不同的數值。實作上最壞是 O(n)（全部不重複）；但概念上只需保留視窗內的值，可視為 O(min(n, k))。 / The hash map stores at most the distinct values seen; worst case O(n) when all values differ. Conceptually only window-sized state is needed.

Pitfalls & Edge Cases

不需要 abs / No abs needed： 因為我們由左往右掃描，目前的索引 i 一定大於存在表裡的舊索引，所以 i - oldIndex 必為正，直接比 <= k 即可。 / Because we scan left-to-right, i is always greater than the stored index, so the difference is positive — no need to call abs.
記得更新最新索引 / Always refresh to the newest index： 若值重複但距離超過 k，必須把索引更新成目前的 i，否則之後比較的是更舊、更遠的索引，會漏掉正確答案。 / When a repeat is too far, you must still update the stored index to i; otherwise later comparisons use a stale, farther index and miss valid answers.
數值範圍大、可為負 / Large and negative values： nums[i] 範圍是 ±10⁹，不能拿來當陣列下標。雜湊表（或 C 裡的雜湊取模）正好處理任意整數鍵。 / Values span ±10⁹ and can be negative, so they can't index a plain array; a hash map handles arbitrary integer keys.
k = 0 的情況 / The k = 0 case： 此時要求 abs(i-j) <= 0，即 i == j，但 i 和 j 必須是不同索引，所以永遠回傳 false。我們的程式自然處理：距離至少為 1，不會 <= 0。 / With k = 0 the gap must be 0 but indices must differ, so the answer is always false; our code handles this since any real gap is >= 1.
整數溢位 / Overflow： 索引相減 i - oldIndex 最多約 10⁵，遠在 int 範圍內，無溢位風險。 / The index difference is at most ~10⁵, safely within int.