3. Longest Substring Without Repeating Characters

題目 / Problem

中文： 給定一個字串 s，請找出其中最長且不含重複字元的子字串的長度。子字串（substring）是指字串中連續的一段字元，不能跳著取。

English: Given a string s, find the length of the longest substring that contains no duplicate characters. A substring is a contiguous block of characters — you cannot skip positions.

約束 / Constraints: - 0 <= s.length <= 5 * 10^4（字串長度，可能為空 / length may be 0） - s 可包含英文字母、數字、符號和空格 / s consists of English letters, digits, symbols, and spaces.

範例 / Worked Example:

Input:  s = "pwwkew"
Output: 3
Explanation: 最長的不重複子字串是 "wke"，長度為 3。
             注意 "pwke" 是「子序列」(subsequence) 而非「子字串」(substring)，因為它跳過了字元，不合法。
             The longest valid substring is "wke" (length 3). "pwke" is a subsequence, not a substring, so it doesn't count.

名詞解釋 / Glossary

子字串 / Substring：字串中連續的一段字元。例如 "abc" 的子字串包含 "ab"、"bc"，但 "ac" 不是（它跳過了 b）。A contiguous run of characters from the string — no gaps allowed.
子序列 / Subsequence：可以跳著取但保持順序的字元序列。"ac" 是 "abc" 的子序列。本題不要子序列。Characters kept in order but with gaps allowed; not what this problem asks for.
滑動視窗 / Sliding Window：用兩個指標標出一段「視窗」（區間），右指標往右擴張納入新字元，當出現重複時左指標往右收縮，視窗整體在字串上「滑動」。A pair of indices marking a range that expands on the right and shrinks on the left as it slides across the string.
雙指標 / Two Pointers：用兩個索引變數（這裡是 left 和 right）共同描述視窗的左右邊界。Two index variables that together mark the boundaries of the window.
雜湊表 / Hash Map（查找表 / Lookup Table）：一種能用「鍵」快速查詢「值」的結構，平均 O(1) 時間。這裡我們用它記錄「某字元上次出現的位置」。A structure giving near-instant key→value lookup; here it stores each character's last seen position.
ASCII：把每個字元對應到一個 0–127（擴充到 256）整數編碼的標準。因此我們可以直接用一個大小 128/256 的陣列當作「字元 → 位置」的查找表，比通用雜湊表更快。A standard mapping from characters to small integers, letting us use a plain array indexed by the character as a fast lookup table.

思路

中文： 最直覺的暴力解法是列舉所有子字串：用兩層迴圈固定起點與終點，再檢查這段內有沒有重複字元。但這樣是 O(n³) 或 O(n²)，當字串長達 5×10⁴ 時太慢。關鍵的觀察是：當我們從左到右掃描時，沒必要每次都從頭重算。我們維護一個「視窗」[left, right]，保證視窗內所有字元都不重複。右指標 right 一格一格往右移、把新字元納入視窗；如果新字元 c 之前已經在視窗裡出現過，就把左指標 left 直接跳到「c 上次出現位置的下一格」，這樣就把重複的那一份排除掉，視窗又重新變得無重複。每一步用 right - left + 1 更新答案最大值。為了 O(1) 判斷「c 上次出現在哪」，我們用一張查找表 last[c] 記錄每個字元最後出現的索引。由於字元集是 ASCII，可以直接開一個大小 256 的整數陣列當查找表，初始化為 -1 表示「還沒出現過」。整個過程中 left 和 right 都只會往右移動、各自最多走 n 步，所以是線性時間。核心不變量（invariant）是：迴圈每次結束後，子字串 s[left..right] 內沒有任何重複字元。

English: The brute-force idea is to enumerate every substring with nested loops and check each for duplicates — that's O(n²) or worse, too slow for n up to 50,000. The key insight is that as we scan left to right we never need to recompute from scratch. We keep a window [left, right] that is guaranteed duplicate-free. We advance right one step at a time to absorb each new character. If the incoming character c was already seen inside the current window, we jump left forward to one past c's previous position, which evicts the old copy and restores the no-duplicate property. After each step we update the answer with the current window length right - left + 1. To check "where did c last appear?" in O(1), we keep a lookup table last[c] holding each character's most recent index. Because the alphabet is ASCII, a plain integer array of size 256 (initialized to -1, meaning "never seen") works as that table. Both left and right only ever move rightward, each at most n steps, giving linear time. The crucial invariant is: after every iteration, the substring s[left..right] contains no repeated character.

逐步走查 / Walkthrough

以 s = "abcabcbb" 為例。left 從 0 開始，best 記錄答案，last[] 全部初始化為 -1。 Tracing s = "abcabcbb". left starts at 0, best holds the answer, all of last[] start at -1.

right	字元 c / char	last[c] 之前值 / prev	left 如何變 / left update	新 left	視窗 / window	長度 / len	best
0	`a`	-1	不在視窗內，不動 / not in window	0	`"a"`	1	1
1	`b`	-1	不動 / no change	0	`"ab"`	2	2
2	`c`	-1	不動 / no change	0	`"abc"`	3	3
3	`a`	0	`0 >= left`，跳到 0+1 / jump to 1	1	`"bca"`	3	3
4	`b`	1	`1 >= left`，跳到 1+1 / jump to 2	2	`"cab"`	3	3
5	`c`	2	`2 >= left`，跳到 2+1 / jump to 3	3	`"abc"`	3	3
6	`b`	4	`4 >= left`，跳到 4+1 / jump to 5	5	`"cb"`	2	3
7	`b`	6	`6 >= left`，跳到 6+1 / jump to 7	7	`"b"`	1	3

每一步結束都更新 last[c] = right。最終 best = 3。 At each step we also set last[c] = right. Final answer best = 3.

注意 / Note: 判斷條件是 last[c] >= left。若 c 上次出現的位置在 left 左邊（已被踢出視窗），就不該移動 left。 The condition is last[c] >= left. If c's last position is left of left (already outside the window), we must not move left.

Solution — C

// 演算法：滑動視窗 + 「字元 → 上次位置」查找表。
// Algorithm: sliding window with a "char -> last index" lookup table.
// 右指標逐格擴張；遇到視窗內的重複字元就把左指標跳到其上次位置之後。
// Right pointer expands; on a duplicate inside the window, jump left past its prior spot.

int lengthOfLongestSubstring(char* s) {
    int last[256];                       // 查找表：last[c] = 字元 c 上次出現的索引 / lookup: last index of char c
                                         // 大小 256 涵蓋所有 ASCII（含符號、空格）/ size 256 covers all ASCII bytes
    for (int i = 0; i < 256; i++)        // 走訪每個格子做初始化 / loop over every slot to initialize
        last[i] = -1;                    // -1 代表「此字元尚未出現過」/ -1 means "this char not seen yet"

    int best = 0;                        // 目前找到的最長長度 / longest length found so far
    int left = 0;                        // 視窗左邊界（含）/ window's left edge (inclusive)

    // right 走訪每個字元，直到字串結尾 '' / right scans each char until the '' terminator
    for (int right = 0; s[right] != ''; right++) {
        // s[right] 是字元，轉成 unsigned char 當索引，避免負值 / cast to unsigned char so the index is 0..255
        unsigned char c = (unsigned char) s[right];

        // 若 c 上次出現位置仍在視窗內（>= left），代表視窗有重複 / if c's last spot is inside the window, it's a dup
        if (last[c] >= left)
            left = last[c] + 1;          // 左邊界跳到重複字元之後一格，踢掉舊的那份 / move left just past the old copy

        last[c] = right;                 // 更新 c 的最新位置為現在 / record c's newest position as right

        int len = right - left + 1;      // 目前視窗長度（兩端皆含）/ current window length (both ends inclusive)
        if (len > best)                  // 若更長就更新答案 / keep the maximum
            best = len;
    }
    return best;                         // 回傳最長不重複子字串的長度 / return the answer
}

Solution — C++

// 演算法：滑動視窗 + 「字元 → 上次位置」查找表，與 C 版相同思路。
// Algorithm: same sliding window + "char -> last index" table as the C version.
// 右指標擴張視窗，左指標在遇到視窗內重複時向右收縮。
// Right pointer grows the window; left pointer shrinks it when a duplicate appears inside.

class Solution {
public:
    int lengthOfLongestSubstring(string s) {
        // vector<int> 是會自動管理記憶體的動態陣列；這裡建 256 格、初值 -1。
        // vector<int> is a self-managing dynamic array; 256 slots all initialized to -1.
        // last[c] = 字元 c 上次出現的索引，-1 表示尚未出現 / last[c] = char c's last index, -1 = unseen.
        vector<int> last(256, -1);

        int best = 0;                    // 目前最長長度 / best length so far
        int left = 0;                    // 視窗左邊界（含）/ left edge of window (inclusive)

        // range 寫法配索引：用傳統 for 取得 right 索引 / classic for loop so we have the index `right`
        for (int right = 0; right < (int) s.size(); right++) {
            // 轉成 unsigned char 作索引，避免非 ASCII 變負數 / cast to unsigned char for a safe 0..255 index
            unsigned char c = (unsigned char) s[right];

            // c 上次位置在視窗內 → 視窗出現重複，左界跳到其後 / dup inside window -> jump left past it
            if (last[c] >= left)
                left = last[c] + 1;

            last[c] = right;             // 記錄 c 的最新出現位置 / update c's latest position

            // std::max 取兩數較大者，等價於 if 比較 / std::max returns the larger of the two values
            best = max(best, right - left + 1);  // right-left+1 是當前視窗長度 / current window length
        }
        return best;                     // 回傳結果 / return the answer
    }
};

複雜度 / Complexity

Time: O(n) — n 是字串長度。右指標 right 走訪每個字元各一次，左指標 left 只會往右、總共最多移動 n 步，兩者都不回頭，所以是線性時間。查找表存取為 O(1)。n is the string length; both pointers only move rightward and each advances at most n times, and every table access is O(1).
Space: O(1) — 查找表固定為 256 格，不隨輸入長度增長，因此是常數空間（若以字元集大小 k 計則為 O(k)）。The lookup table is a fixed 256 entries regardless of input size, so it's constant extra space (O(k) where k is the alphabet size).

Pitfalls & Edge Cases

空字串 / Empty string ("")：迴圈一次都不執行，best 維持 0，正確回傳 0。The loop never runs; best stays 0 — the correct answer.
左指標只能前進，不能後退 / left must never move backwards：必須用條件 last[c] >= left。若漏掉這個檢查、無腦設 left = last[c] + 1，當 c 的舊位置已在視窗外時，left 會被往回拉，導致重複字元混進視窗、答案偏大。The >= left guard prevents pulling left back to an already-evicted position, which would corrupt the window.
字元當索引要轉 unsigned / Cast char to unsigned before indexing：char 在某些平台是有號的，非 ASCII 位元組（值 > 127）會變成負數，拿來當陣列索引就會越界。轉成 unsigned char 確保索引落在 0–255。On platforms with signed char, bytes > 127 become negative and would index out of bounds; the cast keeps it in range.
子字串 vs 子序列 / Substring vs subsequence：答案必須是連續的一段。滑動視窗天生只看連續區間，自然滿足這點。例如 "pwwkew" 不能取 "pwke"。The window only ever represents a contiguous range, so it can never accidentally form a subsequence like "pwke".
長度公式的 off-by-one / Off-by-one in length：視窗 [left, right] 兩端都含，長度是 right - left + 1，別漏掉 +1。Both ends are inclusive, so the length needs the +1.
查找表大小 / Table size：題目含數字、符號、空格，不只 26 個小寫字母，所以用 256 而非 26，避免遺漏字元種類。Use 256, not 26 — the input includes digits, symbols, and spaces.