3. Longest Substring Without Repeating Characters
題目 / Problem
中文: 給定一個字串 s,請找出其中最長且不含重複字元的子字串的長度。子字串(substring)是指字串中連續的一段字元,不能跳著取。
English: Given a string s, find the length of the longest substring that contains no duplicate characters. A substring is a contiguous block of characters — you cannot skip positions.
約束 / Constraints:
- 0 <= s.length <= 5 * 10^4(字串長度,可能為空 / length may be 0)
- s 可包含英文字母、數字、符號和空格 / s consists of English letters, digits, symbols, and spaces.
範例 / Worked Example:
Input: s = "pwwkew"
Output: 3
Explanation: 最長的不重複子字串是 "wke",長度為 3。
注意 "pwke" 是「子序列」(subsequence) 而非「子字串」(substring),因為它跳過了字元,不合法。
The longest valid substring is "wke" (length 3). "pwke" is a subsequence, not a substring, so it doesn't count.
名詞解釋 / Glossary
- 子字串 / Substring:字串中連續的一段字元。例如
"abc"的子字串包含"ab"、"bc",但"ac"不是(它跳過了b)。A contiguous run of characters from the string — no gaps allowed. - 子序列 / Subsequence:可以跳著取但保持順序的字元序列。
"ac"是"abc"的子序列。本題不要子序列。Characters kept in order but with gaps allowed; not what this problem asks for. - 滑動視窗 / Sliding Window:用兩個指標標出一段「視窗」(區間),右指標往右擴張納入新字元,當出現重複時左指標往右收縮,視窗整體在字串上「滑動」。A pair of indices marking a range that expands on the right and shrinks on the left as it slides across the string.
- 雙指標 / Two Pointers:用兩個索引變數(這裡是
left和right)共同描述視窗的左右邊界。Two index variables that together mark the boundaries of the window. - 雜湊表 / Hash Map(查找表 / Lookup Table):一種能用「鍵」快速查詢「值」的結構,平均 O(1) 時間。這裡我們用它記錄「某字元上次出現的位置」。A structure giving near-instant key→value lookup; here it stores each character's last seen position.
- ASCII:把每個字元對應到一個 0–127(擴充到 256)整數編碼的標準。因此我們可以直接用一個大小 128/256 的陣列當作「字元 → 位置」的查找表,比通用雜湊表更快。A standard mapping from characters to small integers, letting us use a plain array indexed by the character as a fast lookup table.
思路
中文: 最直覺的暴力解法是列舉所有子字串:用兩層迴圈固定起點與終點,再檢查這段內有沒有重複字元。但這樣是 O(n³) 或 O(n²),當字串長達 5×10⁴ 時太慢。關鍵的觀察是:當我們從左到右掃描時,沒必要每次都從頭重算。我們維護一個「視窗」[left, right],保證視窗內所有字元都不重複。右指標 right 一格一格往右移、把新字元納入視窗;如果新字元 c 之前已經在視窗裡出現過,就把左指標 left 直接跳到「c 上次出現位置的下一格」,這樣就把重複的那一份排除掉,視窗又重新變得無重複。每一步用 right - left + 1 更新答案最大值。為了 O(1) 判斷「c 上次出現在哪」,我們用一張查找表 last[c] 記錄每個字元最後出現的索引。由於字元集是 ASCII,可以直接開一個大小 256 的整數陣列當查找表,初始化為 -1 表示「還沒出現過」。整個過程中 left 和 right 都只會往右移動、各自最多走 n 步,所以是線性時間。核心不變量(invariant)是:迴圈每次結束後,子字串 s[left..right] 內沒有任何重複字元。
English: The brute-force idea is to enumerate every substring with nested loops and check each for duplicates — that's O(n²) or worse, too slow for n up to 50,000. The key insight is that as we scan left to right we never need to recompute from scratch. We keep a window [left, right] that is guaranteed duplicate-free. We advance right one step at a time to absorb each new character. If the incoming character c was already seen inside the current window, we jump left forward to one past c's previous position, which evicts the old copy and restores the no-duplicate property. After each step we update the answer with the current window length right - left + 1. To check "where did c last appear?" in O(1), we keep a lookup table last[c] holding each character's most recent index. Because the alphabet is ASCII, a plain integer array of size 256 (initialized to -1, meaning "never seen") works as that table. Both left and right only ever move rightward, each at most n steps, giving linear time. The crucial invariant is: after every iteration, the substring s[left..right] contains no repeated character.
逐步走查 / Walkthrough
以 s = "abcabcbb" 為例。left 從 0 開始,best 記錄答案,last[] 全部初始化為 -1。
Tracing s = "abcabcbb". left starts at 0, best holds the answer, all of last[] start at -1.
| right | 字元 c / char | last[c] 之前值 / prev | left 如何變 / left update | 新 left | 視窗 / window | 長度 / len | best |
|---|---|---|---|---|---|---|---|
| 0 | a |
-1 | 不在視窗內,不動 / not in window | 0 | "a" |
1 | 1 |
| 1 | b |
-1 | 不動 / no change | 0 | "ab" |
2 | 2 |
| 2 | c |
-1 | 不動 / no change | 0 | "abc" |
3 | 3 |
| 3 | a |
0 | 0 >= left,跳到 0+1 / jump to 1 |
1 | "bca" |
3 | 3 |
| 4 | b |
1 | 1 >= left,跳到 1+1 / jump to 2 |
2 | "cab" |
3 | 3 |
| 5 | c |
2 | 2 >= left,跳到 2+1 / jump to 3 |
3 | "abc" |
3 | 3 |
| 6 | b |
4 | 4 >= left,跳到 4+1 / jump to 5 |
5 | "cb" |
2 | 3 |
| 7 | b |
6 | 6 >= left,跳到 6+1 / jump to 7 |
7 | "b" |
1 | 3 |
每一步結束都更新 last[c] = right。最終 best = 3。
At each step we also set last[c] = right. Final answer best = 3.
注意 / Note: 判斷條件是 last[c] >= left。若 c 上次出現的位置在 left 左邊(已被踢出視窗),就不該移動 left。
The condition is last[c] >= left. If c's last position is left of left (already outside the window), we must not move left.
Solution — C
// 演算法:滑動視窗 + 「字元 → 上次位置」查找表。
// Algorithm: sliding window with a "char -> last index" lookup table.
// 右指標逐格擴張;遇到視窗內的重複字元就把左指標跳到其上次位置之後。
// Right pointer expands; on a duplicate inside the window, jump left past its prior spot.
int lengthOfLongestSubstring(char* s) {
int last[256]; // 查找表:last[c] = 字元 c 上次出現的索引 / lookup: last index of char c
// 大小 256 涵蓋所有 ASCII(含符號、空格)/ size 256 covers all ASCII bytes
for (int i = 0; i < 256; i++) // 走訪每個格子做初始化 / loop over every slot to initialize
last[i] = -1; // -1 代表「此字元尚未出現過」/ -1 means "this char not seen yet"
int best = 0; // 目前找到的最長長度 / longest length found so far
int left = 0; // 視窗左邊界(含)/ window's left edge (inclusive)
// right 走訪每個字元,直到字串結尾 '