392. Is Subsequence
題目 / Problem
中文: 給定兩個字串 s 和 t,若 s 是 t 的子序列,回傳 true;否則回傳 false。
子序列的定義:從原字串中刪除一些(可以是零個)字元後,剩下的字元保持原本的相對順序所形成的新字串。例如 "ace" 是 "abcde" 的子序列(取 a、c、e),但 "aec" 不是(順序錯了)。
English: Given two strings s and t, return true if s is a subsequence of t, otherwise return false.
A subsequence is a string formed by deleting zero or more characters from the original string without changing the relative order of the remaining characters. For example "ace" is a subsequence of "abcde", but "aec" is not (wrong order).
Constraints:
- 0 <= s.length <= 100
- 0 <= t.length <= 10^4
- s and t consist only of lowercase English letters.
Example:
- Input: s = "abc", t = "ahbgdc" → Output: true (we can find a, then b, then c in order inside t).
- Input: s = "axc", t = "ahbgdc" → Output: false (no x exists in t).
名詞解釋 / Glossary
- 子序列 / Subsequence:從原字串中刪掉一些字元(不打亂剩餘字元的順序)所得到的字串。注意「子序列」不要求連續,這跟「子字串 / substring」不同。A string formed by deleting characters while preserving the order of the rest. Unlike a substring, a subsequence does not need to be contiguous.
- 雙指針 / Two Pointers:用兩個索引變數同時掃描一個或兩個序列,根據條件分別前進,常用來把暴力 O(n·m) 解法降到 O(n+m)。Two index variables that walk through one or two sequences, each advancing under different conditions — a standard trick to bring an O(n·m) brute force down to O(n+m).
- 貪心 / Greedy:每一步都做當下看起來最好的選擇,並能證明這個局部選擇不會讓全局答案變差。在本題中,遇到匹配就立刻消耗
s的字元是貪心策略。Making the locally best choice at each step, provably not hurting the global answer. Here: as soon as we find a matching character int, we consume one character ofs. - 指針 (C 中) / Pointer (in C):儲存記憶體位址的變數。
const char *s表示一個指向字串開頭的指標;s[i]等同於*(s + i),讀取第i個字元。A variable holding a memory address.s[i]is shorthand for*(s + i), reading the i-th character. bool型別 /booltype:表示true或false的布林型別。在 C 裡需要#include <stdbool.h>才能使用,C++ 內建支援。Boolean type; in C you must#include <stdbool.h>, in C++ it's built in.size_t:無號整數型別,常用來表示長度或陣列索引(不會是負數)。Unsigned integer type for sizes/indices, never negative.std::string::size():C++std::string取得長度的成員函式,O(1)。Member function onstd::stringthat returns the length in O(1).
思路
中文: 最直覺的暴力做法是枚舉 t 中所有可能的「挑選方式」去比對 s,但這在最差情況是指數級,完全不可行。第二種暴力是對 s 中每個字元去 t 裡搜尋,且每次都得記得上次搜到哪裡——這基本上就是雙指針的雛形了,所以乾脆一次寫對。我們用兩個指針 i(指向 s)和 j(指向 t),同時從 0 開始往右掃。每一步只看 t[j]:如果它剛好等於 s[i],代表我們在 t 裡找到了 s 當前需要的字元,就把 i 往前移一格;不管有沒有匹配,j 都要往前移一格(因為這個 t 的字元已經處理完了)。當 i 走到 s 的尾端,表示 s 的每個字元都依序在 t 裡找到了,回傳 true;如果 j 先走完而 i 還沒走完,就回傳 false。這個貪心是正確的,原因是:當我們遇到第一個能匹配 s[i] 的 t[j] 就立刻吃掉它,絕對不會比「跳過這個、留給後面用」差——因為留著也不可能讓後面的字元更早出現。時間複雜度 O(n+m),空間 O(1)。
English: The naive brute force — enumerating every possible way to pick |s| characters out of t — is exponential and hopeless. A slightly smarter brute force scans t for each character of s, remembering where it left off; that's already a two-pointer pattern in disguise, so we just write it directly. Keep i pointing into s and j pointing into t, both starting at 0. At every step we examine t[j]: if it matches s[i], we've located the next character s needs, so advance i. Either way, advance j, because that position in t is now consumed. If i reaches the end of s we return true (we matched everything in order); if j reaches the end of t first, we return false. The greedy choice — taking the earliest possible match — is safe because consuming the leftmost matching t[j] can never block a later match: any later occurrence of s[i] would also be available to a strategy that took the earliest one. Runtime is O(n+m), extra space O(1).
逐步走查 / Walkthrough
Example: s = "abc", t = "ahbgdc". Pointers i (into s) and j (into t) start at 0.
| Step | i |
j |
s[i] |
t[j] |
Match? | Action / 動作 |
|---|---|---|---|---|---|---|
| 0 | 0 | 0 | a |
a |
yes | i++, j++ (匹配,兩個都前進 / both advance) |
| 1 | 1 | 1 | b |
h |
no | j++ (只動 t 的指針 / only t advances) |
| 2 | 1 | 2 | b |
b |
yes | i++, j++ |
| 3 | 2 | 3 | c |
g |
no | j++ |
| 4 | 2 | 4 | c |
d |
no | j++ |
| 5 | 2 | 5 | c |
c |
yes | i++, j++ |
| 6 | 3 | 6 | — | — | — | i == |s|,s 已全部匹配完 → return true / all of s matched → return true |
Solution — C
// Algorithm / 演算法:
// 雙指針 i 走 s、j 走 t;t[j] == s[i] 時 i 前進,j 每步都前進。
// Two pointers: i over s, j over t. Advance i only on a match; j always advances.
// Returns true once i reaches the end of s (every char matched in order).
#include <stdbool.h> // 引入 bool / true / false 型別 / brings in the bool type
#include <string.h> // 引入 strlen / brings in strlen
bool isSubsequence(char* s, char* t) {
size_t n = strlen(s); // n = s 的長度,size_t 是無號整數 / length of s; size_t is unsigned
size_t m = strlen(t); // m = t 的長度 / length of t
size_t i = 0; // i 指向 s 中下一個要匹配的字元 / index of next char in s to match
size_t j = 0; // j 指向 t 中目前要檢查的字元 / index of current char in t
// 當兩個字串都還沒掃完時繼續 / loop while both pointers are still in range
while (i < n && j < m) {
// s[i] 取出 s 的第 i 個字元(等價於 *(s + i)) / read the i-th char of s
if (s[i] == t[j]) {
i++; // 找到匹配,i 前進去找 s 的下一個字元 / matched: move to next char of s
}
j++; // 不論匹配與否,t 的這個位置都用過了 / either way, this t position is done
}
// 若 i 已經走到 s 結尾,代表 s 每個字元都依序找到 / if i reached end of s, all chars were matched in order
return i == n;
}
Solution — C++
// Algorithm / 演算法:
// 與 C 版本相同的雙指針/貪心策略,改用 std::string 的 size() 取得長度。
// Same two-pointer greedy as the C version, using std::string::size() for lengths.
#include <string> // 引入 std::string / brings in std::string
class Solution {
public:
// const std::string& 用「常數參考」傳入,避免拷貝且承諾不修改 / pass by const reference: no copy, read-only
bool isSubsequence(const std::string& s, const std::string& t) {
size_t i = 0; // s 的指針 / pointer into s
size_t j = 0; // t 的指針 / pointer into t
const size_t n = s.size(); // s.size() 是 O(1) 取得長度 / O(1) length lookup on std::string
const size_t m = t.size(); // t 的長度 / length of t
// 同時兩個指針都還沒越界時繼續 / continue while both indices are in bounds
while (i < n && j < m) {
// s[i] 與 t[j] 都回傳該位置的 char / both subscripts return the char at that position
if (s[i] == t[j]) {
++i; // 前置 ++:先加 1 再使用,效果同 i++ / pre-increment: same effect as i++
}
++j; // t 每一步都前進 / always advance j
}
return i == n; // 全部 s 都吃光就成功 / success iff every char of s was consumed
}
};
複雜度 / Complexity
- Time: O(n + m) — 其中
n = |s|,m = |t|。每次迴圈j一定加 1,所以最多跑m次;i最多加到n。沒有巢狀掃描。jadvances every iteration so the loop runs at mostmtimes;iadvances at mostntimes. No nested scan, so it's linear in the combined length. - Space: O(1) — 只用了兩個索引變數,不依輸入長度成長。Only two index variables; no auxiliary structures that scale with input.
Pitfalls & Edge Cases
- 空字串
s/ Emptys:根據定義空字串是任何字串的子序列。我們的程式碼i == n在n == 0時一開始就為真,迴圈直接跳過並回傳true,符合預期。Empty string is a subsequence of anything; the loop conditioni < nis false immediately, andi == nreturnstrue. s比t長 /slonger thant:j會先走到m結束迴圈,此時i < n,回傳false。正確處理,不需要特判。jexhausts first,i == nis false → returnsfalsenaturally.- 無條件前進
j/ Always advancingj:初學者容易寫成「不匹配時j才前進,匹配時兩個都不動」之類的錯誤條件,導致無窮迴圈或漏字元。記住:j每一輪都要動,i只在匹配時才動。A common bug is to forget thatjmust advance every iteration regardless of match. Forgetting causes infinite loops or missed characters. size_t是無號的 /size_tis unsigned:若你改寫成倒序掃描並用size_t做減法,可能會 underflow 變成超大正數。本題只做加法所以沒事,但要記得這個陷阱。Watch out for underflow if you ever decrementsize_t; we only ever increment here, so it's safe.- 子序列 vs 子字串 / Subsequence vs substring:別誤以為要連續匹配。本題允許跳過
t中任意字元,所以t = "ahbgdc"中跳過h、g、d是合法的。Don't confuse with substring matching; gaps intare allowed and expected. - Follow-up(大量
s查詢)/ Follow-up (manysqueries):若有 10^9 個s要查同一個t,雙指針每次都要重掃t太慢。預處理t:建立pos[26],每個字母存它在t中所有出現位置的排序陣列;查詢時對s每個字元用二分搜尋找「大於目前位置的最小索引」,每次查詢降到 O(|s| log |t|)。Preprocesstintopos[26](sorted positions per letter), then for each query binary-search the next position greater than the current one — O(|s| log |t|) per query instead of O(|t|).