14. Longest Common Prefix
題目 / Problem
中文:給定一個字串陣列 strs,找出所有字串共同的「最長前綴」(longest common prefix)。如果不存在共同前綴,回傳空字串 ""。
English: Given an array of strings strs, find the longest string that is a prefix of every string in the array. If no common prefix exists, return the empty string "".
Constraints:
- 1 <= strs.length <= 200
- 0 <= strs[i].length <= 200
- strs[i] consists of only lowercase English letters if it is non-empty.
Example:
Input: strs = ["flower","flow","flight"]
Output: "fl"
解釋 / Explanation: 三個字串都以 "fl" 開頭,但第三個字 'i' ≠ 'o' ≠ 'w',所以共同前綴只到 "fl"。
All three start with "fl", but at index 2 the characters differ ('o' vs 'i'), so the LCP stops there.
名詞解釋 / Glossary
- prefix / 前綴:字串開頭的一段連續字元。例如
"fl"是"flower"的前綴,但"lo"不是(因為它不從位置 0 開始)。A contiguous substring that starts at index 0. - longest common prefix (LCP) / 最長共同前綴:對一組字串而言,每一個字串都共享的、最長的開頭子字串。The longest string that appears at the start of every input string.
- vertical scanning / 縱向掃描:把所有字串「直立疊起來」,逐欄(column by column)從左到右比較。一旦某一欄出現不同字元,前面的部分就是答案。Compare characters column-by-column across all strings; stop at the first column where they disagree.
- horizontal scanning / 橫向掃描(替代解法):先假設第一個字串就是 LCP,再用它跟後面每個字串逐一比較、不斷縮短。Start with the first string as the candidate prefix, then shrink it against each subsequent string.
- null terminator / 字串結尾符
'\0':在 C 語言中,字串以'\0'這個位元組標記結束。讀到'\0'就代表這個字串到此為止。In C, strings end with the byte'\0'; reaching it means we've hit the end of the string. malloc/ 動態配置記憶體:在「堆積(heap)」上要一塊記憶體。LeetCode 的 C 題要求回傳的字串必須用malloc(不能回傳 stack 上的局部陣列,因為函式結束後就無效了)。Allocates memory on the heap; LeetCode requires returned C strings to be heap-allocated so they outlive the function.std::string/ C++ 字串類別:C++ 的字串型別,會自動管理記憶體,可以用+、.size()、.substr()等方便的方法。C++'s managed string class with built-in length, slicing, and concatenation.substr(pos, len):C++string的成員函式,回傳從位置pos開始、長度為len的子字串。Returns a substring starting atposwith lengthlen.
思路
我們要找所有字串都共有的開頭。最直觀的「暴力」想法是:先猜第一個字串的全部就是答案,然後每次拿它跟下一個字串比,發現不一樣就把答案往後砍短,直到所有字串都比完。這個做法可行,但寫起來容易出錯。一個更直接、更易懂的方法叫做「縱向掃描」:把所有字串想像成由上到下排成一個矩形,我們從左到右、一欄一欄地檢查。對於第 i 欄,先看第一個字串的第 i 個字元 c,再去檢查其它每一個字串的第 i 個字元是不是也都是 c;只要有任何一個不是 c,或者有任何一個字串的長度根本不到 i+1(也就是 i 已經超出它的結尾),那麼答案就是前 i 個字元。為什麼這樣做正確?因為一個字元能進入 LCP 的條件很簡單——所有字串在那個位置都是同一個字元——只要找到第一個違反的位置,答案就確定了,後面的位置不可能再被加進去。複雜度方面,最壞情況下我們會掃完所有字元一次,總共 O(S),其中 S 是所有字串字元的總數。
The task is to find the shared opening characters across all strings. A natural brute-force idea ("horizontal scanning") is to assume the first string is the answer, then repeatedly shrink it against each remaining string. That works, but the bookkeeping is fiddly. A cleaner approach is vertical scanning: imagine the strings stacked into a rectangle and walk through it one column at a time from left to right. At column i, take the character c = strs[0][i] from the first string, then check every other string: if any of them either has a different character at position i or is too short to even have a position i, the LCP must end at column i, so we return the first i characters. This is correct because a character can only belong to the LCP if every string agrees at that position; the first column that breaks the agreement defines the answer. In the worst case (all strings identical) we examine every character once, giving O(S) time where S is the total number of characters across all strings.
逐步走查 / Walkthrough
Input: strs = ["flower", "flow", "flight"]. We use strs[0] = "flower" as the reference.
Column i |
c = strs[0][i] |
strs[1][i] ("flow") |
strs[2][i] ("flight") |
Match? / 是否全部相等? | Action |
|---|---|---|---|---|---|
| 0 | 'f' |
'f' |
'f' |
yes ✓ | continue / 繼續 |
| 1 | 'l' |
'l' |
'l' |
yes ✓ | continue / 繼續 |
| 2 | 'o' |
'o' |
'i' ✗ |
no — 'i' != 'o' |
stop, return strs[0][0..2) = "fl" / 停止,回傳 "fl" |
Result / 結果:"fl".
注意:如果 strs[1] 是 "fl"(長度只有 2),那麼在 i = 2 時 strs[1][2] 就已經是 '\0'(結尾符),這也算「不匹配」,所以也會回傳 "fl"。
Note: if strs[1] were just "fl", then at i = 2 we'd hit its null terminator — that also counts as a mismatch and we'd return "fl".
Solution — C
/*
* Algorithm: Vertical scanning / 縱向掃描
* 用第一個字串當參考,逐欄比對所有字串。第一個出現不一致 (或某字串到結尾) 的位置,
* 就是 LCP 的長度。把該長度的字首複製到一塊新配置的記憶體並回傳。
* Use strs[0] as the reference; scan column-by-column. The first column where any
* string disagrees (or ends) determines the LCP length. Copy that prefix into a
* freshly malloc'd buffer and return it.
*/
#include <stdlib.h> // malloc — 動態配置記憶體 / for malloc
#include <string.h> // strlen, memcpy — 字串長度與複製 / for strlen and memcpy
char* longestCommonPrefix(char** strs, int strsSize) {
// 邊界情況:陣列為空時回傳空字串 (constraint 保證 >=1,但保險起見還是處理)
// Edge case: empty array — by constraints strsSize >= 1, but we guard anyway.
if (strsSize == 0) {
char* empty = (char*)malloc(1); // 配置 1 個 byte 給結尾符 / 1 byte for '