14. Longest Common Prefix

題目 / Problem

中文：給定一個字串陣列 strs，找出所有字串共同的「最長前綴」（longest common prefix）。如果不存在共同前綴，回傳空字串 ""。

English: Given an array of strings strs, find the longest string that is a prefix of every string in the array. If no common prefix exists, return the empty string "".

Constraints: - 1 <= strs.length <= 200 - 0 <= strs[i].length <= 200 - strs[i] consists of only lowercase English letters if it is non-empty.

Example:

Input:  strs = ["flower","flow","flight"]
Output: "fl"
解釋 / Explanation: 三個字串都以 "fl" 開頭，但第三個字 'i' ≠ 'o' ≠ 'w'，所以共同前綴只到 "fl"。
                    All three start with "fl", but at index 2 the characters differ ('o' vs 'i'), so the LCP stops there.

名詞解釋 / Glossary

prefix / 前綴：字串開頭的一段連續字元。例如 "fl" 是 "flower" 的前綴，但 "lo" 不是（因為它不從位置 0 開始）。A contiguous substring that starts at index 0.
longest common prefix (LCP) / 最長共同前綴：對一組字串而言，每一個字串都共享的、最長的開頭子字串。The longest string that appears at the start of every input string.
vertical scanning / 縱向掃描：把所有字串「直立疊起來」，逐欄（column by column）從左到右比較。一旦某一欄出現不同字元，前面的部分就是答案。Compare characters column-by-column across all strings; stop at the first column where they disagree.
horizontal scanning / 橫向掃描（替代解法）：先假設第一個字串就是 LCP，再用它跟後面每個字串逐一比較、不斷縮短。Start with the first string as the candidate prefix, then shrink it against each subsequent string.
null terminator / 字串結尾符 '\0'：在 C 語言中，字串以 '\0' 這個位元組標記結束。讀到 '\0' 就代表這個字串到此為止。In C, strings end with the byte '\0'; reaching it means we've hit the end of the string.
malloc / 動態配置記憶體：在「堆積（heap）」上要一塊記憶體。LeetCode 的 C 題要求回傳的字串必須用 malloc（不能回傳 stack 上的局部陣列，因為函式結束後就無效了）。Allocates memory on the heap; LeetCode requires returned C strings to be heap-allocated so they outlive the function.
std::string / C++ 字串類別：C++ 的字串型別，會自動管理記憶體，可以用 +、.size()、.substr() 等方便的方法。C++'s managed string class with built-in length, slicing, and concatenation.
substr(pos, len)：C++ string 的成員函式，回傳從位置 pos 開始、長度為 len 的子字串。Returns a substring starting at pos with length len.

思路

我們要找所有字串都共有的開頭。最直觀的「暴力」想法是：先猜第一個字串的全部就是答案，然後每次拿它跟下一個字串比，發現不一樣就把答案往後砍短，直到所有字串都比完。這個做法可行，但寫起來容易出錯。一個更直接、更易懂的方法叫做「縱向掃描」：把所有字串想像成由上到下排成一個矩形，我們從左到右、一欄一欄地檢查。對於第 i 欄，先看第一個字串的第 i 個字元 c，再去檢查其它每一個字串的第 i 個字元是不是也都是 c；只要有任何一個不是 c，或者有任何一個字串的長度根本不到 i+1（也就是 i 已經超出它的結尾），那麼答案就是前 i 個字元。為什麼這樣做正確？因為一個字元能進入 LCP 的條件很簡單——所有字串在那個位置都是同一個字元——只要找到第一個違反的位置，答案就確定了，後面的位置不可能再被加進去。複雜度方面，最壞情況下我們會掃完所有字元一次，總共 O(S)，其中 S 是所有字串字元的總數。

The task is to find the shared opening characters across all strings. A natural brute-force idea ("horizontal scanning") is to assume the first string is the answer, then repeatedly shrink it against each remaining string. That works, but the bookkeeping is fiddly. A cleaner approach is vertical scanning: imagine the strings stacked into a rectangle and walk through it one column at a time from left to right. At column i, take the character c = strs[0][i] from the first string, then check every other string: if any of them either has a different character at position i or is too short to even have a position i, the LCP must end at column i, so we return the first i characters. This is correct because a character can only belong to the LCP if every string agrees at that position; the first column that breaks the agreement defines the answer. In the worst case (all strings identical) we examine every character once, giving O(S) time where S is the total number of characters across all strings.

逐步走查 / Walkthrough

Input: strs = ["flower", "flow", "flight"]. We use strs[0] = "flower" as the reference.

Column `i`	`c = strs[0][i]`	`strs[1][i]` (`"flow"`)	`strs[2][i]` (`"flight"`)	Match? / 是否全部相等？	Action
0	`'f'`	`'f'`	`'f'`	yes ✓	continue / 繼續
1	`'l'`	`'l'`	`'l'`	yes ✓	continue / 繼續
2	`'o'`	`'o'`	`'i'` ✗	no — `'i' != 'o'`	stop, return `strs[0][0..2)` = `"fl"` / 停止，回傳 `"fl"`

Result / 結果："fl".

注意：如果 strs[1] 是 "fl"（長度只有 2），那麼在 i = 2 時 strs[1][2] 就已經是 '\0'（結尾符），這也算「不匹配」，所以也會回傳 "fl"。 Note: if strs[1] were just "fl", then at i = 2 we'd hit its null terminator — that also counts as a mismatch and we'd return "fl".

Solution — C

/*
 * Algorithm: Vertical scanning / 縱向掃描
 * 用第一個字串當參考，逐欄比對所有字串。第一個出現不一致 (或某字串到結尾) 的位置，
 * 就是 LCP 的長度。把該長度的字首複製到一塊新配置的記憶體並回傳。
 * Use strs[0] as the reference; scan column-by-column. The first column where any
 * string disagrees (or ends) determines the LCP length. Copy that prefix into a
 * freshly malloc'd buffer and return it.
 */

#include <stdlib.h>   // malloc — 動態配置記憶體 / for malloc
#include <string.h>   // strlen, memcpy — 字串長度與複製 / for strlen and memcpy

char* longestCommonPrefix(char** strs, int strsSize) {
    // 邊界情況：陣列為空時回傳空字串 (constraint 保證 >=1，但保險起見還是處理)
    // Edge case: empty array — by constraints strsSize >= 1, but we guard anyway.
    if (strsSize == 0) {
        char* empty = (char*)malloc(1);   // 配置 1 個 byte 給結尾符 / 1 byte for ''
        empty[0] = '';                  // 設成空字串 / make it the empty string
        return empty;
    }

    // i 代表目前正在檢查第幾欄 (從 0 開始) / i is the current column index (0-based)
    int i = 0;

    // 外層迴圈會一直跑，直到我們在某個位置 break 出去 / loop until we break on mismatch or end
    while (1) {
        // 拿第一個字串第 i 個字元當作「期望值」 / take strs[0][i] as the expected character
        char c = strs[0][i];

        // 如果第一個字串已經到結尾，就無法再往前推了 / if strs[0] ended, LCP can't grow further
        if (c == '') break;

        // 從第二個字串開始，逐一檢查它們的第 i 個字元 / check every other string at column i
        for (int j = 1; j < strsSize; j++) {
            // 條件 1: strs[j] 在位置 i 已經是結尾 (字串太短) / strs[j] is too short
            // 條件 2: strs[j][i] 跟期望值 c 不同 / character differs from c
            // 任一條件成立 => 整個函式都該結束，回傳到目前為止的前綴
            // Either condition means the LCP stops here — bail out completely.
            if (strs[j][i] == '' || strs[j][i] != c) {
                goto done;   // 用 goto 一次跳出兩層迴圈 / break out of both loops
            }
        }

        // 這一欄全部相等，前綴可以延伸一個字元 / this column matched, extend the prefix
        i++;
    }

done:
    // i 現在等於 LCP 的長度 / i is now the length of the LCP

    // 配置 i+1 個 byte：i 個字元 + 1 個結尾符 '' / allocate i chars + null terminator
    // LeetCode 規定回傳值必須在 heap 上 (不能是 stack 上的區域陣列)
    // LeetCode requires heap-allocated return values; stack arrays vanish on return.
    char* result = (char*)malloc((size_t)i + 1);

    // 從 strs[0] 複製前 i 個字元到 result / copy the first i chars from strs[0]
    memcpy(result, strs[0], (size_t)i);

    // 在末尾加上 ''，這樣 C 才知道字串到這裡結束 / null-terminate so it's a valid C string
    result[i] = '';

    return result;   // 回傳給 LeetCode 評分系統 / hand the string back to LeetCode
}

Solution — C++

/*
 * Algorithm: Vertical scanning / 縱向掃描
 * 跟 C 版本同一套邏輯：以 strs[0] 為基準，逐欄掃描。
 * 用 std::string::substr 直接切出答案，不必手動管理記憶體。
 * Same logic as the C version; std::string::substr handles the slicing and memory.
 */

#include <string>     // std::string — C++ 字串型別 / C++ managed string type
#include <vector>     // std::vector — 動態陣列 / dynamic array

class Solution {
public:
    // strs 用 const& 傳入避免複製整個 vector / pass by const reference to avoid copying
    std::string longestCommonPrefix(std::vector<std::string>& strs) {
        // 題目保證 strs 至少有 1 個元素，但保險判斷一下 / guard against empty input
        if (strs.empty()) return "";

        // 取出第一個字串作為參考；用 const& 避免複製 / reference to strs[0], no copy
        const std::string& first = strs[0];

        // i 代表目前比對到第幾欄 / current column being checked
        for (size_t i = 0; i < first.size(); i++) {
            char c = first[i];   // 期望的字元 / expected character at column i

            // range-for 逐一拿出每個字串 (從 index 1 開始我們手動跳過第一個)
            // Iterate every string; we skip strs[0] inside the loop with a check.
            for (size_t j = 1; j < strs.size(); j++) {
                // 兩個 fail 條件: 字串長度不足 i+1，或第 i 個字元不對
                // Two failure conditions: string too short, or character mismatch.
                if (i >= strs[j].size() || strs[j][i] != c) {
                    // substr(0, i) 取出 first 的前 i 個字元 / first i chars of `first`
                    // 這就是當前能確定的最長共同前綴 / that's the LCP found so far
                    return first.substr(0, i);
                }
            }
        }

        // 走完整個 first 都沒有 mismatch，代表 first 本身就是 LCP
        // (例如 strs = ["abc","abcdef"] 時答案是 "abc")
        // If we exhausted first without any mismatch, first itself is the LCP.
        return first;
    }
};

複雜度 / Complexity

Time: O(S)，其中 S 是所有字串的字元總數 (sum of strs[i].length)。最壞情況 (所有字串完全相同) 需要把每個字元都檢查一遍。Worst case (all strings identical) we touch every character of every string exactly once, where S is the total character count.
Space: O(1) 額外空間（不算回傳值）。我們只用了幾個整數變數；C 版本另外配置 O(L) 的回傳緩衝區，L 是 LCP 長度，這是「輸出」必需的、不是演算法本身的開銷。Constant auxiliary space; the malloc'd return buffer is output, not algorithmic overhead.

Pitfalls & Edge Cases

空陣列 / Empty strs：題目保證 strsSize >= 1，但寫 if (strsSize == 0) 防呆是好習慣，避免存取 strs[0] 造成 segfault。Guard against zero-length input even if constraints rule it out — accessing strs[0] on an empty array would crash.
某個字串是空字串 / Empty string in the input：例如 ["", "abc"]，第一輪迴圈在 i = 0 就會發現 strs[0][0] == '\0' (C) 或 first.size() == 0 (C++)，立刻回傳 ""。正確處理。Handled naturally: we exit immediately at column 0.
C 必須回傳 heap 記憶體 / C return must be heap-allocated：千萬不要寫 char buf[256]; return buf;——那是 stack 上的記憶體，函式一返回就被釋放，LeetCode 拿到的會是垃圾。一定要用 malloc。Never return a local stack array; LeetCode's grader will read freed memory.
不要忘記 '\0' / Forgetting the null terminator：C 字串靠 '\0' 標記結尾，配置 i+1 byte 並把第 i 個位置設成 '\0'，否則 printf / strlen 會讀過頭。Always allocate length + 1 and write '\0' at the end.
off-by-one：i 是長度還是 index？/ Is i a length or an index?：在 mismatch 發生的瞬間，i 是「目前正在比對的欄位 index」，但同時也正好等於「已經確認相等的字元數」，所以可以直接拿來當前綴長度。This dual meaning is by design — the index where things break is the count of matches before it.
第一個字串就是最短的 / strs[0] is the shortest：例如 ["ab", "abcd"]，C 版本在 i = 2 時 strs[0][2] == '\0'，外層迴圈 break；C++ 版本 for 條件 i < first.size() 自然結束，回傳整個 first。兩種寫法都正確處理這個情況。Both versions return first itself when it's the shortest — handled by the loop bound (C++) or the c == '\0' check (C).