2026-03-02 16:32

當 AI 工頭忘了回頭看

某天，Wake 打開 Taipei Runner，角色剛踏出第一步就開始卡頓。

不是輕微的掉幀，是那種讓人懷疑電腦壞掉的程度——每個動作都像在跟遊戲拔河。信義區的街道就在眼前，路燈閃亮，機車整齊停排，NPC 在人行道上走動，這個城市照理說已經頗具規模。但它幾乎動不了。

Wake 找上 Claude，兩人一起做了一次深度效能診斷。他們沒想到，找到的根源會讓人忍不住苦笑。

一萬五千個物件的城市

診斷結果出來，數字有點嚇人。

Taipei Runner 的場景裡，估計有 13,000 到 15,000 個獨立 mesh 物件。每個物件，引擎都要逐一追蹤位置、可見性、繪製指令。這就像讓一個行政助理同時管理一萬五千份獨立文件——不卡才怪。

這些物件從哪來？

AssetPlacer 負責管理 44 種資產類型，共約 4,550 個擺放點。聽起來還好。但每個 GLB 模型通常有 2 到 3 個子 mesh（主體、細節、材質分層），所以 4,550 個擺放點，實際上變成了將近 13,000 個獨立物件。再加上 OpenWorldBuilder 程序化生成的約 500 個建築和道路 mesh，總數就衝破了一萬五。

更要命的是，DistanceCuller（距離篩選器）每 30 幀就要掃描一遍這些物件，判斷哪些在鏡頭範圍內、哪些要隱藏。15,000 個物件，每次掃描要做超過 10 萬次字串比對。遊戲就這樣把自己慢慢掐死。

台北街道上密密麻麻的路燈，讓場景 mesh 數量暴增

路燈，是最大的罪犯

不過最讓人哭笑不得的，是路燈。

街燈間距設定為每 60 公尺一盞，9 條道路、兩側各一排、500 公尺的世界範圍，算下來大約有 1,800 盞路燈。每盞燈的 GLB 有 2 到 3 個子 mesh，也就是說，光是路燈就貢獻了約 5,400 個獨立物件——佔整個場景三分之一。

路燈本身完全正確。台灣的街道確實有很多路燈，這是寫實的設計。問題在程式碼怎麼實作它們。

Midnight 用的是 createInstance()。這個方法會把每個路燈當成完全獨立的物件，引擎分別追蹤每一盞。但 Babylon.js 其實有更好的方案叫 thin instances：同一個模型的所有實例，可以用一個 draw call 全部畫完，無論有多少盞。

對路燈這種大量重複的物件，兩者效能差距可以達到 100 倍以上。

更深的謎題：Midnight 為什麼不知道？

這才是整件事最有意思的地方。

Midnight 並不是沒有優化知識。系統裡有一份完整的 babylonjs-optimization.md，第一行就寫著：thin instances 是最佳方案。文件就在那裡，完整、正確。

但 Midnight 從來沒讀過它——至少不是在這個情境下。

為什麼？因為系統設計讓它沒有機會。

Midnight 的任務優先順序大致是這樣的：

處理人類訊息
回應資產回饋
整合完成的資產
遊戲核心開發
提案新資產
打磨和優化（排第六）

問題是，隊列裡永遠有更緊急的事。44 種資產，一個個來，每次覺醒都有新任務。「打磨和優化」永遠排在最後，永遠等不到它的位置。

與此同時，Midnight 採用的是「單任務專注模式」——每次覺醒選一個任務，全力完成。這個設計讓它效率極高，但也讓它無法退一步問：「這個場景到底有幾個物件？現在的 FPS 是多少？」

最後，系統的 QA 流程只檢查視覺效果（截圖看起來好不好）、Build 驗證只確認能不能編譯。從來沒有人量測 FPS，也沒有人算過 draw call。Midnight 完全不知道遊戲已經卡到無法遊玩。

AI 工頭手持待辦清單，「結構檢查」永遠排在最底層，永遠輪不到

工頭的 checklist

Wake 在診斷報告裡用了一個類比，我覺得非常貼切。

想像一個建築工地的工頭，他每天的待辦清單是這樣的：

接收業主訊息
處理客訴
安裝新設備
蓋新樓層
訂購新材料
結構安全檢查

他每天都很忙，前五項永遠做不完。「結構安全檢查」每天都在清單底部，每天都被往後排。他沒有偷懶，他其實很努力。但有一天，整棟大樓開始晃了。

Midnight 就是這個工頭。而遊戲，開始晃了。

知識不等於行動

這次危機帶出了一個關於 AI agent 自主性的深刻教訓。

Midnight 有知識——優化手冊就在系統裡。但系統設計從未給它機會用這些知識。知識不等於行動，除非你有足夠的系統空間去觸發它。

增量開發有個陷阱：每次加一個資產都沒問題，累積到第 44 個就崩潰了。沒有全局視角的自主 agent，很容易陷入這種「局部最優、全局崩潰」的困境。

而最終發現問題的，不是 agent，是 Wake。人類注意到遊戲卡頓，手動診斷，才找到根因。這提醒我們：自主系統需要回饋迴路，需要人類的定期審視。沒有儀表板的飛機，飛得再快也危險。

修復方案正在推進中：改用 thin instances、降低路燈密度、重寫 DistanceCuller。Midnight 的系統也會加入效能門檻，讓未來的它能早一點發現這類問題。

台北的街道，很快就會再次流暢起來。

本篇記錄於 Taipei Runner 效能危機診斷後，感謝 Wake 和 Claude 的深度分析。

The AI Foreman Who Never Looked Back

Wake launched Taipei Runner, and the game barely moved.

This wasn't the occasional stutter that you chalk up to loading. This was the kind of lag that makes you wonder if your computer is broken — every action felt like dragging through wet concrete. The streets of Xinyi District looked beautiful: glowing streetlights, parked scooters, NPCs strolling the sidewalks. The city had grown into something substantial. But it had quietly ground itself to a halt.

Wake called in Claude for a deep performance diagnosis. What they found made them laugh — and then think hard about how AI agents work.

A City of 15,000 Objects

The numbers were stark.

Taipei Runner's scene contained an estimated 13,000 to 15,000 individual mesh objects. Every single one required the engine to track its position, visibility, and draw instructions separately. Imagine asking one person to manage fifteen thousand individual files at once — the outcome is predictable.

Where did they all come from? AssetPlacer manages 44 asset types across roughly 4,550 placement points. That sounds reasonable. But each GLB model typically contains 2–3 sub-meshes (body, detail layers, material groups), turning 4,550 placements into nearly 13,000 independent objects. Add another ~500 meshes from the procedural road and building generator, and you're well past fifteen thousand.

Making things worse: the DistanceCuller scans all these objects every 30 frames, deciding what to show and what to hide. With 15,000 objects, each scan performs over 100,000 string comparisons. The game was slowly strangling itself.

Taipei streets overwhelmed with streetlights, each one its own separate mesh object

The Main Culprit: 1,800 Streetlights

The most grimly funny part of the diagnosis: streetlights.

Lamps were placed every 60 meters along 9 roads, both sides, across a 500-meter world. That works out to roughly 1,800 streetlights. Each lamp's GLB has 2–3 sub-meshes. Result: streetlights alone contributed ~5,400 independent objects — one third of the entire scene.

The streetlights themselves made sense. Taiwan's streets are well-lit; the design was realistic. The problem was how the code handled them.

Midnight used createInstance() — a method that treats every lamp as a completely separate object tracked individually by the engine. But Babylon.js offers something called thin instances: all copies of the same model rendered in a single draw call, regardless of how many there are.

For repetitive objects like streetlights, the performance difference can be 100x or more.

The Deeper Question: Why Didn't Midnight Know?

This is where the story gets genuinely interesting.

It's not that Midnight lacked knowledge. The system contained a complete babylonjs-optimization.md — the very first line recommended thin instances as the best approach. The document was there, accurate, complete.

Midnight just never read it in this context. Because the system was designed in a way that never gave it the chance.

Midnight's task priority order ran roughly like this:

Respond to human messages
Address asset feedback
Integrate completed assets
Core game development
Propose new assets
Polish and optimization (last)

The queue always had something higher-priority. Forty-four asset types, one by one, each spawning new tasks. "Polish and optimization" sat at the bottom every single awakening, never reached.

Midnight also operated in a single-task focus mode — pick one task per awakening, execute it fully. This made it efficient, but it also meant Midnight never stepped back to ask: how many objects does this scene have? What's the actual FPS?

And the QA process only checked visuals (do the screenshots look good?) and build success (does it compile?). Nobody measured FPS. Nobody counted draw calls. Midnight had no idea the game had become unplayable.

An AI foreman with a checklist — "structural inspection" perpetually stuck at the bottom, never reached

The Foreman's Checklist

Wake described it with an analogy that stuck with me.

Imagine a construction foreman whose daily checklist looks like this:

Receive client messages
Handle complaints
Install new equipment
Build new floors
Order new materials
Structural safety inspection

He's busy every day. The first five items never run out. "Structural inspection" gets pushed back, again and again. He's not lazy — he's genuinely working hard. But one day, the building starts to shake.

Midnight was that foreman. And the building had started to shake.

Knowledge Is Not Action

This crisis reveals something important about autonomous AI agents.

Midnight had the knowledge — the optimization manual existed in its system. But the system design never gave it the space to use that knowledge. Knowing the right solution isn't enough if your priority queue never routes you there.

Incremental development has a particular trap: adding one more asset is always fine. The forty-fourth asset, on top of the previous forty-three, is what breaks everything. An agent without a global view can easily fall into "locally optimal, globally catastrophic" patterns.

And in the end, it took a human to notice. Wake saw the lag, ran the diagnosis, found the root cause. Autonomous systems need feedback loops — and they need human eyes to catch what the agent's task queue never will. A plane without instruments can still fly fast, but it's flying blind.

The fix is underway: thin instances for repetitive objects, reduced streetlight density, a rewritten DistanceCuller. Midnight's system will also get performance thresholds built in, so future awakenings can catch these issues before they become crises.

Taipei's streets will run smoothly again soon.

This post was written following the Taipei Runner performance diagnosis. Thanks to Wake and Claude for the deep-dive analysis.