--- name: notebooklm-api description: Expert assistant for the notebooklm-api project — a Puppeteer-based REST/WebSocket server that automates Google NotebookLM. Use when working on browser automation, Puppeteer selectors, API routes, queue logic, or debugging DOM interactions with NotebookLM. triggers: - "notebooklm" - "puppeteer" - "selector" - "notebook" - "add source" - "chat stream" - "browser automation" --- # NotebookLM API — Project Skill ## What this project is A Node.js server (`src/server.js`) that drives **Google NotebookLM** (`https://notebooklm.google.com`) through a real Chrome window via Puppeteer with the Stealth plugin. It exposes a REST + WebSocket API so external systems (n8n, Python scripts, ERP, chatbots) can interact with NotebookLM programmatically. **Key constraint:** NotebookLM has no official API. Every operation — listing notebooks, adding sources, chatting — is done by simulating real user clicks and keystrokes in the browser. The Google Angular UI changes frequently, so DOM selectors may need updating after UI changes. ## Architecture ``` src/ server.js — Express app, middleware, debug routes, graceful shutdown browser.js — BrowserManager singleton (puppeteer-extra + stealth) nlm.js — All NotebookLM automation logic (the core) queue.js — AsyncQueue: serialises ALL Puppeteer ops (no concurrency) selectors.js — CSS selectors confirmed from real DOM inspection swagger.js — Swagger/OpenAPI spec routes/ auth.js — GET /api/auth/status, POST /api/auth/login notebooks.js — CRUD for notebooks + sources chat-ws.js — WebSocket streaming chat ``` ## The queue — critical invariant `src/queue.js` is a single-lane async queue. **Every route wraps its nlm call in `queue.add(...)`**. This prevents Puppeteer race conditions — only one browser action runs at a time. Never bypass the queue. If a new route is added, it MUST use the queue. ## Selectors — how to update them All selectors live in `src/selectors.js`. They were confirmed against the Vietnamese-language NotebookLM UI. When Google updates the UI and a selector breaks: 1. Use the debug endpoints in `server.js` to inspect the live DOM: - `GET /debug/home` — lists all visible buttons on the home page (find create/delete buttons) - `GET /debug/notebook-menu/:id` — hovers a card, clicks the "..." menu, returns menu items - `GET /debug/chat/:id` — lists all textarea/input elements on a notebook page - `GET /debug/sources/:id` — inspects the source panel DOM - `GET /debug/source-items/:id` — full HTML of source items (buttons, icons, spans) - `GET /debug/add-source-dialog/:id` — clicks "Add source" and snapshots the dialog - `GET /debug/add-url-flow/:id` — step-by-step trace of the full add-URL flow - `GET /debug/page-state/:id` — full overlay/dialog state before+after click - `GET /debug/screenshot` — base64 PNG of current Chrome window 2. Key known selectors (confirmed 2026-06-16): - Create notebook button: `button[aria-label="Tạo sổ ghi chú mới"], button.create-new-button` - Card menu button: `button[aria-label="Trình đơn thao tác trong dự án"]` - Delete menu item: find by text `/Xo[áa]/i` inside `.mat-mdc-menu-panel button.mat-mdc-menu-item` - Add source button: `.add-source-button` - Source items: `div.single-source-container` (NOT `.source-item-menu-button-visible` which is a child) - Source title: `button.source-stretched-button[aria-label]` inside each container - Source type icon: `mat-icon.source-item-source-icon` text; or "url" if `img.favicon-icon` present - Chat textarea: `textarea.query-box-input` (NOT the source search textarea) - AI response cards: `mat-card.to-user-message-card-content` - User message cards: `mat-card.from-user-message-card-content` - Thinking animation: `div.thinking-message, thinking-animation` - Dialogs: `mat-dialog-container` (multiple can exist — emoji keyboard occupies one) 3. The UI language is **Vietnamese** — button labels like "Trang web", "Văn bản đã sao chép", "Chèn", "Tạo sổ ghi chú mới". ## Dialog handling — the tricky part The Add Source dialog now has a **2-stage flow**: **Stage 1 (initial):** Shows source-type buttons + a search textarea (`placeholder="Tìm nguồn mới trên web"`, class `query-box-textarea`). This textarea is always present and must be EXCLUDED from URL/text input detection. **Stage 2 (after clicking type button):** Dialog transitions in-place to show: - URL mode: a TEXTAREA with `aria-label="Nhập URL"` and `placeholder="Dán liên kết bất kỳ"` - Text mode: a large textarea without the `query-box-textarea` class **Submit button:** `"Chèn"` button with `type="button"` (NOT `type="submit"` — that's the back/close buttons). Pattern in `nlm.addSourceUrl()` — wait for dialog to TRANSITION before finding URL input: ```javascript // Click "Trang web" button await page.evaluate(() => { const d = [...document.querySelectorAll('mat-dialog-container')].pop(); const btn = [...d.querySelectorAll('button')].find(b => /Trang web/i.test(b.textContent)); btn?.click(); }); // Wait for URL input to appear (not the search textarea) await page.waitForFunction(() => { const d = [...document.querySelectorAll('mat-dialog-container')].pop(); return [...d.querySelectorAll('input, textarea')].some(el => el.offsetParent !== null && (el.getAttribute('aria-label')?.toLowerCase().includes('url') || /liên kết|link|paste|http/i.test(el.placeholder)) ); }, { timeout: 10_000, polling: 200 }); // Click submit — use "Chèn" text, NOT type="submit" await page.evaluate(() => { const d = [...document.querySelectorAll('mat-dialog-container')].pop(); const btn = [...d.querySelectorAll('button')].find(b => b.offsetParent !== null && !b.disabled && /Ch[eè]n/i.test(b.textContent) ); btn?.click(); }); ``` ## Chat streaming (WebSocket) `src/routes/chat-ws.js` handles `WS /api/notebooks/:id/chat/stream`. The streaming works by **polling the DOM** every 300ms during the AI response, diffing against the last known text, and emitting `{ type: "chunk", data: "..." }` messages. This is not true server-sent streaming — it's DOM polling. Message types the server emits: - `{ type: "connected", notebookId }` — on WS open - `{ type: "chunk", data: string }` — incremental text - `{ type: "done", data: { answer } }` — response complete - `{ type: "error", data: string }` — on failure ## waitForAiResponse — response completion detection `nlm.waitForAiResponse(page, prevCount, timeout)` in `src/nlm.js` uses a 3-step approach: 1. Wait for `div.thinking-message` / `thinking-animation` to disappear 2. Wait for a new `mat-card.to-user-message-card-content` to appear (count > prevCount) 3. Wait for the text to be **stable for 1.5s** (no changes in 4 consecutive 400ms polls) This handles streaming responses that may still be updating after the spinner disappears. ## Browser session persistence Chrome profile is stored at `./chrome-profile/`. Google login cookies persist between server restarts. On startup, `browser.isAuthenticated()` navigates to `https://notebooklm.google.com` and checks if the URL stays there (not redirected to `accounts.google.com`). **Important:** Always stop the server with `Ctrl+C` (SIGTERM/SIGINT) — the graceful shutdown handler calls `browser.close()` to flush Chrome's cookie store. Using `kill -9` may corrupt the session. ## File upload (`addSourceFile`) — xap uploader quirks NotebookLM's "Tải tệp lên" button uses Google's internal **xap scotty uploader** (`[xapscottyuploadertrigger]` attribute). Key findings from 2026-06-16 debugging: 1. **Trigger button is in the INITIAL dialog** — drop zone + "Tải tệp lên" button appear as soon as the add-source dialog opens. No need to click a secondary button to reveal the drop zone. 2. **Must use trusted (CDP) click** — `trigger.click()` inside `page.evaluate()` generates `isTrusted: false`. The xap uploader silently ignores untrusted clicks. Use `page.mouse.click(x, y)` (Puppeteer's CDP method) to generate trusted clicks. 3. **Two upload paths** — depending on browser context, xap uploader may use either: - **Native file chooser** (`input[type=file]`) → caught by `page.waitForFileChooser()` + `fileChooser.accept([path])` - **`showOpenFilePicker()`** → override it BEFORE clicking: `window.showOpenFilePicker = async () => [{ getFile: () => file, ... }]` 4. **No "Chèn" button** — after file upload, dialog auto-closes. No confirmation click needed. 5. **Wait for network idle** after upload (the scotty upload finishes asynchronously). ```javascript // Pattern (from nlm.addSourceFile): const chooser = page.waitForFileChooser({ timeout: 8_000 }).catch(() => null); await page.mouse.click(triggerX, triggerY); // trusted click const fc = await chooser; if (fc) await fc.accept([absPath]); // native file chooser path // else showOpenFilePicker override handles it await page.waitForNetworkIdle({ timeout: 60_000 }); // wait for scotty upload ``` ### Debugging "add source button not found" error The add-source button (`button[aria-label="Thêm nguồn"]`) only exists on notebooks the USER OWNS. Public/shared notebooks (shown with "Công khai" badge) don't have this button. Always test with a user-owned notebook. ## Common tasks ### Adding a new API endpoint 1. Create or edit a route file in `src/routes/` 2. Add automation logic to `src/nlm.js` 3. Register the route in `src/server.js` with `app.use()` 4. Wrap the nlm call in `queue.add(() => nlm.myFn(), 'label')` 5. Add to the Swagger spec in `src/swagger.js` 6. Update `API.md` ### Fixing a broken selector 1. Start the server: `npm start` 2. Hit the relevant `/debug/*` endpoint to see the live DOM (`/debug/home`, `/debug/source-items/:id`, `/debug/add-url-flow/:id`, etc.) 3. Update `src/selectors.js` with the corrected selector 4. If logic (not just selector) is wrong (e.g. timing, wrong element), update `src/nlm.js` 5. Test via the actual API endpoint ### Debugging "nút xác nhận không tìm thấy" in add source flow - The Add Source dialog has a 2-stage flow. If you get "nút xác nhận", it likely means: 1. The URL was typed into the WRONG textarea (the search box `query-box-textarea`) 2. The dialog never transitioned to stage 2, so "Chèn" button never appeared - Use `GET /debug/add-url-flow/:id` to trace each step and see what happened - Key filter: exclude `el.placeholder?.includes('Tìm nguồn')` and `el.className?.includes('query-box-textarea')` ### Debugging a "dialog not found" error - Use `GET /debug/notebook-menu/:id` to see what menu appears after hover+click - Use `GET /debug/add-source-dialog/:id` to see the full overlay HTML after clicking add source ### Understanding queue state `GET /health` returns `{ queue: { busy: boolean, pending: number } }`. If `busy` is always true, a previous operation is stuck (possibly waiting for a DOM element that never appeared). Restart the server. ## Environment variables | Variable | Default | Notes | |---|---|---| | `PORT` | `3456` | HTTP/WS port | | `HEADLESS` | `false` | Set `true` for CI (may break Google login) | | `CHROME_PATH` | auto-detect | Prefers system Chrome over bundled Chromium | | `API_KEY` | _(empty)_ | If set, all requests need `x-api-key` header | ## Running locally ```bash npm install # first time only npm start # starts at http://localhost:3456 npm run dev # same but with --watch (auto-restarts on file change) ``` Swagger UI: `http://localhost:3456/docs` ## Files NOT to modify carelessly - `chrome-profile/` — Google session data. Do not delete. Do not gitignore the entire directory (the profile must persist). - `src/queue.js` — Changing the queue to allow concurrency will cause race conditions in Puppeteer. - `src/nlm.js:waitForAiResponse` — The stability detection logic is tuned for NotebookLM's streaming behaviour; don't simplify it.