notebooklm-api/.claude/skills/notebooklm-api.md

12 KiB

name description triggers
notebooklm-api Expert assistant for the notebooklm-api project — a Puppeteer-based REST/WebSocket server that automates Google NotebookLM. Use when working on browser automation, Puppeteer selectors, API routes, queue logic, or debugging DOM interactions with NotebookLM.
notebooklm
puppeteer
selector
notebook
add source
chat stream
browser automation

NotebookLM API — Project Skill

What this project is

A Node.js server (src/server.js) that drives Google NotebookLM (https://notebooklm.google.com) through a real Chrome window via Puppeteer with the Stealth plugin. It exposes a REST + WebSocket API so external systems (n8n, Python scripts, ERP, chatbots) can interact with NotebookLM programmatically.

Key constraint: NotebookLM has no official API. Every operation — listing notebooks, adding sources, chatting — is done by simulating real user clicks and keystrokes in the browser. The Google Angular UI changes frequently, so DOM selectors may need updating after UI changes.

Architecture

src/
  server.js      — Express app, middleware, debug routes, graceful shutdown
  browser.js     — BrowserManager singleton (puppeteer-extra + stealth)
  nlm.js         — All NotebookLM automation logic (the core)
  queue.js       — AsyncQueue: serialises ALL Puppeteer ops (no concurrency)
  selectors.js   — CSS selectors confirmed from real DOM inspection
  swagger.js     — Swagger/OpenAPI spec
  routes/
    auth.js      — GET /api/auth/status, POST /api/auth/login
    notebooks.js — CRUD for notebooks + sources
    chat-ws.js   — WebSocket streaming chat

The queue — critical invariant

src/queue.js is a single-lane async queue. Every route wraps its nlm call in queue.add(...). This prevents Puppeteer race conditions — only one browser action runs at a time. Never bypass the queue. If a new route is added, it MUST use the queue.

Selectors — how to update them

All selectors live in src/selectors.js. They were confirmed against the Vietnamese-language NotebookLM UI. When Google updates the UI and a selector breaks:

  1. Use the debug endpoints in server.js to inspect the live DOM:

    • GET /debug/home — lists all visible buttons on the home page (find create/delete buttons)
    • GET /debug/notebook-menu/:id — hovers a card, clicks the "..." menu, returns menu items
    • GET /debug/chat/:id — lists all textarea/input elements on a notebook page
    • GET /debug/sources/:id — inspects the source panel DOM
    • GET /debug/source-items/:id — full HTML of source items (buttons, icons, spans)
    • GET /debug/add-source-dialog/:id — clicks "Add source" and snapshots the dialog
    • GET /debug/add-url-flow/:id — step-by-step trace of the full add-URL flow
    • GET /debug/page-state/:id — full overlay/dialog state before+after click
    • GET /debug/screenshot — base64 PNG of current Chrome window
  2. Key known selectors (confirmed 2026-06-16):

    • Create notebook button: button[aria-label="Tạo sổ ghi chú mới"], button.create-new-button
    • Card menu button: button[aria-label="Trình đơn thao tác trong dự án"]
    • Delete menu item: find by text /Xo[áa]/i inside .mat-mdc-menu-panel button.mat-mdc-menu-item
    • Add source button: .add-source-button
    • Source items: div.single-source-container (NOT .source-item-menu-button-visible which is a child)
    • Source title: button.source-stretched-button[aria-label] inside each container
    • Source type icon: mat-icon.source-item-source-icon text; or "url" if img.favicon-icon present
    • Chat textarea: textarea.query-box-input (NOT the source search textarea)
    • AI response cards: mat-card.to-user-message-card-content
    • User message cards: mat-card.from-user-message-card-content
    • Thinking animation: div.thinking-message, thinking-animation
    • Dialogs: mat-dialog-container (multiple can exist — emoji keyboard occupies one)
  3. The UI language is Vietnamese — button labels like "Trang web", "Văn bản đã sao chép", "Chèn", "Tạo sổ ghi chú mới".

Dialog handling — the tricky part

The Add Source dialog now has a 2-stage flow:

Stage 1 (initial): Shows source-type buttons + a search textarea (placeholder="Tìm nguồn mới trên web", class query-box-textarea). This textarea is always present and must be EXCLUDED from URL/text input detection.

Stage 2 (after clicking type button): Dialog transitions in-place to show:

  • URL mode: a TEXTAREA with aria-label="Nhập URL" and placeholder="Dán liên kết bất kỳ"
  • Text mode: a large textarea without the query-box-textarea class

Submit button: "Chèn" button with type="button" (NOT type="submit" — that's the back/close buttons).

Pattern in nlm.addSourceUrl() — wait for dialog to TRANSITION before finding URL input:

// Click "Trang web" button
await page.evaluate(() => {
  const d = [...document.querySelectorAll('mat-dialog-container')].pop();
  const btn = [...d.querySelectorAll('button')].find(b => /Trang web/i.test(b.textContent));
  btn?.click();
});

// Wait for URL input to appear (not the search textarea)
await page.waitForFunction(() => {
  const d = [...document.querySelectorAll('mat-dialog-container')].pop();
  return [...d.querySelectorAll('input, textarea')].some(el =>
    el.offsetParent !== null &&
    (el.getAttribute('aria-label')?.toLowerCase().includes('url') ||
     /liên kết|link|paste|http/i.test(el.placeholder))
  );
}, { timeout: 10_000, polling: 200 });

// Click submit — use "Chèn" text, NOT type="submit"
await page.evaluate(() => {
  const d = [...document.querySelectorAll('mat-dialog-container')].pop();
  const btn = [...d.querySelectorAll('button')].find(b =>
    b.offsetParent !== null && !b.disabled && /Ch[eè]n/i.test(b.textContent)
  );
  btn?.click();
});

Chat streaming (WebSocket)

src/routes/chat-ws.js handles WS /api/notebooks/:id/chat/stream.

The streaming works by polling the DOM every 300ms during the AI response, diffing against the last known text, and emitting { type: "chunk", data: "..." } messages. This is not true server-sent streaming — it's DOM polling.

Message types the server emits:

  • { type: "connected", notebookId } — on WS open
  • { type: "chunk", data: string } — incremental text
  • { type: "done", data: { answer } } — response complete
  • { type: "error", data: string } — on failure

waitForAiResponse — response completion detection

nlm.waitForAiResponse(page, prevCount, timeout) in src/nlm.js uses a 3-step approach:

  1. Wait for div.thinking-message / thinking-animation to disappear
  2. Wait for a new mat-card.to-user-message-card-content to appear (count > prevCount)
  3. Wait for the text to be stable for 1.5s (no changes in 4 consecutive 400ms polls)

This handles streaming responses that may still be updating after the spinner disappears.

Browser session persistence

Chrome profile is stored at ./chrome-profile/. Google login cookies persist between server restarts. On startup, browser.isAuthenticated() navigates to https://notebooklm.google.com and checks if the URL stays there (not redirected to accounts.google.com).

Important: Always stop the server with Ctrl+C (SIGTERM/SIGINT) — the graceful shutdown handler calls browser.close() to flush Chrome's cookie store. Using kill -9 may corrupt the session.

File upload (addSourceFile) — xap uploader quirks

NotebookLM's "Tải tệp lên" button uses Google's internal xap scotty uploader ([xapscottyuploadertrigger] attribute). Key findings from 2026-06-16 debugging:

  1. Trigger button is in the INITIAL dialog — drop zone + "Tải tệp lên" button appear as soon as the add-source dialog opens. No need to click a secondary button to reveal the drop zone.

  2. Must use trusted (CDP) clicktrigger.click() inside page.evaluate() generates isTrusted: false. The xap uploader silently ignores untrusted clicks. Use page.mouse.click(x, y) (Puppeteer's CDP method) to generate trusted clicks.

  3. Two upload paths — depending on browser context, xap uploader may use either:

    • Native file chooser (input[type=file]) → caught by page.waitForFileChooser() + fileChooser.accept([path])
    • showOpenFilePicker() → override it BEFORE clicking: window.showOpenFilePicker = async () => [{ getFile: () => file, ... }]
  4. No "Chèn" button — after file upload, dialog auto-closes. No confirmation click needed.

  5. Wait for network idle after upload (the scotty upload finishes asynchronously).

// Pattern (from nlm.addSourceFile):
const chooser = page.waitForFileChooser({ timeout: 8_000 }).catch(() => null);
await page.mouse.click(triggerX, triggerY);           // trusted click
const fc = await chooser;
if (fc) await fc.accept([absPath]);                   // native file chooser path
// else showOpenFilePicker override handles it
await page.waitForNetworkIdle({ timeout: 60_000 });   // wait for scotty upload

Debugging "add source button not found" error

The add-source button (button[aria-label="Thêm nguồn"]) only exists on notebooks the USER OWNS. Public/shared notebooks (shown with "Công khai" badge) don't have this button. Always test with a user-owned notebook.

Common tasks

Adding a new API endpoint

  1. Create or edit a route file in src/routes/
  2. Add automation logic to src/nlm.js
  3. Register the route in src/server.js with app.use()
  4. Wrap the nlm call in queue.add(() => nlm.myFn(), 'label')
  5. Add to the Swagger spec in src/swagger.js
  6. Update API.md

Fixing a broken selector

  1. Start the server: npm start
  2. Hit the relevant /debug/* endpoint to see the live DOM (/debug/home, /debug/source-items/:id, /debug/add-url-flow/:id, etc.)
  3. Update src/selectors.js with the corrected selector
  4. If logic (not just selector) is wrong (e.g. timing, wrong element), update src/nlm.js
  5. Test via the actual API endpoint

Debugging "nút xác nhận không tìm thấy" in add source flow

  • The Add Source dialog has a 2-stage flow. If you get "nút xác nhận", it likely means:
    1. The URL was typed into the WRONG textarea (the search box query-box-textarea)
    2. The dialog never transitioned to stage 2, so "Chèn" button never appeared
  • Use GET /debug/add-url-flow/:id to trace each step and see what happened
  • Key filter: exclude el.placeholder?.includes('Tìm nguồn') and el.className?.includes('query-box-textarea')

Debugging a "dialog not found" error

  • Use GET /debug/notebook-menu/:id to see what menu appears after hover+click
  • Use GET /debug/add-source-dialog/:id to see the full overlay HTML after clicking add source

Understanding queue state

GET /health returns { queue: { busy: boolean, pending: number } }. If busy is always true, a previous operation is stuck (possibly waiting for a DOM element that never appeared). Restart the server.

Environment variables

Variable Default Notes
PORT 3456 HTTP/WS port
HEADLESS false Set true for CI (may break Google login)
CHROME_PATH auto-detect Prefers system Chrome over bundled Chromium
API_KEY (empty) If set, all requests need x-api-key header

Running locally

npm install    # first time only
npm start      # starts at http://localhost:3456
npm run dev    # same but with --watch (auto-restarts on file change)

Swagger UI: http://localhost:3456/docs

Files NOT to modify carelessly

  • chrome-profile/ — Google session data. Do not delete. Do not gitignore the entire directory (the profile must persist).
  • src/queue.js — Changing the queue to allow concurrency will cause race conditions in Puppeteer.
  • src/nlm.js:waitForAiResponse — The stability detection logic is tuned for NotebookLM's streaming behaviour; don't simplify it.