forked from joseph/notebooklm-api
226 lines
12 KiB
Markdown
226 lines
12 KiB
Markdown
---
|
|
name: notebooklm-api
|
|
description: Expert assistant for the notebooklm-api project — a Puppeteer-based REST/WebSocket server that automates Google NotebookLM. Use when working on browser automation, Puppeteer selectors, API routes, queue logic, or debugging DOM interactions with NotebookLM.
|
|
triggers:
|
|
- "notebooklm"
|
|
- "puppeteer"
|
|
- "selector"
|
|
- "notebook"
|
|
- "add source"
|
|
- "chat stream"
|
|
- "browser automation"
|
|
---
|
|
|
|
# NotebookLM API — Project Skill
|
|
|
|
## What this project is
|
|
|
|
A Node.js server (`src/server.js`) that drives **Google NotebookLM** (`https://notebooklm.google.com`) through a real Chrome window via Puppeteer with the Stealth plugin. It exposes a REST + WebSocket API so external systems (n8n, Python scripts, ERP, chatbots) can interact with NotebookLM programmatically.
|
|
|
|
**Key constraint:** NotebookLM has no official API. Every operation — listing notebooks, adding sources, chatting — is done by simulating real user clicks and keystrokes in the browser. The Google Angular UI changes frequently, so DOM selectors may need updating after UI changes.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
src/
|
|
server.js — Express app, middleware, debug routes, graceful shutdown
|
|
browser.js — BrowserManager singleton (puppeteer-extra + stealth)
|
|
nlm.js — All NotebookLM automation logic (the core)
|
|
queue.js — AsyncQueue: serialises ALL Puppeteer ops (no concurrency)
|
|
selectors.js — CSS selectors confirmed from real DOM inspection
|
|
swagger.js — Swagger/OpenAPI spec
|
|
routes/
|
|
auth.js — GET /api/auth/status, POST /api/auth/login
|
|
notebooks.js — CRUD for notebooks + sources
|
|
chat-ws.js — WebSocket streaming chat
|
|
```
|
|
|
|
## The queue — critical invariant
|
|
|
|
`src/queue.js` is a single-lane async queue. **Every route wraps its nlm call in `queue.add(...)`**. This prevents Puppeteer race conditions — only one browser action runs at a time. Never bypass the queue. If a new route is added, it MUST use the queue.
|
|
|
|
## Selectors — how to update them
|
|
|
|
All selectors live in `src/selectors.js`. They were confirmed against the Vietnamese-language NotebookLM UI. When Google updates the UI and a selector breaks:
|
|
|
|
1. Use the debug endpoints in `server.js` to inspect the live DOM:
|
|
- `GET /debug/home` — lists all visible buttons on the home page (find create/delete buttons)
|
|
- `GET /debug/notebook-menu/:id` — hovers a card, clicks the "..." menu, returns menu items
|
|
- `GET /debug/chat/:id` — lists all textarea/input elements on a notebook page
|
|
- `GET /debug/sources/:id` — inspects the source panel DOM
|
|
- `GET /debug/source-items/:id` — full HTML of source items (buttons, icons, spans)
|
|
- `GET /debug/add-source-dialog/:id` — clicks "Add source" and snapshots the dialog
|
|
- `GET /debug/add-url-flow/:id` — step-by-step trace of the full add-URL flow
|
|
- `GET /debug/page-state/:id` — full overlay/dialog state before+after click
|
|
- `GET /debug/screenshot` — base64 PNG of current Chrome window
|
|
|
|
2. Key known selectors (confirmed 2026-06-16):
|
|
- Create notebook button: `button[aria-label="Tạo sổ ghi chú mới"], button.create-new-button`
|
|
- Card menu button: `button[aria-label="Trình đơn thao tác trong dự án"]`
|
|
- Delete menu item: find by text `/Xo[áa]/i` inside `.mat-mdc-menu-panel button.mat-mdc-menu-item`
|
|
- Add source button: `.add-source-button`
|
|
- Source items: `div.single-source-container` (NOT `.source-item-menu-button-visible` which is a child)
|
|
- Source title: `button.source-stretched-button[aria-label]` inside each container
|
|
- Source type icon: `mat-icon.source-item-source-icon` text; or "url" if `img.favicon-icon` present
|
|
- Chat textarea: `textarea.query-box-input` (NOT the source search textarea)
|
|
- AI response cards: `mat-card.to-user-message-card-content`
|
|
- User message cards: `mat-card.from-user-message-card-content`
|
|
- Thinking animation: `div.thinking-message, thinking-animation`
|
|
- Dialogs: `mat-dialog-container` (multiple can exist — emoji keyboard occupies one)
|
|
|
|
3. The UI language is **Vietnamese** — button labels like "Trang web", "Văn bản đã sao chép", "Chèn", "Tạo sổ ghi chú mới".
|
|
|
|
## Dialog handling — the tricky part
|
|
|
|
The Add Source dialog now has a **2-stage flow**:
|
|
|
|
**Stage 1 (initial):** Shows source-type buttons + a search textarea (`placeholder="Tìm nguồn mới trên web"`, class `query-box-textarea`). This textarea is always present and must be EXCLUDED from URL/text input detection.
|
|
|
|
**Stage 2 (after clicking type button):** Dialog transitions in-place to show:
|
|
- URL mode: a TEXTAREA with `aria-label="Nhập URL"` and `placeholder="Dán liên kết bất kỳ"`
|
|
- Text mode: a large textarea without the `query-box-textarea` class
|
|
|
|
**Submit button:** `"Chèn"` button with `type="button"` (NOT `type="submit"` — that's the back/close buttons).
|
|
|
|
Pattern in `nlm.addSourceUrl()` — wait for dialog to TRANSITION before finding URL input:
|
|
```javascript
|
|
// Click "Trang web" button
|
|
await page.evaluate(() => {
|
|
const d = [...document.querySelectorAll('mat-dialog-container')].pop();
|
|
const btn = [...d.querySelectorAll('button')].find(b => /Trang web/i.test(b.textContent));
|
|
btn?.click();
|
|
});
|
|
|
|
// Wait for URL input to appear (not the search textarea)
|
|
await page.waitForFunction(() => {
|
|
const d = [...document.querySelectorAll('mat-dialog-container')].pop();
|
|
return [...d.querySelectorAll('input, textarea')].some(el =>
|
|
el.offsetParent !== null &&
|
|
(el.getAttribute('aria-label')?.toLowerCase().includes('url') ||
|
|
/liên kết|link|paste|http/i.test(el.placeholder))
|
|
);
|
|
}, { timeout: 10_000, polling: 200 });
|
|
|
|
// Click submit — use "Chèn" text, NOT type="submit"
|
|
await page.evaluate(() => {
|
|
const d = [...document.querySelectorAll('mat-dialog-container')].pop();
|
|
const btn = [...d.querySelectorAll('button')].find(b =>
|
|
b.offsetParent !== null && !b.disabled && /Ch[eè]n/i.test(b.textContent)
|
|
);
|
|
btn?.click();
|
|
});
|
|
```
|
|
|
|
## Chat streaming (WebSocket)
|
|
|
|
`src/routes/chat-ws.js` handles `WS /api/notebooks/:id/chat/stream`.
|
|
|
|
The streaming works by **polling the DOM** every 300ms during the AI response, diffing against the last known text, and emitting `{ type: "chunk", data: "..." }` messages. This is not true server-sent streaming — it's DOM polling.
|
|
|
|
Message types the server emits:
|
|
- `{ type: "connected", notebookId }` — on WS open
|
|
- `{ type: "chunk", data: string }` — incremental text
|
|
- `{ type: "done", data: { answer } }` — response complete
|
|
- `{ type: "error", data: string }` — on failure
|
|
|
|
## waitForAiResponse — response completion detection
|
|
|
|
`nlm.waitForAiResponse(page, prevCount, timeout)` in `src/nlm.js` uses a 3-step approach:
|
|
1. Wait for `div.thinking-message` / `thinking-animation` to disappear
|
|
2. Wait for a new `mat-card.to-user-message-card-content` to appear (count > prevCount)
|
|
3. Wait for the text to be **stable for 1.5s** (no changes in 4 consecutive 400ms polls)
|
|
|
|
This handles streaming responses that may still be updating after the spinner disappears.
|
|
|
|
## Browser session persistence
|
|
|
|
Chrome profile is stored at `./chrome-profile/`. Google login cookies persist between server restarts. On startup, `browser.isAuthenticated()` navigates to `https://notebooklm.google.com` and checks if the URL stays there (not redirected to `accounts.google.com`).
|
|
|
|
**Important:** Always stop the server with `Ctrl+C` (SIGTERM/SIGINT) — the graceful shutdown handler calls `browser.close()` to flush Chrome's cookie store. Using `kill -9` may corrupt the session.
|
|
|
|
## File upload (`addSourceFile`) — xap uploader quirks
|
|
|
|
NotebookLM's "Tải tệp lên" button uses Google's internal **xap scotty uploader** (`[xapscottyuploadertrigger]` attribute). Key findings from 2026-06-16 debugging:
|
|
|
|
1. **Trigger button is in the INITIAL dialog** — drop zone + "Tải tệp lên" button appear as soon as the add-source dialog opens. No need to click a secondary button to reveal the drop zone.
|
|
|
|
2. **Must use trusted (CDP) click** — `trigger.click()` inside `page.evaluate()` generates `isTrusted: false`. The xap uploader silently ignores untrusted clicks. Use `page.mouse.click(x, y)` (Puppeteer's CDP method) to generate trusted clicks.
|
|
|
|
3. **Two upload paths** — depending on browser context, xap uploader may use either:
|
|
- **Native file chooser** (`input[type=file]`) → caught by `page.waitForFileChooser()` + `fileChooser.accept([path])`
|
|
- **`showOpenFilePicker()`** → override it BEFORE clicking: `window.showOpenFilePicker = async () => [{ getFile: () => file, ... }]`
|
|
|
|
4. **No "Chèn" button** — after file upload, dialog auto-closes. No confirmation click needed.
|
|
|
|
5. **Wait for network idle** after upload (the scotty upload finishes asynchronously).
|
|
|
|
```javascript
|
|
// Pattern (from nlm.addSourceFile):
|
|
const chooser = page.waitForFileChooser({ timeout: 8_000 }).catch(() => null);
|
|
await page.mouse.click(triggerX, triggerY); // trusted click
|
|
const fc = await chooser;
|
|
if (fc) await fc.accept([absPath]); // native file chooser path
|
|
// else showOpenFilePicker override handles it
|
|
await page.waitForNetworkIdle({ timeout: 60_000 }); // wait for scotty upload
|
|
```
|
|
|
|
### Debugging "add source button not found" error
|
|
|
|
The add-source button (`button[aria-label="Thêm nguồn"]`) only exists on notebooks the USER OWNS. Public/shared notebooks (shown with "Công khai" badge) don't have this button. Always test with a user-owned notebook.
|
|
|
|
## Common tasks
|
|
|
|
### Adding a new API endpoint
|
|
1. Create or edit a route file in `src/routes/`
|
|
2. Add automation logic to `src/nlm.js`
|
|
3. Register the route in `src/server.js` with `app.use()`
|
|
4. Wrap the nlm call in `queue.add(() => nlm.myFn(), 'label')`
|
|
5. Add to the Swagger spec in `src/swagger.js`
|
|
6. Update `API.md`
|
|
|
|
### Fixing a broken selector
|
|
1. Start the server: `npm start`
|
|
2. Hit the relevant `/debug/*` endpoint to see the live DOM (`/debug/home`, `/debug/source-items/:id`, `/debug/add-url-flow/:id`, etc.)
|
|
3. Update `src/selectors.js` with the corrected selector
|
|
4. If logic (not just selector) is wrong (e.g. timing, wrong element), update `src/nlm.js`
|
|
5. Test via the actual API endpoint
|
|
|
|
### Debugging "nút xác nhận không tìm thấy" in add source flow
|
|
- The Add Source dialog has a 2-stage flow. If you get "nút xác nhận", it likely means:
|
|
1. The URL was typed into the WRONG textarea (the search box `query-box-textarea`)
|
|
2. The dialog never transitioned to stage 2, so "Chèn" button never appeared
|
|
- Use `GET /debug/add-url-flow/:id` to trace each step and see what happened
|
|
- Key filter: exclude `el.placeholder?.includes('Tìm nguồn')` and `el.className?.includes('query-box-textarea')`
|
|
|
|
### Debugging a "dialog not found" error
|
|
- Use `GET /debug/notebook-menu/:id` to see what menu appears after hover+click
|
|
- Use `GET /debug/add-source-dialog/:id` to see the full overlay HTML after clicking add source
|
|
|
|
### Understanding queue state
|
|
`GET /health` returns `{ queue: { busy: boolean, pending: number } }`. If `busy` is always true, a previous operation is stuck (possibly waiting for a DOM element that never appeared). Restart the server.
|
|
|
|
## Environment variables
|
|
|
|
| Variable | Default | Notes |
|
|
|---|---|---|
|
|
| `PORT` | `3456` | HTTP/WS port |
|
|
| `HEADLESS` | `false` | Set `true` for CI (may break Google login) |
|
|
| `CHROME_PATH` | auto-detect | Prefers system Chrome over bundled Chromium |
|
|
| `API_KEY` | _(empty)_ | If set, all requests need `x-api-key` header |
|
|
|
|
## Running locally
|
|
|
|
```bash
|
|
npm install # first time only
|
|
npm start # starts at http://localhost:3456
|
|
npm run dev # same but with --watch (auto-restarts on file change)
|
|
```
|
|
|
|
Swagger UI: `http://localhost:3456/docs`
|
|
|
|
## Files NOT to modify carelessly
|
|
|
|
- `chrome-profile/` — Google session data. Do not delete. Do not gitignore the entire directory (the profile must persist).
|
|
- `src/queue.js` — Changing the queue to allow concurrency will cause race conditions in Puppeteer.
|
|
- `src/nlm.js:waitForAiResponse` — The stability detection logic is tuned for NotebookLM's streaming behaviour; don't simplify it.
|