Timber on Vercel: serverless /v1 API over Neon Postgres#3
Timber on Vercel: serverless /v1 API over Neon Postgres#3ItsmeBlackOps wants to merge 20 commits into
Conversation
… no .query method
📝 WalkthroughWalkthroughThis PR re-hosts Timber on Vercel with Neon Postgres, adding shared API libraries (auth, keyring, env, db, cursor, respond, validate, where, scope), SQL query modules for logs/stats/groupby/facets/events/jobs, endpoint handlers, project CRUD, healthz, retention cron, Logflare dual-write/proxy, REST log clients, tests, and migration documentation. ChangesVercel/Neon API implementation
Estimated code review effort: 4 (Complex) | ~75 minutes REST log clients and migration documentation
Estimated code review effort: 2 (Simple) | ~10 minutes Related PRs: None identified. Suggested labels: api, database, documentation Suggested reviewers: None identified. 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request implements the re-hosting of Timber on Vercel, transitioning the backend from MongoDB to Neon Postgres with serverless functions and introducing lightweight Python and Node.js logging clients. The review feedback highlights several critical issues: a wildcard rewrite rule in vercel.json that breaks API routing; potential query crashes in where.js due to unsafe numeric casting on JSONB fields; performance degradation from bypassing GIN indexes with the #>> operator; memory exhaustion risks in the retention cron due to bulk DELETE ... RETURNING 1; a type resolution error in stats.js with date_trunc; and socket leaks in the Python client from unclosed HTTP connections.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| "rewrites": [ | ||
| { "source": "/v1/:path*", "destination": "/api/v1/:path*" }, | ||
| { "source": "/healthz", "destination": "/api/healthz" }, | ||
| { "source": "/(.*)", "destination": "/index.html" } |
There was a problem hiding this comment.
The wildcard rewrite rule /(.*) is too broad and will match all requests that do not match the previous rules, including direct requests to /api/... (such as the retention cron at /api/cron/retention). This will rewrite those API requests to /index.html, completely breaking the retention cron and any direct API access.
As specified in the design document, the wildcard rewrite should explicitly exclude the /api path using a negative lookahead.
| { "source": "/(.*)", "destination": "/index.html" } | |
| { "source": "/((?!api/).*)", "destination": "/index.html" } |
| if (pathStr.length === 0) return fail(`unknown parameter "${name}"`); | ||
| if (!NUMERIC_RE.test(v)) return fail(`${name} requires a numeric value`); | ||
| const path = pathStr.split('.'); | ||
| clauses.push(`(data #>> ${push(path)})::numeric ${rangeOp} ${push(Number(v))}`); |
There was a problem hiding this comment.
Casting the JSONB value directly to ::numeric without validation is extremely risky. In a semi-structured log database, JSONB fields like data.status might occasionally contain non-numeric string values (e.g., "unknown", "failed", or "OK"). If even a single row contains a non-numeric value at the queried path, this cast will throw a Postgres error and crash the entire query.
To make this robust, use a safe regex-guarded comparison to ensure the value is numeric before casting.
| clauses.push(`(data #>> ${push(path)})::numeric ${rangeOp} ${push(Number(v))}`); | |
| clauses.push(`CASE WHEN (data #>> ${push(path)}) ~ '^-?[0-9]+(\\.[0-9]+)?$' THEN (data #>> ${push(path)})::numeric END ${rangeOp} ${push(Number(v))}`); |
| } else if (name.startsWith('ids.') && name.length > 'ids.'.length) { | ||
| const path = name.slice('ids.'.length).split('.'); | ||
| clauses.push(`ids #>> ${push(path)} = ${push(v)}`); | ||
| } else if (name.startsWith('data.') && name.length > 'data.'.length) { | ||
| const rangeOp = name.endsWith('__gte') ? '>=' : name.endsWith('__lte') ? '<=' : null; | ||
| if (rangeOp === null) { | ||
| const path = name.slice('data.'.length).split('.'); | ||
| clauses.push(`data #>> ${push(path)} = ${push(v)}`); |
There was a problem hiding this comment.
Using the #>> operator for ids.* and data.* equality filters completely bypasses the GIN indexes on the ids and data columns, forcing Postgres to perform full table scans on large datasets.
To utilize the GIN indexes efficiently, use the @> JSONB containment operator by constructing a nested JSONB object dynamically in JavaScript.
| } else if (name.startsWith('ids.') && name.length > 'ids.'.length) { | |
| const path = name.slice('ids.'.length).split('.'); | |
| clauses.push(`ids #>> ${push(path)} = ${push(v)}`); | |
| } else if (name.startsWith('data.') && name.length > 'data.'.length) { | |
| const rangeOp = name.endsWith('__gte') ? '>=' : name.endsWith('__lte') ? '<=' : null; | |
| if (rangeOp === null) { | |
| const path = name.slice('data.'.length).split('.'); | |
| clauses.push(`data #>> ${push(path)} = ${push(v)}`); | |
| } else if (name.startsWith('ids.') && name.length > 'ids.'.length) { | |
| const path = name.slice('ids.'.length).split('.'); | |
| let obj = v; | |
| for (let i = path.length - 1; i >= 0; i--) obj = { [path[i]]: obj }; | |
| clauses.push(`ids @> ${push(JSON.stringify(obj))}::jsonb`); | |
| } else if (name.startsWith('data.') && name.length > 'data.'.length) { | |
| const rangeOp = name.endsWith('__gte') ? '>=' : name.endsWith('__lte') ? '<=' : null; | |
| if (rangeOp === null) { | |
| const path = name.slice('data.'.length).split('.'); | |
| let obj = NUMERIC_RE.test(v) ? Number(v) : v === 'true' ? true : v === 'false' ? false : v === 'null' ? null : v; | |
| for (let i = path.length - 1; i >= 0; i--) obj = { [path[i]]: obj }; | |
| clauses.push(`data @> ${push(JSON.stringify(obj))}::jsonb`); |
| const rows = await db()`DELETE FROM events WHERE expires_at < now() RETURNING 1`; | ||
| return json(res, 200, { ok: true, deleted: rows.length }); |
There was a problem hiding this comment.
Using RETURNING 1 on a bulk DELETE query is highly dangerous for a daily retention cron. If millions of expired logs are deleted at once, Postgres will construct and transmit a massive result set of millions of rows, and Node.js will attempt to parse a huge JSON array. This can easily cause the serverless function to run out of memory (OOM) or exceed Vercel's execution/payload limits.
Since the exact count of deleted rows is not critical for the cron's success, it is much safer to omit the RETURNING clause entirely.
| const rows = await db()`DELETE FROM events WHERE expires_at < now() RETURNING 1`; | |
| return json(res, 200, { ok: true, deleted: rows.length }); | |
| await db()`DELETE FROM events WHERE expires_at < now()`; | |
| return json(res, 200, { ok: true }); |
| const lat = "coalesce(data->>'latencyMs', data->>'durationMs')"; | ||
| const text = ` | ||
| SELECT | ||
| date_trunc($1, received_at) AS bucket, |
There was a problem hiding this comment.
In Postgres, the first argument of date_trunc must be a non-parameterized string literal or explicitly cast. Passing $1 directly as a parameter will cause Postgres to fail with a type resolution error (function date_trunc(unknown, timestamp with time zone) is not unique) because the type of the parameter is ambiguous.
Explicitly cast the parameter to text to resolve this.
| date_trunc($1, received_at) AS bucket, | |
| date_trunc($1::text, received_at) AS bucket, |
| }, | ||
| method="POST", | ||
| ) | ||
| urllib.request.urlopen(req, timeout=5).read() |
There was a problem hiding this comment.
Calling urllib.request.urlopen without closing the returned response object can leak sockets and HTTP connections. In a long-running background thread, this can eventually exhaust system resources.
Use a with statement to ensure the connection is properly closed after reading.
| urllib.request.urlopen(req, timeout=5).read() | |
| with urllib.request.urlopen(req, timeout=5) as response: | |
| response.read() |
…e retry, bigint-safe cursor id, encodeCursor guard, README batch note)
There was a problem hiding this comment.
Actionable comments posted: 20
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🟡 Minor comments (2)
web/tsconfig.app.build.json-4-7 (1)
4-7: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick winEnable incremental builds here
tsBuildInfoFilealone won’t persist a useful cache. Addincremental: truehere (or switch this build path to a compositetsc -bflow) sotsc -p tsconfig.app.build.jsonavoids rechecking everything on every run.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@web/tsconfig.app.build.json` around lines 4 - 7, Enable incremental compilation in the build config so the existing tsBuildInfoFile cache is actually used. Update tsconfig.app.build.json to set incremental to true (or otherwise move this path to a tsc -b/composite setup) so the tsc -p tsconfig.app.build.json build can reuse state instead of rechecking everything. Focus on the build config keys around tsBuildInfoFile, include, and exclude.clients/README.md-21-23 (1)
21-23: 🗄️ Data Integrity & Integration | 🟡 Minor | ⚡ Quick winDocument the full ingest response.
/v1/logsreturns more thanaccepted; the current API also includesrejected(and fallback paths can addstore). Please update the example so client code doesn't hard-code a narrower contract.💡 Suggested edit
- The response is `201 {"accepted":N}`. + The response is `201 {"accepted":N,"rejected":0}` (and fallback variants may add `store`).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@clients/README.md` around lines 21 - 23, The `/v1/logs` ingest response description in README is too narrow and only mentions `accepted`, which can mislead client code. Update the documented response in the README section for the ingest API to reflect the full contract, including `rejected` and any fallback-added `store` fields, so the example matches what the client actually receives.
🧹 Nitpick comments (1)
web/api/cron/retention.js (1)
22-27: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick winCount deleted rows in SQL instead of materializing every
RETURNING 1row.The batch is bounded, but this still transfers up to 10,000 rows per iteration just to compute
rows.length. A small CTE can keep the same behavior while returning a single count row.Suggested refactor
for (let i = 0; i < MAX_BATCHES; i++) { const rows = await db()` - DELETE FROM events - WHERE id IN (SELECT id FROM events WHERE expires_at < now() LIMIT ${BATCH}) - RETURNING 1`; - deleted += rows.length; - if (rows.length < BATCH) break; + WITH doomed AS ( + SELECT id + FROM events + WHERE expires_at < now() + LIMIT ${BATCH} + ), deleted_rows AS ( + DELETE FROM events e + USING doomed d + WHERE e.id = d.id + RETURNING 1 + ) + SELECT count(*)::int AS n FROM deleted_rows`; + deleted += rows[0].n; + if (rows[0].n < BATCH) break; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@web/api/cron/retention.js` around lines 22 - 27, The retention batch delete in the cron job is materializing every returned row just to compute a count. Update the delete query in the retention handler to use a CTE or similar SQL construct that returns a single deleted-row count instead of `RETURNING 1`, and adjust the `deleted` accumulation and batch-termination check in the same flow to use that count while preserving the existing `BATCH` behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@clients/timber_client.py`:
- Around line 74-97: The TimberClient.flush method is dropping buffered events
before the POST succeeds, and close only flushes once, so logs can be lost on
shutdown or transient HTTP failures. Update flush to only remove items from _buf
after urllib.request.urlopen succeeds, and on any exception leave the batch
queued for retry instead of discarding it; then make close in TimberClient
trigger repeated flushing until the buffer is empty or the queue can no longer
be drained.
In `@clients/timber-client.js`:
- Around line 39-64: The shutdown path in createLogger/flush only sends one
MAX_BATCH chunk and marks it gone even if delivery fails, so buffered events can
be lost and HTTP error responses are ignored. Update flush() so it checks the
fetch response and only removes items from buf after a successful 2xx send, and
make close() drain the queue by repeatedly calling flush() until buf is empty.
Keep the behavior in clients/timber-client.js and preserve the existing
non-crashing error handling.
In `@docs/superpowers/plans/2026-06-30-logflare-dual-write.md`:
- Around line 397-440: normalizeRow() is not preserving the Timber log item
shape because it drops _id and expiresAt, even though handler returns these
items as interchangeable with /v1/logs. Update normalizeRow() to carry through
_id and expiresAt from the upstream row when present, alongside the existing
metadata mappings, so the Console can consume both sources without special
cases.
- Around line 492-499: The curl example in the log sending section uses an
invalid top-level batch wrapper, so update the example under the test event
instructions to send either a single envelope or a raw JSON array as accepted by
/v1/logs. Locate the example around the send-test-event snippet and replace the
request body shape so it matches the endpoint contract used elsewhere in this
plan.
- Around line 237-273: The ingest path still waits for Logflare because
`Promise.allSettled` blocks until both promises settle, so update `ingest` in
`web/api/v1/logs.js` to start `forwardToLogflare` in the background and await
only the Neon insert from `db()(text, params)` before returning. Keep
`buildInsert`, `validateBatch`, and `requireWrite` flow intact, but make
Logflare best-effort by not awaiting its promise and handling any rejection
separately so it never delays the response.
In `@docs/superpowers/specs/2026-06-22-timber-on-vercel-design.md`:
- Around line 189-193: Update the AI Usage and user/grouping query descriptions
in the specs doc to match the shipped client contract: use
inputTokens/outputTokens instead of promptTokens/completionTokens, and reference
ids.userId or data.userId instead of user_id. Make the same field-name alignment
in the related “By User” and “By Service” metrics text so the documented SQL
matches what the client helpers emit, using the AI Usage section and its
adjacent metrics as the unique anchors.
In `@web/api/_lib/cursor.js`:
- Around line 10-16: decodeCursor currently accepts digit-only timestamps that
can still produce an invalid Date, which later breaks buildWhere’s toISOString()
path instead of being rejected as an invalid cursor. Update decodeCursor in
cursor.js to validate the parsed ms value by checking the Date is valid before
returning { receivedAt, id }, and return null for out-of-range timestamps so
buildWhere can keep handling it as invalid cursor.
In `@web/api/_lib/logflare.js`:
- Around line 31-35: The Logflare write in the fetch call should be bounded so a
slow or hung request does not keep the ingest path open. Update the Logflare
request logic in the helper that sends the POST to LOGFLARE_URL to use an abort
timeout, and ensure the caller degrades cleanly to false when the timeout is
hit. Keep the change localized to the Logflare send function and preserve the
existing success/failure behavior for the Promise.allSettled flow.
In `@web/api/_lib/respond.js`:
- Around line 21-28: `readJson()` is still vulnerable to hanging on aborted
uploads because it only resolves on `end` or `error`. Update the request-body
promise in `respond.js` to also listen for `close` on the `IncomingMessage` and
resolve immediately when the connection is disconnected, alongside the existing
`req.on('data')`, `req.on('end')`, and `req.on('error')` handling.
In `@web/api/_lib/sql/events.js`:
- Around line 6-13: The events query validation in parseEventsQuery is rejecting
scoped requests because KNOWN_PARAMS only allows app, but web/api/v1/events.js
forwards a resolved project param for scoped /v1/events calls. Update
parseEventsQuery to accept project as a valid parameter and preserve the
existing app handling so scoped requests pass validation while still returning
the resolved app value.
In `@web/api/_lib/sql/facets.js`:
- Around line 7-17: Allow the internal project query param to pass validation in
parseFacetsQuery, since facets scope resolution already consumes it before
parsing. Update the KNOWN_PARAMS whitelist in facets.js to include project so
project-scoped requests do not fail with unknown parameter, while keeping the
existing handling in parseFacetsQuery and related facets handler flow unchanged.
In `@web/api/_lib/sql/groupby.js`:
- Line 16: Treat project as an internal query param in the groupby parser by
adding it to OWN_PARAMS in parseGroupByQuery so it is not forwarded into
buildWhere. This should be done in the groupby query handling logic used by
parseGroupByQuery and any callers like web/api/v1/groupby.js, ensuring scoped
requests with project resolve correctly instead of failing validation.
In `@web/api/_lib/sql/jobs.js`:
- Around line 9-19: The jobs query validation in parseJobsQuery still rejects
the project parameter because the handler is passing the original search params
through unchanged. Update the jobs API flow so the project key is removed or
ignored before parseJobsQuery validates OWN_PARAMS, and make sure the existing
resolveScope/app filtering path remains the source of truth for project-scoped
requests.
In `@web/api/_lib/sql/stats.js`:
- Line 12: The numeric guard in NUM_RE is too restrictive and rejects valid
exponent-form numbers like 1e-7, causing the casting logic in the stats path to
drop those values. Update the regex used by the numeric check in stats.js so it
accepts exponent notation, and make sure the casts in the stats aggregation path
continue to flow through the existing guard. Also add a regression test in
stats.test.js covering an exponent-form metric value such as costUsd.
In `@web/api/_lib/validate.js`:
- Around line 106-113: The truncation check in validate() is using JSON string
length instead of byte size, so multibyte payloads can slip past the
maxDataBytes cap. Update the logic around serialized/raw.data to measure the
UTF-8 byte length of the serialized JSON before comparing against maxDataBytes,
and use that same byte count for _originalBytes in the truncated object. Keep
the existing truncation flow and symbols like serialized, TRUNCATED_HEAD_CHARS,
and value.data, but ensure the byte-based measurement is used consistently.
In `@web/api/_lib/where.js`:
- Around line 19-23: parseDateValue currently accepts digit-only values via new
Date(Number(v)) even when the result is Invalid Date, which later breaks
whereQuery/toISOString handling. Update parseDateValue to validate the
constructed Date before returning it and treat out-of-range epoch inputs as
null, so invalid from/to values are rejected as validation errors instead of
reaching the later ISO conversion path.
- Around line 191-201: The numeric range handling in where.js for the
data.*__gte/__lte branch casts `(data #>> ...)::numeric` directly, which can
fail on rows where the JSON value is non-numeric. Update the clause-building
logic in the `name.startsWith('data.')` path so `buildWhere` only applies the
numeric comparison when the extracted JSON text is safely numeric, and otherwise
routes the condition through a guarded check or equivalent validation before
casting. Keep the fix localized to the `rangeOp` / `pathStr` branch so
`validateEnvelope()` can still accept arbitrary `data` without a bad row
aborting the query.
In `@web/api/healthz.js`:
- Around line 10-12: The /healthz handler is doing an expensive full-table count
on events, so replace the db() query in the health check with a cheap
reachability probe and avoid deriving flushedTotal from count(*). Update the
healthz logic to use the existing healthz path with a lightweight query, and
source flushedTotal from an approximate or cached metric in the same health
response instead of scanning the full events table.
- Around line 17-22: The /healthz handler is always returning HTTP 200 from the
json(res, 200, ...) response even when db() fails, so update the health check
logic to derive the status code from the DB probe result. In the healthz
endpoint, ensure the db() call is wrapped/checked so that a failed probe returns
a non-200 failure status while keeping the existing response shape (ok, wal,
flusher, mongo) in the healthz response.
In `@web/api/v1/logflare.js`:
- Around line 57-68: The Logflare forwarding path in the GET handler is using
raw query params without enforcing project scope, so it can query another
project by changing app/env. Update the request flow to derive the forwarded
scope through resolveScope(...) before calling buildEndpointUrl, and use that
resolved scope instead of trusting parseParams(req.url) values. Keep the
existing read check and config validation, but ensure the forwarded app/env
always come from the scoped resolution used by GET /v1/logs.
---
Minor comments:
In `@clients/README.md`:
- Around line 21-23: The `/v1/logs` ingest response description in README is too
narrow and only mentions `accepted`, which can mislead client code. Update the
documented response in the README section for the ingest API to reflect the full
contract, including `rejected` and any fallback-added `store` fields, so the
example matches what the client actually receives.
In `@web/tsconfig.app.build.json`:
- Around line 4-7: Enable incremental compilation in the build config so the
existing tsBuildInfoFile cache is actually used. Update tsconfig.app.build.json
to set incremental to true (or otherwise move this path to a tsc -b/composite
setup) so the tsc -p tsconfig.app.build.json build can reuse state instead of
rechecking everything. Focus on the build config keys around tsBuildInfoFile,
include, and exclude.
---
Nitpick comments:
In `@web/api/cron/retention.js`:
- Around line 22-27: The retention batch delete in the cron job is materializing
every returned row just to compute a count. Update the delete query in the
retention handler to use a CTE or similar SQL construct that returns a single
deleted-row count instead of `RETURNING 1`, and adjust the `deleted`
accumulation and batch-termination check in the same flow to use that count
while preserving the existing `BATCH` behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 631d29fd-5251-4de3-91ef-cac23d70a7b0
⛔ Files ignored due to path filters (1)
web/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (45)
clients/README.mdclients/timber-client.jsclients/timber_client.pydocs/superpowers/plans/2026-06-22-timber-on-vercel.mddocs/superpowers/plans/2026-06-30-logflare-dual-write.mddocs/superpowers/specs/2026-06-22-timber-on-vercel-design.mdtest/api/facets-events-jobs.test.jstest/api/groupby.test.jstest/api/ingest.test.jstest/api/lib.test.jstest/api/logflare.test.jstest/api/projects.test.jstest/api/stats.test.jstest/api/where.test.jsweb/api/_lib/auth.jsweb/api/_lib/cursor.jsweb/api/_lib/db.jsweb/api/_lib/env.jsweb/api/_lib/ingest.jsweb/api/_lib/keyring.jsweb/api/_lib/logflare.jsweb/api/_lib/projects.jsweb/api/_lib/respond.jsweb/api/_lib/scope.jsweb/api/_lib/sql/events.jsweb/api/_lib/sql/facets.jsweb/api/_lib/sql/groupby.jsweb/api/_lib/sql/jobs.jsweb/api/_lib/sql/logs.jsweb/api/_lib/sql/stats.jsweb/api/_lib/validate.jsweb/api/_lib/where.jsweb/api/cron/retention.jsweb/api/healthz.jsweb/api/v1/events.jsweb/api/v1/facets.jsweb/api/v1/groupby.jsweb/api/v1/jobs.jsweb/api/v1/logflare.jsweb/api/v1/logs.jsweb/api/v1/projects.jsweb/api/v1/stats.jsweb/package.jsonweb/tsconfig.app.build.jsonweb/vercel.json
| def flush(self): | ||
| with self._lock: | ||
| if not self._buf: | ||
| return | ||
| batch = self._buf[:_MAX_BATCH] | ||
| self._buf = self._buf[_MAX_BATCH:] | ||
| try: | ||
| req = urllib.request.Request( | ||
| self.url + "/v1/logs", | ||
| data=json.dumps(batch).encode("utf-8"), | ||
| headers={ | ||
| "content-type": "application/json", | ||
| "authorization": "Bearer " + self.key, | ||
| }, | ||
| method="POST", | ||
| ) | ||
| urllib.request.urlopen(req, timeout=5).read() | ||
| except Exception: | ||
| pass # never raise into the caller | ||
|
|
||
| def close(self): | ||
| self._stop.set() | ||
| self.flush() | ||
|
|
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | 🏗️ Heavy lift
Don't drop the queued batch on shutdown or HTTP errors.
flush() removes up to 50 events before the request completes, and close() only calls it once. That means buffered logs are lost on shutdown, and any failed POST is discarded instead of retried.
🔧 Suggested fix
def flush(self):
with self._lock:
if not self._buf:
return
batch = self._buf[:_MAX_BATCH]
self._buf = self._buf[_MAX_BATCH:]
try:
req = urllib.request.Request(
self.url + "/v1/logs",
data=json.dumps(batch).encode("utf-8"),
@@
)
urllib.request.urlopen(req, timeout=5).read()
except Exception:
- pass # never raise into the caller
+ with self._lock:
+ self._buf = batch + self._buf🧰 Tools
🪛 ast-grep (0.44.0)
[warning] 89-89: Request-controlled URL passed to urlopen; validate against an allowlist to prevent SSRF.
Context: urllib.request.urlopen(req, timeout=5)
Note: [CWE-918] Server-Side Request Forgery (SSRF).
(urlopen-unsanitized-data)
[info] 82-82: use jsonify instead of json.dumps for JSON output
Context: json.dumps(batch)
Note: [CWE-116] Improper Encoding or Escaping of Output.
(use-jsonify)
🪛 Ruff (0.15.20)
[error] 81-89: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
[error] 90-90: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
[error] 91-92: try-except-pass detected, consider logging the exception
(S110)
[warning] 91-91: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@clients/timber_client.py` around lines 74 - 97, The TimberClient.flush method
is dropping buffered events before the POST succeeds, and close only flushes
once, so logs can be lost on shutdown or transient HTTP failures. Update flush
to only remove items from _buf after urllib.request.urlopen succeeds, and on any
exception leave the batch queued for retry instead of discarding it; then make
close in TimberClient trigger repeated flushing until the buffer is empty or the
queue can no longer be drained.
| async function flush() { | ||
| if (!url || !key || buf.length === 0) return; | ||
| const batch = buf.slice(0, MAX_BATCH); | ||
| buf = buf.slice(MAX_BATCH); | ||
| try { | ||
| await fetch(url + '/v1/logs', { | ||
| method: 'POST', | ||
| headers: { 'content-type': 'application/json', authorization: 'Bearer ' + key }, | ||
| body: JSON.stringify(batch), | ||
| }); | ||
| } catch { | ||
| // swallow: a logging failure must never crash the app | ||
| } | ||
| } | ||
|
|
||
| const timer = url && key ? setInterval(flush, intervalMs) : null; | ||
| if (timer && typeof timer.unref === 'function') timer.unref(); | ||
|
|
||
| return { | ||
| log, | ||
| flush, | ||
| close: () => { | ||
| if (timer) clearInterval(timer); | ||
| return flush(); | ||
| }, | ||
| }; |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | 🏗️ Heavy lift
Drain the queue before shutdown and only dequeue on success.
close() only triggers one flush(), so anything beyond the first 50 buffered events is lost on shutdown. Also, fetch() won't throw on 4xx/5xx, so bad responses are silently dropped today.
🔧 Suggested fix
async function flush() {
if (!url || !key || buf.length === 0) return;
const batch = buf.slice(0, MAX_BATCH);
buf = buf.slice(MAX_BATCH);
try {
- await fetch(url + '/v1/logs', {
+ const res = await fetch(url + '/v1/logs', {
method: 'POST',
headers: { 'content-type': 'application/json', authorization: 'Bearer ' + key },
body: JSON.stringify(batch),
});
+ if (!res.ok) throw new Error(`log upload failed: ${res.status}`);
} catch {
- // swallow: a logging failure must never crash the app
+ buf = batch.concat(buf);
}
}🧰 Tools
🪛 ast-grep (0.44.0)
[warning] 53-53: Avoid using the initial state variable in setState
Context: setInterval(flush, intervalMs)
Note: [CWE-710] Improper Adherence to Coding Standards. Security best practice.
(setstate-same-var)
[error] 53-53: React's useState should not be directly called
Context: setInterval(flush, intervalMs)
Note: [CWE-710] Improper Adherence to Coding Standards. Security best practice.
(usestate-direct-usage)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@clients/timber-client.js` around lines 39 - 64, The shutdown path in
createLogger/flush only sends one MAX_BATCH chunk and marks it gone even if
delivery fails, so buffered events can be lost and HTTP error responses are
ignored. Update flush() so it checks the fetch response and only removes items
from buf after a successful 2xx send, and make close() drain the queue by
repeatedly calling flush() until buf is empty. Keep the behavior in
clients/timber-client.js and preserve the existing non-crashing error handling.
| The key constraint: Logflare forward must not block the response. We use `Promise.allSettled` so Neon insert and Logflare forward race in parallel - the response is sent as soon as Neon confirms, regardless of Logflare. | ||
|
|
||
| Actually, we want Neon to be the source of truth - respond after Neon succeeds. Logflare is best-effort. So: start both in parallel, await Neon, ignore Logflare result. | ||
|
|
||
| - [ ] **Step 1: Modify `web/api/v1/logs.js` ingest function** | ||
|
|
||
| Replace the current `ingest` function (lines 13-25): | ||
|
|
||
| ```js | ||
| import { json, badRequest, methodNotAllowed, readJson } from '../_lib/respond.js'; | ||
| import { requireWrite, requireRead } from '../_lib/auth.js'; | ||
| import { validateBatch } from '../_lib/validate.js'; | ||
| import { ttlDays, limits } from '../_lib/env.js'; | ||
| import { buildInsert } from '../_lib/ingest.js'; | ||
| import { db } from '../_lib/db.js'; | ||
| import { parseLogsQuery, runLogs } from '../_lib/sql/logs.js'; | ||
| import { resolveScope } from '../_lib/projects.js'; | ||
| import { forwardToLogflare } from '../_lib/logflare.js'; | ||
|
|
||
| async function ingest(req, res) { | ||
| const principal = requireWrite(req, res); | ||
| if (!principal) return; | ||
| const body = await readJson(req); | ||
| if (!body.ok) return badRequest(res, 'invalid or empty JSON body'); | ||
| const v = validateBatch(body.value, limits()); | ||
| if (!v.ok) { | ||
| return json(res, v.status ?? 400, v.index != null ? { error: v.error, index: v.index } : { error: v.error }); | ||
| } | ||
| const { text, params } = buildInsert(v.events, principal, ttlDays(), new Date()); | ||
| // Run Neon insert and Logflare forward in parallel. Only Neon result matters | ||
| // for the response; Logflare is best-effort and must never block the caller. | ||
| const [neonResult] = await Promise.allSettled([ | ||
| db()(text, params), | ||
| forwardToLogflare(v.events, principal), | ||
| ]); | ||
| if (neonResult.status === 'rejected') throw neonResult.reason; | ||
| return json(res, 201, { accepted: v.events.length, rejected: 0 }); |
There was a problem hiding this comment.
🚀 Performance & Scalability | 🟠 Major | ⚡ Quick win
Don't wait on Logflare in the ingest path.
Promise.allSettled() still blocks until the Logflare request settles, so the client pays that latency despite the "fire-and-forget" intent. Kick off Logflare without awaiting it, and await only the Neon write before responding.
⚙️ Suggested fix
- const [neonResult] = await Promise.allSettled([
- db()(text, params),
- forwardToLogflare(v.events, principal),
- ]);
- if (neonResult.status === 'rejected') throw neonResult.reason;
+ void forwardToLogflare(v.events, principal);
+ await db()(text, params);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| The key constraint: Logflare forward must not block the response. We use `Promise.allSettled` so Neon insert and Logflare forward race in parallel - the response is sent as soon as Neon confirms, regardless of Logflare. | |
| Actually, we want Neon to be the source of truth - respond after Neon succeeds. Logflare is best-effort. So: start both in parallel, await Neon, ignore Logflare result. | |
| - [ ] **Step 1: Modify `web/api/v1/logs.js` ingest function** | |
| Replace the current `ingest` function (lines 13-25): | |
| ```js | |
| import { json, badRequest, methodNotAllowed, readJson } from '../_lib/respond.js'; | |
| import { requireWrite, requireRead } from '../_lib/auth.js'; | |
| import { validateBatch } from '../_lib/validate.js'; | |
| import { ttlDays, limits } from '../_lib/env.js'; | |
| import { buildInsert } from '../_lib/ingest.js'; | |
| import { db } from '../_lib/db.js'; | |
| import { parseLogsQuery, runLogs } from '../_lib/sql/logs.js'; | |
| import { resolveScope } from '../_lib/projects.js'; | |
| import { forwardToLogflare } from '../_lib/logflare.js'; | |
| async function ingest(req, res) { | |
| const principal = requireWrite(req, res); | |
| if (!principal) return; | |
| const body = await readJson(req); | |
| if (!body.ok) return badRequest(res, 'invalid or empty JSON body'); | |
| const v = validateBatch(body.value, limits()); | |
| if (!v.ok) { | |
| return json(res, v.status ?? 400, v.index != null ? { error: v.error, index: v.index } : { error: v.error }); | |
| } | |
| const { text, params } = buildInsert(v.events, principal, ttlDays(), new Date()); | |
| // Run Neon insert and Logflare forward in parallel. Only Neon result matters | |
| // for the response; Logflare is best-effort and must never block the caller. | |
| const [neonResult] = await Promise.allSettled([ | |
| db()(text, params), | |
| forwardToLogflare(v.events, principal), | |
| ]); | |
| if (neonResult.status === 'rejected') throw neonResult.reason; | |
| return json(res, 201, { accepted: v.events.length, rejected: 0 }); | |
| import { json, badRequest, methodNotAllowed, readJson } from '../_lib/respond.js'; | |
| import { requireWrite, requireRead } from '../_lib/auth.js'; | |
| import { validateBatch } from '../_lib/validate.js'; | |
| import { ttlDays, limits } from '../_lib/env.js'; | |
| import { buildInsert } from '../_lib/ingest.js'; | |
| import { db } from '../_lib/db.js'; | |
| import { parseLogsQuery, runLogs } from '../_lib/sql/logs.js'; | |
| import { resolveScope } from '../_lib/projects.js'; | |
| import { forwardToLogflare } from '../_lib/logflare.js'; | |
| async function ingest(req, res) { | |
| const principal = requireWrite(req, res); | |
| if (!principal) return; | |
| const body = await readJson(req); | |
| if (!body.ok) return badRequest(res, 'invalid or empty JSON body'); | |
| const v = validateBatch(body.value, limits()); | |
| if (!v.ok) { | |
| return json(res, v.status ?? 400, v.index != null ? { error: v.error, index: v.index } : { error: v.error }); | |
| } | |
| const { text, params } = buildInsert(v.events, principal, ttlDays(), new Date()); | |
| // Run Neon insert and Logflare forward in parallel. Only Neon result matters | |
| // for the response; Logflare is best-effort and must never block the caller. | |
| void forwardToLogflare(v.events, principal); | |
| await db()(text, params); | |
| return json(res, 201, { accepted: v.events.length, rejected: 0 }); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/superpowers/plans/2026-06-30-logflare-dual-write.md` around lines 237 -
273, The ingest path still waits for Logflare because `Promise.allSettled`
blocks until both promises settle, so update `ingest` in `web/api/v1/logs.js` to
start `forwardToLogflare` in the background and await only the Neon insert from
`db()(text, params)` before returning. Keep `buildInsert`, `validateBatch`, and
`requireWrite` flow intact, but make Logflare best-effort by not awaiting its
promise and handling any rejection separately so it never delays the response.
| function normalizeRow(row) { | ||
| const doc = { | ||
| app: row['metadata.app'] ?? row.app ?? '', | ||
| env: row['metadata.env'] ?? row.env ?? '', | ||
| event: row.event_message ?? row.event ?? '', | ||
| level: row['metadata.level'] ?? row.level ?? '', | ||
| receivedAt: row.timestamp ?? row['metadata.receivedAt'] ?? null, | ||
| }; | ||
| if (row['metadata.ts']) doc.ts = row['metadata.ts']; | ||
| if (row['metadata.message']) doc.message = row['metadata.message']; | ||
| if (row['metadata.ids']) doc.ids = row['metadata.ids']; | ||
| if (row['metadata.data']) doc.data = row['metadata.data']; | ||
| return doc; | ||
| } | ||
|
|
||
| export default async function handler(req, res) { | ||
| if (req.method !== 'GET') return methodNotAllowed(res, 'GET'); | ||
| if (!requireRead(req, res)) return; | ||
|
|
||
| const { endpointId, apiKey } = logflareConfig(); | ||
| if (!endpointId || !apiKey) { | ||
| return json(res, 503, { error: 'logflare not configured' }); | ||
| } | ||
|
|
||
| const sp = new url("https://github.com/ItsmeBlackOps/timber/pull/req.url,%20'http://localhost'").searchParams; | ||
| const parsed = parseParams(sp); | ||
| if (!parsed.ok) return badRequest(res, parsed.error); | ||
|
|
||
| const url = buildEndpointurl("https://github.com/ItsmeBlackOps/timber/pull/endpointId,%20parsed.value"); | ||
| let upstream; | ||
| try { | ||
| upstream = await fetch(url, { headers: { 'X-API-KEY': apiKey } }); | ||
| } catch (err) { | ||
| return json(res, 502, { error: 'logflare unreachable' }); | ||
| } | ||
|
|
||
| if (!upstream.ok) { | ||
| const text = await upstream.text().catch(() => ''); | ||
| return json(res, 502, { error: 'logflare error', detail: text }); | ||
| } | ||
|
|
||
| const body = await upstream.json(); | ||
| const items = (body.result ?? []).map(normalizeRow); | ||
| return json(res, 200, { items, nextCursor: null, source: 'logflare' }); |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift
Preserve the Timber log shape here.
normalizeRow() drops _id and expiresAt, but this endpoint claims to return the same item shape as /v1/logs. If the Console consumes both sources interchangeably, this will force a special-case path or break pagination/rendering.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/superpowers/plans/2026-06-30-logflare-dual-write.md` around lines 397 -
440, normalizeRow() is not preserving the Timber log item shape because it drops
_id and expiresAt, even though handler returns these items as interchangeable
with /v1/logs. Update normalizeRow() to carry through _id and expiresAt from the
upstream row when present, alongside the existing metadata mappings, so the
Console can consume both sources without special cases.
| Send a test event (replace URL and key with your values): | ||
|
|
||
| ```bash | ||
| curl -s -X POST https://timber-console.vercel.app/v1/logs \ | ||
| -H "Authorization: Bearer YOUR_WRITE_KEY" \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{"batch":[{"event":"logflare.test","level":"info","data":{"hello":"world"}}]}' | ||
| ``` |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Send a raw event array here, not a batch wrapper.
/v1/logs accepts either a single envelope or a JSON array. This curl example will be rejected because batch is an unknown top-level key.
🧪 Suggested fix
- -d '{"batch":[{"event":"logflare.test","level":"info","data":{"hello":"world"}}]}'
+ -d '[{"event":"logflare.test","level":"info","data":{"hello":"world"}}]'📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Send a test event (replace URL and key with your values): | |
| ```bash | |
| curl -s -X POST https://timber-console.vercel.app/v1/logs \ | |
| -H "Authorization: Bearer YOUR_WRITE_KEY" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"batch":[{"event":"logflare.test","level":"info","data":{"hello":"world"}}]}' | |
| ``` | |
| Send a test event (replace URL and key with your values): | |
🧰 Tools
🪛 Betterleaks (1.6.0)
[high] 495-496: Discovered a potential authorization token provided in a curl command header, which could compromise the curl accessed resource.
(curl-auth-header)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/superpowers/plans/2026-06-30-logflare-dual-write.md` around lines 492 -
499, The curl example in the log sending section uses an invalid top-level batch
wrapper, so update the example under the test event instructions to send either
a single envelope or a raw JSON array as accepted by /v1/logs. Locate the
example around the send-test-event snippet and replace the request body shape so
it matches the endpoint contract used elsewhere in this plan.
| function parseDateValue(v) { | ||
| if (EPOCH_MS_RE.test(v)) return new Date(Number(v)); | ||
| const t = Date.parse(v); | ||
| return Number.isNaN(t) ? null : new Date(t); | ||
| } |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Treat out-of-range epoch values as invalid input.
new Date(Number(v)) can return Invalid Date for huge digit-only from/to values, but parseDateValue() currently treats that as success. The later toISOString() calls on Lines 211-212 then throw, so from=99999999999999999999 becomes a 500 instead of a validation error.
Suggested fix
function parseDateValue(v) {
- if (EPOCH_MS_RE.test(v)) return new Date(Number(v));
+ if (EPOCH_MS_RE.test(v)) {
+ const d = new Date(Number(v));
+ return Number.isNaN(d.getTime()) ? null : d;
+ }
const t = Date.parse(v);
return Number.isNaN(t) ? null : new Date(t);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| function parseDateValue(v) { | |
| if (EPOCH_MS_RE.test(v)) return new Date(Number(v)); | |
| const t = Date.parse(v); | |
| return Number.isNaN(t) ? null : new Date(t); | |
| } | |
| function parseDateValue(v) { | |
| if (EPOCH_MS_RE.test(v)) { | |
| const d = new Date(Number(v)); | |
| return Number.isNaN(d.getTime()) ? null : d; | |
| } | |
| const t = Date.parse(v); | |
| return Number.isNaN(t) ? null : new Date(t); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@web/api/_lib/where.js` around lines 19 - 23, parseDateValue currently accepts
digit-only values via new Date(Number(v)) even when the result is Invalid Date,
which later breaks whereQuery/toISOString handling. Update parseDateValue to
validate the constructed Date before returning it and treat out-of-range epoch
inputs as null, so invalid from/to values are rejected as validation errors
instead of reaching the later ISO conversion path.
| } else if (name.startsWith('data.') && name.length > 'data.'.length) { | ||
| const rangeOp = name.endsWith('__gte') ? '>=' : name.endsWith('__lte') ? '<=' : null; | ||
| if (rangeOp === null) { | ||
| const path = name.slice('data.'.length).split('.'); | ||
| clauses.push(`data #>> ${push(path)} = ${push(v)}`); | ||
| } else { | ||
| const pathStr = name.slice('data.'.length, -'__gte'.length); | ||
| if (pathStr.length === 0) return fail(`unknown parameter "${name}"`); | ||
| if (!NUMERIC_RE.test(v)) return fail(`${name} requires a numeric value`); | ||
| const path = pathStr.split('.'); | ||
| clauses.push(`(data #>> ${push(path)})::numeric ${rangeOp} ${push(Number(v))}`); |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Guard the JSON numeric cast before applying __gte/__lte.
(data #>> ... )::numeric will abort the whole query if any row stores that path as non-numeric text. Since validateEnvelope() accepts arbitrary data objects, one bad event can turn a user filter into a 500.
Suggested fix
} else {
const pathStr = name.slice('data.'.length, -'__gte'.length);
if (pathStr.length === 0) return fail(`unknown parameter "${name}"`);
if (!NUMERIC_RE.test(v)) return fail(`${name} requires a numeric value`);
const path = pathStr.split('.');
- clauses.push(`(data #>> ${push(path)})::numeric ${rangeOp} ${push(Number(v))}`);
+ const pathRef = push(path);
+ const valueRef = push(Number(v));
+ clauses.push(
+ `CASE
+ WHEN (data #>> ${pathRef}) ~ '^-?\\d+(\\.\\d+)?$'
+ THEN (data #>> ${pathRef})::numeric ${rangeOp} ${valueRef}
+ ELSE false
+ END`
+ );
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } else if (name.startsWith('data.') && name.length > 'data.'.length) { | |
| const rangeOp = name.endsWith('__gte') ? '>=' : name.endsWith('__lte') ? '<=' : null; | |
| if (rangeOp === null) { | |
| const path = name.slice('data.'.length).split('.'); | |
| clauses.push(`data #>> ${push(path)} = ${push(v)}`); | |
| } else { | |
| const pathStr = name.slice('data.'.length, -'__gte'.length); | |
| if (pathStr.length === 0) return fail(`unknown parameter "${name}"`); | |
| if (!NUMERIC_RE.test(v)) return fail(`${name} requires a numeric value`); | |
| const path = pathStr.split('.'); | |
| clauses.push(`(data #>> ${push(path)})::numeric ${rangeOp} ${push(Number(v))}`); | |
| } else if (name.startsWith('data.') && name.length > 'data.'.length) { | |
| const rangeOp = name.endsWith('__gte') ? '>=' : name.endsWith('__lte') ? '<=' : null; | |
| if (rangeOp === null) { | |
| const path = name.slice('data.'.length).split('.'); | |
| clauses.push(`data #>> ${push(path)} = ${push(v)}`); | |
| } else { | |
| const pathStr = name.slice('data.'.length, -'__gte'.length); | |
| if (pathStr.length === 0) return fail(`unknown parameter "${name}"`); | |
| if (!NUMERIC_RE.test(v)) return fail(`${name} requires a numeric value`); | |
| const path = pathStr.split('.'); | |
| const pathRef = push(path); | |
| const valueRef = push(Number(v)); | |
| clauses.push( | |
| `CASE | |
| WHEN (data #>> ${pathRef}) ~ '^-?\\d+(\\.\\d+)?$' | |
| THEN (data #>> ${pathRef})::numeric ${rangeOp} ${valueRef} | |
| ELSE false | |
| END` | |
| ); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@web/api/_lib/where.js` around lines 191 - 201, The numeric range handling in
where.js for the data.*__gte/__lte branch casts `(data #>> ...)::numeric`
directly, which can fail on rows where the JSON value is non-numeric. Update the
clause-building logic in the `name.startsWith('data.')` path so `buildWhere`
only applies the numeric comparison when the extracted JSON text is safely
numeric, and otherwise routes the condition through a guarded check or
equivalent validation before casting. Keep the fix localized to the `rangeOp` /
`pathStr` branch so `validateEnvelope()` can still accept arbitrary `data`
without a bad row aborting the query.
| try { | ||
| const rows = await db()`SELECT count(*)::int AS n FROM events`; | ||
| count = rows[0].n; |
There was a problem hiding this comment.
🚀 Performance & Scalability | 🟠 Major | ⚡ Quick win
Avoid count(*) on the full events table in /healthz.
This makes every health probe grow with table size, so the endpoint gets slower and more expensive as retention grows. Use a cheap reachability probe here, and source flushedTotal from an approximate or cached metric instead.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@web/api/healthz.js` around lines 10 - 12, The /healthz handler is doing an
expensive full-table count on events, so replace the db() query in the health
check with a cheap reachability probe and avoid deriving flushedTotal from
count(*). Update the healthz logic to use the existing healthz path with a
lightweight query, and source flushedTotal from an approximate or cached metric
in the same health response instead of scanning the full events table.
| return json(res, 200, { | ||
| ok: connected, | ||
| wal: { totalBytes: 0, backlogBytes: 0, overBudget: false }, | ||
| flusher: { running: true, caughtUp: true, flushedTotal: count, lastError: null }, | ||
| mongo: { connected }, | ||
| }); |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Return a failure status when the DB probe fails.
/healthz still emits HTTP 200 when db() throws, so status-code-based monitors will treat an unusable API instance as healthy.
Suggested fix
- return json(res, 200, {
+ return json(res, connected ? 200 : 503, {
ok: connected,
wal: { totalBytes: 0, backlogBytes: 0, overBudget: false },
flusher: { running: true, caughtUp: true, flushedTotal: count, lastError: null },
mongo: { connected },
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| return json(res, 200, { | |
| ok: connected, | |
| wal: { totalBytes: 0, backlogBytes: 0, overBudget: false }, | |
| flusher: { running: true, caughtUp: true, flushedTotal: count, lastError: null }, | |
| mongo: { connected }, | |
| }); | |
| return json(res, connected ? 200 : 503, { | |
| ok: connected, | |
| wal: { totalBytes: 0, backlogBytes: 0, overBudget: false }, | |
| flusher: { running: true, caughtUp: true, flushedTotal: count, lastError: null }, | |
| mongo: { connected }, | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@web/api/healthz.js` around lines 17 - 22, The /healthz handler is always
returning HTTP 200 from the json(res, 200, ...) response even when db() fails,
so update the health check logic to derive the status code from the DB probe
result. In the healthz endpoint, ensure the db() call is wrapped/checked so that
a failed probe returns a non-200 failure status while keeping the existing
response shape (ok, wal, flusher, mongo) in the healthz response.
| if (!requireRead(req, res)) return; | ||
|
|
||
| const { endpointId, apiKey } = logflareConfig(); | ||
| if (!endpointId || !apiKey) { | ||
| return json(res, 503, { error: 'logflare not configured' }); | ||
| } | ||
|
|
||
| const sp = new url("https://github.com/ItsmeBlackOps/timber/pull/req.url,%20'http://localhost'").searchParams; | ||
| const parsed = parseParams(sp); | ||
| if (!parsed.ok) return badRequest(res, parsed.error); | ||
|
|
||
| const url = buildEndpointurl("https://github.com/ItsmeBlackOps/timber/pull/endpointId,%20parsed.value"); |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | ⚡ Quick win
Apply project scope here before forwarding app/env.
This endpoint skips the resolveScope gate that GET /v1/logs uses, so any valid read key can change ?app= / ?env= and query the shared Logflare endpoint for another project. Please derive the forwarded scope from resolveScope(...) instead of trusting raw query params.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@web/api/v1/logflare.js` around lines 57 - 68, The Logflare forwarding path in
the GET handler is using raw query params without enforcing project scope, so it
can query another project by changing app/env. Update the request flow to derive
the forwarded scope through resolveScope(...) before calling buildEndpointUrl,
and use that resolved scope instead of trusting parseParams(req.url) values.
Keep the existing read check and config validation, but ensure the forwarded
app/env always come from the scoped resolution used by GET /v1/logs.
Rebuilds Timber as a fully Vercel-hosted log service: the existing React Console plus the
/v1/*API as Vercel serverless functions backed by Neon serverless Postgres. Replaces the Docker + self-hosted MongoDB deployment for the hosted use case (that artifact stays in the repo).What is included
POST /v1/logsingest (single or batch) into Neon, language-agnostic REST (no SDK).GET /v1/logs(filter + keyset pagination),/v1/stats(date_trunc buckets, percentile_cont latency, error rate, AI cost/token sums),/v1/groupby,/v1/facets,/v1/events,/v1/jobs./v1/projectsCRUD + per-project scoping; all six lenses (Errors, AI Usage, By User, By Service, Slow Operations, Cron and Jobs)./healthzadapted to Postgres; daily retention via Vercel Cron (level-based TTL)./v1/*contract).clients/).How it works
parse*Queryported verbatim fromsrc/query/*; only the Mongo pipeline became SQL, preserving the response shapes inweb/src/lib/types.ts.@neondatabase/serverless), stateless per query; functions inweb/api/.Verification
Notes
DATABASE_URL,TIMBER_KEYS,CRON_SECRET.web/tsconfig.app.build.json(app-only) so tests are excluded from the deployed typecheck.Spec:
docs/superpowers/specs/2026-06-22-timber-on-vercel-design.mdPlan:
docs/superpowers/plans/2026-06-22-timber-on-vercel.mdUpdate: Logflare dual-write + fallback (Neon at capacity)
Neon hit its hard 512 MB project size limit (error 53100) and was rejecting all inserts, so ingest was returning 500s and losing writes. This adds Logflare as a parallel store and a fallback path.
What changed
POST /v1/logsnow dual-writes each batch to Neon and Logflare in parallel (Promise.allSettled). Neon stays primary.201 {store:"logflare"}. Only when both fail does it error. This stops the 500s and prevents log loss while Neon is full.forwardToLogflarereports success (checksres.ok, returns a boolean);buildLogflarePayloadis a pure mapper (event name tomessage, everything else tometadata).GET /v1/logflareproxies filter queries (same param surface as/v1/logs) to a Logflare Endpoint and normalizes rows to the/v1/logsshape. Returns a clean503untilLOGFLARE_ENDPOINT_IDis set (Logflare Endpoints are still in beta for this account).LOGFLARE_SOURCE_ID,LOGFLARE_API_KEY,LOGFLARE_ENDPOINT_ID(vialogflareConfig()).Verification
201 {store:"logflare"}with Neon full;GET /v1/logflarereturns clean 503;GET /v1/logsstill reads existing Neon data.Follow-up (not in this PR)
dailyDashboardlogs (skipinfo-leveldb.querypolling; thenotification_outboxpoll alone is ~9M events/month) and/or freeing Neon space.Plan:
docs/superpowers/plans/2026-06-30-logflare-dual-write.mdSummary by CodeRabbit
New Features
Bug Fixes