Skip to content

Tags: triggerdotdev/trigger.dev

Tags

re2-test-snapshot-cancel-rc3

Toggle re2-test-snapshot-cancel-rc3's commit message

Verified

This commit was signed with the committer’s verified signature.
myftija Saadi Myftija
fix: suffix retried instance create names to dodge stale registrations

A failed create can leave its instance name registered gateway/fcrun-side
until async cleanup runs, so a same-name retry can 409 against our own
residue (observed: tap-EBUSY 500 at 18:29Z followed by 409 name_conflict
on the retry 2.7s later, costing the full redrive anyway). Give retry
attempts a deterministic -rN suffix; attempt 1 keeps the unsuffixed name
so the non-retry path is unchanged. The suffixed name flows into both the
instance name and TRIGGER_RUNNER_ID from the same variable - every
downstream flow (suspend scheduling, snapshot dispatch, cancel guards,
run-engine fields) treats it as one opaque self-reported token, and
restored VMs already carry deterministic name suffixes.

Temporary measure (TRI-10293): the proper fix is gateway-side cleanup of
failed-create registrations.

re2-test-snapshot-cancel-rc2

Toggle re2-test-snapshot-cancel-rc2's commit message

Verified

This commit was signed with the committer’s verified signature.
myftija Saadi Myftija
fix: retry transient instance create failures instead of abandoning t…

…he run

ComputeWorkloadManager.create swallows gateway errors by design, so a
cold start that fails placement (e.g. a netns slot with a busy tap, a
full node disk) silently abandons the dequeued run until the run
engine's PENDING_EXECUTING timeout redrives it minutes later. These
failures are transient per placement - redriven runs virtually always
succeed - so retry the create up to 3 times with short backoff before
giving up. Gateway 5xx and network-level fetch failures are retried;
4xx responses (won't heal) and timeouts (the instance may still be
provisioning) are not.

re2-test-snapshot-cancel-rc1

Toggle re2-test-snapshot-cancel-rc1's commit message

Verified

This commit was signed with the committer’s verified signature.
myftija Saadi Myftija
fix: cancel pending delayed snapshots when the run completes or disco…

…nnects

The compute suspend flow delays snapshots by snapshotDelayMs to avoid
wasted work on short-lived waitpoints, with the intent that a run which
continues before the delay expires cancels the pending snapshot. But the
only cancel() call site is the /continue workload action, which runners
only invoke when restoring from an already-taken snapshot - so a pending
snapshot is never actually cancelled (zero snapshot.canceled events in
prod). When a run resumes and completes within the delay window, the
stale snapshot fires anyway and fcrun pauses the VM for ~6-13s while its
controller is mid warm-start long-poll. The frozen guest can't fire its
abort timer or send a FIN, so firestarter keeps the connection claimable
past the client deadline and dispatches runs into it - each one a ~300s
stall (TRI-10293).

Cancel the pending snapshot when the attempt completes and when the run
socket disconnects. Genuine waitpoint suspensions keep the runner socket
connected and the attempt incomplete, so neither hook cancels a snapshot
that is still wanted. Cancellation is guarded by runnerId so a stale
duplicate runner for a reassigned run can't cancel the new runner's
pending snapshot.

re2-test-client-deqeueue-metrics-2

Toggle re2-test-client-deqeueue-metrics-2's commit message

Verified

This commit was signed with the committer’s verified signature.
myftija Saadi Myftija
feat(supervisor): 60s dequeue latency bucket to bracket the retry-exh…

…austed error envelope

re2-test-client-deqeueue-metrics

Toggle re2-test-client-deqeueue-metrics's commit message

Verified

This commit was signed with the committer’s verified signature.
myftija Saadi Myftija
chore: add changeset for dequeue latency histogram

re2-prod-client-deqeueue-metrics

Toggle re2-prod-client-deqeueue-metrics's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
feat(supervisor): publish client-side dequeue API latency as a Promet…

…heus histogram (#3887)

The supervisor's dequeue round-trip time (`POST
/engine/v1/worker-actions/dequeue`) was measured but only flowed into
wide events and OTel span attributes — there was no Prometheus series,
so latency percentiles and error rates weren't queryable. This adds
`queue_consumer_pool_dequeue_duration_seconds` (histogram, label
`outcome=success|empty|error`) to the existing consumer-pool metrics,
scraped automatically by the existing ServiceMonitors on
queue-raider/schedule-raider/supervisor.

- Records every dequeue call, including failed ones, which previously
emitted no timing at all
- The pool's shared `ConsumerPoolMetrics` instance is injected into each
consumer (mirrors the `BackpressureMetrics` → `BackpressureMonitor`
wiring)
- Buckets extend to 30s because `wrapZodFetch` retries internally (5
attempts, ≥7.5s backoff before a retryable error surfaces)
- Existing `dequeueResponseMs` wide-event/span behavior unchanged

v4.5.0-rc.5

Toggle v4.5.0-rc.5's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
chore: release v4.5.0-rc.5 (#3808)

## Summary
1 new feature, 8 improvements, 1 bug fix.

## Highlights

- Add optional `shouldPauseScaling` to the supervisor consumer pool
scaling options to freeze scale-up while it returns true (scale-down
stays allowed).
([#3836](#3836))

## Improvements
- The MCP server no longer tells the AI agent to wait for a run to
complete after every `trigger_task` call. Waiting is now opt-in: the
agent only waits when you ask it to (for example "trigger and then wait
for it to finish"). This avoids burning tokens polling runs you didn't
need to block on and keeps responses clearer.
([#3838](#3838))
- Update the bundled OpenTelemetry packages to their latest releases
(`@opentelemetry/sdk-node` 0.218.0, `@opentelemetry/core` 2.7.1,
`@opentelemetry/host-metrics` 0.38.3).
([#3810](#3810))
- `envvars.upload` now accepts an optional `isSecret` flag, letting you
create the imported variables as secret (redacted) environment
variables. When omitted, variables default to non-secret.
([#3809](#3809))
- Offload large trigger payloads to object storage before sending the
trigger API request. The SDK uploads packets at or above the existing
128KB limit and sends an `application/store` pointer instead of
embedding large JSON in the request body. `TriggerTaskRequestBody` now
validates that `application/store` payloads are non-empty storage paths.
([#3785](#3785))
- Make mollifier buffer and drainer internals configurable.
`MollifierBuffer` now accepts `ackGraceTtlSeconds`,
`maxRetriesPerRequest`, `reconnectStepMs`, and `reconnectMaxMs` options,
and `MollifierDrainer` accepts `maxBackoffMs` and `backoffFloorMs`. All
default to their previous hardcoded values, so existing behaviour is
unchanged.
([#3822](#3822))
- `MollifierDrainer` accepts a `drainBatchSize` option (default 1) that
controls how many entries are popped per env per tick — in-flight
handlers remain capped by the global `concurrency`. `MollifierBuffer`
also gains `getDrainingCount()` / `listStaleDraining()`, backed by a new
`mollifier:draining` ZSET maintained atomically with
pop/ack/fail/requeue (observability-only).
([#3797](#3797))
- Adds AI SDK 7 support. The `ai` peer range now includes v7, and the
`chat.agent` / chat surfaces work against v7's ESM-only build. On v7,
install `@ai-sdk/otel` alongside `ai` and the SDK registers it for you
so `experimental_telemetry` spans keep flowing into your run traces (v7
stopped emitting them from `ai` core). v5 and v6 keep working unchanged.
([#3833](#3833))
- `useTriggerChatTransport` now recovers when restored session state
points at a session that no longer exists in the current environment
([#3816](#3816))

## Bug fixes
- Fix `@trigger.dev/core` build: cast the underlying log record exporter
when calling `forceFlush` so it typechecks against the updated
OpenTelemetry `LogRecordExporter` type (which no longer declares
`forceFlush`).
([#3829](#3829))

<details>
<summary>Raw changeset output</summary>

⚠️⚠️⚠️⚠️⚠️⚠️

`main` is currently in **pre mode** so this branch has prereleases
rather than normal releases. If you want to exit prereleases, run
`changeset pre exit` on `main`.

⚠️⚠️⚠️⚠️⚠️⚠️

# Releases
## @trigger.dev/build@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## trigger.dev@4.5.0-rc.5

### Patch Changes

- The MCP server no longer tells the AI agent to wait for a run to
complete after every `trigger_task` call. Waiting is now opt-in: the
agent only waits when you ask it to (for example "trigger and then wait
for it to finish"). This avoids burning tokens polling runs you didn't
need to block on and keeps responses clearer.
([#3838](#3838))
- Update the bundled OpenTelemetry packages to their latest releases
(`@opentelemetry/sdk-node` 0.218.0, `@opentelemetry/core` 2.7.1,
`@opentelemetry/host-metrics` 0.38.3).
([#3810](#3810))
-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`
    -   `@trigger.dev/build@4.5.0-rc.5`
    -   `@trigger.dev/schema-to-json@4.5.0-rc.5`

## @trigger.dev/core@4.5.0-rc.5

### Patch Changes

- Add optional `shouldPauseScaling` to the supervisor consumer pool
scaling options to freeze scale-up while it returns true (scale-down
stays allowed).
([#3836](#3836))

- Fix `@trigger.dev/core` build: cast the underlying log record exporter
when calling `forceFlush` so it typechecks against the updated
OpenTelemetry `LogRecordExporter` type (which no longer declares
`forceFlush`).
([#3829](#3829))

- `envvars.upload` now accepts an optional `isSecret` flag, letting you
create the imported variables as secret (redacted) environment
variables. When omitted, variables default to non-secret.
([#3809](#3809))

    ```ts
    await envvars.upload("proj_1234", "prod", {
      variables: { STRIPE_SECRET_KEY: "sk_live_..." },
      isSecret: true,
    });
    ```

- Offload large trigger payloads to object storage before sending the
trigger API request. The SDK uploads packets at or above the existing
128KB limit and sends an `application/store` pointer instead of
embedding large JSON in the request body. `TriggerTaskRequestBody` now
validates that `application/store` payloads are non-empty storage paths.
([#3785](#3785))

Payload uploads use the same resolved `ApiClient` as the trigger call
(including `requestOptions.clientConfig`), not only the global
`apiClientManager.client` — so custom `baseURL`, access token, and
preview branch apply to both presign and trigger.

- Update the bundled OpenTelemetry packages to their latest releases
(`@opentelemetry/sdk-node` 0.218.0, `@opentelemetry/core` 2.7.1,
`@opentelemetry/host-metrics` 0.38.3).
([#3810](#3810))

## @trigger.dev/plugins@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## @trigger.dev/python@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/sdk@4.5.0-rc.5`
    -   `@trigger.dev/core@4.5.0-rc.5`
    -   `@trigger.dev/build@4.5.0-rc.5`

## @trigger.dev/react-hooks@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## @trigger.dev/redis-worker@4.5.0-rc.5

### Patch Changes

- Make mollifier buffer and drainer internals configurable.
`MollifierBuffer` now accepts `ackGraceTtlSeconds`,
`maxRetriesPerRequest`, `reconnectStepMs`, and `reconnectMaxMs` options,
and `MollifierDrainer` accepts `maxBackoffMs` and `backoffFloorMs`. All
default to their previous hardcoded values, so existing behaviour is
unchanged.
([#3822](#3822))
- `MollifierDrainer` accepts a `drainBatchSize` option (default 1) that
controls how many entries are popped per env per tick — in-flight
handlers remain capped by the global `concurrency`. `MollifierBuffer`
also gains `getDrainingCount()` / `listStaleDraining()`, backed by a new
`mollifier:draining` ZSET maintained atomically with
pop/ack/fail/requeue (observability-only).
([#3797](#3797))
-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## @trigger.dev/rsc@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## @trigger.dev/schema-to-json@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## @trigger.dev/sdk@4.5.0-rc.5

### Patch Changes

- Adds AI SDK 7 support. The `ai` peer range now includes v7, and the
`chat.agent` / chat surfaces work against v7's ESM-only build. On v7,
install `@ai-sdk/otel` alongside `ai` and the SDK registers it for you
so `experimental_telemetry` spans keep flowing into your run traces (v7
stopped emitting them from `ai` core). v5 and v6 keep working unchanged.
([#3833](#3833))

- `useTriggerChatTransport` now recovers when restored session state
points at a session that no longer exists in the current environment
([#3816](#3816))

- Offload large trigger payloads to object storage before sending the
trigger API request. The SDK uploads packets at or above the existing
128KB limit and sends an `application/store` pointer instead of
embedding large JSON in the request body. `TriggerTaskRequestBody` now
validates that `application/store` payloads are non-empty storage paths.
([#3785](#3785))

Payload uploads use the same resolved `ApiClient` as the trigger call
(including `requestOptions.clientConfig`), not only the global
`apiClientManager.client` — so custom `baseURL`, access token, and
preview branch apply to both presign and trigger.

- Update the bundled OpenTelemetry packages to their latest releases
(`@opentelemetry/sdk-node` 0.218.0, `@opentelemetry/core` 2.7.1,
`@opentelemetry/host-metrics` 0.38.3).
([#3810](#3810))

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

</details>

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

v.docker.4.5.0-rc.5

Toggle v.docker.4.5.0-rc.5's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
chore: release v4.5.0-rc.5 (#3808)

## Summary
1 new feature, 8 improvements, 1 bug fix.

## Highlights

- Add optional `shouldPauseScaling` to the supervisor consumer pool
scaling options to freeze scale-up while it returns true (scale-down
stays allowed).
([#3836](#3836))

## Improvements
- The MCP server no longer tells the AI agent to wait for a run to
complete after every `trigger_task` call. Waiting is now opt-in: the
agent only waits when you ask it to (for example "trigger and then wait
for it to finish"). This avoids burning tokens polling runs you didn't
need to block on and keeps responses clearer.
([#3838](#3838))
- Update the bundled OpenTelemetry packages to their latest releases
(`@opentelemetry/sdk-node` 0.218.0, `@opentelemetry/core` 2.7.1,
`@opentelemetry/host-metrics` 0.38.3).
([#3810](#3810))
- `envvars.upload` now accepts an optional `isSecret` flag, letting you
create the imported variables as secret (redacted) environment
variables. When omitted, variables default to non-secret.
([#3809](#3809))
- Offload large trigger payloads to object storage before sending the
trigger API request. The SDK uploads packets at or above the existing
128KB limit and sends an `application/store` pointer instead of
embedding large JSON in the request body. `TriggerTaskRequestBody` now
validates that `application/store` payloads are non-empty storage paths.
([#3785](#3785))
- Make mollifier buffer and drainer internals configurable.
`MollifierBuffer` now accepts `ackGraceTtlSeconds`,
`maxRetriesPerRequest`, `reconnectStepMs`, and `reconnectMaxMs` options,
and `MollifierDrainer` accepts `maxBackoffMs` and `backoffFloorMs`. All
default to their previous hardcoded values, so existing behaviour is
unchanged.
([#3822](#3822))
- `MollifierDrainer` accepts a `drainBatchSize` option (default 1) that
controls how many entries are popped per env per tick — in-flight
handlers remain capped by the global `concurrency`. `MollifierBuffer`
also gains `getDrainingCount()` / `listStaleDraining()`, backed by a new
`mollifier:draining` ZSET maintained atomically with
pop/ack/fail/requeue (observability-only).
([#3797](#3797))
- Adds AI SDK 7 support. The `ai` peer range now includes v7, and the
`chat.agent` / chat surfaces work against v7's ESM-only build. On v7,
install `@ai-sdk/otel` alongside `ai` and the SDK registers it for you
so `experimental_telemetry` spans keep flowing into your run traces (v7
stopped emitting them from `ai` core). v5 and v6 keep working unchanged.
([#3833](#3833))
- `useTriggerChatTransport` now recovers when restored session state
points at a session that no longer exists in the current environment
([#3816](#3816))

## Bug fixes
- Fix `@trigger.dev/core` build: cast the underlying log record exporter
when calling `forceFlush` so it typechecks against the updated
OpenTelemetry `LogRecordExporter` type (which no longer declares
`forceFlush`).
([#3829](#3829))

<details>
<summary>Raw changeset output</summary>

⚠️⚠️⚠️⚠️⚠️⚠️

`main` is currently in **pre mode** so this branch has prereleases
rather than normal releases. If you want to exit prereleases, run
`changeset pre exit` on `main`.

⚠️⚠️⚠️⚠️⚠️⚠️

# Releases
## @trigger.dev/build@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## trigger.dev@4.5.0-rc.5

### Patch Changes

- The MCP server no longer tells the AI agent to wait for a run to
complete after every `trigger_task` call. Waiting is now opt-in: the
agent only waits when you ask it to (for example "trigger and then wait
for it to finish"). This avoids burning tokens polling runs you didn't
need to block on and keeps responses clearer.
([#3838](#3838))
- Update the bundled OpenTelemetry packages to their latest releases
(`@opentelemetry/sdk-node` 0.218.0, `@opentelemetry/core` 2.7.1,
`@opentelemetry/host-metrics` 0.38.3).
([#3810](#3810))
-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`
    -   `@trigger.dev/build@4.5.0-rc.5`
    -   `@trigger.dev/schema-to-json@4.5.0-rc.5`

## @trigger.dev/core@4.5.0-rc.5

### Patch Changes

- Add optional `shouldPauseScaling` to the supervisor consumer pool
scaling options to freeze scale-up while it returns true (scale-down
stays allowed).
([#3836](#3836))

- Fix `@trigger.dev/core` build: cast the underlying log record exporter
when calling `forceFlush` so it typechecks against the updated
OpenTelemetry `LogRecordExporter` type (which no longer declares
`forceFlush`).
([#3829](#3829))

- `envvars.upload` now accepts an optional `isSecret` flag, letting you
create the imported variables as secret (redacted) environment
variables. When omitted, variables default to non-secret.
([#3809](#3809))

    ```ts
    await envvars.upload("proj_1234", "prod", {
      variables: { STRIPE_SECRET_KEY: "sk_live_..." },
      isSecret: true,
    });
    ```

- Offload large trigger payloads to object storage before sending the
trigger API request. The SDK uploads packets at or above the existing
128KB limit and sends an `application/store` pointer instead of
embedding large JSON in the request body. `TriggerTaskRequestBody` now
validates that `application/store` payloads are non-empty storage paths.
([#3785](#3785))

Payload uploads use the same resolved `ApiClient` as the trigger call
(including `requestOptions.clientConfig`), not only the global
`apiClientManager.client` — so custom `baseURL`, access token, and
preview branch apply to both presign and trigger.

- Update the bundled OpenTelemetry packages to their latest releases
(`@opentelemetry/sdk-node` 0.218.0, `@opentelemetry/core` 2.7.1,
`@opentelemetry/host-metrics` 0.38.3).
([#3810](#3810))

## @trigger.dev/plugins@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## @trigger.dev/python@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/sdk@4.5.0-rc.5`
    -   `@trigger.dev/core@4.5.0-rc.5`
    -   `@trigger.dev/build@4.5.0-rc.5`

## @trigger.dev/react-hooks@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## @trigger.dev/redis-worker@4.5.0-rc.5

### Patch Changes

- Make mollifier buffer and drainer internals configurable.
`MollifierBuffer` now accepts `ackGraceTtlSeconds`,
`maxRetriesPerRequest`, `reconnectStepMs`, and `reconnectMaxMs` options,
and `MollifierDrainer` accepts `maxBackoffMs` and `backoffFloorMs`. All
default to their previous hardcoded values, so existing behaviour is
unchanged.
([#3822](#3822))
- `MollifierDrainer` accepts a `drainBatchSize` option (default 1) that
controls how many entries are popped per env per tick — in-flight
handlers remain capped by the global `concurrency`. `MollifierBuffer`
also gains `getDrainingCount()` / `listStaleDraining()`, backed by a new
`mollifier:draining` ZSET maintained atomically with
pop/ack/fail/requeue (observability-only).
([#3797](#3797))
-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## @trigger.dev/rsc@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## @trigger.dev/schema-to-json@4.5.0-rc.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

## @trigger.dev/sdk@4.5.0-rc.5

### Patch Changes

- Adds AI SDK 7 support. The `ai` peer range now includes v7, and the
`chat.agent` / chat surfaces work against v7's ESM-only build. On v7,
install `@ai-sdk/otel` alongside `ai` and the SDK registers it for you
so `experimental_telemetry` spans keep flowing into your run traces (v7
stopped emitting them from `ai` core). v5 and v6 keep working unchanged.
([#3833](#3833))

- `useTriggerChatTransport` now recovers when restored session state
points at a session that no longer exists in the current environment
([#3816](#3816))

- Offload large trigger payloads to object storage before sending the
trigger API request. The SDK uploads packets at or above the existing
128KB limit and sends an `application/store` pointer instead of
embedding large JSON in the request body. `TriggerTaskRequestBody` now
validates that `application/store` payloads are non-empty storage paths.
([#3785](#3785))

Payload uploads use the same resolved `ApiClient` as the trigger call
(including `requestOptions.clientConfig`), not only the global
`apiClientManager.client` — so custom `baseURL`, access token, and
preview branch apply to both presign and trigger.

- Update the bundled OpenTelemetry packages to their latest releases
(`@opentelemetry/sdk-node` 0.218.0, `@opentelemetry/core` 2.7.1,
`@opentelemetry/host-metrics` 0.38.3).
([#3810](#3810))

-   Updated dependencies:
    -   `@trigger.dev/core@4.5.0-rc.5`

</details>

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

re2-test-supervisor-backpressure-rc1

Toggle re2-test-supervisor-backpressure-rc1's commit message
fix(supervisor): fail open on engaged verdict with no fresh timestamp

When maxVerdictAgeMs is set, an engaged verdict must carry a fresh ts; a missing
or stale ts can't be trusted (a dead producer could otherwise pin the brake), so
treat it as not-engaged.

re2-prod-supervisor-backpressure

Toggle re2-prod-supervisor-backpressure's commit message
fix(supervisor): also strip DOCKER_REGISTRY_PASSWORD from debug env log

Pre-existing secret that wasn't excluded from envWithoutSecrets; add it to the
strip-list alongside the backpressure redis password.