Laravel Background Jobs: 12 Best Practices for Production Queues
· 11 min read · Boring Observability
Queued jobs are the part of a Laravel app that runs when you're not looking — after the request is gone, during a deploy, against data that may have changed or vanished, on code one version newer than the code that dispatched them. Most job bugs aren't logic errors; they're assumptions about when and how many times a job runs that quietly hold in development and fall apart under production load.
Here are twelve practices that keep a queue healthy under real traffic. They build on each other: small arguments make idempotency easier, idempotency makes retries safe, safe retries make backoff useful, and so on. The last one — backwards compatibility across deploys — is the one that bites hardest and no linter catches.
1. Keep arguments small
Every byte of constructor arguments is a byte in Redis, multiplied by every queued instance. Don't pass big arrays,
blobs, or rich DTOs "to save a query." Pass the minimal identifiers and re-read what you need inside handle().
A queue backlog of fat jobs is how you OOM Redis.
If you pass Eloquent models, Laravel can serialize just the class and the ID and automatically re-reload the record on
execution — as long as you include the SerializesModels trait.
2. Every job should declare its queue
class GenerateImageThumbnails implements ShouldQueue
{
public $queue = 'media';
}
Horizon routes work by queue name, and each supervisor is tuned for a workload — concurrency, timeout, balance strategy.
A job with no $queue lands on default, where a slow image resize sits behind — or ahead of — a
latency-sensitive job.
Mixing fast and slow, or urgent and bulk, on one queue defeats the entire point of having supervisors.
Pick the queue that matches the work's shape and SLA, and make sure a matching Horizon supervisor processes it.
3. Make jobs idempotent
Horizon delivers at-least-once. A worker that's killed mid-job — deploy, OOM, timeout — leaves the job to be retried, so the same job body can run twice. Design every job so that running it twice produces the same end state as running it once:
- Check before you create (
firstOrCreate,updateOrCreate, "already sent?" guards). - Use unique constraints as a backstop, not wishful thinking.
- Never assume "this ran, therefore it ran exactly once."
The realistic failure mode isn't a clean re-run from the top — it's a job that did partial work and then got
killed before completing. Redis only removes a job from its reserved set on success, so a hard kill
(SIGKILL from a per-job timeout, or the supervisor force-killing a worker that outlived the shutdown grace
window during a deploy) leaves the job to be migrated back and re-run from scratch — with whatever side effects the first
attempt already committed still in place.
Because attempts is incremented when the job is reserved, not when it fails, each kill-and-recover
cycle burns one try; after maxTries such cycles the job is marked failed having run its side effects several
times with no clean completion. So idempotency has to cover resuming after a partial run, not just "don't
duplicate a fully-successful run."
4. Keep jobs atomic and short
A deploy restarts workers; a job that exceeds its $timeout is killed mid-flight. If your job does five writes
and dies after three, the retry redoes all five — and the first three had better be idempotent (see above).
Prefer one job per record over one job that loops 10,000 records. A per-record job that dies retries one record; a mega-job retries everything and may never finish inside the timeout window. Fan out:
Item::where(...)->eachById(fn (Item $i) => SyncItemJob::dispatch($i));
5. Retries and backoff — let the framework do it
Don't manually re-run failed jobs and don't catch-and-swallow. Configure retries declaratively:
public $tries = 5;
// exponential backoff — required for anything hitting an external API
public function backoff(): array
{
return [10, 30, 60, 300];
}
Hammering an API with immediate retries turns a transient blip into an outage. Back off. Let jobs that genuinely can't
succeed land in failed_jobs — that's a signal to fix the root cause, not a thing to mass-retry. Use
$this->fail($e) to bail early when you know a retry won't help.
6. Dispatch after the transaction commits
DB::transaction(function () {
$booking = Booking::create([...]);
SendBookingConfirmation::dispatch($booking->id)->afterCommit();
});
Without afterCommit() (or the connection-level 'after_commit' => true), a fast worker can
pick up the job and find() a booking that hasn't been committed yet —
ModelNotFoundException, a spurious failure that only reproduces under load. Dispatch the side effect only
once the data it depends on is durable.
7. The ShouldBeUnique contract — three ways to get it wrong
Unique jobs are where the most expensive, silent bugs live.
Set uniqueFor. A plain ShouldBeUnique lock is acquired at dispatch and
released only when processing reaches a terminal state (success, or final failure after maxTries) — it is
held across the whole processing window and across backoff retries. Without uniqueFor the lock is
created with no TTL (forever()), so it is reclaimed only by that eventual terminal run. Normally
that's fine: a worker killed mid-job leaves the job to be retried, and the retry's terminal state clears the lock. The
exposure is twofold:
- Until that retry completes, every new dispatch of the same unique job is silently dropped — not
errored, just gone.
uniqueForbounds this window to a known TTL instead of "however long the kill→retry→finish cycle takes." - If the job never reaches a terminal run — a
uniqueId()that reads external mutable state (so the release computes a different key than acquire), a lost reserved entry, ormaxTries: 0with a job that's killed every attempt — theforever()lock is never reclaimed and the job is silently undispatchable until the key is cleared by hand.uniqueForis the only thing that auto-recovers it.
Caveat: uniqueFor must exceed your worst-case total processing-plus-retry time, or the lock expires
mid-legitimate-run and a duplicate can be dispatched. It's a ceiling on the deadlock risk, not a free safety net
— keep uniqueId() a pure function of the job's own data.
public int $uniqueFor = 3600; // lock self-heals after an hour
uniqueId is mandatory for parameterized jobs. The lock key is
laravel_unique_job:<class>:<uniqueId>. Omit uniqueId on a job that takes arguments
and the key collapses to the class name alone — so SyncCompany(1) and SyncCompany(2) share one
lock and one of them is silently dropped at dispatch. Lost work, no error.
public function uniqueId(): string
{
return (string) $this->companyId;
}
A genuinely class-wide singleton is fine — make the intent explicit by returning a constant from uniqueId().
8. Never bulk or batch a unique job
Queue::bulk() / Bus::bulk() push raw payloads straight to Redis, skipping the dispatcher that
acquires the lock — uniqueness silently does nothing. And batching a unique job means a dropped duplicate desyncs the
batch's up-front job count, so its then/finally callbacks never fire and the batch hangs as
"pending" forever. Dispatch unique jobs individually.
Periodic / scheduled jobs should implement ShouldBeUnique so a slow run doesn't overlap the next tick.
9. A batchable job must honour cancellation
Cancelling a batch only stops future dispatches — jobs already on the queue still wake up and run their full body unless they check. For anything that mutates state (writes files, calls APIs, charges cards), that's wasted work against a batch the caller abandoned. Guard it:
public function handle(): void
{
if ($this->batch()?->cancelled()) {
return;
}
// ... heavy work
}
…or centralise it with the Illuminate\Queue\Middleware\SkipIfBatchCancelled middleware.
10. Don't sleep() in a job
A sleeping job pins a worker doing nothing, starving every other job behind it. If you need to wait — rate limits, a not-yet-ready upstream — release the job back with a delay instead:
$this->release(60); // re-queue, free the worker, try again in a minute
11. The queue outlives your deploy — backwards compatibility
This is the one that bites hardest and isn't covered by any linter, so read it twice.
When you deploy, there are already jobs sitting in the queue, serialized against the old code. A worker on the new code has to deserialize and run them. Two ways this goes wrong.
Renaming or moving a job class
The serialized payload stores the fully-qualified class name — App\Products\Jobs\SyncStockJob.
Rename the class, move it to another namespace, or delete it, and every already-queued instance becomes unresolvable:
deserialization throws, the job lands in failed_jobs, the work is lost.
So:
- Two-phase it. Keep the old class (even as a thin subclass of the new one) for one deploy cycle, let the queue drain, then remove it in a follow-up deploy.
- Or drain the queue of that job type before shipping the rename.
- Don't rename a hot job class in the same PR that renames its behaviour.
Changing constructor arguments
The subtle one. PHP's unserialize() does not call the constructor — it restores the saved
properties directly. So your constructor's default values do nothing for jobs already in the queue.
class SyncStockJob implements ShouldQueue
{
// ⚠️ promoted, no class-level default.
// An old payload serialized before $force existed restores WITHOUT it →
// accessing $this->force throws "must not be accessed before initialization".
public function __construct(
public int $productId,
public bool $force = false,
) {}
}
The constructor default = false only helps new dispatches. An old queued job never goes through the
constructor, so its $force stays uninitialized, and the new handle() that reads it crashes. Fix:
give the property a class-level default so deserialized old jobs fall back gracefully:
class SyncStockJob implements ShouldQueue
{
public bool $force = false; // ← real default, survives unserialize
public function __construct(public int $productId, bool $force = false)
{
$this->force = $force;
}
}
The general rules for a safe job change:
- New properties get a class-level default value (or are nullable), never a bare typed property.
- Don't remove or rename a property a queued payload still carries without a deprecation window.
handle()must tolerate both the old and the new payload shape for the length of one deploy cycle.- When in doubt, two-phase: ship the backwards-compatible change, let the old jobs drain, clean up in a later deploy.
12. Watch it run
Use Horizon. Watch queue wait times, failure rates, and throughput per queue — a backlog on one supervisor is invisible from the others. A job that's "working" in the sense of not erroring can still be silently 20 minutes behind. Instrumentation is how you find out before a customer does.
The core takeaway
Write every job assuming it will run twice, run late, run against a deleted record, and run on code that's one version
newer than the code that dispatched it. Idempotency, atomicity, backoff, afterCommit, an explicit queue, the
full ShouldBeUnique / batch contracts, and — above all — backwards compatibility across deploys.