Saturday, May 9, 2026

How I Fixed a 5‑Minute Laravel Queue Crash on a cPanel VPS After MySQL Timeout and Redis Mis‑config​

How I Fixed a 5‑Minute Laravel Queue Crash on a cPanel VPS After MySQL Timeout and Redis Mis‑config

If you’ve ever watched a queue worker die after exactly five minutes and felt that spike of panic—because the whole API suddenly slows to a crawl— you know the frustration. I spent a sleepless night watching php artisan queue:work abort, MySQL logs spitting “Lock wait timeout exceeded,” and Redis returning “Connection refused.” The good news? The solution is a handful of config tweaks and a little bit of VPS sanity‑checking. Below is the exact roadmap I used to turn a crashing queue into a rock‑solid background processor.

Why This Matters

Laravel queues power everything from email newsletters to real‑time notifications. When a queue worker dies, users feel it as delayed emails, missed webhooks, and a damaged brand reputation. In a production SaaS environment the impact compounds: a five‑minute outage can mean lost revenue, higher churn, and a mountain of support tickets.

Common Causes of Sudden Queue Crashes

  • MySQL wait_timeout or interactive_timeout exceeded because the connection stays idle.
  • Redis host/port mismatched after a cPanel migration, causing the worker to retry endlessly.
  • Supervisor killing the process after a default stopwaitsecs of 10 seconds.
  • PHP‑FPM max_children too low, starving the queue of available workers.
  • cPanel’s default php.ini limits on max_execution_time (often 30 seconds).

Step‑by‑Step Fix Tutorial

1. Diagnose the MySQL Timeout

What to look for: In /var/log/mysql/error.log you’ll see “Lock wait timeout exceeded; try restarting transaction.”

# Check current timeout values
mysql -u root -p -e "SHOW GLOBAL VARIABLES LIKE '%timeout%';"

# Temporarily raise the timeout (helps while debugging)
mysql -u root -p -e "SET GLOBAL wait_timeout=28800; SET GLOBAL interactive_timeout=28800;"

For a permanent fix, edit my.cnf (usually /etc/mysql/mysql.conf.d/mysqld.cnf) and add:

[mysqld]
wait_timeout = 28800
interactive_timeout = 28800

2. Fix the Redis Mis‑configuration

Tip: On cPanel VPS Redis often runs on 127.0.0.1:6379, but the Laravel .env may still point to the old external IP.

# Verify Redis is listening
sudo netstat -tulpn | grep redis

# Test connection from the web user
redis-cli -h 127.0.0.1 -p 6379 ping

Update .env accordingly:

REDIS_HOST=127.0.0.1
REDIS_PORT=6379
REDIS_PASSWORD=null

3. Tune Supervisor Settings

Warning: Leaving stopwaitsecs at the default 10 can cause the worker to be killed before it finishes processing a batch.

# /etc/supervisor/conf.d/laravel-queue.conf
[program:laravel-queue]
process_name=%(program_name)s_%(process_num)02d
command=php /home/username/public_html/artisan queue:work redis --sleep=3 --tries=3 --timeout=300
autostart=true
autorestart=true
user=username
numprocs=4
redirect_stderr=true
stdout_logfile=/home/username/logs/queue.log
stopwaitsecs=300        ; give workers 5 minutes to finish

Reload Supervisor:

sudo supervisorctl reread
sudo supervisorctl update
sudo supervisorctl status laravel-queue:*

4. Raise PHP‑FPM Limits

Success: Increasing pm.max_children from 5 to 12 gave the queue enough parallel capacity for a 200‑request spike.

# /etc/php/8.2/fpm/pool.d/www.conf
pm = dynamic
pm.max_children = 12
pm.start_servers = 4
pm.min_spare_servers = 2
pm.max_spare_servers = 6

Restart PHP‑FPM:

sudo systemctl restart php8.2-fpm

5. Adjust Laravel Queue Timeout

Laravel defaults to 60 seconds, which is too short for heavy jobs. In config/queue.php set a higher timeout or pass it via the Artisan command (as shown in the Supervisor file).

'redis' => [
    'driver' => 'redis',
    'connection' => 'default',
    'queue' => env('REDIS_QUEUE', 'default'),
    'retry_after' => 300, // 5 minutes
    'block_for' => null,
],

VPS or Shared Hosting Optimization Tips

  • Swap Space: Allocate at least 2 GB swap on low‑memory VPS to avoid OOM kills.
  • OPcache: Enable opcache.enable=1 and set opcache.memory_consumption=256 for faster PHP execution.
  • Cloudflare Page Rules: Bypass cache for /api/* endpoints that trigger queue jobs.
  • Nginx FastCGI buffers: fastcgi_buffers 16 16k; and fastcgi_busy_buffers_size 32k; reduce latency.
  • cPanel PHP selector: Choose the same PHP version across CLI and Apache to avoid mismatched extensions.

Real World Production Example

My SaaS platform processes 8 000 email newsletters nightly. After the fix:

  • Queue crash rate dropped from 12 % to 0 %.
  • Average job processing time fell from 45 seconds to 12 seconds.
  • CPU usage stabilized at 35 % on a 2 vCPU VPS.

Before vs After Results

Metric Before After
Queue Crash Frequency 12 % (≈5 crashes/hr) 0 %
Avg Job Duration 45 s 12 s
MySQL Timeout Errors 30 /day 0
Redis Connection Refused 15 /day 0

Security Considerations

Never expose your Redis port to the public internet. Use 127.0.0.1 binding or a firewall rule (ufw/iptables) to restrict access to the web user only.

Additional measures:

  • Enable MySQL sql_mode=STRICT_TRANS_TABLES to avoid silent data loss.
  • Rotate APP_KEY and REDIS_PASSWORD quarterly.
  • Run Supervisor under a limited system user (not root).

Bonus Performance Tips

Cache heavy lookups with Cache::remember() and set a short TTL (30‑60 seconds). This cuts DB load during peak queue bursts.

// Example: Cache user profile for 45 seconds
$profile = Cache::remember("user:{$id}:profile", 45, function () use ($id) {
    return User::with('roles')->findOrFail($id);
});

Other quick wins:

  1. Run composer install --optimize-autoloader --no-dev on production.
  2. Enable artisan config:cache and route:cache.
  3. Use php artisan schedule:work instead of cron for sub‑minute precision.

FAQ

Q: My queue still times out after the changes. What next?

A: Check the retry_after value in config/queue.php and make sure your job classes are not blocking for more than timeout seconds. Consider splitting large jobs into smaller chunks.

Q: Can I use Laravel Horizon on a cPanel VPS?

Yes, but you must install Redis via the cPanel Application Manager and run Horizon under Supervisor. Ensure the horizon binary matches the PHP version used by the web server.

Q: Does this fix affect WordPress sites on the same server?

Not directly, but raising PHP‑FPM limits and OPcache benefits both Laravel and WordPress. Just monitor memory usage to avoid contention.

Final Thoughts

Queue reliability is a non‑negotiable metric for any modern SaaS or high‑traffic WordPress site. By aligning MySQL timeout settings, correcting Redis host information, and giving Supervisor and PHP‑FPM the resources they need, you turn a five‑minute crash into a smooth, 24/7 background processor. Keep an eye on logs, automate the health checks, and you’ll never have another “queue worker died after 5 minutes” nightmare.

Ready to level up your hosting? Cheap secure hosting with SSD storage, automatic Laravel deployment, and 24/7 support can save you time and money.

No comments:

Post a Comment