Deploying a Rails application is often a routine task, but when you throw Sidekiq into the mix, there are some nuances to consider. Sidekiq, a popular background processing tool, can introduce challenges, especially during restarts. In this article, we'll delve into the intricacies of handling Sidekiq during deployments and offer some best practices.
The Basics of Deployment RestartWhen you deploy a new version of your Rails app, both the main application and Sidekiq need to restart. If you're using a strategy like rolling restarts, your application server will gracefully finish current user requests before initiating the restart. This ensures minimal disruption for your users. But what happens with sidekiq and background jobs?
Sidekiq's Background Jobs: The Heart of the MatterSidekiq's background jobs are where the real challenges emerge:
- Jobs in Queue: These are the jobs waiting for their turn to run. They remain unaffected by the restart and will run as expected once Sidekiq is back up. Everything is stored in redis so no problem.
- Running Jobs: Jobs that are actively running during a deploy / restart can face an interruption. If they are taking too long, they may fail. Sidekiq's default behavior is to retry these jobs so no problem. However, if you've set a job not to retry, it won't pick up where it left off after the interruption.
Sidekiq's Challenges and SolutionsDeployments often bring code changes. But what if a background job is running old code while new code is being deployed?
- Ensure that critical jobs are set to retry and be inside transactions. This way, even if they're interrupted, they'll have another shot at completing.
- One thing that we need to be careful are database schema changes. Imagine a job expecting a specific database column, but the new deployment removes it. Even database transactions, which ensure a set of operations are fully completed, can't protect against such structural changes.
- Be aware of jobs that take a long time to complete. Consider scheduling deployments during off-peak times or when such jobs are less likely to run.
What challenges have you encountered during your deployments, and how did you overcome them?