我遇到了在docker swarm中发布相同的问题,这里提供一个部分来源于他人的解决方案。
Rails已经有一种机制来检测并发迁移,它使用数据库锁。但它会触发ConcurrentException,而不是等待。
一种解决方案是使用循环,每当抛出ConcurrentException时,只需等待5秒钟,然后重新进行迁移。
这尤其重要,因为所有容器都执行迁移,如果迁移失败,所有容器都必须失败。
来自coffejumper的解决方案
namespace :db do
namespace :migrate do
desc 'Run db:migrate and monitor ActiveRecord::ConcurrentMigrationError errors'
task monitor_concurrent: :environment do
loop do
puts 'Invoking Migrations'
Rake::Task['db:migrate'].reenable
Rake::Task['db:migrate'].invoke
puts 'Migrations Successful'
break
rescue ActiveRecord::ConcurrentMigrationError
puts 'Migrations Sleeping 5'
sleep(5)
end
end
end
end
有时您还需要按顺序执行其他进程,例如 after_party、cron 设置等来执行迁移。解决方案是使用与 Rails 相同的机制,在数据库锁定周围嵌入 rake 任务:
下面是基于 Rails 6 代码的示例,migrate_without_lock 执行所需的迁移,而 with_advisory_lock 获取数据库锁定(如果无法获取锁定,则触发 ConcurrentMigrationError)。
module Swarm
class Migration
def migrate
with_advisory_lock { migrate_without_lock }
end
private
def migrate_without_lock
**puts "Database migration"
Rake::Task['db:migrate'].invoke
puts "After_party migration"
Rake::Task['after_party:run'].invoke
...
puts "Migrations successful"**
end
def with_advisory_lock
lock_id = generate_migrator_advisory_lock_id
MyAdvisoryLockBase.establish_connection(ActiveRecord::Base.connection_config) unless MyAdvisoryLockBase.connected?
connection = MDAdvisoryLockBase.connection
got_lock = connection.get_advisory_lock(lock_id)
raise ActiveRecord::ConcurrentMigrationError unless got_lock
yield
ensure
if got_lock && !connection.release_advisory_lock(lock_id)
raise ActiveRecord::ConcurrentMigrationError.new(
ActiveRecord::ConcurrentMigrationError::RELEASE_LOCK_FAILED_MESSAGE
)
end
end
MIGRATOR_SALT = 1942351734
def generate_migrator_advisory_lock_id
db_name_hash = Zlib.crc32(ActiveRecord::Base.connection_config[:database])
MIGRATOR_SALT * db_name_hash
end
end
class MyAdvisoryLockBase < ActiveRecord::AdvisoryLockBase
self.connection_specification_name = "MDAdvisoryLockBase"
end
end
和之前一样,做一个循环来等待。
namespace :swarm do
desc 'Run migrations tasks after acquisition of lock on database'
task migrate: :environment do
result = 1
(1..10).each do |i|
**Swarm::Migration.new.migrate**
puts "Attempt #{i} sucessfully terminated"
result = 0
break
rescue ActiveRecord::ConcurrentMigrationError
seconds = rand(3..10)
puts "Attempt #{i} another migration is running => sleeping #{seconds}s"
sleep(seconds)
rescue => e
puts e
e.backtrace.each { |m| puts m }
break
end
exit(result)
end
end
然后在您的启动脚本中,只需启动rake任务即可。
set -e
bundle exec rails swarm:migrate
exec bundle exec rails server -b "0.0.0.0"
在最后,当所有容器运行您的迁移任务时,它们必须有一种机制来判断是否已经完成,以便不做任何操作(例如db:migrate)。
使用这个解决方案,Swarm启动容器的顺序不再重要,如果出现问题,所有容器都知道问题的所在 :-)