使用Rails在Postgres中开发自定义函数

3

假设我有一个复杂的公式来计算小部件的质量

quality = 0.4(factory_quality) + 0.3(1/days_since_manufacture) + 0.3(materials_quality)

这三个因素本身就是函数,需要与工厂表进行连接,可能还需要连接一个物料清单与材料的连接表,其中相关记录被平均或其他操作。

在Rails项目中,架构上如何管理这个问题? a) 如何产生正确的查询和 b) 如何管理Rails代码?

目前对于SQL,我正在使用FROM语句中的子查询:

SELECT *,
  (0.4 * factory_quality + 0.3 * (1/days_since_manufacture) + 0.3 * materials_quality) AS quality
FROM (
  SELECT *,
    ((factories.last_inspection_score + factories.variance_score)/2) AS factory_quality,
    (now() - widgets.created_at) AS days_since_manufacture,
    SUM(materials.quality_score) AS materials_quality
  FROM widgets,
  JOIN factories ON widget.factory_id = factories.id
  JOIN bills_of_materials ON widget.id = bills_of_materials.widget_id
  JOIN materials ON bills_of_materials.material_id = materials.id
  GROUP BY widgets.id
) AS widgets;

在Rails中,我主要使用ActiveRecord实现了这个功能:
class Widget < ActiveRecord::Base
  belongs_to :factory
  has_many :bills_of_material
  has_many :materials, through :bills_of_material

  class << self
    def with_quality
      select([
        "widgets.*",
        "(0.4 * factory_quality + 0.3 * (1/days_since_manufacture) + 0.3 * materials_quality) AS quality"
      ].join(",")
      .from("(#{subquery}) AS widgets")
    end
    private
      def subquery
        select([
          "widgets.*",
          "((factories.last_inspection_score + factories.variance_score)/2) AS factory_quality",
          "(now() - widgets.created_at) AS days_since_manufacture",
          "SUM(materials.quality_score) AS materials_quality"
        ].join(","))
        .joins(:factory,:materials)
        .group("widgets.id")
        .to_sql
      end
  end
end

话虽如此,我觉得我可以在Postgres中创建一个自定义函数,在该函数中移动所有这些SQL语句,进行迁移,并清理Rails的代码,使其看起来像:

def with_scores
  select("*,quality_score_func(id) AS quality")
end

或类似的东西,但我觉得通过数据库迁移来管理一个不断变化的公式将会很麻烦,更不用说找出当前公式的形式(而且难以测试)。

其他人是如何解决这个问题的?有什么建议或提示吗?


只是想澄清一下,我不是在回避SQL,并且我相信计算应该放在数据库中以获得更好的性能。我只是想知道是否有人开发了一种设计模式来优化这段代码。 - rurabe
1个回答

2
这是我能想到的最不太像SQL的方式来实现这个。我无法真正测试它,但希望至少能帮上忙。据我所知,如果您使用includes,Rails将组合联接并在一个查询中积极地加载所有相关数据。
# All of these are additional Widget instance methods; you decide if they are private
#
# Example use:
#
#   @widget = Widget.includes(:factory, :materials).find(1)
#   puts @widget.quality_score
# or
#   @widgets = Widget.includes(:factory, :materials).all
#   @widgets.each { |widget| puts widget.quality_score }

# Consider making these weights named constants
def quality_score
  0.4 * factory_quality + \
  0.3 * (1/days_since_manufacture) + \
  0.3 * (materials_quality_score )
end

def days_since_manufacture
  Time.now - created_at
end

def factory_quality
  (factory.last_inspection_score + factory.variance_score)/2
end

def materials_quality_score
  materials.inject(0) {|sum, material| sum + material.quality_score }
end

谢谢您的回答,但我并不想在Ruby中做更多的工作,恰恰相反。如果我在Ruby中进行计算,代码会更清晰,但为了执行像按质量排名前5名这样的查询,我需要实例化所有记录,对所有记录进行计算,在内存中排序,然后丢弃N-5。我愿意忍受代码的丑陋以避免这种性能损失,但我觉得这可能是一个常见的问题,可能有比我所做的更少丑陋的方法。 - rurabe
对于误解表示歉意。如果是这种情况,我同意定义Postgres函数是一个不好的想法。相反,我认为我会定义Ruby方法来生成适当的片段,并将它们组合成可以发送到find_by_sql的内容。我不指望这样看起来会好多少,但它允许您将可能会改变的计算与阻塞和处理分开。 - Steve Rowley

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接