Carbon聚合器旨在将多个指标的值汇总到一个输出指标中,通常是为了提高图形呈现性能。而Statsd旨在将多个数据点汇总到一个指标中,否则Graphite只会存储最后一个报告的值在最小存储分辨率中。
发送大约100个数据点,值为1。如果没有statsd,graphite将仅在每个10秒桶中存储收到的最后一个计数。在接收到第100条消息之后,其他99个数据点将被丢弃。但是,如果您在应用程序和graphite之间放置了statsd,则它将在将总计发送到graphite之前将所有100个数据点相加。因此,没有statsd时,您的图表仅指示在10秒间隔期间是否发生了登录。有了statsd,则表示在该间隔期间发生了多少次登录。(
等)。在您的运营仪表板上,您使用此Graphite查询显示所有服务器的平均值图表:weightedAverage(services.login.server*.response.time, services.login.server*.response.count, 2)。不幸的是,呈现此图表需要10秒钟。为解决此问题,您可以添加一个碳汇聚器规则来预先计算所有服务器的平均值并将该值存储在新指标中。现在,您可以更新您的仪表板以简单地拉取单个指标(例如
the aggregation rules in storage-aggregation.conf apply to all storage intervals in storage-schemas.conf except the first (smallest) retention period for each retention string. it is possible to use carbon-aggregator to aggregate data points within a metric for that first retention period. unfortunately, aggregation-rules.conf uses "blob" patterns rather than regex patterns. so you need to add a separate aggregation-rules.conf file entry for every path depth and aggregation type. the advantage of statsd is that the client sending the metric can specify the aggregation type rather than encoding it in the metric path. that gives you the flexibility to add a new metric on the fly regardless of metric path depth. if you wanted to configure carbon-aggregator to do statsd-like aggregation automatically when you add a new metric, your aggregation-rules.conf file would look something like this:
<n1>.avg (10)= avg <n1>.avg$
<n1>.count (10)= sum <n1>.count$
<n1>.<n2>.avg (10)= avg <n1>.<n2>.avg$
<n1>.<n2>.count (10)= sum <n1>.<n2>.count$
<n1>.<n2>.<n3>.avg (10)= avg <n1>.<n2>.<n3>.avg$
<n1>.<n2>.<n3>.count (10)= sum <n1>.<n2>.<n3>.count$
...
<n1>.<n2>.<n3> ... <n99>.count (10)= sum <n1>.<n2>.<n3> ... <n99>.count$
notes: the trailing "$" is not needed in graphite 0.10+ (currently pre-release) here is the relevant patch on github and here is the standard documentation on aggregation rules
the weightedAverage function is new in graphite 0.10, but generally the averageSeries function will give a very similar number as long as your load is evenly balanced. if you have some servers that are both slower and service fewer requests or you are just a stickler for precision, then you can still calculate a weighted average with graphite 0.9. you just need to build a more complex query like this:
divideSeries(sumSeries(multiplySeries(a.time,a.count), multiplySeries(b.time,b.count)),sumSeries(a.count, b.count))
if statsd is run on the client box this also reduces network load. although, in theory, you could run carbon-aggregator on the client side too. however, if you use one of the statsd client libraries, you can also use sampling to reduce the load on your application machine's cpu (e.g. creating loopback udp packets). furthermore, statsd can automatically perform multiple different aggregations on a single input metric (sum, mean, min, max, etcetera)
if you use statsd on each app server to aggregate response times, and then re-aggregate those values on the graphite server using carbon aggregator, you end up with an average response time weighted by app server rather than request. obviously, this only matters for aggregating using a mean or top_90 aggregation rule, and not min, max or sum. also, it only matters for mean if your load is unbalanced. as an example: assume you have a cluster of 100 servers, and suddenly 1 server is sent 99% of the traffic. consequentially, the response times quadruple on that 1 server, but remain steady on the other 99 servers. if you use client side aggregation, your overall metric would only go up about 3%. but if you do all your aggregation in a single server-side carbon aggregator, then your overall metric would go up by about 300%.
carbon-c-relay is essentially a drop-in replacement for carbon-aggregator written in c. it has improved performance and regex-based matching rules. the upshot being that you can do both statsd-style datapoint aggregation and carbon-relay style metric aggregation and other neat stuff like multi-layered aggregation all in the same simple regex-based config file.
if you use the cyanite back-end instead of carbon-cache, then cyanite will do the intra-metric averaging for you in memory (as of version 0.5.1) or at read time (in the version <0.1.3 architecture).