删除Elasticsearch中的旧索引

39

我有许多以logstash-Year-Week格式索引的日志。如果我想删除几周前的索引,该如何在elasticsearch中实现呢?是否有一种简单、无缝的方法来做到这一点?

10个回答

34

这正是我在寻找的。你有关于Curator应用的文档吗? - steven johns
9
这不适用于Curator v4或更新版本。它需要一个配置文件和一个操作文件,其中描述了Curator的操作。 - Eduardo Bezerra
请查看@sachchit-bansal的回答,以获取Curator 4.2示例的可行性。 - chrisan

28
如果您正在使用Elasticsearch 5.x版本,则需要安装Curator版本4.x。您可以从文档中查看版本兼容性和安装步骤。
安装完成后,只需运行命令即可。
curator --config path/config_file.yml [--dry-run] path/action_file.yml

Curator 提供了一个 dry-run 标志,只输出 Curator 将要执行的内容。输出将在您在 config.yml 文件中定义的日志文件中。如果在 config_file.yml 中未定义日志键,则会将 currator 输出到控制台。要删除索引,请运行上述命令而不带 --dry-run 标志。

配置文件 config_file.yml 是:

---
client:
  hosts:
   - 127.0.0.1
  port: 9200
logging:
  loglevel: INFO
  logfile: "/root/curator/logs/actions.log"
  logformat: default
  blacklist: ['elasticsearch', 'urllib3']

操作文件 action_file.yml 是

---
actions:
  1:
    action: delete_indices
    description: >-
      Delete indices older than 7 days (based on index name), for logstash-
      prefixed indices. Ignore the error if the filter does not result in an
      actionable list of indices (ignore_empty_list) and exit cleanly.
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: prefix
      value: logstash-
      exclude:
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y.%m.%d'
      unit: days
      unit_count: 7
      exclude:
如果您想要定期自动删除每周、每月等索引,请编写如下的bash脚本:
#!/bin/bash
# Script to delete the log event indices of the elasticsearch weekly

#This will delete the indices of the last 7 days
curator --config /path/config_file.yml /path/action_file.yml

将一个 shell 脚本放在以下任一文件夹中之一:/etc/cron.daily、/etc/cron.hourly、/etc/cron.monthly 或 /etc/cron.weekly,然后你的工作就完成了。

注意:确保在你的配置和操作文件中使用正确的缩进,否则它将无法正常工作。


2
谢谢,这是针对Curator 4.2的当前(2017年)工作版本的答案 :) - chrisan
这就是curator的工作方式!Vineeth Mohan的答案已经过时了,从curator 4.x开始 - 这应该适用于大多数elasticsearch安装(当前版本为5.x)。 - jonashackt

18

我使用一份Bash脚本,只需将30更改为您想要保留的天数即可。

#!/bin/bash

# Zero padded days using %d instead of %e
DAYSAGO=`date --date="30 days ago" +%Y%m%d`
ALLLINES=`/usr/bin/curl -s -XGET http://127.0.0.1:9200/_cat/indices?v | egrep logstash`

echo
echo "THIS IS WHAT SHOULD BE DELETED FOR ELK:"
echo

echo "$ALLLINES" | while read LINE
do
  FORMATEDLINE=`echo $LINE | awk '{ print $3 }' | awk -F'-' '{ print $2 }' | sed 's/\.//g' ` 
  if [ "$FORMATEDLINE" -lt "$DAYSAGO" ]
  then
    TODELETE=`echo $LINE | awk '{ print $3 }'`
    echo "http://127.0.0.1:9200/$TODELETE"
  fi
done

echo
echo -n "if this make sence, Y to continue N to exit [Y/N]:"
read INPUT
if [ "$INPUT" == "Y" ] || [ "$INPUT" == "y" ] || [ "$INPUT" == "yes" ] || [ "$INPUT" == "YES" ]
then
  echo "$ALLLINES" | while read LINE
  do
    FORMATEDLINE=`echo $LINE | awk '{ print $3 }' | awk -F'-' '{ print $2 }' | sed 's/\.//g' `
    if [ "$FORMATEDLINE" -lt "$DAYSAGO" ]
    then
      TODELETE=`echo $LINE | awk '{ print $3 }'`
      /usr/bin/curl -XDELETE http://127.0.0.1:9200/$TODELETE
      sleep 1
      fi
  done
else 
  echo SCRIPT CLOSED BY USER, BYE ...
  echo
  exit
fi

1
它实际上运行得非常完美,不需要安装像Curator这样的额外工具。 - Piotr Dawidiuk

16

从elasticsearch 6.6开始,索引生命周期管理已经包含在基本版(免费版)的elasticsearch中,并以一种更为优雅的方式完成了Curator的工作。

以下步骤未经Martin Ehrnhöfer的精彩简洁博客文章许可而复制

前提条件(注意复制者):

  • 您的elasticsearch服务器可以通过 http://elasticsearch:9200 访问
  • 您希望您的索引在30天后被清除(30d
  • 您的策略名称将被创建为 cleanup_policy
  • 您的filebeat索引名称以 filebeat- 开头
  • 您的logstash索引名称以 logstash- 开头

1. 创建一个策略,在一个月后删除索引

curl -X PUT "http://elasticsearch:9200/_ilm/policy/cleanup_policy?pretty" \
     -H 'Content-Type: application/json' \
     -d '{
      "policy": {                       
        "phases": {
          "hot": {                      
            "actions": {}
          },
          "delete": {
            "min_age": "30d",           
            "actions": { "delete": {} }
          }
        }
      }
    }'

2. 将此策略应用于所有现有的filebeat和logstash索引

curl -X PUT "http://elasticsearch:9200/logstash-*/_settings?pretty" \
     -H 'Content-Type: application/json' \
     -d '{ "lifecycle.name": "cleanup_policy" }'
curl -X PUT "http://elasticsearch:9200/filebeat-*/_settings?pretty" \
     -H 'Content-Type: application/json' \
     -d '{ "lifecycle.name": "cleanup_policy" }'

3. 创建一个模板,将此政策应用于新的filebeat和logstash索引

curl -X PUT "http://elasticsearch:9200/_template/logging_policy_template?pretty" \
     -H 'Content-Type: application/json' \
     -d '{
      "index_patterns": ["filebeat-*", "logstash-*"],                 
      "settings": { "index.lifecycle.name": "cleanup_policy" }
    }'

7

请看Curator,这是一个专门开发用于此类情况的工具。

以下是文档中的示例命令:

curator --host 10.0.0.2 delete indices --older-than 30 --time-unit days \
   --timestring '%Y.%m.%d'

6
你可以使用curl。
 curl -X DELETE http://localhost:9200/filebeat-$(date +"%Y.%m.%d" -d "last Month")

您需要在 xxx.sh 文件中添加以下命令,然后就可以创建 crontab。 crontab -e

00 00 * * * /etc/elasticsearch/xxx.sh

这个cron将每天在下午12点运行,它将删除旧的日志。

1
this is all you need. - visualex
谢谢,这对我很有帮助!在我的情况下,我是这样使用的:我正在执行echo以检查日期echo logstash-$(date +"%Y.%m" -d "-4 month").* - Sabuhi Shukurov

1
curator_cli delete_indices --filter_list '{"filtertype":"none"}' 

将删除所有或筛选:

 --filter_list '[{"filtertype":"age","source":"creation_date","direction":"older","unit":"days","unit_count":13},{"filtertype":"pattern","kind":"prefix","value":"logstash"}]'

0

yanb(另一个Bash)

#!/bin/bash
searchIndex=logstash-monitor
elastic_url=localhost
elastic_port=9200

date2stamp () {
    date --utc --date "$1" +%s
}

dateDiff (){
    case $1 in
        -s)   sec=1;      shift;;
        -m)   sec=60;     shift;;
        -h)   sec=3600;   shift;;
        -d)   sec=86400;  shift;;
        *)    sec=86400;;
    esac
    dte1=$(date2stamp $1)
    dte2=$(date2stamp $2)
    diffSec=$((dte2-dte1))
    if ((diffSec < 0)); then abs=-1; else abs=1; fi
    echo $((diffSec/sec*abs))
}

for index in $(curl -s "${elastic_url}:${elastic_port}/_cat/indices?v" |     grep -E " ${searchIndex}-20[0-9][0-9]\.[0-1][0-9]\.[0-3][0-9]" | awk '{     print $3 }'); do
  date=$(echo ${index: -10} | sed 's/\./-/g')
  cond=$(date +%Y-%m-%d)
  diff=$(dateDiff -d $date $cond)
  echo -n "${index} (${diff})"
  if [ $diff -gt 1 ]; then
    echo " / DELETE"
    # curl -XDELETE "${elastic_url}:${elastic_port}/${index}?pretty"
  else
    echo ""
  fi
done    

0

Curator没有帮助我

现在,当我使用以下命令运行Curator时,它会给我一个错误:

curator --config config_file.yml action_file.yml

错误:

Error: Elasticsearch version 7.9.1 incompatible with this version of Curator (5.2.0)

无法找到与 Elasticsearch 7.9.1 兼容的 curator 版本,也不能只升级或降级 Elasticsearch 版本。因此,我使用了 @Alejandro 的答案,并使用下面的脚本完成了这项工作。我稍微修改了一下脚本。

脚本解决方案

#!/bin/bash

# Zero padded days using %d instead of %e
DAYSAGO=`date --date="30 days ago" +%Y%m%d`
ALLLINES=`/usr/bin/curl -s -XGET http://127.0.0.1:9200/_cat/indices?v`
# Just add -u <username>:<password> in curl statement if your elastic search is behind the credentials. Also, you can give an additional grep statement to filter out specific indexes

echo
echo "THIS IS WHAT SHOULD BE DELETED FOR ELK:"
echo

echo "$ALLLINES" | while read LINE
do
  FORMATEDLINE=`echo $LINE | awk '{ print $3 }' | grep -Eo "[0-9]{4}.[0-9]{2}.[0-9]{2}" | sed 's/\.//g'`
  if [ "$FORMATEDLINE" -lt "$DAYSAGO" ]
  then
    TODELETE=`echo $LINE | awk '{ print $3 }'`
    echo "http://127.0.0.1:9200/$TODELETE"
  fi
done

echo
echo -n "Y to continue N to exit [Y/N]:"
read INPUT
if [ "$INPUT" == "Y" ] || [ "$INPUT" == "y" ] || [ "$INPUT" == "yes" ] || [ "$INPUT" == "YES" ]
then
  echo "$ALLLINES" | while read LINE
    do
    FORMATEDLINE=`echo $LINE | awk '{ print $3 }' | grep -Eo "[0-9]{4}.[0-9]{2}.[0-9]{2}" | sed 's/\.//g'`
    if [ "$FORMATEDLINE" -lt "$DAYSAGO" ]
    then
      TODELETE=`echo -n $LINE | awk '{ print $3 }'`
      /usr/bin/curl -XDELETE http://127.0.0.1:9200/$TODELETE
      sleep 1
      fi
  done
else
  echo SCRIPT CLOSED BY USER, BYE ...
  echo
  exit
fi

-1
在我的情况下,删除旧的索引是必要的,因为我从5.X升级到了7.5版本。因此,我遵循了简单的步骤来清除这些索引。
rm -rf /var/lib/elasticsearch/nodes/0/indices/*

2
但是绕过官方API做任何事情都是不好的。 - ipeacocks

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接