我使用elasticsearch-php对elasticsearch进行了基准测试。我比较了逐一索引10,000个文档所需的时间与批量索引1,000个文档共10,000个文档所需的时间。
在我的VPN服务器上,配置是3个内核和2GB内存,无论是否使用批量索引,性能都相当。
我的PHP代码(灵感来自一个帖子):
<?php
set_time_limit(0); // no timeout
require 'vendor/autoload.php';
$es = new Elasticsearch\Client([
'hosts'=>['127.0.0.1:9200']
]);
$max = 10000;
// ELASTICSEARCH BULK INDEX
$temps_debut = microtime(true);
for ($i = 0; $i <= $max; $i++) {
$params['body'][] = array(
'index' => array(
'_index' => 'articles',
'_type' => 'article',
'_id' => 'cle' . $i
)
);
$params['body'][] = array(
'my_field' => 'my_value' . $i
);
if ($i % 1000) { // Every 1000 documents stop and send the bulk request
$responses = $es->bulk($params);
$params = array(); // erase the old bulk request
unset($responses); // unset to save memory
}
}
$temps_fin = microtime(true);
echo 'Elasticsearch bulk: ' . round($i / round($temps_fin - $temps_debut, 4)) . ' per sec <br>';
// ELASTICSEARCH WITHOUT BULK INDEX
$temps_debut = microtime(true);
for ($i = 1; $i <= $max; $i++) {
$params = array();
$params['index'] = 'my_index';
$params['type'] = 'my_type';
$params['id'] = "key".$i;
$params['body'] = array('testField' => 'valeur'.$i);
$ret = $es->index($params);
}
$temps_fin = microtime(true);
echo 'Elasticsearch One by one : ' . round($i / round($temps_fin - $temps_debut, 4)) . 'per sec <br>';
?>
Elasticsearch批量插入:每秒1209个 Elasticsearch逐一插入:每秒1197个
我的批量索引有什么问题可以提高性能吗?
谢谢