在使用CQL脚本时,是否有一种方法可以通过变量传递CQL命令,例如:
select * from "Column Family Name" where "ColumnName"='A variable which takes different values';
欢迎提出任何建议。
不,CQL 确实没有定义变量、运行循环以及基于这些变量更新/查询的方式。
作为替代方案,我通常使用 DataStax Python driver 来处理像这样的简单任务/脚本。以下是我之前使用的 Python 脚本摘录,用于从 CSV 文件中填充产品颜色。
# connect to Cassandra
auth_provider = PlainTextAuthProvider(username='username', password='currentHorseBatteryStaple')
cluster = Cluster(['127.0.0.1'], auth_provider=auth_provider)
session = cluster.connect('products')
# prepare statements
preparedUpdate = session.prepare(
"""
UPDATE products.productsByItemID SET color=? WHERE itemid=? AND productid=?;
"""
)
# end prepare statements
counter = 0
# read csv file
dataFile = csv.DictReader(csvfilename, delimiter=',')
for csvRow in dataFile:
itemid = csvRow['itemid']
color = csvRow['customcolor']
productid = csvRow['productid']
#update product color
session.execute(preparedUpdate,[color,itemid,productid])
counter = counter + 1
# close Cassandra connection
session.cluster.shutdown()
session.shutdown()
print "updated %d colors" % (counter)
想要了解更多信息,请查看DataStax教程使用Apache Cassandra和Python入门。
是的,您可以通过以下方式传递变量:
import com.datastax.spark.connector.{SomeColumns, _}
import org.apache.spark.{SparkConf, SparkContext}
import com.datastax.spark.connector.cql.CassandraConnector
import org.apache.spark.SparkConf
import com.datastax.spark.connector
import com.datastax.spark.connector._
import org.apache.spark.{Logging, SparkConf}
import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.{Row, SQLContext, DataFrame}
import org.apache.spark.sql.cassandra._
val myvar=1
csc.setKeyspace("test_keyspace")
val query="""select a.col1, c.col4, b.col2 from test_keyspace.table1 a inner join test_keyspace.table2 b on a.col1=b.col2 inner join test_keyspace.table3 c on b.col3=c.col4 where a.col1="""+myvar.toString
val results=csc.sql(query)
results.show()