如何在使用SparkConf连接远程Cassandra集群时通过“需要身份验证”?

9

我正在尝试使用Apache Spark和Cassandra进行数据分析。因此,我编写了一段Java代码来访问运行在远程机器上的Cassandra。我使用了以下Java代码。

public class JavaDemo implements Serializable {
private transient SparkConf conf;

private JavaDemo(SparkConf conf) {
    this.conf = conf;
}

private void run() {
    JavaSparkContext sc = new JavaSparkContext(conf);
    generateData(sc);
    compute(sc);
    showResults(sc);
    sc.stop();
}

private void generateData(JavaSparkContext sc) {
    CassandraConnector connector = CassandraConnector.apply(sc.getConf());
    Session session = connector.openSession();

    // Prepare the schema

        session.execute("DROP KEYSPACE IF EXISTS java_api");
        session.execute("CREATE KEYSPACE java_api WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}");
        session.execute("CREATE TABLE java_api.products (id INT PRIMARY KEY, name TEXT, parents LIST<INT>)");
        session.execute("CREATE TABLE java_api.sales (id UUID PRIMARY KEY, product INT, price DECIMAL)");
        session.execute("CREATE TABLE java_api.summaries (product INT PRIMARY KEY, summary DECIMAL)");

}

private void compute(JavaSparkContext sc) {
    System.out.println("IN compute");
}

private void showResults(JavaSparkContext sc) {
    System.out.println("IN showResults");
}

public static void main(String[] args) {


    SparkConf conf = new SparkConf();
    conf.setAppName("Java API demo");
    conf.setMaster("local[1]");
    System.out.println("---------------------------------");
    conf.set("spark.cassandra.connection.host", "192.168.1.219");


    JavaDemo app = new JavaDemo(conf);
    app.run();
} 

我的远程主机是192.168.1.219,上面运行着cassandra,默认端口是9160。当我运行这个程序时,我遇到了以下错误。

    15/01/29 10:14:26 INFO ui.SparkUI: Started Spark Web UI at http://Justin:4040
15/01/29 10:14:27 WARN core.FrameCompressor: Cannot find LZ4 class, you should make sure the LZ4 library is in the classpath if you intend to use it. LZ4 compression will not be available for the protocol.
Exception in thread "main" com.datastax.driver.core.exceptions.AuthenticationException: Authentication error on host /192.168.1.219:9042: Host /192.168.1.219:9042 requires authentication, but no authenticator found in Cluster configuration
    at com.datastax.driver.core.AuthProvider$1.newAuthenticator(AuthProvider.java:38)
    at com.datastax.driver.core.Connection.initializeTransport(Connection.java:139)
    at com.datastax.driver.core.Connection.<init>(Connection.java:111)
    at com.datastax.driver.core.Connection$Factory.open(Connection.java:445)
    at com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:216)
    at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:172)
    at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:80)
    at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1145)
    at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:313)
    at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:166)
    at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$4.apply(CassandraConnector.scala:151)
    at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$4.apply(CassandraConnector.scala:151)
    at com.datastax.spark.connector.cql.RefCountedCache.createNewValueAndKeys(RefCountedCache.scala:36)
    at com.datastax.spark.connector.cql.RefCountedCache.acquire(RefCountedCache.scala:61)
    at com.datastax.spark.connector.cql.CassandraConnector.openSession(CassandraConnector.scala:72)
    at com.datastax.spark.demo.JavaDemo.generateData(JavaDemo.java:42)
    at com.datastax.spark.demo.JavaDemo.run(JavaDemo.java:34)
    at com.datastax.spark.demo.JavaDemo.main(JavaDemo.java:73)

我有所遗漏吗?它直接连接到9042端口。我该如何连接它?


1
关于“在主机/192.168.1.219:9042上的身份验证错误:主机/192.168.1.219:9042需要身份验证,但在集群配置中找不到认证器”的内容有什么不清楚的地方吗? - Sotirios Delimanolis
1个回答

14

看起来您的cassandra集群已配置了身份验证。由于您未提供凭据,它不允许您连接。您可以使用本页描述的 spark.cassandra.auth.usernamespark.cassandra.auth.password 属性传递身份验证凭据。这里有更详细的说明。

因此,您可以执行以下操作:

conf.set("spark.cassandra.auth.username", "cassandra");            
conf.set("spark.cassandra.auth.password", "cassandra");

在你的代码中使它工作。

如果你启用了认证,并且还没有创建或更改任何用户,则可以使用 'cassandra' 作为用户名和密码。但在生产环境中,你应该创建一个单独的帐户来代替,并更改 cassandra 用户的密码,因为它可以访问所有内容。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接