如何在pandas python中通过read_sql的'where in'子句传递元组

Question

如何在pandas python中通过read_sql的'where in'子句传递元组

4

我正在将一个转换为字符串的元组作为参数传递给read_sql方法

sql = "select * from table1 where col1 in " + str(tuple1) + " and col2 in " + str(tuple2)

df = pd.read_sql(sql, conn)

这段代码本来是正常运行的，但当元组只包含一个值时，SQL会出现ORA-00936: missing expression错误，因为这样的单元素元组有一个额外的逗号。例如：

tuple1 = (4011,)
tuple2 = (23,24)

SQL语句的格式如下：

select * from table1 where col1 in (4011,) + " and col2 in (23,24)
                                        ^
ORA-00936: missing expression

除了使用字符串操作去掉逗号之外，有没有更好的方法来做这件事？

有没有更好的方法来参数化read_sql函数？

- pratish_v

3个回答

3

你收到错误的原因是由于SQL语法问题。

当你使用WHERE col in (...)列表时，末尾的逗号会导致语法错误。

无论如何，使用字符串拼接将值放入SQL语句是不被赞同的，并且最终会导致更多的问题。

大多数Python SQL库都支持参数化查询。我不知道你使用的连接库是哪个，所以无法提供确切的文档链接，但是对于psycopg2，原理是相同的。

http://initd.org/psycopg/docs/usage.html#passing-parameters-to-sql-queries

这个功能在pd.read_sql中也有，所以为了安全地实现你想要的，你可以这样做：

sql = "select * from table1 where col1 in %s and col2 in %s"

df = pd.read_sql(sql, conn, params = [tuple1, tuple2])

- greg_data

好的方式，谢谢，但是我仍然在使用这一方法中遇到了错误：pandas.io.sql.DatabaseError: 执行 SQL 失败：Variable_TypeByValue() ：未处理的数据类型元组。 - pratish_v

我从你的评论中了解到，你可能正在使用cx_oracle。我在文档中找不到任何关于为什么元组不起作用的信息，http://cx-oracle.readthedocs.io/en/latest/cursor.html#Cursor.execute 表示序列是可以的。也许只需先尝试将每个元组转换为列表？ - greg_data

-1

选择 * 从表名 where 1=1 and (column_a, column_b) not in ((28,1),(25,1))

- Ashish Kumar

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Clíodhna · Accepted Answer

可能有更好的方法，但我会在查询周围添加一个if语句，并使用.format()而不是+来参数化查询。

可能的if语句：

if len(tuple1) < 2:
    tuple1 = tuple1[0]

这将根据您的输入而变化。如果您有元组列表，可以这样做：

tuples = [(4011,), (23, 24)]
new_t = []
for t in tuples:
    if len(t) == 2:
         new_t.append(t)
    elif len(t) == 1:
         new_t.append(t[0])

输出：

[4011, (23, 24)]

使用.format()更好地参数化查询的方法：

sql = "select * from table1 where col1 in {} and col2 in {}".format(str(tuple1), str(tuple2))

希望这能帮到您！