如何将Python Pandas中的数据类型映射到PostgreSQL表？

Question

如何将Python Pandas中的数据类型映射到PostgreSQL表？

pythondatabasepandaspostgresql

3

我有一个 pandas.DataFrame，其中的列具有不同的数据类型，如 object，int64 等。

我已经创建了一个具有适当数据类型的 postgresql 表。我想将所有数据框中的数据插入到 postgresql 表中。我应该如何处理？

注意：pandas 中的数据来自另一个源，因此数据类型不是由我手动指定的。

- Pranav Barve

2个回答

2

最简单的方法是使用sqlalchemy:

from sqlalchemy import create_engine

engine = create_engine('postgresql://abc:def@localhost:5432/database')
df.to_sql('table_name', engine, if_exists='replace')

如果表存在，您可以使用 if_exists 选项选择要执行的操作。

if_exists {‘fail’, ‘replace’, ‘append’}, default ‘fail’

如果表不存在，它将创建一个具有相应数据类型的新表。

- NYC Coder

听起来这个表已经被创建并且他们想要追加数据到它里面。在这种情况下，数据类型已经得到了处理。如果我没记错的话，如果有一个数据条目无法转换为表的数据类型，则会引发异常。更具挑战性的问题是自动使用正确的数据类型创建表，不过那个函数做得很好！ - J.Warren

对我来说完美地工作了，正确识别和转换了数据类型。 - Hrvoje

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- rokdd · Accepted Answer

也许您有我曾经遇到的问题，即想在现有表格上创建新列，然后用替换或附加表格的解决方案对我无效。对我来说，简而言之，情况如下（我猜测转换数据类型没有通用解决方案，您应该根据自己的需求进行调整）：

lg.debug('table gets extended with the columns: '+",".join(dataframe.dtypes))
#check whether we have to add a field
df_postgres={'object':'text','int64':'bigint','float64':'numeric','bool':'boolean','datetime64':'timestamp','timedelta':'interval'}
for col in dataframe.columns:
    #convert the columns to postgres:
    if str(dataframe.dtypes[col]) in df_postgres:
        dbo.table_column_if_not_exists(self.table_name,col,df_postgres[str(dataframe.dtypes[col])],original_endpoint)
    else:
        lg.error('Fieldtype '+str(dataframe.dtypes[col])+' is not configured')

和创建列的函数：

def table_column_if_not_exists(self,table,name,dtype,original_endpoint=''):
    self.query(query='ALTER TABLE '+table+' ADD COLUMN IF NOT EXISTS '+name+' '+dtype)
    #make a comment when we know which source create this column
    if original_endpoint!='':
            self.query(query='comment on column '+table+'.'+name+" IS '"+original_endpoint+"'")