pg_dump序列数据类型问题

Question

pg_dump序列数据类型问题

7

请问有人可以解释一下为什么使用以下脚本创建的PostgreSQL表：

CREATE TABLE users
(
  "id" serial NOT NULL,
  "name" character varying(150) NOT NULL,
  "surname" character varying (250) NOT NULL,
  "dept_id" integer NOT NULL,
  CONSTRAINT users_pkey PRIMARY KEY ("id")
)

pg_dump以以下格式转储：

CREATE TABLE users(
      "id" integer NOT NULL,
      "name" character varying(150) NOT NULL,
      "surname" character varying (250) NOT NULL,
      "dept_id" integer NOT NULL
    );

ALTER TABLE users OWNER TO postgres;

CREATE SEQUENCE "users_id_seq"
        START WITH 1
        INCREMENT BY 1
        NO MINVALUE
        NO MAXVALUE
        CACHE 1;

ALTER TABLE "users_id_seq" OWNER TO postgres;
ALTER SEQUENCE "users_id_seq" OWNED BY users."id";
ALTER TABLE ONLY users
ADD CONSTRAINT users_pkey PRIMARY KEY ("id");

很显然，以上只是转储文件的一个小片段。

为什么pg_dump会将数据类型serial转换为integer？当我从转储的SQL文件还原数据库时，它几乎变得无用，因为自增长停止工作，当从前端表单添加新记录时，它会失败并显示“id字段不能为空”的错误消息，很明显这是由于它是一个非空的主键，但自动递增应该启动并使用序列中的下一个值填充字段。

我错过了什么吗？

- Peter

真是巧合：https://dev59.com/rdTvs4cB2Jgan1znIo8C（我猜现在还是九月吧？）回答这个问题：序列*不存在*，序列只是整数的简写形式，带有默认值（序列）。通常，pg_dump输出以下代码：（1）表DDL，然后（2）插入数据，然后（3）创建序列+ 将其附加到整数字段+（4）将序列设置为最大数字。 - joop

哇！现在不是九月，但今天也有人问了一个几乎和我的问题一模一样的问题 :）。在我的情况下，序列也未设置（SELECT pg_catalog.setval('"users_id_seq"', 1, false)），而它应该是178。那么，在恢复后，我如何确保数据库的自动递增与恢复之前一样正常工作？ - Peter

显然，以上只是转储文件的一个小片段。请滚动到转储文件的末尾（使用您喜欢的编辑器;-)），并且：也许将postgres + pg_dump版本号添加到您的问题中。顺便问一下，您的输出是来自完整的pg_dump，还是带有“--schema-only”标志？ - joop

实际上在顶部而不是底部：-- 从数据库版本9.4.4转储 - 由pg_dump版本9.4.4转储。我进行整个转储，我的唯一选项是-F p --插入。 - Peter

3个回答

1

好的，在这里它可以工作。测试片段：

DROP SCHEMA tmpdump ;
CREATE SCHEMA tmpdump ;
set search_path = tmpdump ;

-- SELECT version();
DROP TABLE lusers;
CREATE TABLE lusers
(
  "id" serial NOT NULL
  , "name" character varying(150) NOT NULL
  , "surname" character varying (250) NOT NULL
  , "dept_id" integer NOT NULL
  , CONSTRAINT lusers_pkey PRIMARY KEY ("id")
);
INSERT INTO lusers ("name", "surname", "dept_id") VALUES
         ('Joop', 'Zoetemelk', 2) , ('Jan', 'Jansen', 3)
         , ('Ard', 'Schenk', 4) , ('Kees', 'Verkerk', 5);

仅使用以下命令转储tmpdump模式：

pg_dump -U someusername yourdbname -n tmpdump -F p --inserts | less

- joop

我认为我找到了问题所在。转储是使用--inserts完成的，而ID是插入语句的一部分。但是，如果我不这样做，我就会面临数据完整性问题的风险。 - Peter

0

这是我编写的一个脚本，用于将数据库导出为手动编写的格式：

#!/usr/bin/env python2
from __future__ import print_function

import re
import sys
from subprocess import check_output

re_seq_name = re.compile(r'OWNED BY .*\.(.+);')
re_pk = re.compile(r'\s*ADD (CONSTRAINT [\w_]+ PRIMARY KEY .*);')
re_fk = re.compile(
    r'\s*ADD (CONSTRAINT [\w_]+ FOREIGN KEY .*);')
re_fk_tbl = re.compile(r'FOREIGN KEY .* REFERENCES ([\w_]+)\s*\([\w_]+\)')
re_unique = re.compile(r'\s*ADD (CONSTRAINT [\w_]+ UNIQUE .*);')
re_tbl = re.compile(r'\s*CREATE TABLE IF NOT EXISTS ([\w_]+)')

env = {"PGHOST": "127.0.0.1",
       "PGDATABASE": "kadir",
       "PGUSER": "postgres",
       "PGPASSWORD": "password"}


def main():
    result = check_output(
        ["psql", "-tA", "-F,", "-c", r"\dt public.*"], env=env)

    table_names = []

    for line in result.split('\n'):
        if not line:
            continue
        table_name = line.split(",")[1]
        table_names.append(table_name)

    create_stmts = {}  # record all create table statement for topological sort
    tables_deps = []

    for table_name in table_names:
        result = check_output(["pg_dump", "-sx", "-t", table_name], env=env)

        sequences = {}
        constraints = []
        indexes = []
        lines = []
        for line in result.split("\n"):
            # remove unnecessary lines
            if not line:
                continue
            if line.startswith("--"):
                continue
            if line.startswith("SET"):
                continue
            if line.startswith("SELECT"):
                continue
            line = line.replace("CREATE TABLE", "CREATE TABLE IF NOT EXISTS")
            line = line.replace("public.", "")
            line = line.replace("timestamp without time zone", "timestamp")
            line = line.replace("character varying", "varchar")
            lines.append(line)

        # record sequences, constraints, indexes and fk deps
        deps = []
        for line in lines:
            if line.strip().startswith("CREATE SEQUENCE"):
                seq_name = line.replace("CREATE SEQUENCE", "").strip()
                seq_col_line = [
                    l for l in lines if seq_name in l and "ALTER SEQUENCE" in l]
                if len(seq_col_line) != 1:
                    raise Exception("expect one element")
                seq_col = re_seq_name.findall(seq_col_line[0])[0]
                sequences[seq_name] = seq_col
            if "PRIMARY KEY" in line:
                constraints.append(re_pk.findall(line)[0])
            if "FOREIGN KEY" in line:
                constraints.append(re_fk.findall(line)[0])
                deps.append(re_fk_tbl.findall(line)[0])
            if "UNIQUE" in line:
                constraints.append(re_unique.findall(line)[0])
            if line.strip().startswith("CREATE INDEX"):
                line = line.replace("USING btree ", "")
                line = line.replace(
                    "CREATE INDEX", "CREATE INDEX IF NOT EXISTS")
                indexes.append(line)

        tables_deps.append((table_name, deps))

        # extract create table statement
        start_index = (i for i, s in enumerate(lines)
                       if "CREATE TABLE" in s).next()
        end_index = (i for i, s in enumerate(lines)
                     if s.strip().startswith(");")).next()
        create_stmt = lines[start_index:end_index+1]

        # populate sequences
        for seq, col in sequences.items():
            (index, line) = ((i, s) for i, s in enumerate(create_stmt)
                             if s.strip().startswith(col)).next()
            if "bigint" in line:
                line = line.replace("bigint", "bigserial")
            elif "integer" in line:
                line = line.replace("integer", "serial")
            create_stmt[index] = line

        # insert constraints
        constraints = ["    "+c.strip() for c in constraints]
        constraints[:-1] = [c+"," for c in constraints[:-1]]

        create_stmt[end_index-1] = create_stmt[end_index-1]+",\n"
        create_stmt[end_index:end_index] = constraints
        create_stmt.extend(indexes)

        create_stmts[table_name] = create_stmt

    result = topological_sort(tables_deps)

    for table_name in result:
        for line in create_stmts[table_name]:
            print(line)
        print("\n")


def topological_sort(items):
    """shape of items: [(item1, (item2, item3))]"""
    result = []
    provided = set()
    remaining_items = list(items)
    all_items = set([i[0] for i in items])
    while remaining_items:
        emitted = False
        for i in remaining_items:
            item, dependencies = i
            dependencies = set(dependencies)
            if dependencies.issubset(provided):
                result.append(item)
                remaining_items.remove(i)
                provided.add(item)
                emitted = True
                break
        if not emitted:
            print("[Error]Dependency not found or cyclic dependency found:")
            # print("Found dependencies:", ", ".join(provided))
            for i in remaining_items:
                item, dependencies = i
                not_met = set(dependencies).difference(all_items)
                if not_met:
                    print("  ", item, "depends on", ", ".join(dependencies))
                    print("  dependency not met:", ", ".join(not_met))
            sys.exit(1)

    return result


if __name__ == '__main__':
    main()

示例输出：

CREATE TABLE IF NOT EXISTS competency (
    id bigserial NOT NULL,
    competency_title varchar(64) DEFAULT NULL::varchar,
    competency_description varchar(2048),
    competency_category_id bigint,
    created_by bigint,
    created_date timestamp DEFAULT now(),
    changed_date timestamp DEFAULT now(),
    deleted_date timestamp,

    CONSTRAINT competency_competency_title_unique UNIQUE (competency_title),
    CONSTRAINT competency_pk PRIMARY KEY (id),
    CONSTRAINT competency_competency_category_id_fk FOREIGN KEY (competency_category_id) REFERENCES competency_category(id),
    CONSTRAINT competency_created_by_fk FOREIGN KEY (created_by) REFERENCES user_profile(user_id)
);
CREATE INDEX competency_competency_title_index ON competency (competency_title);

- nemo

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Valery Viktorovsky · Accepted Answer

来自文档:

smallserial、serial 和 bigserial 数据类型不是真正的类型，仅仅是为了方便创建唯一标识列（类似于其他一些数据库支持的 AUTO_INCREMENT 属性）。在当前实现中，指定：

CREATE TABLE tablename (
    colname SERIAL
);

等同于指定：

CREATE SEQUENCE tablename_colname_seq;
CREATE TABLE tablename (
    colname integer NOT NULL DEFAULT nextval('tablename_colname_seq')
);
ALTER SEQUENCE tablename_colname_seq OWNED BY tablename.colname;

因此，我们创建了一个整数列，并安排其默认值从序列生成器中分配。应用NOT NULL约束以确保不能插入空值。（在大多数情况下，您还需要附加UNIQUE或PRIMARY KEY约束，以防止意外插入重复值，但这不是自动的。）最后，序列被标记为“拥有者”，因此如果删除该列或表，则序列将被删除。