合并多个CSV文件

4
我有一个包含多个CSV文件的文件夹。每个文件都包含日期和值两列。我想将所有文件合并成一个,其中第一列包含值日期(每个文件相同),其他列由每个单独文件的值填充,即(日期,value_file1,value_file2...)。
您有没有通过一个简单的Python脚本或甚至是Unix命令来实现这个目标的建议?
感谢您的帮助!

1
可能是Python合并两个CSV文件的重复问题。 - A.J. Uppal
第二个文件的列只是简单地附加到第一个文件的列。我希望第二个文件的列被放置在合并文件中的新列中,而不是附加到当前列。 - user3551674
这不是非常清楚 - 你能更清楚地解释一下输入的csv文件长什么样吗?日期在哪里/何时相同? - StackG
1个回答

3

我建议使用类似 csvkit的csvjoin 的工具

pip install csvkit
$ csvjoin --help
usage: csvjoin [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]
               [-p ESCAPECHAR] [-z MAXFIELDSIZE] [-e ENCODING] [-S] [-v] [-l]
               [--zero] [-c COLUMNS] [--outer] [--left] [--right]
               [FILE [FILE ...]]

Execute a SQL-like join to merge CSV files on a specified column or columns.

positional arguments:
  FILE                  The CSV files to operate on. If only one is specified,
                        it will be copied to STDOUT.

optional arguments:
  -h, --help            show this help message and exit
  -d DELIMITER, --delimiter DELIMITER
                        Delimiting character of the input CSV file.
  -t, --tabs            Specifies that the input CSV file is delimited with
                        tabs. Overrides "-d".
  -q QUOTECHAR, --quotechar QUOTECHAR
                        Character used to quote strings in the input CSV file.
  -u {0,1,2,3}, --quoting {0,1,2,3}
                        Quoting style used in the input CSV file. 0 = Quote
                        Minimal, 1 = Quote All, 2 = Quote Non-numeric, 3 =
                        Quote None.
  -b, --doublequote     Whether or not double quotes are doubled in the input
                        CSV file.
  -p ESCAPECHAR, --escapechar ESCAPECHAR
                        Character used to escape the delimiter if --quoting 3
                        ("Quote None") is specified and to escape the
                        QUOTECHAR if --doublequote is not specified.
  -z MAXFIELDSIZE, --maxfieldsize MAXFIELDSIZE
                        Maximum length of a single field in the input CSV
                        file.
  -e ENCODING, --encoding ENCODING
                        Specify the encoding the input CSV file.
  -S, --skipinitialspace
                        Ignore whitespace immediately following the delimiter.
  -v, --verbose         Print detailed tracebacks when errors occur.
  -l, --linenumbers     Insert a column of line numbers at the front of the
                        output. Useful when piping to grep or as a simple
                        primary key.
  --zero                When interpreting or displaying column numbers, use
                        zero-based numbering instead of the default 1-based
                        numbering.
  -c COLUMNS, --columns COLUMNS
                        The column name(s) on which to join. Should be either
                        one name (or index) or a comma-separated list with one
                        name (or index) for each file, in the same order that
                        the files were specified. May also be left
                        unspecified, in which case the two files will be
                        joined sequentially without performing any matching.
  --outer               Perform a full outer join, rather than the default
                        inner join.
  --left                Perform a left outer join, rather than the default
                        inner join. If more than two files are provided this
                        will be executed as a sequence of left outer joins,
                        starting at the left.
  --right               Perform a right outer join, rather than the default
                        inner join. If more than two files are provided this
                        will be executed as a sequence of right outer joins,
                        starting at the right.

Note that the join operation requires reading all files into memory. Don't try
this on very large files.

非常感谢!这正是我所寻找的! - user3551674

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接