通过geopandas连接多个shapefile文件

Question

通过geopandas连接多个shapefile文件

14

我正在尝试通过实现以下内容来合并多个shapefile文件：

import geopandas as gpd
import pandas as pd

for i in range(10,56):
    interesting_files = "/Users/m3105/Downloads/area/tl_2015_{}_arealm.shp".format(i)
    gdf_list = []
    for filename in sorted(interesting_files):
        gdf_list.append(gpd.read_file((filename)))
        full_gdf = pd.concat(gdf_list)

目录 /Users/m3105/Downloads/area 中有几个shapefile，例如 tl_2015_01_arealm.shp，tl_2015_02_arealm.shp，一直到 tl_2015_56_arealm.shp。我想将所有这些shapefile合并，并避免重复他们的头文件。然而，每当我尝试使用上面的代码连接文件时，就会出现以下错误：

ValueError：Null layer:u''

通常，我知道如何将csv文件拼接在一起，但我不确定如何拼接shapefile。非常感谢任何帮助。

- M3105

1

你的代码似乎存在几个错误，无法运行：interesting_files是一个单独的字符串，因此使用for filename in sorted(interesting_files):循环遍历它将会遍历该文件名的单个字符。此外，pd.concat(gdf_list)应该在for循环之外。 - joris

3个回答

18

如果像@Paul H的答案中使用pandas.concat，一些地理信息（如坐标参考系统（crs））默认情况下不会被保留。但是，如果使用以下方式，则可以解决这个问题：

import os
import geopandas as gpd
import pandas as pd

file = os.listdir("Your folder")
path = [os.path.join("Your folder", i) for i in file if ".shp" in i]

gdf = gpd.GeoDataFrame(pd.concat([gpd.read_file(i) for i in path], 
                        ignore_index=True), crs=gpd.read_file(path[0]).crs)

通过这种方式，geodataframe 将具有您所需的 CRS。

- lemon

2

我有大型数据集，这个工作速度非常快！ - g123456k

1

那是如此高效！ - icypy

1

我没有足够的声望来评论上一个提交，但在测试具有不同CRS的输入文件后，

gdf = gpd.GeoDataFrame(pd.concat([gpd.read_file(i) for i in path], 
                        ignore_index=True), crs=gpd.read_file(path[0]).crs)

应该是

gdf = gpd.GeoDataFrame(pd.concat([gpd.read_file(i).to_crs(gpd.read_file(path[0]).crs) for i in path], 
                        ignore_index=True), crs=gpd.read_file(path[0]).crs)

- Steven Gregg

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Paul H · Accepted Answer

由于我没有你的数据，无法对其进行测试，但是您需要类似于以下内容的代码（假设使用Python 3）：

from pathlib import Path
import pandas
import geopandas

folder = Path("/Users/m3105/Downloads/area")
shapefiles = folder.glob("tl_2015_*_arealm.shp")
gdf = pandas.concat([
    geopandas.read_file(shp)
    for shp in shapefiles
]).pipe(geopandas.GeoDataFrame)
gdf.to_file(folder / 'compiled.shp')