从 Pandas 行创建一个字典列表?

3
我有一个奇怪的问题。我有一个数据帧中的索引和一堆列。我希望索引成为键,所有其他列都在列表中。以下是一个示例:
数据帧:
    0   1   2   3
Barker Minerals Ltd             
Blackout Media Corp             
Booking Holdings Inc    Booking Holdings Inc    Booking Holdings Inc 4.10 04/13/2025    BOOKING HOLDINGS INC    
Baker Hughes Company    Baker Hughes Company    BAKER HUGHES A GE COMPANY LLC-3.34%-12-15-2027  BAKER HUGHES A GE COMPANY LLC-3.14%-11-7-2029   
Bank of Queensland Limited  Bank of Queensland Limited  Bank of Queensland Limited FRN 10-MAY-2026 3.50% 05/10/26   Bank of Queensland Limited FRN 26-OCT-2020 1.27% 10/26/20   Bank of Queensland Limited FRN 16-NOV-2021 1.12% 11/16/21

如果我运行此命令,它会把所有内容都转换成列表,但我希望它成为一个列表的字典:
df.to_numpy().tolist()

我想要一个字典,其中每个键对应其他列中的值列表(类似于这样):
{
Barker Minerals Ltd:    
Blackout Media Corp:    
Booking Holdings Inc:   [Booking Holdings Inc ,Booking Holdings Inc 4.10 04/13/2025,BOOKING HOLDINGS INC]
Baker Hughes Company:   [Baker Hughes Company ,BAKER HUGHES A GE COMPANY LLC-3.34%-12-15-2027,BAKER HUGHES A GE COMPANY LLC-3.14%-11-7-2029]
Bank of Queensland Limited: [Bank of Queensland Limited ,Bank of Queensland Limited FRN 10-MAY-2026 3.50% 05/10/26,Bank of Queensland Limited FRN 26-OCT-2020 1.27% 10/26/20, Bank of Queensland Limited FRN 16-NOV-2021 1.12% 11/16/21]
}

您能够胜任翻译这个吗?


2
df.T.to_dict('list')将会有带有空字符串的lists(假设空单元格是空字符串)来表示空值。 - Michael Szczesny
@MichaelSzczesny 我不清楚在'list'中应该放什么?上面是我正在尝试将其转换为字典列表的DF。 - Lostsoul
也许 Excel 表格的图片会令人困惑,但这是我认为可以轻松展示数据框内容的一种方式,因为同时复制/粘贴大量数据和列很困难。为了清晰起见,已将其移除。 - Lostsoul
是的。列标题基于列中项目数量的数字。 - Lostsoul
抱歉,我删掉了我的评论,但我看到了你的Excel编辑! - Mark Moretto
1个回答

3
根据 Michael Szczesny 在评论中提出的最简单答案是:

在评论中明确指出的最简单答案:

df.T.to_dict(orient="list")

输出:

{'Barker Minerals Ltd': [nan, nan, nan, nan],
 'Blackout Media Corp': [nan, nan, nan, nan],
 'Booking Holdings Inc': ['Booking Holdings Inc',
  'Booking Holdings Inc 4.10 04/13/2025',
  'BOOKING HOLDINGS INC',
  nan],
 'Baker Hughes Company': ['Baker Hughes Company',
  'BAKER HUGHES A GE COMPANY LLC-3.34%-12-15-2027',
  'BAKER HUGHES A GE COMPANY LLC-3.14%-11-7-2029',
  nan],
 'Bank of Queensland Limited': ['Bank of Queensland Limited',
  'Bank of Queensland Limited FRN 10-MAY-2026 3.50% 05/10/26',
  'Bank of Queensland Limited FRN 26-OCT-2020 1.27% 10/26/20',
  ' Bank of Queensland Limited FRN 16-NOV-2021 1.12% 11/16/21']}

此外,如果您希望删除所有的nan,则代码如下:

df =  pd.read_csv("df_to_dict.csv", index_col=0)
val = df.T.to_dict(orient="list")
cleaned_val = {}

for i in val:
    cleaned_val[i] = [j for j in val[i] if str(j)!="nan"]
    
cleaned_val

输出如下:
{'Barker Minerals Ltd': [],
 'Blackout Media Corp': [],
 'Booking Holdings Inc': ['Booking Holdings Inc',
  'Booking Holdings Inc 4.10 04/13/2025',
  'BOOKING HOLDINGS INC'],
 'Baker Hughes Company': ['Baker Hughes Company',
  'BAKER HUGHES A GE COMPANY LLC-3.34%-12-15-2027',
  'BAKER HUGHES A GE COMPANY LLC-3.14%-11-7-2029'],
 'Bank of Queensland Limited': ['Bank of Queensland Limited',
  'Bank of Queensland Limited FRN 10-MAY-2026 3.50% 05/10/26',
  'Bank of Queensland Limited FRN 26-OCT-2020 1.27% 10/26/20',
  ' Bank of Queensland Limited FRN 16-NOV-2021 1.12% 11/16/21']}
< p > < code> to_dict() 的文档可以在此处访问 这里。 < /p >

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接