在Pandas DataFrame中，在字符串内进行漂亮的换行打印

Question

在Pandas DataFrame中，在字符串内进行漂亮的换行打印

27

我有一个Pandas数据框，其中一列包含字符串元素，而这些字符串元素包含了换行符，我想要直接打印它们。但是在输出中它们只会以\n的形式出现。

也就是说，我想要打印出下面的内容：

  pos     bidder
0   1
1   2
2   3  <- alice
       <- bob
3   4

但这就是我得到的结果：

  pos            bidder
0   1
1   2
2   3  <- alice\n<- bob
3   4

我该如何实现我的目标？我可以使用DataFrame吗？还是必须手动逐行打印填充列？

目前我已经有以下成果：

n = 4
output = pd.DataFrame({
    'pos': range(1, n+1),
    'bidder': [''] * n
})
bids = {'alice': 3, 'bob': 3}
used_pos = []
for bidder, pos in bids.items():
    if pos in used_pos:
        arrow = output.ix[pos, 'bidder']
        output.ix[pos, 'bidder'] = arrow + "\n<- %s" % bidder
    else:
        output.ix[pos, 'bidder'] = "<- %s" % bidder
print(output)

- shadowtalker

4个回答

26

使用Pandas的`.set_properties()`和CSS的`white-space`属性

[用于IPython笔记本]

另一种方法是使用Pandas的pandas.io.formats.style.Styler.set_properties()方法和CSS的"white-space": "pre-wrap"属性：

from IPython.display import display

# Assuming the variable df contains the relevant DataFrame
display(df.style.set_properties(**{
    'white-space': 'pre-wrap',
}))

为了保持文本左对齐，您可以在下面添加'text-align': 'left'。

from IPython.display import display

# Assuming the variable df contains the relevant DataFrame
display(df.style.set_properties(**{
    'text-align': 'left',
    'white-space': 'pre-wrap',
}))

- yongjieyongjie

不幸的是，它不适用于大型数据框：“对象<class'pandas.io.formats.style.Styler'>太大而无法序列化；估计104360172字节；限制20000000”。（即使数据框在其他方面表现良好） - max

6

有点类似于unsorted的答案：

import pandas as pd

# Save the original `to_html` function to call it later
pd.DataFrame.base_to_html = pd.DataFrame.to_html
# Call it here in a controlled way
pd.DataFrame.to_html = (
    lambda df, *args, **kwargs: 
        (df.base_to_html(*args, **kwargs)
           .replace(r"\n", "<br/>"))
)

这样，您无需在Jupyter笔记本中调用任何显式函数，因为内部已经调用了to_html。如果您想使用原始函数，请调用base_to_html（或您所命名的任何名称）。我正在使用jupyter 1.0.0，notebook 5.7.6。

- Roger d'Amiens

这个能在Python脚本中使用吗 - 不需要Jupyter笔记本？ - Craig Nathan

5

来自pandas.DataFrame 文档:

二维大小可变、潜在异构的表格数据结构，具有标记的轴（行和列）。算术运算对齐行和列标签。可以被视为Series对象的类似字典的容器。主要的pandas数据结构。

因此，在DataFrame中不能没有索引创建行。换行符“\n”在DataFrame中也无效。

您可以用空值覆盖'pos'，并在下一行输出下一个'bidder'。但是每次这样做时，索引和'pos'都会偏移。例如：

  pos    bidder
0   1          
1   2          
2   3  <- alice
3        <- bob
4   5

如果一个叫做“frank”的竞标人的价值为4，它将覆盖“bob”。随着添加更多，这会引起问题。可能可以使用DataFrame并编写代码来解决此问题，但最好考虑其他解决方案。

以下是生成上述输出结构的代码。

import pandas as pd

n = 5
output = pd.DataFrame({'pos': range(1, n + 1),
                      'bidder': [''] * n},
                      columns=['pos', 'bidder'])
bids = {'alice': 3, 'bob': 3}
used_pos = []
for bidder, pos in bids.items():
    if pos in used_pos:
        output.ix[pos, 'bidder'] = "<- %s" % bidder
        output.ix[pos, 'pos'] = ''
    else:
        output.ix[pos - 1, 'bidder'] = "<- %s" % bidder
        used_pos.append(pos)
print(output)

编辑：

另一个选项是重新组织数据和输出。您可以将pos作为列，并为数据中的每个键/人创建一个新行。在下面的代码示例中，它打印了将NaN值替换为空字符串的DataFrame。

import pandas as pd

data = {'johnny\nnewline': 2, 'alice': 3, 'bob': 3,
        'frank': 4, 'lisa': 1, 'tom': 8}
n = range(1, max(data.values()) + 1)

# Create DataFrame with columns = pos
output = pd.DataFrame(columns=n, index=[])

# Populate DataFrame with rows
for index, (bidder, pos) in enumerate(data.items()):
    output.loc[index, pos] = bidder

# Print the DataFrame and remove NaN to make it easier to read.
print(output.fillna(''))

# Fetch and print every element in column 2
for index in range(1, 5):
    print(output.loc[index, 2])

不过，这取决于您想要对数据做什么。祝你好运:)

- oystein-hr

1

有趣的是，尽管我在定义中没有看到任何排除DataFrame包含元素内换行符的内容。例如，在R中原则完全有效。无论如何，我可能最终会使用字符串格式逐行执行它。 - shadowtalker

如果你从一个包含'johnny\nnewline'的DataFrame中获取一个元素并打印它，它会在一行上打印'johnny'，并在新的一行上打印'newline'。为了回答这个问题，可以添加另一个选项并打印示例。 - oystein-hr

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- unsorted · Accepted Answer

如果您正在尝试在IPython笔记本中执行此操作，可以执行以下操作：

from IPython.display import display, HTML

def pretty_print(df):
    return display( HTML( df.to_html().replace("\\n","<br>") ) )

在Pandas DataFrame中，在字符串内进行漂亮的换行打印

使用Pandas的.set_properties()和CSS的white-space属性

使用Pandas的`.set_properties()`和CSS的`white-space`属性