使用Python将CSV转换为HTML表格

8
我正在尝试从一个.csv文件中提取数据,并将其导入到Python中的HTML表格中。
这是csv文件https://www.mediafire.com/?mootyaa33bmijiq 背景: 该csv文件中填充了来自足球队的数据[年龄组,轮次,对手,团队得分,对手得分,位置]。 我需要能够选择特定的年龄组并仅在单独的表格中显示这些详细信息。
这是目前为止我所拥有的全部内容...
infile = open("Crushers.csv","r")

for line in infile:
    row = line.split(",")
    age = row[0]
    week = row [1]
    opp = row[2]
    ACscr = row[3]
    OPPscr = row[4]
    location = row[5]

if age == 'U12':
   print(week, opp, ACscr, OPPscr, location)

你可以使用 pandas 库来实现这个功能。pandas 有一个名为 to_html 的方法。这里是链接 https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_html.html - arnold
7个回答

22

首先安装pandas:

pip install pandas

接着运行:

import pandas as pd

columns = ['age', 'week', 'opp', 'ACscr', 'OPPscr', 'location']
df = pd.read_csv('Crushers.csv', names=columns)

# This you can change it to whatever you want to get
age_15 = df[df['age'] == 'U15']
# Other examples:
bye = df[df['opp'] == 'Bye']
crushed_team = df[df['ACscr'] == '0']
crushed_visitor = df[df['OPPscr'] == '0']
# Play with this

# Use the .to_html() to get your table in html
print(crushed_visitor.to_html())

你将得到类似于以下的内容:

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>age</th>
      <th>week</th>
      <th>opp</th>
      <th>ACscr</th>
      <th>OPPscr</th>
      <th>location</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>34</th>
      <td>U17</td>
      <td>1</td>
      <td>Banyo</td>
      <td>52</td>
      <td>0</td>
      <td>Home</td>
    </tr>
    <tr>
      <th>40</th>
      <td>U17</td>
      <td>7</td>
      <td>Aspley</td>
      <td>62</td>
      <td>0</td>
      <td>Home</td>
    </tr>
    <tr>
      <th>91</th>
      <td>U12</td>
      <td>7</td>
      <td>Rochedale</td>
      <td>8</td>
      <td>0</td>
      <td>Home</td>
    </tr>
  </tbody>
</table>


5

首先,安装pandas:

pip install pandas

然后,
import pandas as pd         
a = pd.read_csv("Crushers.csv") 
# to save as html file 
# named as "Table" 
a.to_html("Table.htm") 
# assign it to a  
# variable (string) 
html_file = a.to_html()

4
以下函数接受文件名、头部(可选)和分隔符(可选)作为输入,将CSV转换为HTML表格,并以字符串形式返回。 如果未提供标题,则假定标题已经存在于CSV文件中。

将CSV文件内容转换为HTML格式的表格

def csv_to_html_table(fname,headers=None,delimiter=","):
    with open(fname) as f:
        content = f.readlines()
    #reading file content into list
    rows = [x.strip() for x in content]
    table = "<table>"
    #creating HTML header row if header is provided 
    if headers is not None:
        table+= "".join(["<th>"+cell+"</th>" for cell in headers.split(delimiter)])
    else:
        table+= "".join(["<th>"+cell+"</th>" for cell in rows[0].split(delimiter)])
        rows=rows[1:]
    #Converting csv to html row by row
    for row in rows:
        table+= "<tr>" + "".join(["<td>"+cell+"</td>" for cell in row.split(delimiter)]) + "</tr>" + "\n"
    table+="</table><br>"
    return table

在您的情况下,函数调用将如下所示,但这将不会过滤掉csv中的条目,而是直接将整个csv文件转换为HTML表格。
filename="Crushers.csv"
myheader='age,week,opp,ACscr,OPPscr,location'
html_table=csv_to_html_table(filename,myheader)

注意:若要筛选出具有特定数值的条目,请在 for 循环中添加条件语句。

1
如果您想在表格中添加边框,请将“<table>”替换为“<table border = 1>”。 - Atul6.Singh
非常感谢您提供的脚本,它对我很有帮助。我想知道一下,一旦CSV文件被转换为HTML表格,是否可以将带有文本的单元格转换为HTML的<a href link>?您是否有任何指南或链接可以给我? - mollieda
1
你需要更改else块,扩展for循环推导式。在for循环中添加条件语句,使用正则表达式检查单元格内容是否匹配URL格式。如果匹配,则添加带有单元格数据的<a href>标签。 - Yash
嗨,在以下链接中,我尝试遵循您的建议,请您看一下并帮助我;https://stackoverflow.com/questions/71280410/how-to-convert-automatically-a-string-into-a-clickable-link-a-href-url-a - mollieda

2

首先导入一些内容:

import csv
from html import escape
import io

现在是构建模块的时候了 - 让我们创建一个用于读取CSV文件的函数和另一个用于生成HTML表格的函数:
def read_csv(path, column_names):
    with open(path, newline='') as f:
        # why newline='': see footnote at the end of https://docs.python.org/3/library/csv.html
        reader = csv.reader(f)
        for row in reader:
            record = {name: value for name, value in zip(column_names, row)}
            yield record

def html_table(records):
    # records is expected to be a list of dicts
    column_names = []
    # first detect all posible keys (field names) that are present in records
    for record in records:
        for name in record.keys():
            if name not in column_names:
                column_names.append(name)
    # create the HTML line by line
    lines = []
    lines.append('<table>\n')
    lines.append('  <tr>\n')
    for name in column_names:
        lines.append('    <th>{}</th>\n'.format(escape(name)))
    lines.append('  </tr>\n')
    for record in records:
        lines.append('  <tr>\n')
        for name in column_names:
            value = record.get(name, '')
            lines.append('    <td>{}</td>\n'.format(escape(value)))
        lines.append('  </tr>\n')
    lines.append('</table>')
    # join the lines to a single string and return it
    return ''.join(lines)

现在只需要将它们组合起来 :)
records = list(read_csv('Crushers.csv', 'age week opp ACscr OPPscr location'.split()))

# Print first record to see whether we are loading correctly
print(records[0])
# Output:
# {'age': 'U13', 'week': '1', 'opp': 'Waterford', 'ACscr': '22', 'OPPscr': '36', 'location': 'Home'}

records = [r for r in records if r['age'] == 'U12']

print(html_table(records))
# Output:
# <table>
#   <tr>
#     <th>age</th>
#     <th>week</th>
#     <th>opp</th>
#     <th>ACscr</th>
#     <th>OPPscr</th>
#     <th>location</th>
#   </tr>
#   <tr>
#     <td>U12</td>
#     <td>1</td>
#     <td>Waterford</td>
#     <td>0</td>
#     <td>4</td>
#     <td>Home</td>
#   </tr>
#   <tr>
#     <td>U12</td>
#     <td>2</td>
#     <td>North Lakes</td>
#     <td>12</td>
#     <td>18</td>
#     <td>Away</td>
#   </tr>
#   ...
# </table>

几点说明:

  • csv.reader 比行分割更好,因为它还处理带引号的值,甚至是带有换行符的带引号的值

  • html.escape 用于转义可能包含字符 <> 的字符串

  • 使用字典比元组更容易处理

  • 通常 CSV 文件包含标题(第一行是列名),可以使用 csv.DictReader 轻松加载;但是 Crushers.csv 没有标题(数据从第一行开始),所以我们在函数 read_csv 中自己构建字典

  • 函数 read_csvhtml_table 都是通用的,可以处理任何数据,列名没有被“硬编码”到它们中

  • 是的,你可以使用 pandas 的 read_csvto_html 代替 :) 但是知道如何在没有 pandas 的情况下进行自定义处理很好,或者只是作为编程练习。


2

在开始打印所需的行之前,请输出一些HTML来设置适当的表格结构。

当您找到要打印的行时,请以HTML表格行格式输出它。

# begin the table
print("<table>")

# column headers
print("<th>")
print("<td>Week</td>")
print("<td>Opp</td>")
print("<td>ACscr</td>")
print("<td>OPPscr</td>")
print("<td>Location</td>")
print("</th>")

infile = open("Crushers.csv","r")

for line in infile:
    row = line.split(",")
    age = row[0]
    week = row [1]
    opp = row[2]
    ACscr = row[3]
    OPPscr = row[4]
    location = row[5]

    if age == 'U12':
        print("<tr>")
        print("<td>%s</td>" % week)
        print("<td>%s</td>" % opp)
        print("<td>%s</td>" % ACscr)
        print("<td>%s</td>" % OPPscr)
        print("<td>%s</td>" % location)
        print("</tr>")

# end the table
print("</table>")

2
这也应该可以工作:

最初的回答:

from html import HTML
import csv

def to_html(csvfile):
    H = HTML()
    t=H.table(border='2')
    r = t.tr
    with open(csvfile) as csvfile:
        reader = csv.DictReader(csvfile)
        for column in reader.fieldnames:
            r.td(column)
        for row in reader:
            t.tr
            for col in row.iteritems():
                t.td(col[1])
    return t

最初的回答:通过将csv文件传递给函数来调用该函数。

这是最直接的解决方案,但你需要修改一些东西才能使其工作——你需要初始化H并保留行中列的顺序。 `from html import HTML import csvdef csv_to_html(csvfile): H = HTML() t=H.table(border='2') r = t.tr with open(csvfile) as csvfile: reader = csv.DictReader(csvfile) for column in reader.fieldnames: r.td(column) for row in reader: t.tr for column in reader.fieldnames: t.td(row.get(column, "empty")) return H` - George

1

其他答案建议使用pandas,但如果您只需要将CSV格式化为HTML表格,那可能有些过度了。如果您想仅出于此目的使用现有软件包,则可以使用tabulate

import csv

from tabulate import tabulate

with open("Crushers.csv") as file:
    reader = csv.reader(file)
    u12_rows = [row for row in reader if row[0] == "U12"]
print(tabulate(u12_rows, tablefmt="html"))

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接