我希望用最大尺寸为180×180像素的图像来代表音频文件。
我希望生成这个图像,以某种方式给出音频文件的表示,就像SoundCloud的波形(幅度图)一样?
我首先发布在ux.stackexchange.com上, 这是我尝试联系任何正在处理此问题的程序员。
我希望用最大尺寸为180×180像素的图像来代表音频文件。
我希望生成这个图像,以某种方式给出音频文件的表示,就像SoundCloud的波形(幅度图)一样?
wave
模块周围编写的轻量级包装器:from pydub import AudioSegment
# first I'll open the audio file
sound = AudioSegment.from_mp3("some_song.mp3")
# break the sound 180 even chunks (or however
# many pixels wide the image should be)
chunk_length = len(sound) / 180
loudness_of_chunks = []
for i in range(180):
start = i * chunk_length
end = chunk_start + chunk_length
chunk = sound[start:end]
loudness_of_chunks.append(chunk.rms)
for循环可以表示为以下列表推导式,我只是想让它更清晰明了:
loudness_of_chunks = [
sound[ i*chunk_length : (i+1)*chunk_length ].rms
for i in range(180)]
max_rms = max(loudness_of_chunks)
scaled_loudness = [ (loudness / max_rms) * 180 for loudness in loudness_of_chunks]
我会把实际像素绘制留给你,因为我对 PIL 或 ImageMagik 不是很熟悉 :/
max_rms
转换为浮点数。这会有所帮助。 - Remco基于Jiaaro的答案(感谢编写pydub!),并为web2py构建,这是我的意见:
def generate_waveform():
img_width = 1170
img_height = 140
line_color = 180
filename = os.path.join(request.folder,'static','sounds','adg3.mp3')
# first I'll open the audio file
sound = pydub.AudioSegment.from_mp3(filename)
# break the sound 180 even chunks (or however
# many pixels wide the image should be)
chunk_length = len(sound) / img_width
loudness_of_chunks = [
sound[ i*chunk_length : (i+1)*chunk_length ].rms
for i in range(img_width)
]
max_rms = float(max(loudness_of_chunks))
scaled_loudness = [ round(loudness * img_height/ max_rms) for loudness in loudness_of_chunks]
# now convert the scaled_loudness to an image
im = Image.new('L',(img_width, img_height),color=255)
draw = ImageDraw.Draw(im)
for x,rms in enumerate(scaled_loudness):
y0 = img_height - rms
y1 = img_height
draw.line((x,y0,x,y1), fill=line_color, width=1)
buffer = cStringIO.StringIO()
del draw
im = im.filter(ImageFilter.SMOOTH).filter(ImageFilter.DETAIL)
im.save(buffer,'PNG')
buffer.seek(0)
return response.stream(buffer, filename=filename+'.png')