从多通道wav文件中读取单个通道

Question

从多通道wav文件中读取单个通道

18

我需要从一个包含多达12个（11.1格式）声道的wav文件中提取单个声道的样本。我知道在普通立体声文件中，样本是交错的，先是左声道，然后是右声道，就像这样：

[1st L] [1st R] [2nd L] [2nd R]...

因此，要读取左通道，我会这样做，

for (var i = 0; i < myByteArray.Length; i += (bitDepth / 8) * 2)
{
    // Get bytes and convert to actual samples.
}

为了获取正确的通道，我只需执行 for (var i = (bitDepth / 8)...。

但是，对于具有两个以上通道的文件，使用什么顺序？

- Sam

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Sam · Accepted Answer

Microsoft创建了一个标准，涵盖了高达18个通道。根据他们的说法，wav文件需要有一个特殊元数据子块（在“可扩展格式”部分下），指定一个“通道掩码”（dwChannelMask）。该字段长度为4个字节（一个uint），其中包含每个存在的通道的相应位，因此指示文件中使用的这18个通道中的哪些。

主通道布局

以下是MCL，即现有通道应交错的顺序，以及每个通道的位值。如果不存在某个通道，则下一个存在的通道将“降级”到缺失通道的位置，并使用其顺序号，但从不使用位值。（位值对于每个通道都是唯一的，无论通道是否存在）。

Order | Bit | Channel

 1.     0x1  Front Left
 2.     0x2  Front Right
 3.     0x4  Front Center
 4.     0x8  Low Frequency (LFE)
 5.    0x10  Back Left (Surround Back Left)
 6.    0x20  Back Right (Surround Back Right)
 7.    0x40  Front Left of Center
 8.    0x80  Front Right of Center
 9.   0x100  Back Center
10.   0x200  Side Left (Surround Left)
11.   0x400  Side Right (Surround Right)
12.   0x800  Top Center
13.  0x1000  Top Front Left
14.  0x2000  Top Front Center
15.  0x4000  Top Front Right
16.  0x8000  Top Back Left
17. 0x10000  Top Back Center
18. 0x20000  Top Back Right

示例：如果通道掩码为0x63F（1599），则表示文件包含8个通道（FL，FR，FC，LFE，BL，BR，SL和SR）。

读取和检查通道掩码

要获取掩码，您需要读取第40、41、42和43个字节（假设基本索引为0，并且您正在读取标准wav头）。例如，

var bytes = new byte[50];

using (var stream = new FileStream("filepath...", FileMode.Open))
{
    stream.Read(bytes, 0, 50);
}

var speakerMask = BitConverter.ToUInt32(new[] { bytes[40], bytes[41], bytes[42], bytes[43] }, 0);

然后，您需要检查所需的通道是否存在。为此，我建议创建一个枚举（用[Flags]定义），其中包含所有通道及其相应的值。

[Flags]
public enum Channels : uint
{
    FrontLeft = 0x1,
    FrontRight = 0x2,
    FrontCenter = 0x4,
    Lfe = 0x8,
    BackLeft = 0x10,
    BackRight = 0x20,
    FrontLeftOfCenter = 0x40,
    FrontRightOfCenter = 0x80,
    BackCenter = 0x100,
    SideLeft = 0x200,
    SideRight = 0x400,
    TopCenter = 0x800,
    TopFrontLeft = 0x1000,
    TopFrontCenter = 0x2000,
    TopFrontRight = 0x4000,
    TopBackLeft = 0x8000,
    TopBackCenter = 0x10000,
    TopBackRight = 0x20000
}

最后，check检查通道是否存在。

如果通道掩码不存在怎么办？

自己创建一个！根据文件的通道数，您要么必须猜测使用哪些通道，要么只是盲目地遵循MCL。在下面的代码片段中，我们两者都做了一点。

public static uint GetSpeakerMask(int channelCount)
{
    // Assume setup of: FL, FR, FC, LFE, BL, BR, SL & SR. Otherwise MCL will use: FL, FR, FC, LFE, BL, BR, FLoC & FRoC.
    if (channelCount == 8)
    {
        return 0x63F; 
    }

    // Otherwise follow MCL.
    uint mask = 0;
    var channels = Enum.GetValues(typeof(Channels)).Cast<uint>().ToArray();

    for (var i = 0; i < channelCount; i++)
    {
        mask += channels[i];
    }

    return mask;
}

提取样本

要读取特定通道的样本，您需要像处理立体声文件一样执行操作，即通过帧大小（以字节为单位）递增循环计数器。

frameSize = (bitDepth / 8) * channelCount

You also need to offset your loop's starting index. This is where things become more complicated, as you have to start reading data from the channel's order number based on existing channels, times byte depth.

What do I mean "based on existing channels"? Well, you need to reassign the existing channels' order number from 1, incrementing the order for each channel that is present. For example, the channel mask 0x63F indicates that the FL, FR, FC, LFE, BL, BR, SL & SR channels are used, therefore the new channel order numbers for the respective channels would look like this (note, the bit values are not and should not ever be changed):

Order | Bit | Channel

 1.     0x1  Front Left
 2.     0x2  Front Right
 3.     0x4  Front Center
 4.     0x8  Low Frequency (LFE)
 5.    0x10  Back Left (Surround Back Left)
 6.    0x20  Back Right (Surround Back Right)
 7.   0x200  Side Left (Surround Left)
 8.   0x400  Side Right (Surround Right)

你会注意到FLoC、FRoC和BC都没有出现，因此SL和SR通道“下降”到下一个可用的序号，而不是使用SL和SR的默认顺序（10, 11）。

总结

因此，要读取单个通道的字节，您需要执行类似于以下操作。

// This code will only return the bytes of a particular channel. It's up to you to convert the bytes to actual samples.
public static byte[] GetChannelBytes(byte[] audioBytes, uint speakerMask, Channels channelToRead, int bitDepth, uint sampleStartIndex, uint sampleEndIndex)
{
    var channels = FindExistingChannels(speakerMask);
    var ch = GetChannelNumber(channelToRead, channels);
    var byteDepth = bitDepth / 8;
    var chOffset = ch * byteDepth;
    var frameBytes = byteDepth * channels.Length;
    var startByteIncIndex = sampleStartIndex * byteDepth * channels.Length;
    var endByteIncIndex = sampleEndIndex * byteDepth * channels.Length;
    var outputBytesCount = endByteIncIndex - startByteIncIndex;
    var outputBytes = new byte[outputBytesCount / channels.Length];
    var i = 0;

    startByteIncIndex += chOffset;

    for (var j = startByteIncIndex; j < endByteIncIndex; j += frameBytes)
    {
        for (var k = j; k < j + byteDepth; k++)
        {
            outputBytes[i] = audioBytes[(k - startByteIncIndex) + chOffset];
            i++;
        }
    }

    return outputBytes;
}

private static Channels[] FindExistingChannels(uint speakerMask)
{
    var foundChannels = new List<Channels>();

    foreach (var ch in Enum.GetValues(typeof(Channels)))
    {
        if ((speakerMask & (uint)ch) == (uint)ch)
        {
            foundChannels.Add((Channels)ch);
        }
    }

    return foundChannels.ToArray();
}

private static int GetChannelNumber(Channels input, Channels[] existingChannels)
{
    for (var i = 0; i < existingChannels.Length; i++)
    {
        if (existingChannels[i] == input)
        {
            return i;
        }
    }

    return -1;
}