实时音频流Java

Question

实时音频流Java

javasocketsaudio-streamingaudio-recording

13

我正在将麦克风的实时流传输到另一台计算机上的Java服务器。但我只听到白噪声。

我已经附上了客户端和服务器程序。

Client:

import java.io.IOException;
import java.net.DatagramPacket;
import java.net.DatagramSocket;
import java.net.InetAddress;
import java.net.SocketException;
import java.net.UnknownHostException;

import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.TargetDataLine;

public class Mic 
{
    public byte[] buffer;
    private int port;
    static AudioInputStream ais;

    public static void main(String[] args)
    {
        TargetDataLine line;
        DatagramPacket dgp; 

        AudioFormat.Encoding encoding = AudioFormat.Encoding.PCM_SIGNED;
        float rate = 44100.0f;
        int channels = 2;
        int sampleSize = 16;
        boolean bigEndian = true;
        InetAddress addr;


        AudioFormat format = new AudioFormat(encoding, rate, sampleSize, channels, (sampleSize / 8) * channels, rate, bigEndian);

        DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);
        if (!AudioSystem.isLineSupported(info)) {
            System.out.println("Line matching " + info + " not supported.");
            return;
        }

        try
        {
            line = (TargetDataLine) AudioSystem.getLine(info);

            int buffsize = line.getBufferSize()/5;
            buffsize += 512; 

            line.open(format);

            line.start();   

            int numBytesRead;
            byte[] data = new byte[buffsize];

            addr = InetAddress.getByName("127.0.0.1");
            DatagramSocket socket = new DatagramSocket();
            while (true) {
                   // Read the next chunk of data from the TargetDataLine.
                   numBytesRead =  line.read(data, 0, data.length);
                   // Save this chunk of data.
                   dgp = new DatagramPacket (data,data.length,addr,50005);

                   socket.send(dgp);
                }

        }catch (LineUnavailableException e) {
            e.printStackTrace();
        }catch (UnknownHostException e) {
            // TODO: handle exception
        } catch (SocketException e) {
            // TODO: handle exception
        } catch (IOException e2) {
            // TODO: handle exception
        }
    }
}

服务器端没有问题，它与安卓客户端AudioRecord配合运行完美。

Server:

import java.io.ByteArrayInputStream;
import java.net.DatagramPacket;
import java.net.DatagramSocket;

import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.SourceDataLine;

public class Server {

    AudioInputStream audioInputStream;
    static AudioInputStream ais;
    static AudioFormat format;
    static boolean status = true;
    static int port = 50005;
    static int sampleRate = 44100;

    static DataLine.Info dataLineInfo;
    static SourceDataLine sourceDataLine;

    public static void main(String args[]) throws Exception 
    {
        System.out.println("Server started at port:"+port);

        DatagramSocket serverSocket = new DatagramSocket(port);

        /**
         * Formula for lag = (byte_size/sample_rate)*2
         * Byte size 9728 will produce ~ 0.45 seconds of lag. Voice slightly broken.
         * Byte size 1400 will produce ~ 0.06 seconds of lag. Voice extremely broken.
         * Byte size 4000 will produce ~ 0.18 seconds of lag. Voice slightly more broken then 9728.
         */

        byte[] receiveData = new byte[4096];

        format = new AudioFormat(sampleRate, 16, 1, true, false);
        dataLineInfo = new DataLine.Info(SourceDataLine.class, format);
        sourceDataLine = (SourceDataLine) AudioSystem.getLine(dataLineInfo);
        sourceDataLine.open(format);
        sourceDataLine.start();

        //FloatControl volumeControl = (FloatControl) sourceDataLine.getControl(FloatControl.Type.MASTER_GAIN);
        //volumeControl.setValue(1.00f);

        DatagramPacket receivePacket = new DatagramPacket(receiveData, receiveData.length);

        ByteArrayInputStream baiss = new ByteArrayInputStream(receivePacket.getData());

        while (status == true) 
        {
            serverSocket.receive(receivePacket);
            ais = new AudioInputStream(baiss, format, receivePacket.getLength());
            toSpeaker(receivePacket.getData());
        }

        sourceDataLine.drain();
        sourceDataLine.close();
    }

    public static void toSpeaker(byte soundbytes[]) {
        try 
        {
            System.out.println("At the speaker");
            sourceDataLine.write(soundbytes, 0, soundbytes.length);
        } catch (Exception e) {
            System.out.println("Not working in speakers...");
            e.printStackTrace();
        }
    }
}

- user4488923

5个回答

4

这是一个旧问题，但解决它在某种程度上帮助了我，我想我找到的东西可能会帮助其他人，因此...这就是我解决你所描述问题的方式：

在我的计算机上，更改

boolean bigEndian = true;

to

boolean bigEndian = false;

解决了白噪声问题（这显然是字节顺序的问题）

如果你只做了上述更改，最终产生的音频将会有一个较低的音调，这是因为在麦克风一侧采集了两个通道，在扬声器一侧只播放了一个通道。

要解决这个问题，只需更改以下这行代码：

format = new AudioFormat(sampleRate, 16, 1, true, false);

为了

format = new AudioFormat(sampleRate, 16, 2, true, false);

然后音频应该清晰易懂。

- Don Joe

3

当客户端和服务器使用不同大小的数据缓冲区时，可能会导致其中一个被截断，从而可能会导致一个或两个产生伪像。

您的服务器缓冲区大小设置为byte[] receiveData = new byte[4096]; 由于某种原因，您的客户端缓冲区大小是动态的，并且设置为byte[] data = new byte[buffsize]; 将客户端缓冲区大小设置为静态的4096以匹配服务器：byte[] data = new byte[4096]; 或者只需确保它们的大小相同...

- user3674935

1

我建议您先将客户端录制的音频写入文件，这样可以验证捕获的音频是否正常。您可以使用像sox这样的工具将PCM转换为WAV。

- Tim

0

在编程中，匹配客户端和服务器的音频格式非常重要。例如，在Client.java中更改为：format = new AudioFormat(sampleRate, 16, 1, true, false);同时，两个程序还需要使用相同的缓冲区大小。

- Mauricio

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Michael Macha · Accepted Answer

我用正弦波（或类似于正弦波的东西）填充了麦克风，你的程序运行良好。

我的具体更改如下：

package audioclient;

import java.io.*;
import java.net.*;
import java.nio.ByteBuffer;

import javax.sound.sampled.*;

public class Mic {
    public byte[] buffer;
    private int port;
    static AudioInputStream ais;

        public static void main(String[] args) {
        TargetDataLine line;
        DatagramPacket dgp;

        AudioFormat.Encoding encoding = AudioFormat.Encoding.PCM_SIGNED;
        float rate = 44100.0f;
        int channels = 2;
        int sampleSize = 16;
        boolean bigEndian = true;
        InetAddress addr;

        AudioFormat format = new AudioFormat(encoding, rate, sampleSize, channels, (sampleSize / 8) * channels, rate, bigEndian);

        DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);
        if (!AudioSystem.isLineSupported(info)) {
            System.out.println("Line matching " + info + " not supported.");
            return;
        }

        try {
            line = (TargetDataLine) AudioSystem.getLine(info);

            //TOTALLY missed this.
            int buffsize = line.getBufferSize() / 5;
            buffsize += 512;

            line.open(format);

            line.start();

            int numBytesRead;
            byte[] data = new byte[buffsize];

            /*
             * MICK's injection: We have a buffsize of 512; it is best if the frequency
             * evenly fits into this (avoid skips, bumps, and pops). Additionally, 44100 Hz,
             * with two channels and two bytes per sample. That's four bytes; divide
             * 512 by it, you have 128.
             * 
             * 128 samples, 44100 per second; that's a minimum of 344 samples, or 172 Hz.
             * Well within hearing range; slight skip from the uneven division. Maybe
             * bump it up to 689 Hz.
             * 
             * That's a sine wave of shorts, repeated twice for two channels, with a
             * wavelength of 32 samples.
             * 
             * Note: Changed my mind, ignore specific numbers above.
             * 
             */
            {
                final int λ = 16;
                ByteBuffer buffer = ByteBuffer.allocate(λ * 2 * 8);
                for(int j = 0; j < 2; j++) {
                    for(double i = 0.0; i < λ; i++) {
                        System.out.println(j + " " + i);
                        //once for each sample
                        buffer.putShort((short)(Math.sin(Math.PI * (λ/i)) * Short.MAX_VALUE));
                        buffer.putShort((short)(Math.sin(Math.PI * (λ/i)) * Short.MAX_VALUE));
                    }
                }

                data = buffer.array();
            }

            addr = InetAddress.getByName("127.0.0.1");
            try(DatagramSocket socket = new DatagramSocket()) {
                while (true) {
                    for(byte b : data) System.out.print(b + " ");

                    // Read the next chunk of data from the TargetDataLine.
//                  numBytesRead = line.read(data, 0, data.length);

                    for(int i = 0; i < 64; i++) {
                        byte b = data[i];
                        System.out.print(b + " ");
                    }
                    System.out.println();

                    // Save this chunk of data.
                    dgp = new DatagramPacket(data, data.length, addr, 50005);    

                    for(int i = 0; i < 64; i++) {
                        byte b = dgp.getData()[i];
                        System.out.print(b + " ");
                    }
                    System.out.println();

                    socket.send(dgp);
                }
            }

        } catch (LineUnavailableException e) {
            e.printStackTrace();
        } catch (UnknownHostException e) {
            // TODO: handle exception
        } catch (SocketException e) {
            // TODO: handle exception
        } catch (IOException e2) {
            // TODO: handle exception
        }
    }
}

显然我把它误解为一个512字节长的片段，并搞砸了正弦波，但问题是，它确实产生了它应该产生的声音——在特定频率下的令人昏昏欲睡的嘎嘎声。

基于这一点，我不认为问题明确出现在你的代码中。我首先要检查的是您的系统用于音频的哪一行。您是否连接了多个麦克风？可能是网络摄像头的麦克风吗？您可以使用 PulseAudio 音量控制之类的工具进行检查。如果您还没有检查麦克风的功能性，也可以这样做；它们确实有寿命限制。

在音频流中混淆位并不罕见，也不难；但我看不到您可能会这样做的任何地方。

一个想法是修改您的程序，在将其发送到服务器之前尝试在本地播放声音。这样，您至少可以确定问题是前 Mic 还是后 Mic。