我正在尝试在.Net Core 2 Web API中使用Websockets实现Nexmo的语音API。
- 通过Nexmo从电话接收到音频
- 使用Microsoft Cognitive 语音转文本 API
- 将文本发送到机器人
- 对机器人的回复进行Microsoft Cognitive 文本转语音
- 通过其语音API Websocket将语音发送回Nexmo
目前,我暂时绕过了机器人步骤,因为我首先尝试连接WebSocket。当尝试回显方法(将接收到的音频发送回WebSocket)时,它可以正常工作。但是当我尝试发送来自Microsoft文本到语音的语音时,电话会结束。
我没有找到任何实现不同于回显的文档。
当WebSocket外部使用TextToSpeech和SpeechToText方法时,它们的工作方式与预期相同。
以下是具有语音转文本的WebSocket:
public static async Task Echo(HttpContext context, WebSocket webSocket)
{
var buffer = new byte[1024 * 4];
WebSocketReceiveResult result = await webSocket.ReceiveAsync(new ArraySegment<byte>(buffer), CancellationToken.None);
while (!result.CloseStatus.HasValue)
{
while(!result.EndOfMessage)
{
result = await webSocket.ReceiveAsync(new ArraySegment<byte>(buffer), CancellationToken.None);
}
var text = SpeechToText.RecognizeSpeechFromBytesAsync(buffer).Result;
Console.WriteLine(text);
}
await webSocket.CloseAsync(result.CloseStatus.Value, result.CloseStatusDescription, CancellationToken.None);
}
下面是带有文字转语音功能的WebSocket代码:
public static async Task Echo(HttpContext context, WebSocket webSocket)
{
var buffer = new byte[1024 * 4];
WebSocketReceiveResult result = await webSocket.ReceiveAsync(new ArraySegment<byte>(buffer), CancellationToken.None);
while (!result.CloseStatus.HasValue)
{
var ttsAudio = await TextToSpeech.TransformTextToSpeechAsync("Hello, this is a test", "en-US");
await webSocket.SendAsync(new ArraySegment<byte>(ttsAudio, 0, ttsAudio.Length), WebSocketMessageType.Binary, true, CancellationToken.None);
result = await webSocket.ReceiveAsync(new ArraySegment<byte>(buffer), CancellationToken.None);
}
await webSocket.CloseAsync(result.CloseStatus.Value, result.CloseStatusDescription, CancellationToken.None);
}
2019年3月1日更新
回复 Sam Machin 的评论:我尝试将数组拆分为每个640字节的块(我使用16000khz采样率),但nexmo仍然挂断电话,我仍然听不到任何声音。
public static async Task NexmoTextToSpeech(HttpContext context, WebSocket webSocket)
{
var ttsAudio = await TextToSpeech.TransformTextToSpeechAsync("This is a test", "en-US");
var buffer = new byte[1024 * 4];
WebSocketReceiveResult result = await webSocket.ReceiveAsync(new ArraySegment<byte>(buffer), CancellationToken.None);
while (!result.CloseStatus.HasValue)
{
await SendSpeech(context, webSocket, ttsAudio);
result = await webSocket.ReceiveAsync(new ArraySegment<byte>(buffer), CancellationToken.None);
}
await webSocket.CloseAsync(WebSocketCloseStatus.NormalClosure, "Closing Socket", CancellationToken.None);
}
private static async Task SendSpeech(HttpContext context, WebSocket webSocket, byte[] ttsAudio)
{
const int chunkSize = 640;
var chunkCount = 1;
var offset = 0;
var lastFullChunck = ttsAudio.Length < (offset + chunkSize);
try
{
while(!lastFullChunck)
{
await webSocket.SendAsync(new ArraySegment<byte>(ttsAudio, offset, chunkSize), WebSocketMessageType.Binary, false, CancellationToken.None);
offset = chunkSize * chunkCount;
lastFullChunck = ttsAudio.Length < (offset + chunkSize);
chunkCount++;
}
var lastMessageSize = ttsAudio.Length - offset;
await webSocket.SendAsync(new ArraySegment<byte>(ttsAudio, offset, lastMessageSize), WebSocketMessageType.Binary, true, CancellationToken.None);
}
catch (Exception ex)
{
}
}
以下是有时出现在日志中的异常:
System.Net.WebSockets.WebSocketException (0x80004005): 远程方在未完成关闭握手的情况下关闭了 WebSocket 连接。