如何在C#中解压缩Gzipped Http Get响应

11

想要解压从API获取的GZip压缩响应。尝试了下面的代码,但它总是返回以下结果:

\u001f�\b\0\0\0\0\0\0\0�Y]o........

我的代码是:

 private string GetResponse(string sData, string sUrl)
 {
      try
      {
           string script = null;
           try
           {
                string urlStr = @"" + sUrl + "?param=" + sData;

                Uri url = new Uri(urlStr, UriKind.Absolute);

                HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
                request.Method = "GET";
                request.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;

                using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
                using (StreamReader reader = new StreamReader(response.GetResponseStream()))
                {
                     script = reader.ReadToEnd();
                }      
           }
           catch (System.Net.Sockets.SocketException)
           {
                // The remote site is currently down. Try again next time. 
           }
           catch (UriFormatException)
           {
                // Only valid absolute URLs are accepted 
           }

           return script;
      }
      catch (Exception ex)
      {
           throw new Exception(ex.ToString());
      }
 }

我从很多参考资料中找到了上面的自动解压代码。但是最终它对我没有用。为了解压缩的数据,我尝试了下面的函数:

 private string DecompressGZIP(string compressedText)
 {
      byte[] gZipBuffer = Convert.FromBase64String(compressedText);
      using (var memoryStream = new MemoryStream())
      {
           int dataLength = BitConverter.ToInt32(gZipBuffer, 0);
           memoryStream.Write(gZipBuffer, 4, gZipBuffer.Length - 4);

           var buffer = new byte[dataLength];

           memoryStream.Position = 0;
           using (var gZipStream = new GZipStream(memoryStream, CompressionMode.Decompress))
           {
                gZipStream.Read(buffer, 0, buffer.Length);
           }

           return Encoding.UTF8.GetString(buffer);
      }
 }
但是,它在第一行代码本身就失败了,因为出现了以下异常:
System.FormatException:“输入不是有效的Base-64字符串,因为它包含一个非Base64字符,超过两个填充字符或填充字符中有一个非法字符。”
由于我是初学者,希望您能指导我.....提前感谢....

什么?你在哪里找到那个第二个代码块的?为什么要将响应读取为字符串?为什么要对其进行base64解码?这不是解压缩,request.AutomaticDecompression = DecompressionMethods.GZip应该已经为您完成了。请阅读[ask]并重新开始,创建一个[mcve]。而且,在英语中,您不必将随机单词大写。 - CodeCaster
3个回答

16

这是关键的部分,它将负责解码gzip流:

var clientHandler = new HttpClientHandler() { AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate }; 
var client = new HttpClient(clientHandler); 

你太棒了!! - Fadi
不适用于 Blazor,您将收到错误 System.PlatformNotSupportedException:在此平台上不支持该操作。 - Tailslide

6

请将我的函数更改为以下内容,对我来说完美运行:

private JObject PostingToPKFAndDecompress(string sData, string sUrl)
        {
            var jOBj = new JObject();
            try
            {

                try
                {
                    string urlStr = @"" + sUrl + "?param=" + sData;


                    HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlStr);
                    HttpWebResponse response = (HttpWebResponse)request.GetResponse();
                    Stream resStream = response.GetResponseStream();

                    var t = ReadFully(resStream);
                    var y = Decompress(t);

                    using (var ms = new MemoryStream(y))
                    using (var streamReader = new StreamReader(ms))
                    using (var jsonReader = new JsonTextReader(streamReader))
                    {
                        jOBj = (JObject)JToken.ReadFrom(jsonReader);
                    }


                }
                catch (System.Net.Sockets.SocketException)
                {
                    // The remote site is currently down. Try again next time. 
                }

            }
            catch (Exception ex)
            {
                throw new Exception(ex.ToString());
            }
            return jOBj;
        }

        public static byte[] ReadFully(Stream input)
        {
            byte[] buffer = new byte[16 * 1024];
            using (MemoryStream ms = new MemoryStream())
            {
                int read;
                while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
                {
                    ms.Write(buffer, 0, read);
                }
                return ms.ToArray();
            }
        }

        public static byte[] Decompress(byte[] data)
        {
            using (var compressedStream = new MemoryStream(data))
            using (var zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
            using (var resultStream = new MemoryStream())
            {
                zipStream.CopyTo(resultStream);
                return resultStream.ToArray();
            }
        }

5
针对一个复杂问题的回答也应该是复杂的。问题的评论指出了一个更好的解决方案。这里是澄清:var clientHandler = new HttpClientHandler() { AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate }; var client = new HttpClient(clientHandler);将会负责解码压缩后的流。 - henon
是的 - 理论上你所说的应该是有效的。但当涉及到Http 301,302等无休止的重定向时,并不总是有效。然后,由于某些难以理解的原因,解压被“忽略”,你必须自己处理它! - Richard Hammond
发现响应使用了Brotoli编码(??这是什么鬼??)猩猩们挺起来! - Richard Hammond
@henon,你应该将你的评论作为答案添加,因为它可以导致更简洁的代码。 - Berend Engelbrecht
@BerendEngelbrecht 我已经翻译了,如果您能让它更好,请随意编辑。 - henon

0
这是我对于使用AutomaticDecompression的Gzip,Deflate,Brotli或All的解决方案。
注意:您不一定需要发送头信息。
string url = "https://my.example.com";

using HttpClientHandler handler = new HttpClientHandler();

/// Set the AutomaticDecompression property to DecompressionMethods.GZip
handler.AutomaticDecompression = DecompressionMethods.GZip;
// handler.AutomaticDecompression = DecompressionMethods.Deflate;
// handler.AutomaticDecompression = DecompressionMethods.Brotli;
// handler.AutomaticDecompression = DecompressionMethods.All;

/// Create an instance of HttpClient with the handler
using HttpClient client = new HttpClient(handler);

/// Send a GET request to a URL that returns compressed content
using HttpResponseMessage response = await client.GetAsync(url);

/// Check if the response is successful
if (response.IsSuccessStatusCode)
{
    /// Read the response content as a string
    /// The content will be automatically decompressed by the handler
    string content = await response.Content.ReadAsStringAsync();

    Console.WriteLine(content);

    HtmlDocument htmlDocument = new();
    htmlDocument.LoadHtml(content);

    List<HtmlNode> allRows = htmlDocument.DocumentNode.Descendants("div").ToList();
    Console.WriteLine(allRows[0]);
}
else
{
    /// Handle the error
    _logger.LogError($"Error accessing the URL. Is TSETMC server down? Status: {response.StatusCode}");
}

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接