使用C#从API下载PDF文件

Question

使用C#从API下载PDF文件

c#.netasp.net-web-api

7

我正在创建一个控制台应用程序，它可以执行以下两个操作：

连接到供应商API以在两个日期之间获取已提交费用的凭证号码
下载与费用提交的收据相关的PDF副本

第一部分，我已经成功实现了。我能够连接到供应商API并解析返回的XML内容，使用以下代码创建凭证号码数组（需要获取PDF图像）。

static async Task RunAsyncCR()
        {
            using (var client = new HttpClient())
            {
                var values = new Dictionary<string, string>
                {
                    {"un","SomeUser"},
                    {"pw","SomePassword"},
                    {"method","getVoucherInvoices"},
                    {"fromDate","05/30/2016"},
                    {"toDate", "06/13/2016"}
                };

                var content = new FormUrlEncodedContent(values);

                Console.WriteLine("Connecting...");

                var response = await client.PostAsync("https://www.chromeriver.com/receipts/doit", content);

                Console.WriteLine("Connected...");

                var responseString = await response.Content.ReadAsStringAsync();

                char[] DelimiterChars = {'<'};

                String[] xmlReturn = responseString.Split(DelimiterChars);

                string[] VoucherNumber = new string[500];

                int i = 0;

                foreach (string s in xmlReturn)
                {
                    if (s.Contains("voucherInvoice>") && s != "/voucherInvoice>\n    ")
                    {
                        VoucherNumber[i] = s.Substring(15, 16);

                        i++;
                    }
                }

                Array.Resize(ref VoucherNumber, i);

是的，可能有更好的方法来完成这个任务，但它能够正常工作并返回我期望的值。

现在，我遇到的问题是当我重新连接API以检索文件时，我似乎无法将文件下载到指定的文件路径。

我可以使用以下代码重新连接API：

            i = 0;

            foreach (string x in VoucherNumber)
            {
                Console.WriteLine("Get receipt: " + x);

                var NewValues = new Dictionary<string, string>
                {
                    {"un","SomeUser"},
                    {"pw","SomePassword"},
                    {"method","getReceiptsWithCoverPage"},
                    {"voucherInvoiceForPdf", VoucherNumber[i]}
                };

                var NewContent = new FormUrlEncodedContent(NewValues);

                var NewResponse = await client.PostAsync("https://www.chromeriver.com/receipts/doit", NewContent);

                string NewResponseString = await NewResponse.Content.ReadAsStringAsync();

但是我无法将响应写入有效的文件（PDF）。

这里是我的Autos窗口的屏幕截图，当我逐步执行代码时，我需要下载该文件：

我的问题是，从这个点开始，我该如何将文件保存到我的系统中？

我尝试使用System.IO.File.WriteAllLines()方法获取从Console.WriteLine(NewResponseString);得到的编码响应，并使用指定的文件路径/名称将其写入文件，但结果是空白文件。我也花了一些时间在谷歌/Stackoverflow上进行深入研究，但是我不知道如何实现我找到的结果。

非常感谢您提供任何帮助。

- Jeff Beese

1

你能在这里以十六进制粘贴那个流的前20个字节吗？ - Thomas Weller

你有使用流的经验吗？如果没有，那就是你应该开始的地方。如果这不是你的问题，也许你可以澄清一下——你需要帮助将流保存到磁盘上，还是需要帮助从响应中提取流/字节数组？ - Igor

@Igor - 我没有使用流的经验 - 这是我第一次需要连接到API来检索文件... - Jeff Beese

5个回答

2

首先，您确定是否存在文件？我建议使用开源库PdfSharp。我个人也使用它并且效果很好。至于下载文件，也许这会对您有所帮助...

同步下载

using System.Net;
WebClient webClient = new WebClient();
webClient.DownloadFile("http://example.com/myfile.txt", @"c:\\myfile.txt");

http://www.csharp-examples.net/download-files/

- user6454724

是的，如果您查看我的屏幕截图，那里绝对有一个文件，其中包含返回的文件信息，包括文件名。 - Jeff Beese

如果您想选择PdfSharp，请在此处下载链接。以下是我编写的实现pdf编写的示例。http://www.pdfsharp.net/Downloads.ashx - user6454724

PdfDocument test = new PdfDocument();test.Info.Title = "测试PDF";PdfPage page = test.AddPage();XGraphics graph = XGraphics.FromPdfPage(page);XFont font = new XFont("Verdana", 20, XFontStyle.Bold);graph.DrawString(strPairs, font, XBrushes.Black, new XRect(0, 0,page.Width.Point, page.Height.Point), XStringFormats.Center);string pdfFilename = "testPairs.pdf";test.Save(pdfFilename);Process.Start(pdfFilename); - user6454724

返回的内容是一个PDF结构。原帖作者只是不知道如何将该内容保存到磁盘上。除非原帖作者还需要编程创建PDF文件的帮助，否则PDF库对此毫无帮助。 - Igor

1

首先从NewResponse创建StreamReader。保留HTML，不要解释。

Stream receiveStream = NewResponse.GetResponseStream ();
StreamReader readStream = new StreamReader (receiveStream, Encoding.UTF8);

然后定义一个StremaWriter来写入文件。

using (var writer = new StreamWriter(@"C:\MyNewFile.pdf", append: false))
{
    writer.Write(readStream.ReadToEnd());
}

另一种方法是：

var httpContent = NewResponse.Content; 

using(var newFile = System.IO.File.Create(@"C:\MyNewFile.pdf"))
{ 
    var stream = await httpContent.ReadAsStreamAsync();
    await stream.CopyToAsync(newFile);
}

- Atanu Sarkar

1

这是我所做的，没有找到其他满足我的情况的解决方案：

using (var client = new System.Net.Http.HttpClient())

{
  client.DefaultRequestHeaders.Add("Authorization", "someapikey");
  client.BaseAddress = new Uri("https://someurl.com");
  byte[] bytes = client.GetByteArrayAsync(client.BaseAddress).ConfigureAwait(false).GetAwaiter().GetResult();

  
  string pdfFilePath = @"c:\somepath"
  System.IO.File.WriteAllBytes(pdfFilePath, bytes);

  //Note that below is only to open PDF in standard viewer, not necessary
  var process = new System.Diagnostics.Process();
  var startInfo = new System.Diagnostics.ProcessStartInfo()
{
    FileName=pdfFilePath,
    WorkingDirectory = System.IO.Path.GetDirectoryName(pdfFilePath),
    UseShellExecute = true
}

  process.StartInfo = startInfo;
  process.Start();

}

- vktr

如果有几个用户同时调用这个方法会怎么样？c:\somepath 不会被覆盖吗？这样可能会导致损坏或者影响其他资源吧？ - undefined

0

使用此代码从API下载PDF。它将把字符串数据转换为字节并为您提供必要的解决方案。

HttpWebRequest request = (HttpWebRequest) WebRequest.Create(URL);

request.ContentType = "application/pdf;charset=UTF-8";
request.Method = "GET";

using(HttpWebResponse response = (HttpWebResponse) request.GetResponse()) {

    BinaryReader bin = new BinaryReader(response.GetResponseStream());

    byte[] buffer = bin.ReadBytes((Int32) response.ContentLength);

    Response.Buffer = true;
    Response.Charset = "";

    Response.AppendHeader("Content-Disposition", "attachment; filename=+ filename); 

    Response.Cache.SetCacheability(HttpCacheability.NoCache);

    Response.ContentType = "application/pdf";

    Response.BinaryWrite(buffer);

    Response.Flush();

    Response.End();
}

- vishal singh

我喜欢看到新参与者的贡献，但我鼓励您在发布之前先测试您的解决方案。这份代码甚至无法编译，更别提实际运行了。建议您修复它，使其能够正常工作。这样它可能成为对该问题有价值的答案。 - Tim

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Igor · Accepted Answer

所以我认为你需要帮助Streams。返回的HttpContent实际上是一个System.Net.Http.StreamContent实例，它表明你正在获取内容。只需要从该实例中获取流（内容）并将其保存到文件即可。

var NewResponse = await client.PostAsync("https://www.chromeriver.com/receipts/doit", NewContent);

System.Net.Http.HttpContent content = NewResponse.Content; // actually a System.Net.Http.StreamContent instance but you do not need to cast as the actual type does not matter in this case

using(var file = System.IO.File.Create("somePathHere.pdf")){ // create a new file to write to
    var contentStream = await content.ReadAsStreamAsync(); // get the actual content stream
    await contentStream.CopyToAsync(file); // copy that stream to the file stream
}

我建议您尽量了解一下流(Streams)的工作原理，这是许多编程语言中常见的构造，您很可能在不久的将来需要处理它们。