ASP.NET Core健康检查随机失败,出现TaskCanceledException或OperationCanceledException。

33
我已在我的 asp.net core 应用程序中实现了健康检查。一个健康检查包括两个检查——DbContext 连接和自定义检查,检查 NpgsqlConnection。
99% 的情况下一切正常。偶尔会出现健康检查失败并抛出 TaskCanceledException 或 OperationCanceledException 异常。从我的日志中可以看到,这些异常在大约 2ms-25ms 后被抛出(因此没有任何超时的可能)。
重要提示:
当我多次点击健康检查(简单地在浏览器中按 F5 键)时,它会抛出异常。看起来在上一个健康检查完成之前,您不能点击 /health 端点。如果是这种情况,为什么呢?即使我在自定义健康检查中放置 Thread.Sleep(5000);(完全没有 DB 连接检查),如果我在 5 秒钟内点击 /health 端点,它也会失败。
问题:健康检查是否以某种“神奇”的方式单线程运行(当您再次点击该端点时,它会取消上一个健康检查调用)?
Startup.cs ConfigureServices
services
    .AddHealthChecks()
    .AddCheck<StorageHealthCheck>("ReadOnly Persistance")
    .AddDbContextCheck<MyDbContext>("EFCore persistance");

Startup.cs 配置

if (env.IsDevelopment())
{
    app.UseDeveloperExceptionPage();
}
else
{
    app.UseHsts();
}

app.UseHttpsRedirection();
app.UseCors(options => options.AllowAnyOrigin().AllowAnyMethod().AllowAnyHeader());

app.UseMiddleware<RequestLogMiddleware>();
app.UseMiddleware<ErrorLoggingMiddleware>();

if (!env.IsProduction())
{
    app.UseSwagger();

    app.UseSwaggerUI(c =>
    {
        c.SwaggerEndpoint("/swagger/v1/swagger.json", "V1");
        c.SwaggerEndpoint($"/swagger/v2/swagger.json", $"V2");
    });
}

app.UseHealthChecks("/health", new HealthCheckOptions()
{
    ResponseWriter = WriteResponse
});

app.UseMvc();

StorageHealthCheck.cs

public class StorageHealthCheck : IHealthCheck
    {
        private readonly IMediator _mediator;

        public StorageHealthCheck(IMediator mediator)
        {
            _mediator = mediator;
        }

        public async Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default(CancellationToken))
        {
            var isReadOnlyHealthy = await _mediator.Send(new CheckReadOnlyPersistanceHealthQuery());

            return new HealthCheckResult(isReadOnlyHealthy ? HealthStatus.Healthy : HealthStatus.Unhealthy, null);
        }
    }

CheckReadOnlyPersistanceHealthQueryHandler:

NpgsqlConnectionStringBuilder csb = new NpgsqlConnectionStringBuilder(_connectionString.Value);

string sql = $@"
    SELECT * FROM pg_database WHERE datname = '{csb.Database}'";

try
{
    using (IDbConnection connection = new NpgsqlConnection(_connectionString.Value))
    {
        connection.Open();

        var stateAfterOpening = connection.State;
        if (stateAfterOpening != ConnectionState.Open)
        {
            return false;
        }

        connection.Close();
        return true;
    }
}
catch
{
    return false;
}

任务取消异常:

System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Npgsql.TaskExtensions.WithCancellation[T](Task`1 task, CancellationToken cancellationToken)
   at Npgsql.NpgsqlConnector.ConnectAsync(NpgsqlTimeout timeout, CancellationToken cancellationToken)
   at Npgsql.NpgsqlConnector.RawOpen(NpgsqlTimeout timeout, Boolean async, CancellationToken cancellationToken)
   at Npgsql.NpgsqlConnector.Open(NpgsqlTimeout timeout, Boolean async, CancellationToken cancellationToken)
   at Npgsql.NpgsqlConnection.<>c__DisplayClass32_0.<<Open>g__OpenLong|0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at Npgsql.EntityFrameworkCore.PostgreSQL.Storage.Internal.NpgsqlDatabaseCreator.ExistsAsync(CancellationToken cancellationToken)
   at Microsoft.Extensions.Diagnostics.HealthChecks.DbContextHealthCheck`1.CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken)
   at Microsoft.Extensions.Diagnostics.HealthChecks.DefaultHealthCheckService.CheckHealthAsync(Func`2 predicate, CancellationToken cancellationToken)
   at Microsoft.AspNetCore.Diagnostics.HealthChecks.HealthCheckMiddleware.InvokeAsync(HttpContext httpContext)
   at Microsoft.AspNetCore.Builder.Extensions.MapWhenMiddleware.Invoke(HttpContext context)

操作已取消异常:

System.OperationCanceledException: The operation was canceled.
   at System.Threading.CancellationToken.ThrowOperationCanceledException()
   at Microsoft.Extensions.Diagnostics.HealthChecks.DefaultHealthCheckService.CheckHealthAsync(Func`2 predicate, CancellationToken cancellationToken)
   at Microsoft.AspNetCore.Diagnostics.HealthChecks.HealthCheckMiddleware.InvokeAsync(HttpContext httpContext)
   at Microsoft.AspNetCore.Builder.Extensions.MapWhenMiddleware.Invoke(HttpContext context)

1
很难将异常堆栈跟您所发布的代码匹配。您确定您的代码是否忽略了“CancellationToken”吗? - Stephen Cleary
8
遇到了同样的问题。 - Ben
3
是的,我认为问题出在HttpContext.RequestAborted上。在健康检查的源代码中,我看到“HttpContext.RequestAborted”被用作取消标记(第59行,https://github.com/aspnet/Diagnostics/blob/master/src/Microsoft.AspNetCore.Diagnostics.HealthChecks/HealthCheckMiddleware.cs),但我不知道如何正确处理它。 - Maciej Pszczolinski
1
你最终解决了这个问题吗? - jjxtra
jjxtra - 不,我没有。我仍然在遇到这些错误。 - Maciej Pszczolinski
显示剩余3条评论
3个回答

28

我终于找到了答案。

初始原因是当HTTP请求被中止时,httpContext.RequestAborted CancellationToken被触发并抛出异常(OperationCanceledException)。

我的应用程序中有一个全局异常处理程序,并且我已将每个未处理的异常转换为500错误。 即使客户端中止了请求,从未收到500响应,我的日志仍在记录。

我实施的解决方案如下:

public async Task Invoke(HttpContext context)
{
    try
    {
        await _next(context);
    }
    catch (Exception ex)
    {
        if (context.RequestAborted.IsCancellationRequested)
        {
            _logger.LogWarning(ex, "RequestAborted. " + ex.Message);
            return;
        }

        _logger.LogCritical(ex, ex.Message);
        await HandleExceptionAsync(context, ex);
        throw;
    }
}

private static Task HandleExceptionAsync(HttpContext context, Exception ex)
{
    var code = HttpStatusCode.InternalServerError; // 500 if unexpected

    //if (ex is MyNotFoundException) code = HttpStatusCode.NotFound;
    //else if (ex is MyUnauthorizedException) code = HttpStatusCode.Unauthorized;
    //else if (ex is MyException) code = HttpStatusCode.BadRequest;

    var result = JsonConvert.SerializeObject(new { error = ex.Message });
    context.Response.ContentType = "application/json";
    context.Response.StatusCode = (int)code;
    return context.Response.WriteAsync(result);
}

希望这对某些人有所帮助。


1
这对我很有帮助!谢谢! - Robert Christ

0

在一个大型生产环境中测试后,我最好的理论是,在健康检查中需要等待任何写程序向http上下文输出流。我在一个返回未等待的任务的方法中遇到了这个错误。等待任务似乎解决了问题。await的好处是你也可以捕获TaskCancelledException并忽略它。

示例:


// map health checks
endpoints.MapHealthChecks("/health-check", new HealthCheckOptions
{
    ResponseWriter = HealthCheckExtensions.WriteJsonResponseAsync,
    Predicate = check => check.Name == "default"
});

/// <summary>
/// Write a json health check response
/// </summary>
/// <param name="context">Http context</param>
/// <param name="report">Report</param>
/// <returns>Task</returns>
public static async Task WriteJsonResponseAsync(HttpContext context, HealthReport report)
{
    try
    {
        HealthReportEntry entry = report.Entries.Values.FirstOrDefault();
        context.Response.ContentType = "application/json; charset=utf-8";
        await JsonSerializer.SerializeAsync(context.Response.Body, entry.Data,entry.Data.GetType());
    }
    catch (TaskCancelledException)
    {
    }
}


1
不幸的是,没有帮助。异常如下: System.IO.IOException:客户端重置了请求流。 然后有时会出现 TaskCanceledException - Maciej Pszczolinski
我想你可以只捕获所有的异常。 - jjxtra

0
如果您正在使用Serilog的RequestLoggingMiddleware,以下内容将允许您停止将中止的健康检查请求记录为错误:
app.UseSerilogRequestLogging(options =>
{
    options.GetLevel = (ctx, _, ex) =>
    {
        if (ex == null && ctx.Response.StatusCode <= 499)
        {
            return LogEventLevel.Information;
        }

        if (ctx.Request.Path.StartsWithSegments("/healthcheck"))
        {
            // If the incoming HTTP request for a healthcheck is aborted, don't log the resultant OperationCanceledException
            // as an error. Note that the ASP.NET DefaultHealthCheckService ensures that if the exception occurs
            // within the healthcheck implementation (and the request wasn't aborted) a failed healthcheck is logged
            // see https://github.com/dotnet/aspnetcore/blob/ce9e1ae5500c3f0c4b9bd682fd464b3493e48e61/src/HealthChecks/HealthChecks/src/DefaultHealthCheckService.cs#L121
            if (ex is OperationCanceledException)
            {
                return LogEventLevel.Information;
            }
        }

        return LogEventLevel.Error;
    };
});

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接