如何在FastAPI启动时下载大文件而不阻塞事件循环？

Question

如何在FastAPI启动时下载大文件而不阻塞事件循环？

pythonpython-3.xasynchronousfastapidownloadfileasync

3

我想在应用程序启动时下载一个大文件，但它应该是并行进行的。也就是说，实际的应用程序启动不应该等待文件下载完成。

我目前正在做的是：

from fastapi import FastAPI


app = FastAPI()

items = {}


@app.on_event("startup")
def startup_event():
    //Download file

现在这似乎是有效的，但我遇到了很多关键工作超时错误。我想知道是否有办法在应用程序启动时进行下载，但又不会让应用程序等待下载完成。

- AP1709

你可以使用Python内置的threading或asyncio来异步执行下载任务。 - undefined

请查看这个答案，这个答案，以及这个和这个。你可能也会发现这个答案和这个答案有帮助。 - undefined

2个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Chris · Answer 1

这个答案的代码和信息来自以下答案。因此，请查看它们以获取更多详细信息和解释：

如何在每个FastAPI端点中初始化全局对象或变量并重复使用它？
在Uvicorn/FastAPI中进行下游Https请求的正确方法是什么？
在FastAPI端点中使用concurrent.futures.ThreadPoolExecutor调用是否危险？
FastAPI python：如何在后台运行线程？
在FastAPI中从在线视频URL返回文件/流响应
FastAPI UploadFile与Flask相比较慢
如何使用FastAPI下载大文件？
如何在同一运行事件循环中运行另一个应用程序？

以下提供的解决方案使用了httpx库，该库为Python提供了一个强大的HTTP客户端库，具有async API，并支持HTTP/1.1和HTTP/2。在asyncio应用程序中，还使用了aiofiles库来处理文件操作（如将文件写入磁盘）。用于测试解决方案的公共视频（大文件）可以在这里找到。

解决方案1

如果您希望在整个应用程序中重用HTTP客户端，请使用此解决方案。

from fastapi import FastAPI, Request
from contextlib import asynccontextmanager
from fastapi.responses import StreamingResponse
from starlette.background import BackgroundTask
import asyncio
import aiofiles
import httpx


async def download_large_file(client: httpx.AsyncClient):
    large_file_url = 'http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4'
    path = 'save_to/video.mp4'
    req = client.build_request('GET', large_file_url)
    r = await client.send(req, stream=True)
    async with aiofiles.open(path, 'wb') as f:
        async for chunk in r.aiter_raw():
            await f.write(chunk)
    await r.aclose()

    
@asynccontextmanager
async def lifespan(app: FastAPI):
    # Initialise the Client on startup and add it to the state
    async with httpx.AsyncClient() as client:
        asyncio.create_task(download_large_file(client))
        yield {'client': client}
        # The Client closes on shutdown


app = FastAPI(lifespan=lifespan)


@app.get('/')
async def home():
    return 'Hello World!'


@app.get('/download')
async def download_some_file(request: Request):
    client = request.state.client  # reuse the HTTP client
    req = client.build_request('GET', 'https://www.example.com')
    r = await client.send(req, stream=True)
    return StreamingResponse(r.aiter_raw(), background=BackgroundTask(r.aclose))

解决方案2

如果您不需要重复使用HTTP客户端，只需要在启动时使用它，请使用此解决方案。

from fastapi import FastAPI
from contextlib import asynccontextmanager
import asyncio
import aiofiles
import httpx


async def download_large_file():
    large_file_url = 'http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4'
    path = 'save_to/video.mp4'
    async with httpx.AsyncClient() as client:
        async with client.stream('GET', large_file_url) as r:
            async with aiofiles.open(path, 'wb') as f:
                async for chunk in r.aiter_raw():   
                    await f.write(chunk)


@asynccontextmanager
async def lifespan(app: FastAPI):
    asyncio.create_task(download_large_file())
    yield


app = FastAPI(lifespan=lifespan)


@app.get('/')
async def home():
    return 'Hello World!'

- Prudhviraj · Answer 2

假设我们以10GB的文件（https://speed.hetzner.de/10GB.bin）作为例子，在启动时进行下载。

应用程序启动时，使用aiohttp触发一个异步下载任务，从https://speed.hetzner.de/10GB.bin获取文件并保存为downloaded_file。

下载以块的形式进行，这个后台进程允许应用程序启动其他任务并响应传入的请求，而无需等待下载完成。

import asyncio
from fastapi import FastAPI
import aiohttp

app = FastAPI()

async def download_large_file():
    async with aiohttp.ClientSession() as session:
        url = "https://speed.hetzner.de/10GB.bin"
        async with session.get(url) as response:
            if response.status == 200:
                with open('downloaded_file', 'wb') as file:
                    while True:
                        chunk = await response.content.read(1024)
                        if not chunk:
                            break
                        file.write(chunk)

@app.on_event("startup")
async def startup_event():
    loop = asyncio.get_event_loop()
    loop.create_task(download_large_file())

希望这段代码能帮到你。