作为免责声明,当从命令行工作时,使用工具通常足够,或对于更复杂的用例,
BigQuery客户端库可使多种语言与BigQuery进行编程。然而,有时候使用REST API进行简单请求以查看某些API在低级别上的工作原理仍然很有用。
首先,请确保您已经安装了Google Cloud SDK。这应该包括gcloud
和bq
命令行工具。如果尚未授权您的帐户,请从终端运行此命令进行授权:
gcloud auth login
这应该提示您登录,然后给您一个访问代码,您可以将其粘贴到终端中。(确切的过程可能会随时间而变化)。
现在让我们尝试使用BigQuery REST API进行查询,调用
jobs.query
方法。使用您自己的项目名称修改此脚本,您可以从
Google Cloud控制台 中找到,然后将脚本粘贴到终端中:
PROJECT="YOUR_PROJECT_NAME"
QUERY="\"SELECT 1 AS x, 'foo' AS y;\""
REQUEST="{\"kind\":\"bigquery#queryRequest\",\"useLegacySql\":false,\"query\":$QUERY}"
echo $REQUEST | \
curl -X POST -d @- -H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://www.googleapis.com/bigquery/v2/projects/$PROJECT/queries
如果成功了,你应该看到类似于这样的输出:
{
"kind": "bigquery#queryResponse",
"schema": {
"fields": [
{
"name": "x",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "y",
"type": "STRING",
"mode": "NULLABLE"
}
]
},
"jobReference": {
"projectId": "<your project ID>",
"jobId": "<your job ID>"
},
"totalRows": "1",
"rows": [
{
"f": [
{
"v": "1"
},
{
"v": "foo"
}
]
}
],
"totalBytesProcessed": "0",
"jobComplete": true,
"cacheHit": false
}
如果您还没有设置bq
命令行工具,您可以使用终端中的bq init
进行设置。一旦设置完成,您可以尝试使用它运行相同的查询:
bq query --use_legacy_sql=False "SELECT 1 AS x, 'foo' AS y;"
你可以通过传递
--apilog=
选项来查看
bq
工具发出的 REST API 请求:
bq --apilog= query --use_legacy_sql=False "SELECT [1, 2, 3] AS x;"
现在让我们尝试一个例子,使用
jobs.insert
方法而不是
query
API。运行此脚本,将
YOUR_PROJECT_NAME
替换为您的项目名称:
PROJECT="YOUR_PROJECT_NAME"
QUERY="\"SELECT 1 AS x, 'foo' AS y;\""
REQUEST="{\"configuration\":{\"query\":{\"useLegacySql\":false,\"query\":${QUERY}}}}"
echo $REQUEST | \
curl -X POST -d @- -H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://www.googleapis.com/bigquery/v2/projects/$PROJECT/jobs
与 query
API 不同的是,它会立即返回响应结果,而你将看到类似于以下结果:
{
"kind": "bigquery#job",
"etag": "\"<etag string>\"",
"id": "<project name>:<job ID>",
"selfLink": "https://www.googleapis.com/bigquery/v2/projects/<project name>/jobs/<job ID>",
"jobReference": {
"projectId": "<project name>",
"jobId": "<job ID>"
},
"configuration": {
"query": {
"query": "SELECT 1 AS x, 'foo' AS y;",
"destinationTable": {
"projectId": "<project name>",
"datasetId": "<anonymous dataset>",
"tableId": "<anonymous table>"
},
"createDisposition": "CREATE_IF_NEEDED",
"writeDisposition": "WRITE_TRUNCATE",
"useLegacySql": false
}
},
"status": {
"state": "RUNNING"
},
"statistics": {
"creationTime": "<timestamp millis>",
"startTime": "<timestamp millis>"
},
"user_email": "<your email address>"
}
注意状态:
"status": {
"state": "RUNNING"
},
如果您想现在检查工作情况,可以使用
jobs.get
方法。与之前类似,从终端运行此命令,使用上一步输出中的作业ID:
PROJECT="YOUR_PROJECT_NAME"
JOB_ID="YOUR_JOB_ID"
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://www.googleapis.com/bigquery/v2/projects/$PROJECT/jobs/$JOB_ID
如果查询完成,您将收到一条指示结果的响应:
...
"status": {
"state": "DONE"
},
...
最后,我们可以使用REST API发出请求以获取查询结果。
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://www.googleapis.com/bigquery/v2/projects/$PROJECT/queries/$JOB_ID
输出结果将类似于我们之前使用的
jobs.query
方法的结果:
{
"kind": "bigquery#getQueryResultsResponse",
"etag": "\"<etag string>\"",
"schema": {
"fields": [
{
"name": "x",
"type": "INTEGER",
"mode": "NULLABLE"
},
{
"name": "y",
"type": "STRING",
"mode": "NULLABLE"
}
]
},
"jobReference": {
"projectId": "<project ID>",
"jobId": "<job ID>"
},
"totalRows": "1",
"rows": [
{
"f": [
{
"v": "1"
},
{
"v": "foo"
}
]
}
],
"totalBytesProcessed": "0",
"jobComplete": true,
"cacheHit": true
}