如何查找哪些列没有任何数据(所有值都为NULL)?

14

我有一个数据库中的多个表。我想找出哪些列(在哪些表中)没有任何值(一个列中全部为NULL)。在下面的示例中,结果应该是

TestTable1 --> Var2
TestTable2 --> Variable1

我不知道如何创建这种类型的查询。非常感谢你的帮助!

--create first table
create table dbo.TestTable1 (
sur_id int identity(1,1) not null primary key,
var1 int null,
var2 int null
)
go

--insert some values
insert into dbo.TestTable1 (var1) 
    select 1 union all select 2 union all select 3

--create second table
create table dbo.TestTable2 (
sur_id int identity(1,1) not null primary key,
variable1 int null,
variable2 int null
)

--and insert some values
insert into dbo.TestTable2 (variable2) 
    select 1 union all select 2 union all select 3
6个回答

12

对于单个列,count(ColumnName) 返回该列不为 null 的行数:

select  count(TheColumn)
from    YourTable

您可以生成所有列的查询。根据Martin的建议,您可以使用 is_nullable = 1 来排除无法为null的列。例如:

select  'count(' + name + ') as ' + name + ', '
from    sys.columns
where   object_id = object_id('YourTable')
        and is_nullable = 1
如果表的数量较大,您可以以类似的方式为所有表生成查询。 所有表的列表在sys.tables中。

2
为什么不用COUNT(col)呢?那只会计算“非空”值。 - Martin Smith
@Martin:说得好,我想我更喜欢显式的case。已在答案中编辑。 - Andomar
很酷 - 如果表的列数据类型为“text”,我会遇到一个错误,通过转换它来解决。当system_type_id = 35时,CASE语句如下:'count(CAST(' + name + ' as VARCHAR(MAX))) as ' + name + ', ',否则,'count(' + name + ') as ' + name + ', '。 - cmartin

7
这是一个我编写的脚本,用于执行同样的功能,它是一个两步手动过程:
1.在SSMS中运行此脚本,并选择结果窗格中的所有行:
SELECT '选择 COUNT( DISTINCT [' + COLUMN_NAME + ']) AS UniqueValues, ''' + TABLE_NAME + '.' + COLUMN_NAME + ''' AS ColumnName FROM [' + TABLE_SCHEMA + '].[' + TABLE_NAME + ']
UNION ALL ' FROM INFORMATION_SCHEMA.COLUMNS ORDER BY TABLE_NAME, COLUMN_NAME
2.将结果粘贴到新的查询窗口中。滚动到最底部并删除尾随的 "UNION ALL" 语句。它将看起来像这样:
SELECT COUNT( DISTINCT [ModifiedByUserId]) AS UniqueValues, 'Inspections.ModifiedByUserId' AS ColumnName FROM [dbo].[Inspections] UNION ALL SELECT COUNT( DISTINCT [Notes]) AS UniqueValues, 'Inspections.Notes' AS ColumnName FROM [dbo].[Inspections] UNION ALL SELECT COUNT( DISTINCT [PublicPassword]) AS UniqueValues, 'Inspections.PublicPassword' AS ColumnName FROM [dbo].[Inspections] UNION ALL SELECT COUNT( DISTINCT [ShopId]) AS UniqueValues, 'Inspections.ShopId' AS ColumnName FROM [dbo].[Inspections] UNION ALL SELECT COUNT( DISTINCT [Status]) AS UniqueValues, 'Inspections.Status' AS ColumnName FROM [dbo].[Inspections] UNION ALL SELECT COUNT( DISTINCT [SupervisorUserId]) AS UniqueValues, 'Inspections.SupervisorUserId' AS ColumnName FROM [dbo].[Inspections] UNION ALL
3.运行查询。在一个有300个列的数据库中,需要大约6分钟运行完毕。它将更快或更慢,具体取决于使用了多少索引。

如果出现错误“查询处理器耗尽了内部资源,无法生成查询计划。这是一个罕见事件,仅在极其复杂的查询或引用非常多的表或分区的查询中才会发生。请简化查询。如果您认为收到了此消息,请联系客户支持服务获取更多信息。”只需创建一个具有两列的表,并将动态SQL更改为在开头包括INSERT INTO TEMPYTABLE(COL1,COL2),并删除末尾的UNION ALL。 - adolf garlic
1
如果您只想了解所有空列,可以删除“distinct”,这样运行速度会更快。 - StuartN
干得好。这是唯一可行的答案,而且无需安装任何东西。虽然需要两个步骤,但省了我很多时间。它在一晚上运行了大约五百万行和一千列的数据。 - Philip Rego

4

更新....好吧,我对此太过着迷了

该过程接受两个参数:要搜索的表和要应用的条件。您可以将本质上的where子句传递给第二个参数。我编写了该过程以将双引号解释为单引号....再次说明,这是基于原始开发人员的概念构建的。

GO
/****** Object:  StoredProcedure [dbo].[SearchAllTables]    Script Date: 05/04/2011 14:29:48 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
alter PROC [dbo].[SearchAllTables2] ( @SEARCH_TABLE NVARCHAR(255), @CONDITION AS NVARCHAR(MAX) ) AS
BEGIN

-- Copyright © 2002 Narayana Vyas Kondreddi. All rights reserved.
-- Purpose: To search all columns of all tables for a given search string
-- Written by: Narayana Vyas Kondreddi -- Site: http://vyaskn.tripod.com
-- Tested on: SQL Server 7.0 and SQL Server 2000
-- Date modified: 28th July 2002 22:50 GMT
CREATE TABLE #Results (ColumnName nvarchar(370), ColumnValue nvarchar(3630))
SET NOCOUNT ON
DECLARE @TableName nvarchar(256), @ColumnName nvarchar(128), @SearchStr2 nvarchar(110), @COND_STR NVARCHAR(MAX)
SET @TableName = '' 
--SET @SearchStr2 = QUOTENAME('%' + @SearchStr + '%','''')
WHILE @TableName IS NOT NULL
BEGIN
SET @ColumnName = ''
SET @CONDITION = REPLACE(@CONDITION,'"','''')
SET @TableName =
(
SELECT MIN(QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME))
FROM INFORMATION_SCHEMA.TABLES
WHERE
TABLE_TYPE = 'BASE TABLE'
AND
QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME) > @TableName
AND OBJECTPROPERTY(
OBJECT_ID(
QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME)
), 'IsMSShipped'
) = 0 AND TABLE_NAME = @SEARCH_TABLE
) WHILE (@TableName IS NOT NULL) AND (@ColumnName IS NOT NULL) 
BEGIN SET @ColumnName = ( 
SELECT MIN(QUOTENAME(COLUMN_NAME)) 
FROM INFORMATION_SCHEMA.COLUMNS 
WHERE TABLE_SCHEMA = PARSENAME(@TableName, 2) AND 
TABLE_NAME = PARSENAME(@TableName, 1) AND QUOTENAME(COLUMN_NAME) > @ColumnName ) 
IF @ColumnName IS NOT NULL 
BEGIN 
SET @COND_STR = REPLACE(@CONDITION,'''','"')
INSERT INTO #Results 
EXEC ( 'SELECT  DISTINCT ''' + @TableName + '.' + @ColumnName + ''',''' + @COND_STR + ''' AS CONDITION FROM ' + @TableName + ' (NOLOCK) ' + ' WHERE ' + @ColumnName + ' ' + @CONDITION)
PRINT ( 'SELECT  DISTINCT ''' + @TableName + '.' + @ColumnName + ''',''' + @COND_STR + ''' AS CONDITION FROM ' + @TableName + ' (NOLOCK) ' + ' WHERE ' + @ColumnName + ' ' + @CONDITION)
END 
END 
END 
    SELECT ColumnName, ColumnValue 
    FROM #Results 
END
GO
-- to execute

exec [SearchAllTables2] 'TABLENAME','LIKE "%DOUG%"' -- double quotes are automatically escaped to single quotes...

从下面版权中修改的原始代码....仅使用部分。

GO
/****** Object:  StoredProcedure [dbo].[SearchAllTables]    Script Date: 05/04/2011 14:29:48 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
alter PROC [dbo].[SearchAllTables2] ( @TABLE_NAME NVARCHAR(255) ) AS
BEGIN

-- Copyright © 2002 Narayana Vyas Kondreddi. All rights reserved.
-- Purpose: To search all columns of all tables for a given search string
-- Written by: Narayana Vyas Kondreddi -- Site: http://vyaskn.tripod.com
-- Tested on: SQL Server 7.0 and SQL Server 2000
-- Date modified: 28th July 2002 22:50 GMT
CREATE TABLE #Results (ColumnName nvarchar(370), ColumnValue nvarchar(3630))
SET NOCOUNT ON
DECLARE @TableName nvarchar(256), @ColumnName nvarchar(128), @SearchStr2 nvarchar(110)
SET @TableName = '' 
--SET @SearchStr2 = QUOTENAME('%' + @SearchStr + '%','''')
WHILE @TableName IS NOT NULL
BEGIN
SET @ColumnName = ''
SET @TableName =
(
SELECT MIN(QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME))
FROM INFORMATION_SCHEMA.TABLES
WHERE
TABLE_TYPE = 'BASE TABLE'
AND
QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME) > @TableName
AND OBJECTPROPERTY(
OBJECT_ID(
QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME)
), 'IsMSShipped'
) = 0 AND TABLE_NAME = @TABLE_NAME
) WHILE (@TableName IS NOT NULL) AND (@ColumnName IS NOT NULL) 
BEGIN SET @ColumnName = ( 
SELECT MIN(QUOTENAME(COLUMN_NAME)) 
FROM INFORMATION_SCHEMA.COLUMNS 
WHERE TABLE_SCHEMA = PARSENAME(@TableName, 2) AND 
TABLE_NAME = PARSENAME(@TableName, 1) AND QUOTENAME(COLUMN_NAME) > @ColumnName ) 
IF @ColumnName IS NOT NULL 
BEGIN 
INSERT INTO #Results 
EXEC ( 'SELECT  DISTINCT ''' + @TableName + '.' + @ColumnName + ''', ''IS NULL'' FROM ' + @TableName + ' (NOLOCK) ' + ' WHERE ' + @ColumnName + ' IS NULL ')--LIKE ' + @SearchStr2 ) 
--PRINT ( 'SELECT  DISTINCT ''' + @TableName + '.' + @ColumnName + ''', ''IS NOT NULL'' FROM ' + @TableName + ' (NOLOCK) ' + ' WHERE  ' + @ColumnName + ' IS NOT NULL ')--LIKE ' + @SearchStr2 ) 
END 
END 
END 
    SELECT ColumnName, ColumnValue 
    FROM #Results 
END
GO
-- to execute

exec [SearchAllTables2] 'Master'

2

这个东西非常有用,必须放到一个方便的系统处理程序中 - 特别是当你继承了一个遗留数据库时,不知道哪些列可以删除或忽略。

/*
Show the count of not-null values in a table
*/
create proc sp_aaShowAllNullColumns @tableName varchar(255) as
begin
set nocount on

declare @sql nvarchar(4000)
declare @cols nvarchar(4000)
declare @tcols table( colbit nvarchar(255) )

insert @tcols
select 'count(' + name + ') as ' + name + ', ' as colbit
from    sys.columns
where   object_id = object_id(@tableName)
and is_nullable = 1 

select @cols = coalesce( @cols, ', ', '' ) + colbit from @tcols
select @cols = substring( @cols, 1, (len(@cols) - 1) )
select @cols = isnull( @cols, '' )

select @sql = 'select count(*) as Rows' + @cols + ' from ' + @tableName
exec sp_executeSql @sql

end
go

exec sys.sp_MS_marksystemobject 'sp_aaShowAllNullColumns'
go

use Bookshop
go
exec sp_aaShowAllNullColumns 'Books'
go

1
以下是一段bash脚本,用于在SQLite数据库中的所有非空表格(或在此类数据库中指定的所有表格)中识别所有NULL列。只要编程语言能够与SQLite数据库通信,就可以使用相同的技术。
#!/bin/bash

function help {
    cat <<EOF
Syntax: $0 databasefile [table ...]

If no tables are specified, then for each non-empty user table in the
specified SQLite database row, this script will emit the column names
of those columns in which all the values are NULL.  If any tables are
specified, only the specified tables are scanned.

The script is written to make it easy to modify the criteria and the output.

Thanks to SQL, two passes are required per table, and if no tables are
specified, an additional invocation of sqlite3 is required.

Column names are written in the form: tablename.columnname

Requirements:
  sqlite3 on the \$PATH

Options: 
 -v | --verbose :: emit additional information

EOF
}

while [ "$1" ]
do case "$1" in
      -h | --help | "" ) help
           exit
           ;;
      -v | --verbose ) VERBOSE=1
           shift
           ;;
      * ) break
      ;;
  esac
done

function verbose { if [ "$VERBOSE" ] ; then echo "$@" >&2 ; fi ; }

db="$1"
shift

if [ ! -s "$db" ] ; then echo "$0 : $db not found" ; exit ; fi

# To prevent loading ~/.sqliterc specify -init ""

# Global: db
function nullcolumns {
  local table="$1"
  local count column field nulls
  ( read count
    if [ -n "$count" ] ; then
        verbose "Row count for $table: $count"
    if [ "$count" -gt 0 ] ; then
          while read column ; do
          echo "SELECT '$column', * FROM
                  (SELECT COUNT(*) FROM $table WHERE '$column' IS NULL);"
          done |
          sqlite3 -readonly "$db" | while IFS="|" read field nulls ; do
            verbose $table.$field ... $nulls
            if [ "$nulls" -eq $count ] ; then echo "$table.$field" ; fi
          done
        else cat > /dev/null
        fi
    else cat > /dev/null
    fi ) < <(sqlite3 -readonly "$db" "select count(*) from '$table'; 
                                      select name from pragma_table_info( '$table' )")
}

if [ $# = 0 ] ; then
    sqlite3 -readonly "$db" .tables | while read table ; do
    nullcolumns "$table"
    done
else
    for table ; do
    nullcolumns "$table"
    done
fi

0
DROP TABLE IF EXISTS #null_check
CREATE TABLE   #null_check (TableName VARCHAR(30), ColumnName varchar(100), Percent_Null decimal)

--columns with 1 are all NULL, 0 has some populated values
--Below generates 1 sql statement per column for every column and let rip.


SELECT CONCAT('INSERT INTO #NULL_CHECK SELECT ''', t.name, ''',''',c.name, ''',sum(case when ISNULL(CONVERT(VARCHAR(10),',c.name, '),''',''') IN (''',''',''*'') THEN 1.00 else 0.00 end)/count(*) percent_NULL FROM ',t.name) FROM sys.tables t JOIN sys.columns c ON t.object_id = c.object_id


--See returns
SELECT * FROM #null_check

2
目前你的回答不够清晰,请编辑并添加更多细节,以帮助其他人理解它如何回答问题。你可以在帮助中心找到有关如何撰写好答案的更多信息。 - Community
请不要发布仅包含代码的答案。未来的读者会感激您解释为什么它回答了问题,而不是从代码中推断出来。另外,您能解释一下这如何补充其他答案吗? - Gert Arnold

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接