在Python中,如何从一个数组列表中移除空数组和零数组?

3

我正在考虑一些Python数据,它们是以列表数组的形式存在的:

LA=
[array([  99.08322813,  253.42371683,  300.792029  ])
array([  51.55274095,  106.29707418,  0])
array([0, 0 ,0 , 0, 0])
array([ 149.07283952,  191.45513754,  251.19610503,  393.50806493, 453.56783459])
array([ 105.61643877,  442.76668729,  450.37335607])
array([ 348.84179544])
array([], dtype=float64)]
array([0, 0 , 0])
array([ 295.05603151,  0,  451.77083268,  500.81771919])
array([ 295.05603151,  307.37232315,  451.77083268,  500.81771919])
array([  91.86758237,  148.70156948,  488.70648486,  507.31389766])
array([ 353.68691095])
array([ 208.21919198,  246.57665959,  0,  251.33820305, 394.34266882])
array([], dtype=float64)]

在我的数据中,我得到了一些空数组:
array([], dtype=float64)] 

以及用零填充的数组:

array([0, 0, 0])

我该如何自动化地摆脱这两种类型的数组,并得到以下结果:
LA=
[array([  99.08322813,  253.42371683,  300.792029  ])
array([  51.55274095,  106.29707418,  0])
array([ 149.07283952,  191.45513754,  251.19610503,  393.50806493, 453.56783459])
array([ 105.61643877,  442.76668729,  450.37335607])
array([ 348.84179544])
array([ 295.05603151,  0,  451.77083268,  500.81771919])
array([ 295.05603151,  307.37232315,  451.77083268,  500.81771919])
array([  91.86758237,  148.70156948,  488.70648486,  507.31389766])
array([ 353.68691095])
array([ 208.21919198,  246.57665959,  0,  251.33820305, 394.34266882])

最后,我希望能保持数组列表的格式,同时删除零。
LA=
[array([  99.08322813,  253.42371683,  300.792029  ])
array([  51.55274095,  106.29707418])
array([ 149.07283952,  191.45513754,  251.19610503,  393.50806493, 453.56783459])
array([ 105.61643877,  442.76668729,  450.37335607])
array([ 348.84179544])
array([ 295.05603151,  451.77083268,  500.81771919])
array([ 295.05603151,  307.37232315,  451.77083268,  500.81771919])
array([  91.86758237,  148.70156948,  488.70648486,  507.31389766])
array([ 353.68691095])
array([ 208.21919198,  246.57665959,  251.33820305, 394.34266882])

提前感谢你


1
你尝试过什么吗?把你的代码给我们看看。 - user1907906
2个回答

5

使用NumPy和列表推导:

>>> from numpy import *

解决方案1:

>>> [x[x!=0] for x in LA if len(x) and len(x[x!=0])]                          
[array([  99.08322813,  253.42371683,  300.792029  ]),                                           
 array([  51.55274095,  106.29707418]),                                                          
 array([ 149.07283952,  191.45513754,  251.19610503,  393.50806493,                              
        453.56783459]),                                                                          
 array([ 105.61643877,  442.76668729,  450.37335607]),                                           
 array([ 348.84179544]),                                                                         
 array([ 295.05603151,  451.77083268,  500.81771919]),                                           
 array([ 295.05603151,  307.37232315,  451.77083268,  500.81771919]),                            
 array([  91.86758237,  148.70156948,  488.70648486,  507.31389766]),                            
 array([ 353.68691095]),                                                                         
 array([ 208.21919198,  246.57665959,  251.33820305,  394.34266882])]    

解决方案2:

>>> [x[x!=0] for x in LA if count_nonzero(x)]                          
[array([  99.08322813,  253.42371683,  300.792029  ]),                                           
 array([  51.55274095,  106.29707418]),                                                          
 array([ 149.07283952,  191.45513754,  251.19610503,  393.50806493,                              
        453.56783459]),                                                                          
 array([ 105.61643877,  442.76668729,  450.37335607]),                                           
 array([ 348.84179544]),                                                                         
 array([ 295.05603151,  451.77083268,  500.81771919]),                                           
 array([ 295.05603151,  307.37232315,  451.77083268,  500.81771919]),                            
 array([  91.86758237,  148.70156948,  488.70648486,  507.31389766]),                            
 array([ 353.68691095]),                                                                         
 array([ 208.21919198,  246.57665959,  251.33820305,  394.34266882])]    

时间比较:

In [56]: %timeit  [x[x!=0] for x in LA if len(x) and len(x[x!=0])]                     
10000 loops, best of 3: 176 µs per loop                                                          

In [88]: %timeit [x[x!=0] for x in LA if count_nonzero(x)]                                   
10000 loops, best of 3: 89.7 µs per loop   

#@gnibbler's solution:

In [82]: %timeit [x.compress(x) for x in LA if x.any()]                                          
10000 loops, best of 3: 138 µs per loop  

较大数组的定时结果:

In [140]: LA = [resize(x, 10**5) for x in LA]                                                    

In [142]: %timeit [x[x!=0] for x in LA if len(x) and len(x[x!=0])]                               
10 loops, best of 3: 26.7 ms per loop                                                            

In [143]: %timeit [x[x!=0] for x in LA if count_nonzero(x) > 0]                                  
10 loops, best of 3: 26 ms per loop                                                              

In [144]: %timeit [x.compress(x) for x in LA if x.any()]                                         
10 loops, best of 3: 42.7 ms per loop                                                            

In [145]: %timeit [x.compress(x) for x in LA if count_nonzero(x)]                                
10 loops, best of 3: 45.8 ms per loop                                                            

In [146]: %timeit [x[x!=0] for x in LA if x.any()]                                               
10 loops, best of 3: 22.9 ms per loop                                                            

In [147]: %timeit [x[x!=0] for x in LA if count_nonzero(x)]                                      
10 loops, best of 3: 26.2 ms per loop  

你能给我的答案计时吗? - John La Rooy
@gnibbler 的 compress 函数大约需要 138 微秒。 - Ashwini Chaudhary
当您使用 count_nonzero 时,您不需要进行 len(x) 检查。 - DSM
啊哈... any 是慢的部分。count_nonzero 要快得多。 - John La Rooy
@gnibbler 我其实预期它会更快,也就是说它应该像Python的any一样短路。顺便说一下,我将所有项目调整为100000并再次计时,这次[x[x!=0] for x in LA if x.any()]是最快的,而令人震惊的是[x.compress(x) for x in LA if count_nonzero(x)]是最慢的。 - Ashwini Chaudhary
@AshwiniChaudhary,这很有道理。不过我没想到对于短数组来说any会那么慢。 - John La Rooy

5
一个列表推导式可以完成第一部分。
[x for x in LA if x.any()]

您可以使用compress完成第二部分。

[x.compress(x) for x in LA if x.any()]

基于Ashwini的想法,更快的版本
[x.compress(x) for x in LA if count_nonzero(x)]

时间:

In [89]: %timeit [x.compress(x) for x in LA if count_nonzero(x)]  #clear winner                                
10000 loops, best of 3: 20.2 µs per loop     

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接