x = [8,2,3,4,5]
y = [6,3,7,2,1]
如何以简洁优雅的方式找到两个列表中的第一个共同元素(在此例中为“2”)? 任何一个列表都可以为空,或者没有共同的元素-这种情况下使用None即可。我需要向一个刚接触Python的人展示这个,所以越简单越好。
更新:对于我的目的来说,顺序并不重要,但假设我正在寻找x中也出现在y中的第一个元素。
x = [8,2,3,4,5]
y = [6,3,7,2,1]
如何以简洁优雅的方式找到两个列表中的第一个共同元素(在此例中为“2”)? 任何一个列表都可以为空,或者没有共同的元素-这种情况下使用None即可。使用排序并不是完成此操作的最快方法,通过使用一个集合(哈希映射),可以在O(N)时间内完成。
>>> x = [8,2,3,4,5]
>>> y = [6,3,7,2,1]
>>> set_y = set(y)
>>> next((a for a in x if a in set_y), None)
2
或者:
next(ifilter(set(y).__contains__, x), None)
这是它的功能:
>>> def foo(x, y):
seen = set(y)
for item in x:
if item in seen:
return item
else:
return None
>>> foo(x, y)
2
为了展示不同方法之间的时间差异(朴素方法,二分查找和集合),以下是一些计时结果。我必须这样做来证明二分查找比许多人认为的更快...
from itertools import ifilter
from bisect import bisect_left
a = [1, 2, 3, 9, 1, 1] * 100000
b = [44, 11, 23, 9, 10, 99] * 10000
c = [1, 7, 2, 4, 1, 9, 9, 2] * 1000000 # repeats early
d = [7, 6, 11, 13, 19, 10, 19] * 1000000
e = range(50000)
f = range(40000, 90000) # repeats in the middle
g = [1] * 10000000 # no repeats at all
h = [2] * 10000000
from random import randrange
i = [randrange(10000000) for _ in xrange(5000000)] # some randoms
j = [randrange(10000000) for _ in xrange(5000000)]
def common_set(x, y, ifilter=ifilter, set=set, next=next):
return next(ifilter(set(y).__contains__, x), None)
pass
def common_b_sort(x, y, bisect=bisect_left, sorted=sorted, min=min, len=len):
sorted_y = sorted(y)
for a in x:
if a == sorted_y[min(bisect_left(sorted_y, a),len(sorted_y)-1)]:
return a
else:
return None
def common_naive(x, y):
for a in x:
for b in y:
if a == b: return a
else:
return None
from timeit import timeit
from itertools import repeat
import threading, thread
print 'running tests - time limit of 20 seconds'
for x, y in [('a', 'b'), ('c', 'd'), ('e', 'f'), ('g', 'h'), ('i', 'j')]:
for func in ('common_set', 'common_b_sort', 'common_naive'):
try:
timer = threading.Timer(20, thread.interrupt_main) # 20 second time limit
timer.start()
res = timeit(stmt="print '[', {0}({1}, {2}), ".format(func, x, y),
setup='from __main__ import common_set, common_b_sort, common_naive, {0}, {1}'.format(x, y),
number=1)
except:
res = "Too long!!"
finally:
print '] Function: {0}, {1}, {2}. Time: {3}'.format(func, x, y, res)
timer.cancel()
测试数据为:
a = [1, 2, 3, 9, 1, 1] * 100000
b = [44, 11, 23, 9, 10, 99] * 10000
c = [1, 7, 2, 4, 1, 9, 9, 2] * 1000000 # repeats early
d = [7, 6, 11, 13, 19, 10, 19] * 1000000
e = range(50000)
f = range(40000, 90000) # repeats in the middle
g = [1] * 10000000 # no repeats at all
h = [2] * 10000000
from random import randrange
i = [randrange(10000000) for _ in xrange(5000000)] # some randoms
j = [randrange(10000000) for _ in xrange(5000000)]
结果:
running tests - time limit of 20 seconds
[ 9 ] Function: common_set, a, b. Time: 0.00569520707241
[ 9 ] Function: common_b_sort, a, b. Time: 0.0182240340602
[ 9 ] Function: common_naive, a, b. Time: 0.00978832505249
[ 7 ] Function: common_set, c, d. Time: 0.249175872911
[ 7 ] Function: common_b_sort, c, d. Time: 1.86735751332
[ 7 ] Function: common_naive, c, d. Time: 0.264309220865
[ 40000 ] Function: common_set, e, f. Time: 0.00966861710078
[ 40000 ] Function: common_b_sort, e, f. Time: 0.0505980508696
[ ] Function: common_naive, e, f. Time: Too long!!
[ None ] Function: common_set, g, h. Time: 1.11300018578
[ None ] Function: common_b_sort, g, h. Time: 14.9472068377
[ ] Function: common_naive, g, h. Time: Too long!!
[ 5411743 ] Function: common_set, i, j. Time: 1.88894859542
[ 5411743 ] Function: common_b_sort, i, j. Time: 6.28617268396
[ 5411743 ] Function: common_naive, i, j. Time: 1.11231867458
这能让你了解它在处理更大的输入时会如何缩放,O(N)比O(N log N)和O(N^2)更快。
result = None
# Go trough one array
for i in x:
# The element repeats in the other list...
if i in y:
# Store the result and break the loop
result = i
break
更加优雅的做法是将相同的功能封装到函数中使用类似PEP 8的编码风格规范:
def get_first_common_element(x,y):
''' Fetches first element from x that is common for both lists
or return None if no such an element is found.
'''
for i in x:
if i in y:
return i
# In case no common element found, you could trigger Exception
# Or if no common element is _valid_ and common state of your application
# you could simply return None and test return value
# raise Exception('No common element found')
return None
如果您想获取所有常见元素,可以简单地按照以下方式进行:
>>> [i for i in x if i in y]
[1, 2, 3]
None
即可。此外,您最后三行的缩进有误。 - user2032433None
(当你编写每天需要处理成千上万个文件的大型脚本时,很容易忘记检查某些错误状态... 对我来说 最好让脚本崩溃,这样你就会得到完整的通知并处理它)...当然,这不是终极规则。如果他想让用户输入相关列表,那么当没有公共元素被找到时,实际上是异常。如果它是应用程序的常见且有效状态,则返回 None
。 - VyktorO(1)
(除非你使用了某种索引),我认为它只可能在 O(log(N))
的时间内完成。 - Vyktor使用next
从生成器中获取第一个项的一行代码:
x = [8,2,3,4,5]
y = [6,3,7,2,1]
first = next((a for a in x if a in y), None)
set.__contains__
,因为它比list.__contains__
更快:set_y = set(y)
first = next((a for a in x if a in set_y), None)
first = next((lambda set_y: a for a in x if a in set_y)(set(y)), None)
set_y = set(y)
,然后它就完美了。 - jamylaknext((a for a in x if a in set(y)), None)
这段代码会导致集合每次都被构建吗? - Ericnext(ifilter(set(y).__contains__, x), None)
。(不要在家尝试,因为你不应该使用双下划线方法) - jamylakin
的for
循环将导致O(N^2)
的复杂度,但您可以在这里对y
进行排序并使用二分查找来改善时间复杂度为O(NlogN)
。def binary_search(lis,num):
low=0
high=len(lis)-1
ret=-1 #return -1 if item is not found
while low<=high:
mid=(low+high)//2
if num<lis[mid]:
high=mid-1
elif num>lis[mid]:
low=mid+1
else:
ret=mid
break
return ret
x = [8,2,3,4,5]
y = [6,3,7,2,1]
y.sort()
for z in x:
ind=binary_search(y,z)
if ind!=-1
print z
break
输出:2
使用bisect模块执行与上述相同的操作:
import bisect
x = [8,2,3,4,5]
y = [6,3,7,2,1]
y.sort()
for z in x:
ind=bisect.bisect(y,z)-1 #or use `ind=min(bisect.bisect_left(y, z), len(y) - 1)`
if ind!=-1 and y[ind] ==z:
print z #prints 2
break
for z in x: if z in y
实际上等同于两个 for
循环。 - Ashwini Chaudharybisect
模块,它非常被忽视,但它可以在一行中完成你想要的操作。 - jamylak我假设您想教这个人Python,而不仅仅是编程。因此,我毫不犹豫地使用zip
代替丑陋的循环变量;它是Python中非常有用的一部分,而且不难解释。
def first_common(x, y):
common = set(x) & set(y)
for current_x, current_y in zip(x, y):
if current_x in common:
return current_x
elif current_y in common:
return current_y
print first_common([8,2,3,4,5], [6,3,7,2,1])
zip
,那么无需使用它的方法如下:def first_common2(x, y):
common = set(x) & set(y)
for i in xrange(min(len(x), len(y))):
if x[i] in common:
return x[i]
elif y[i] in common:
return y[i]
对于那些感兴趣的人,以下是如何将其扩展到任意数量的序列:
def first_common3(*seqs):
common = set.intersection(*[set(seq) for seq in seqs])
for current_elements in zip(*seqs):
for element in current_elements:
if element in common:
return element
最后请注意,与其他一些解决方案不同的是,如果第一个公共元素首先出现在第二个列表中,这也同样适用。
我刚刚注意到你的更新,这使得解决方案变得更加简单:
def first_common4(x, y):
ys = set(y) # We don't want this to be recreated for each element in x
for element in x:
if element in ys:
return element
上述内容可以说比生成器表达式更易读。
可惜Python中没有内置的有序集合。这使得解决方案不够优雅。
izip_longest()
可以被更常见的 zip()
替代:当最短序列用尽时,没有必要继续在最长序列中搜索,因为任何共同元素都已经在最短列表中找到。 - Eric O. Lebigot使用for循环似乎最容易向新手解释。
for number1 in x:
for number2 in y:
if number1 == number2:
print number1, number2
print x.index(number1), y.index(number2)
exit(0)
print "No common numbers found."
NB 没有测试,只是我脑海中的想法。
def findcommon(x,y):
common = None
for i in range(0,max(len(x),len(y))):
common = set(x[0:i]).intersection(set(y[0:i]))
if common: break
return list(common)[0] if common else None
def first_common_element(x,y):
common = set(x).intersection(set(y))
if common:
return x[min([x.index(i)for i in common])]
仅供娱乐(可能不太高效),另一种使用itertools
的版本:
from itertools import dropwhile, product
from operator import __ne__
def accept_pair(f):
"Make a version of f that takes a pair instead of 2 arguments."
def accepting_pair(pair):
return f(*pair)
return accepting_pair
def get_first_common(x, y):
try:
# I think this *_ unpacking syntax works only in Python 3
((first_common, _), *_) = dropwhile(
accept_pair(__ne__),
product(x, y))
except ValueError:
return None
return first_common
x = [8, 2, 3, 4, 5]
y = [6, 3, 7, 2, 1]
print(get_first_common(x, y)) # 2
y = [6, 7, 1]
print(get_first_common(x, y)) # None
使用lambda pair: pair[0] != pair[1]
比accept_pair(__ne__)
更简单,但不够有趣。
使用 set - 这是任意数量列表的通用解决方案:
def first_common(*lsts):
common = reduce(lambda c, l: c & set(l), lsts[1:], set(lsts[0]))
if not common:
return None
firsts = [min(lst.index(el) for el in common) for lst in lsts]
index_in_list = min(firsts)
trgt_lst_index = firsts.index(index_in_list)
return lsts[trgt_lst_index][index_in_list]
def first_common(*lsts):
common = reduce(lambda c, l: c & set(l), lsts[1:], set(lsts[0]))
if not common:
return None
for lsts_slice in itertools.izip_longest(*lsts):
slice_intersection = common.intersection(lsts_slice)
if slice_intersection:
return slice_intersection.pop()
x,y
是完全有效的数学表达式,对于迭代器和类似迭代器的变量,i,j,k
也是一样。此外,在处理几何图形时,a,b,c,d
是参数方程中属性的常见名称。 - Vyktork,v
,并且在x,y
中使用z
。此外,n
通常代表任何自然数。 - user2032433