我有一个名为A
的pandas数据帧,其中列keywords
如下:
keywords
['loans','mercedez','bugatti','a4']
['trump','usa','election','president']
['galaxy','7s','canon','macbook']
['beiber','spiderman','marvels','ironmen']
.........................................
.........................................
.........................................
我还有另一个pandas dataframe B
,其中包含列category
和words
,其为逗号分隔的字符串,如下所示:
category words
audi audi a4,audi a6
bugatti bugatti veyron, bugatti chiron
mercedez mercedez s-class, mercedez e-class
dslr canon, nikon
apple iphone 7s,iphone 6s,iphone 5
finance sales,loans,sales price
politics donald trump, election, votes
entertainment spiderman,captain america, ironmen
music justin beiber, rihana,drake
........ ..............
......... .........
我想将数据框 A
的列关键词
与数据框 B
的列单词
进行映射,并分配相应的类别
。将关键词
列映射到每个字符串中的单词,例如:关键词a4
应与列单词
中字符串audi a4
中的两个单词匹配。预期结果应为:
keywords matched_category
['loans','mercedez','bugatti','a4'] ['finance','mercedez','mercedez','bugatti','bugatti','audi']
['trump','usa','election','president'] ['politics','politics']
['galaxy','7s','canon','macbook'] ['apple','dslr']
['beiber','spiderman','marvels','ironmen'] ['music','entertaiment','entertainment','entertainment']