如果涉及到空间问题,你最好使用 std::map<int, std::vector<string>>
。下面的代码很简单(但可以通过将所有单词转为小写并去除标点来改进):
#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>
#include <map>
using namespace std;
int main(int argc, char *argv[])
{
if (argc < 2)
return EXIT_FAILURE;
std::map<string, int> strs;
ifstream inf(argv[1]);
string str;
while (inf >> str)
++strs[str];
std::map<int, std::vector<string>> vals;
for (auto it : strs)
vals[ it.second ].push_back(it.first);
for (auto it : vals)
{
cout << "Count: " << it.first << ": ";
std::copy(it.second.begin(), it.second.end(),
ostream_iterator<string>(cout, " "));
cout << endl;
}
}
样例输入
我选择了《皆大欢喜》中威廉·莎士比亚的独白,这段独白有一些有趣的特点,稍后您会看到:
All the world's a stage,
And all the men and women merely players:
They have their exits and their entrances;
And one man in his time plays many parts,
His acts being seven ages. At first, the infant,
Mewling and puking in the nurse's arms.
And then the whining school-boy, with his satchel
And shining morning face, creeping like snail
Unwillingly to school. And then the lover,
Sighing like furnace, with a woeful ballad
Made to his mistress' eyebrow. Then a soldier,
Full of strange oaths and bearded like the pard,
Jealous in honour, sudden and quick in quarrel,
Seeking the bubble reputation
Even in the cannon's mouth. And then the justice,
In fair round belly with good capon lined,
With eyes severe and beard of formal cut,
Full of wise saws and modern instances;
And so he plays his part. The sixth age shifts
Into the lean and slipper'd pantaloon,
With spectacles on nose and pouch on side,
His youthful hose, well saved, a world too wide
For his shrunk shank; and his big manly voice,
Turning again toward childish treble, pipes
And whistles in his sound. Last scene of all,
That ends this strange eventful history,
Is second childishness and mere oblivion,
Sans teeth, sans eyes, sans taste, sans everything.
样例输出
Count: 1: All At Even For In Into Is Jealous Last Made Mewling Sans Seeking Sighing That The Then They Turning Unwillingly acts again age ages. all all, arms. ballad beard bearded being belly big bubble cannon's capon childish childishness creeping cut, ends entrances; eventful everything. exits eyebrow. eyes eyes, face, fair first, formal furnace, good have he history, honour, hose, infant, instances; justice, lean lined, lover, man manly many men mere merely mistress' modern morning mouth. nose nurse's oaths oblivion, one pantaloon, pard, part. parts, pipes players: pouch puking quarrel, quick reputation round satchel saved, saws scene school-boy, school. second seven severe shank; shifts shining shrunk side, sixth slipper'd snail so soldier, sound. spectacles stage, sudden taste, teeth, this time too toward treble, voice, well whining whistles wide wise woeful women world world's youthful
Count: 2: Full His With on plays strange their to
Count: 3: like sans then with
Count: 4: a of
Count: 6: in
Count: 7: his
Count: 8: And
Count: 11: and the
有趣的是这篇独白中独特词汇串的数量如此之多。几乎像他事先计划好了一样。然而,如果考虑大小写和标点符号的情况,数字显然是不同的。幸运的是,这也很容易做到,只需更改第一个while循环:
while (inf >> str)
{
string alpha;
for_each(str.begin(), str.end(),
[](char& c){c=tolower(static_cast<unsigned char>(c));});
copy_if(str.begin(), str.end(), back_inserter(alpha),
[](const char& c){return isalpha(static_cast<unsigned char>(c));});
++strs[alpha];
}
这给我们带来了以下结果:
Count: 1: acts again age ages arms at ballad beard bearded being belly big bubble cannons capon childish childishness creeping cut ends entrances even eventful everything exits eyebrow face fair first for formal furnace good have he history honour hose infant instances into is jealous justice last lean lined lover made man manly many men mere merely mewling mistress modern morning mouth nose nurses oaths oblivion one pantaloon pard part parts pipes players pouch puking quarrel quick reputation round satchel saved saws scene school schoolboy second seeking seven severe shank shifts shining shrunk side sighing sixth slipperd snail so soldier sound spectacles stage sudden taste teeth that they this time too toward treble turning unwillingly voice well whining whistles wide wise woeful women world worlds youthful
Count: 2: eyes full on plays strange their to
Count: 3: all like
Count: 4: a of sans then
Count: 5: with
Count: 7: in
Count: 9: his
Count: 12: the
Count: 19: and
还是相当不错的,Billy。
由于第一张地图排序的性质,您可以按计数字母顺序获取结果单词列表。额外功能太棒了。
map<int, vector<string> >
是一种实现方式。 - nhahtdhstd::multimap <int,std :: string>
,不过我个人更喜欢std::map<int,std :: vector <std :: string>>
。 - WhozCraigvector
和map
都可以使用,但超出此范围,我预计在内存中存储字符串的问题会超过使用哪种数据结构的问题。 - nhahtdh