C++:识别句子中单词出现的频率

3
什么是最适合这个任务的STL?我一直在使用Map,但它无法正常工作。我不确定我应该如何检查在句子中出现的相同单词数量,例如:

我爱他,我爱她,他爱她。

所以我想让程序提示用户输入一个整数,比如我输入3,输出将是“love”,因为相同的单词在句子中出现了3次。但如果我想做这样的程序,应该使用哪种方法呢?

目前我的程序会提示用户输入单词,然后返回该单词出现的次数,例如单词“love”的出现次数为3。但现在我想反过来做。可以吗?使用哪种STL会更好?

4个回答

3

我猜你使用了一个map来存储出现次数。首先要理解的是,由于你使用了map,键是唯一的,而存储的数据可能不唯一。 考虑一个名为x的map,其内容如下:

x["I"]=3
x["Love"]=3
x["C"]=5

键和值之间存在唯一的映射关系,而不是反过来。如果您需要这种一对一的映射关系,我建议您使用不同的数据结构。如果您想使用map,并且仍然要搜索元素,则可以使用STL搜索函数或编写自己的搜索函数。 search()

map<string,int>::iterator ser;
cin>>check;
for(ser=x.begin();ser!=x.end();++ser)
{
    if(ser->second==check)
    {
       cout<<"Word"<<ser->first<<endl;
       break;
    }
}

Ser是map的迭代器。 - elricL

3

首先建立单词计数的映射,然后从中构建反向多重映射。最后,您可以确定哪些单词以给定频率出现:

#include <algorithm>
#include <iostream>
#include <iterator>
#include <map>
#include <set>
#include <sstream>
#include <string>
#include <utility>

int main()
{
    std::string str("I love him, I love her, he love her");
    std::istringstream ss(str);
    std::istream_iterator<std::string> begin(ss);
    std::istream_iterator<std::string> end;

    std::map<std::string, int> word_count;
    std::for_each(begin, end, [&](const std::string& s)
    {
        ++word_count[s];
    });

    std::multimap<int, std::string> count_words;
    std::for_each(word_count.begin(), word_count.end(),
                  [&](const std::pair<std::string, int>& p)
    {
        count_words.insert(std::make_pair(p.second, p.first));
    });

    auto its = count_words.equal_range(3);
    std::for_each(its.first, its.second,
                  [](const std::pair<int, std::string>& p)
    {
        std::cout << p.second << std::endl;
    });
}

请注意,这并没有正确地计算输入字符串中的其他标点符号,例如 her,herword_count 中有不同的条目,以及 'him,'。您可能需要对 word_count 进行后处理,以剥离非字母字符并重新分组或丢弃。连字符词和其他嵌入式非字母仍然存在问题。 - Steve Townsend

2
/******************************************************************
Name  :  Paul Rodgers
Source : HW1.CPP
Compiler :  Visual C++ .NET
Action : Program will read in from standard input and determine the
         frequency of word lengths found in input.  An appropriate
         table is also displayed.  Maximum word length is 15 characters
         words greater then 15 are counted as length 15. 
         Average word length also displayed.

Note   : Words include hyphenated and ones with apostrophes.  Words with
         apostrophes, i.e. Jim's, will count the apostrophe as part of the
         word length. Hyphen is counted if word on same line, else not.

         Also an int array is used to hold the number of words with
         length associated with matching subscript, with subscript 0
         not being used.  So subscript 1 corresponds to word length of 1,
         subscript 2 to word length of 2 and so on.
------------------------------------------------------------------------*/
#include <iostream>
#include <ctype.h>
#include <iomanip>
using namespace std;

int NextWordLength(void);                    // function prototypes
void DisplayFrequencyTable(const int Words[]);

const int WORD_LENGTH = 16;                // global constant for array

void main()
{
  int WordLength;                         // actual length of word 0 to X
  int NumOfWords[WORD_LENGTH] = {0};     // array holds # of lengths of words

  WordLength = NextWordLength();
  while (WordLength)                   // continue to loop until no word, i.e. 0
    {                                 // increment length counter
      (WordLength <= 14) ? (++NumOfWords[WordLength]) : (++NumOfWords[15]);
      WordLength = NextWordLength();
    }

  DisplayFrequencyTable(NumOfWords);
}

/**********************  NextWordLength  ********************************
Action  : Will determine the length of the next word. Hyphenated words and
          words with apostrophes are counted as one word accordingly
Parameters : none
Returns   : the length of word, 0 if none, i.e. end of file
-----------------------------------------------------------------------*/
int NextWordLength(void)
{
  char Ch;
  int EndOfWord = 0,       //tells when we have read in one word
      LengthOfWord = 0;

  Ch = cin.get();                           // get first character
  while (!cin.eof() && !EndOfWord)
   {
     while (isspace(Ch) || ispunct(Ch))      // Skips leading white spaces
        Ch = cin.get();                      // and leading punctation marks

     if (isalnum(Ch))          // if character is a letter or number
        ++LengthOfWord;        // then increment word length

     Ch = cin.get();           // get next character

     if ((Ch == '-') && (cin.peek() == '\n')) //check for hyphenated word over two lines
       {
         Ch = cin.get();       // don't count hyphen and remove the newline char
         Ch = cin.get();       // get next character then on next line
       }

     if ((Ch == '-') && (isalpha(cin.peek()))) //check for hyphenated word in one line
     {
         ++LengthOfWord;       // count the hyphen as part of word
         Ch = cin.get();       // get next character
     }

     if ((Ch == '\'') && (isalpha(cin.peek()))) // check for apostrophe in word
      {
        ++LengthOfWord;        // count apostrophe in word length
        Ch = cin.get();        // and get next letter
      }

     if (isspace(Ch) || ispunct(Ch) || cin.eof())  // is it end of word
       EndOfWord++;
   }

  return LengthOfWord;
}

/***********************  DisplayFrequencyTable  **************************
Action      :  Will display the frequency of length of words along with the
               average word length
Parameters
  IN        : Pointer to array holding the frequency of the lengths
Returns     : Nothing
Precondition: for loop does not go beyond WORD_LENGTH
------------------------------------------------------------------------*/
void DisplayFrequencyTable(const int Words[])
{
  int TotalWords = 0, TotalLength = 0;

  cout << "\nWord Length      Frequency\n";
  cout << "------------     ----------\n";

  for (int i = 1; i <= WORD_LENGTH-1; i++)
    {
     cout << setw(4) << i << setw(18) << Words[i] << endl;
     TotalLength += (i*Words[i]);
     TotalWords += Words[i];
    }

  cout << "\nAverage word length is ";

  if (TotalLength)
     cout << float(TotalLength)/TotalWords << endl;
  else
    cout << 0 << endl;
}

1
嗨,保罗。感谢你的代码片段。但是你确定这有帮助吗?解释你所做的事情和示例代码在长期视角上更有帮助。代码可能只是被复制而没有理解甚至阅读它。在这种情况下,代码甚至似乎不能解决上面问题的问题 :) - Michel Feldheim

-1
#include<iostream>
#include<string>
#include<vector>
#include<cstddef>
#include<map>

using std::cout;
using std::cin;
using std::string;
using std::endl;
using std::vector;
using std::map;

int main() {

    cout << "Please enter a string: " << endl;
    string str;
    getline(cin, str, '\n');

    size_t str_len = str.size();
    cout << endl << endl;

    size_t i = 0, j = 0;
    bool pop = false;

    map<string, int> myMap;

    for (size_t k = 0; k < str_len-1; k++) {
        if (((k == 0) && isalpha(str[0])) || (!(isalpha(str[k-1])) && isalpha(str[k])))
            i = k;
        if ( isalpha(str[k]) && !(isalpha(str[k+1])) ) {
            j = k;
            pop = true;
        }
        if ( (k == str_len-2) && isalpha(str[k+1]) ) {
            j = k+1;
            pop = true;
        }

        if ( (i <= j) && pop ) {
            string tmp = str.substr(i, j-i+1);
            cout << tmp << '\t';
            myMap[tmp]++;
            pop = false;
        }
    }
    cout << endl << endl;

    map<string, int>::iterator itr, end = myMap.end();
    for (itr = myMap.begin(); itr != end; itr++)
        cout << itr->first << "\t - - - - - \t" << itr->second << endl;

    cout << endl;

    return 0;
}

仅仅是一段没有解释的代码块并不有用。请花时间解释这段代码如何回答 OP 的问题。 - Zero Piraeus

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接