|
Corpus Cleaner
|
#include <perplexity_filter.hh>
Public Member Functions | |
| KenLMFilter () | |
| Score sentence by KenLM. | |
| double | Score (const wstring sentence) |
| Score sentence by KenLM. | |
| double | ScoreWithSentencePiece (const wstring sentence) |
| Score sentence by KenLM with SentencePiece Tokenizing. | |
| double | Perplexity (const wstring sentence) |
| Perplexity sentence by KenLM. | |
| double | PerplexityWithSentencePiece (const wstring sentence) |
| Perplexity sentence by KenLM with SentencePiece Tokenizing. | |
Public Attributes | |
| sentencepiece::SentencePieceProcessor | processor |
Definition at line 33 of file perplexity_filter.hh.
| KenLMFilter::KenLMFilter | ( | ) |
Score sentence by KenLM.
The step is...
Example:
| const | wstring &sentence: text sentence |
Definition at line 27 of file perplexity_filter.cc.
| double KenLMFilter::Perplexity | ( | const wstring | sentence | ) |
Perplexity sentence by KenLM.
The step is...
Example: wstring sentence = L"吾輩は猫である.名前はまだない."; cout << KenLMPerplexity(sentence) << endl; // 4117.1
| const | string& src: text sentence |
Definition at line 147 of file perplexity_filter.cc.
| double KenLMFilter::PerplexityWithSentencePiece | ( | const wstring | sentence | ) |
Perplexity sentence by KenLM with SentencePiece Tokenizing.
The step is...
The usage is following.
wstring sentence = L"吾輩は猫である.名前はまだない."; cout << PerplexityWithSentencePiece(sentence) << endl; // 677.5
| const | string& src: text sentence |
Definition at line 176 of file perplexity_filter.cc.
| double KenLMFilter::Score | ( | const wstring | sentence | ) |
Score sentence by KenLM.
The step is...
Example:
| const | wstring &sentence: text sentence |
Definition at line 57 of file perplexity_filter.cc.
| double KenLMFilter::ScoreWithSentencePiece | ( | const wstring | sentence | ) |
Score sentence by KenLM with SentencePiece Tokenizing.
The step is...
Example:
| const | wstring &sentence: text sentence |
Definition at line 102 of file perplexity_filter.cc.
| sentencepiece::SentencePieceProcessor KenLMFilter::processor |
Definition at line 36 of file perplexity_filter.hh.