Go to the source code of this file.
|
| uint64_t | ConutLines (const string &filename) |
| | Get line count of filename file.
|
| |
| void | SplitFiles (const vector< string > &output_files, const string &input_file) |
| | split one file into multiple equal parts based on the number of lines
|
| |
| void | MergeFiles (const vector< string > &input_files, const string &output_file) |
| | split one file into multiple equal parts based on the number of lines
|
| |
| void | MultiProcessCorpusClean (const string input_folder_path, const string output_folder_path) |
| |
| int | main (void) |
| |
◆ ConutLines()
| uint64_t ConutLines |
( |
const string & | filename | ) |
|
Get line count of filename file.
Example: string input_path = "../data/input/"; ConutLines(input_path);
- Parameters
-
| string | filename: file name |
- Returns
- uint64_t: count of file line
Definition at line 16 of file main.cpp.
◆ main()
◆ MergeFiles()
| void MergeFiles |
( |
const vector< string > & | input_files, |
|
|
const string & | output_file ) |
split one file into multiple equal parts based on the number of lines
- Parameters
-
| sconst | vector<string>& input_files: file list that is merged |
| const | string& output_file: merged file |
- Returns
- void: None
Definition at line 80 of file main.cpp.
◆ MultiProcessCorpusClean()
| void MultiProcessCorpusClean |
( |
const string | input_folder_path, |
|
|
const string | output_folder_path ) |
◆ SplitFiles()
| void SplitFiles |
( |
const vector< string > & | output_files, |
|
|
const string & | input_file ) |
split one file into multiple equal parts based on the number of lines
- Parameters
-
| string | filename: file name |
- Returns
- uint64_t: count of file line
Definition at line 32 of file main.cpp.