Command line for generating misspell texts

## Detailed description
From #614, we now have a way to produce misspells for Thai and English text.

One usecase of the module would be to simulate out-of-distribution (OOD) datasets due to misspelling. This command line will be an interface for practitioners who want to create such datasets. 


More precisely, give a text file, the command line will read the file (by line) and add misspells accordingly. The number of misspells should be configured as well as the random see. The user can specify the output path; if not, the default option would be that the command line use the filename and add a suffix.


## Context
In my view, being able to simulate OOD situations has implication in a number functionalities provided by PyThaiNLP, especially in segmentation related tasks.

## Possible implementation
```
thainlp misspell --file ./some/data.txt --seed=1  --mispell-ratio 0.05 

# output file: ./some/data[-misspelled-r.05-seed1].txt
```
Remarks:
- `[...]`  is the suffix added by the command line;
- `mispell-ratio` could be the number of misspells per 100 characters.


## What's next?
Once we have the command line, we could try to use it with datasets such as BEST2010 or other standard datasets and evaluate the behavior of segmentation algorithms provided by PyThaiNLP. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Command line for generating misspell texts #615

Detailed description

Context

Possible implementation

What's next?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Command line for generating misspell texts #615

Description

Detailed description

Context

Possible implementation

What's next?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions