Skip to content

Steps to run Chinese Word Count Module

RAJDEEP KAUR edited this page Feb 4, 2018 · 8 revisions
  1. Please use the link below to download the library necessary for the chinese wordcount: https://nlp.stanford.edu/software/stanford-segmenter-2017-06-09.zip

  2. Unzip it and find the file "ctb.gz" in the data folder "/stanford-segmenter-2017-06-09/data".

    Folder

  3. Upload the input.txt file and dictionary file with UTF-8 encoding in the TACIT word count module.

    Tip : (Windows)Open the file in Notepad. Click 'Save As...'. In the 'Encoding:' combo box you will see the current file format. If its is not UTF-8, save as UTF-8. (Mac) Use TextEditor for the input.

  4. When the wordcount runs for the first time with chinese text it will prompt for the chinese dictionary. Add that file from step 2 to tacit the first time, it will not be required again. Word Count View

    option

    Select

  5. You can now check the output files after TACIT word count finishes.

Clone this wiki locally