Learning to generate questions from text.
Blog on this project :
Link1 : https://software.intel.com/en-us/articles/using-natural-language-processing-for-smart-question-generation
Link2 : http://dynamichub.in/aditya/sqg/
- Sentence Selection: This module selects topically important sentences from text document.
- Gap Selection: This module uses Standford Parser extract NP(noun phrase) and ADJP(Adjective Phrase) from important sentences as candidate gaps.
- Question Formation: This module generate actual questions from the fill in the blank type of question. It uses the NLTK parser and grammar syntax logics for the same.
- Question Classification: Classify question quality based on pre-trained SVM classifier (Conditional trained only for Blank type questions)
Some details about the project has also been mentioned in procedure.txt file which lies in the home directory itself.
Install Python2.7`in your systemgit clone https://github.com/adityasarvaiya/Automatic_Question_Generation.gitcd Automatic_Question_Generation pip install -r requirements.txtif you have problem with dotenv package then uninstall dotenv and install python-dotenv
pip install nltk
python
import nltk
nltk.download("punkt")
nltk.download("stopwords")
nltk.download("averaged_perceptron_taggepython r")- Create a folder to host all the stanford models, e.g.
mkdir /your-path-to-stanford-models/stanford-models.
- Download Stanford Parser at here, unzip, and:
- Move
stanford-parser.jarto stanford models folder, e.g./your-path-to-stanford-models/stanford-models/stanford-parser.jar - Move
stanford-parser-x-x-x-models.jarto stanford models folder. - Unzip
stanford-parser-x-x-x-models.jar, move/edu/stanford/nlp/models/lexparser/englishPCFG.ser.gztostanford-models/
- Move
- Download Stanford NER at here, unzip, and:
- Move
stanford-ner.jarto stanford models folder. - Move
stanford-ner-x-x-x.jarto stanford models folder (e.g. 3.7.0). - Move
/classifiers/english.all.3class.distsim.crf.ser.gzto stanford models folder.
- Move
The stanford models folder should looks like this:
- stanford-models/
| - stanford-parser.jar
| - stanford-parser-x-x-x-models.jar
| - englishPCFG.ser.gz
| - stanford-ner.jar
| - stanford-ner-x-x-x.jar
| - english.all.3class.distsim.crf.ser.gz
Create environment variable file with: touch .env for configuration (in project root).
SENTENCE_RATIO = 0.05 #The threshold of important sentences
STANFORD_JARS=/path-to-your-stanford-models/stanford-models/
STANFORD_PARSER_CLASSPATH=/path-to-your-stanford-models/stanford-models/stanford-parser-x.x.x-models.jar
STANFORD_NER_CLASSPATH=/path-to-your-stanford-models/stanford-models/stanford-ner.jar| ID | Variable Name | Variable Location | USE |
|---|---|---|---|
| 1 | SENTENCE_RATIO | .env file | Controls the ratio to sentence selection from given text. Range [0,1] |
| 2 | len(entities) > 7 | aqg/utils/gap_selection line 58 | It elemenates any sentence with more than 7 entities |
[embed] https://github.com/adityasarvaiya/Automatic_Question_Generation/blob/master/project.pdf [/embed]