diff --git a/docs/_posts/luca-martial/2022-02-03-w2v_cc_300d_fr.md b/docs/_posts/luca-martial/2022-02-03-w2v_cc_300d_fr.md new file mode 100644 index 00000000000000..bf72a0bf84a956 --- /dev/null +++ b/docs/_posts/luca-martial/2022-02-03-w2v_cc_300d_fr.md @@ -0,0 +1,84 @@ +--- +layout: model +title: Fastext Word Embeddings in French +author: John Snow Labs +name: w2v_cc_300d +date: 2022-02-03 +tags: [fr, open_source] +task: Embeddings +language: fr +edition: Spark NLP 3.4.0 +spark_version: 3.0 +supported: true +article_header: + type: cover +use_language_switcher: "Python-Scala-Java" +--- + +## Description + +Word Embeddings lookup annotator that maps tokens to vectors. + +## Predicted Entities + + + +{:.btn-box} + + +[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/w2v_cc_300d_fr_3.4.0_3.0_1643891127135.zip){:.button.button-orange.button-orange-trans.arr.button-icon} + +## How to use + + + +
+{% include programmingLanguageSelectScalaPythonNLU.html %} +```python +documentAssembler = DocumentAssembler()\ + .setInputCol("text")\ + .setOutputCol("document") + +tokenizer = Tokenizer()\ + .setInputCols("document")\ + .setOutputCol("token") + +embeddings = WordEmbeddingsModel.pretrained("w2v_cc_300d", "fr")\ + .setInputCols(["document", "token"])\ + .setOutputCol("embeddings") +``` +```scala +val documentAssembler = new DocumentAssembler() + .setInputCol("text") + .setOutputCol("document") + +val tokenizer = new Tokenizer() + .setInputCols("document") + .setOutputCol("token") + +val embeddings = WordEmbeddingsModel.pretrained("w2v_cc_300d", "fr") + .setInputCols(Array("document", "token")) + .setOutputCol("embeddings") +``` +
+ +{:.model-param} +## Model Information + +{:.table-model} +|---|---| +|Model Name:|w2v_cc_300d| +|Type:|embeddings| +|Compatibility:|Spark NLP 3.4.0+| +|License:|Open Source| +|Edition:|Official| +|Input Labels:|[document, token]| +|Output Labels:|[embeddings]| +|Language:|fr| +|Size:|1.3 GB| +|Case sensitive:|false| +|Dimension:|300| + +## References + +[FastText common crawl word embeddings for French](https://fasttext.cc/docs/en/crawl-vectors.html). \ No newline at end of file