JohnSnowLabs · maziyarpanahi · Jan 10, 2022 · Jan 3, 2022 · Jan 10, 2022 · Jan 10, 2022
diff --git a/docs/assets/images/ocr/text_detection.png b/docs/assets/images/ocr/text_detection.png
diff --git a/docs/en/ocr_object_detection.md b/docs/en/ocr_object_detection.md
@@ -134,3 +134,135 @@ display_images(data, "image_with_regions")
 **Output:**
 
 ![image](/assets/images/ocr/signature.png)
+
+
+
+## ImageTextDetector
+
+`ImageTextDetector` is a DL model for detect text on the image.
+It based on CRAFT network architecture.
+
+
+#### Input Columns
+
+{:.table-model-big}
+| Param name | Type | Default | Column Data Description |
+| --- | --- | --- | --- |
+| inputCol | string | image | image struct ([Image schema](ocr_structures#image-schema)) |
+
+#### Parameters
+
+{:.table-model-big}
+| Param name | Type | Default | Description |
+| --- | --- | --- | --- |
+| scoreThreshold | float | 0.9 | Score threshold for output regions.|
+| sizeThreshold | int | 5 | Size threshold for detected text |
+| textThreshold | float | 0.4f | Text threshold |
+| linkThreshold | float | 0.4f | Link threshold |
+| width | integer | 0 | Scale width to this value, if 0 use original width |
+| height | integer | 0 | Scale height to this value, if 0 use original height |
+
+#### Output Columns
+
+{:.table-model-big}
+| Param name | Type | Default | Column Data Description |
+| --- | --- | --- | --- |
+| outputCol | string | table_regions | array of [Coordinaties]ocr_structures#coordinate-schema)|
+
+
+**Example:**
+
+<div class="tabs-box pt0" markdown="1">
+
+{% include programmingLanguageSelectScalaPython.html %}
+
+```scala
+import com.johnsnowlabs.ocr.transformers.*
+import com.johnsnowlabs.ocr.OcrContext.implicits._
+
+val imagePath = "path to image"
+
+// Read image file as binary file
+val df = spark.read
+  .format("binaryFile")
+  .load(imagePath)
+  .asImage("image")
+
+// Define transformer for detect text
+val text_detector = ImageTextDetector
+  .pretrained("text_detection_v1", "en", "clinical/ocr")
+  .setInputCol("image")
+  .setOutputCol("text_regions")
+
+val draw_regions = new ImageTextDetector()
+  .setInputCol("image")
+  .setInputRegionsCol("text_regions")
+  .setOutputCol("image_with_regions")
+  .setSizeThreshold(10)
+  .setScoreThreshold(0.9)
+  .setLinkThreshold(0.4)
+  .setTextThreshold(0.2)
+  .setWidth(1512)
+  .setHeight(2016)
+
+
+pipeline = PipelineModel(stages=[
+    binary_to_image,
+    text_detector,
+    draw_regions
+])
+
+val data = pipeline.transform(df)
+
+data.storeImage("image_with_regions")
+```
+
+```python
+from pyspark.ml import PipelineModel
+from sparkocr.transformers import *
+
+imagePath = "path to image"
+
+# Read image file as binary file
+df = spark.read 
+    .format("binaryFile")
+    .load(imagePath)
+
+binary_to_image = BinaryToImage() \
+    .setInputCol("content") \
+    .setOutputCol("image")
+
+# Define transformer for detect text
+text_detector = ImageTextDetector \
+  .pretrained("text_detection_v1", "en", "clinical/ocr") \
+  .setInputCol("image") \
+  .setOutputCol("text_regions") \
+  .setSizeThreshold(10) \
+  .setScoreThreshold(0.9) \
+  .setLinkThreshold(0.4) \
+  .setTextThreshold(0.2) \
+  .setWidth(1512) \
+  .setHeight(2016)
+
+draw_regions = ImageDrawRegions() \
+  .setInputCol("image") \
+  .setInputRegionsCol("text_regions") \
+  .setOutputCol("image_with_regions")
+
+
+pipeline = PipelineModel(stages=[
+    binary_to_image,
+    text_detector,
+    draw_regions
+])
+
+data = pipeline.transform(df)
+
+display_images(data, "image_with_regions")
+```
+
+</div>
+
+**Output:**
+
+![image](/assets/images/ocr/text_detection.png)
diff --git a/docs/en/ocr_pipeline_components.md b/docs/en/ocr_pipeline_components.md
@@ -2509,6 +2509,7 @@ data.show()
 | Param name | Type | Default | Description |
 | --- | --- | --- | --- |
 | explodeCols | Array[string] | |Columns which need to explode |
+| rotated | boolean | False | Support rotated regions |
 
 #### Output Columns
 
@@ -2741,6 +2742,7 @@ result = pipeline.transform(df)
 | --- | --- | --- | --- |
 | lineWidth | Int | 4 | Line width for draw rectangles |
 | fontSize | Int | 12 | Font size for render labels and score |
+| rotated | boolean | False | Support rotated regions |
 
 #### Output Columns
 

diff --git a/docs/en/ocr_release_notes.md b/docs/en/ocr_release_notes.md
@@ -11,6 +11,38 @@ sidebar:
     nav: spark-ocr
 ---
 
+
+## 3.10.0
+
+Release date: 10-01-2022
+
+
+#### Overview
+
+Form recognition using LayoutLMv2 and text detection.
+
+
+#### New Features
+
+* Added [VisualDocumentNERv2](ocr_visual_document_understanding#visualdocumentnerv2) transformer
+* Added DL based [ImageTextDetector](ocr_object_detection#imagetextdetector) transformer
+* Support rotated regions in [ImageSplitRegions](ocr_pipeline_components#imagesplitregions)
+* Support rotated regions in [ImageDrawRegions](ocr_pipeline_components#imagedrawregions)
+
+
+#### New Models
+
+* LayoutLMv2 fine-tuned on FUNSD dataset
+* Text detection model based on CRAFT architecture
+
+
+#### New notebooks
+
+* [Text Detection](https://github.com/JohnSnowLabs/spark-ocr-workshop/blob/3100-release-candidate/jupyter/TextDetection/SparkOcrImageTextDetection.ipynb)
+* [Visual Document NER v2](https://github.com/JohnSnowLabs/spark-ocr-workshop/blob/3100-release-candidate/jupyter/SparkOCRVisualDocumentNERv2.ipynb)
+
+
+
 ## 3.9.1
 
 Release date: 02-11-2021
@@ -28,6 +60,7 @@ Added preserving of original file formatting
 * [Preserve Original Formatting](https://github.com/JohnSnowLabs/spark-ocr-workshop/blob/3.9.1/jupyter/SparkOcrPreserveOriginalFormatting.ipynb)
 
 
+
 ## 3.9.0
 
 Release date: 20-10-2021

diff --git a/docs/en/ocr_visual_document_understanding.md b/docs/en/ocr_visual_document_understanding.md
@@ -243,7 +243,7 @@ document_ner = VisualDocumentNer() \
 pipeline = PipelineModel(stages=[
     binary_to_image,
     ocr,
-    document_ner,    
+    document_ner,
 ])
 
 result = pipeline.transform(df)
@@ -262,4 +262,141 @@ Output:
 | B-COMPANY, [word -> AEON, token -> aeon], []], [entity, 0, 0, B-COMPANY,|
 | [word -> CO., token -> co], ...                                         |
 +-------------------------------------------------------------------------+
-```
+```
+
+## VisualDocumentNERv2
+
+`VisualDocumentNERv2` is a DL model for NER documents which is an improved version of `VisualDocumentNER`. There is available pretrained model trained on FUNSD dataset.
+
+#### Input Columns
+
+{:.table-model-big}
+| Param name | Type | Default | Column Data Description |
+| --- | --- | --- | --- |
+| inputCols | Array[String] |  | Сolumn names for tokens of the document and image|
+
+
+#### Parameters
+
+{:.table-model-big}
+| Param name | Type | Default | Description |
+| --- | --- | --- | --- |
+| maxSentenceLength | int | 512 | Maximum sentence length. |
+| whiteList | Array[String] | | Whitelist of output labels |
+
+#### Output Columns
+
+{:.table-model-big}
+| Param name | Type | Default | Column Data Description |
+| --- | --- | --- | --- |
+| outputCol | string | entities | Name of output column with entities Annotation. |
+
+
+**Example:**
+
+
+<div class="tabs-box pt0" markdown="1">
+
+{% include programmingLanguageSelectScalaPython.html %}
+
+```scala
+import com.johnsnowlabs.ocr.transformers.*
+import com.johnsnowlabs.ocr.OcrContext.implicits._
+
+val imagePath = "path to image"
+
+var dataFrame = spark.read.format("binaryFile").load(imagePath)
+
+var bin2imTransformer = new BinaryToImage()
+bin2imTransformer.setImageType(ImageType.TYPE_3BYTE_BGR)
+
+val ocr = new ImageToHocr()
+  .setInputCol("image")
+  .setOutputCol("hocr")
+  .setIgnoreResolution(false)
+  .setOcrParams(Array("preserve_interword_spaces=0"))
+
+val tokenizer = new HocrTokenizer()
+  .setInputCol("hocr")
+  .setOutputCol("token")
+
+val visualDocumentNER = VisualDocumentNERv2
+  .pretrained("layoutlmv2_funsd", "en", "clinical/ocr")
+  .setInputCols(Array("token", "image"))
+
+val pipeline = new Pipeline()
+  .setStages(Array(
+    bin2imTransformer,
+    ocr,
+    tokenizer,
+    visualDocumentNER
+  ))
+
+val results = pipeline
+  .fit(dataFrame)
+  .transform(dataFrame)
+  .select("entities")
+  .cache()
+
+result.select("entities").show()
+```
+
+```python
+from pyspark.ml import PipelineModel
+from sparkocr.transformers import *
+
+imagePath = "path to image"
+
+# Read image file as binary file
+df = spark.read 
+    .format("binaryFile")
+    .load(imagePath)
+
+binToImage = BinaryToImage() \
+    .setInputCol("content") \
+    .setOutputCol("image")
+
+ocr = ImageToHocr()\
+    .setInputCol("image")\
+    .setOutputCol("hocr")\
+    .setIgnoreResolution(False)\
+    .setOcrParams(["preserve_interword_spaces=0"])
+
+tokenizer = HocrTokenizer()\
+    .setInputCol("hocr")\
+    .setOutputCol("token")
+
+ner = VisualDocumentNerV2()\
+    .pretrained("layoutlmv2_funsd", "en", "clinical/ocr")\
+    .setInputCols(["token", "image"])\
+    .setOutputCol("entities")
+
+pipeline = PipelineModel(stages=[
+    binToImage,
+    ocr,
+    tokenizer,
+    ner
+    ])
+
+result = pipeline.transform(df)
+result.withColumn('filename', path\_array.getItem(f.size(path_array)- 1)) \
+    .withColumn("exploded_entities", f.explode("entities")) \
+    .select("filename", "exploded_entities") \
+    .show(truncate=False)
+```
+
+</div>
+
+Output sample:
+
+```
++---------+-------------------------------------------------------------------------------------------------------------------------+
+|filename |exploded_entities                                                                                                        |
++---------+-------------------------------------------------------------------------------------------------------------------------+
+|form1.jpg|[entity, 0, 6, i-answer, [x -> 1027, y -> 89, height -> 19, confidence -> 96, word -> Version:, width -> 90], []]        |
+|form1.jpg|[entity, 25, 35, b-header, [x -> 407, y -> 190, height -> 37, confidence -> 96, word -> Institution, width -> 241], []]  |
+|form1.jpg|[entity, 37, 40, i-header, [x -> 667, y -> 190, height -> 37, confidence -> 96, word -> Name, width -> 130], []]         |
+|form1.jpg|[entity, 42, 52, b-question, [x -> 498, y -> 276, height -> 19, confidence -> 96, word -> Institution, width -> 113], []]|
+|form1.jpg|[entity, 54, 60, i-question, [x -> 618, y -> 276, height -> 19, confidence -> 96, word -> Address, width -> 89], []]     |
++---------+-------------------------------------------------------------------------------------------------------------------------+
+```