Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
17 changes: 17 additions & 0 deletions CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,3 +1,20 @@
=======
6.2.0
=======
---------------------------
New Features & Enhancements
---------------------------

* [SPARKNLP-1288] AutoGGUF close model #14671
* [SPARKNLP-1283] Add remove thinking flag #14672
* [SPARKNLP-1293] Enhancements EntityRuler and DocumentNormalizer #14674
* [SPARKNLP-1299] Add Hierarchical Element Identification to HTMLReader #14675

---------
Bug Fixes
---------
* [SPARKNLP-1300] RobertaEmbeddings: changing token sequence in warmup test #14677

=======
6.1.5
=======
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ $ java -version
$ conda create -n sparknlp python=3.7 -y
$ conda activate sparknlp
# spark-nlp by default is based on pyspark 3.x
$ pip install spark-nlp==6.1.5 pyspark==3.3.1
$ pip install spark-nlp==6.2.0 pyspark==3.3.1
```

In Python console or Jupyter `Python3` kernel:
Expand Down Expand Up @@ -129,7 +129,7 @@ For a quick example of using pipelines and models take a look at our official [d

### Apache Spark Support

Spark NLP *6.1.5* has been built on top of Apache Spark 3.4 while fully supports Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x
Spark NLP *6.2.0* has been built on top of Apache Spark 3.4 while fully supports Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x

| Spark NLP | Apache Spark 3.5.x | Apache Spark 3.4.x | Apache Spark 3.3.x | Apache Spark 3.2.x | Apache Spark 3.1.x | Apache Spark 3.0.x | Apache Spark 2.4.x | Apache Spark 2.3.x |
|-----------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|
Expand Down Expand Up @@ -159,7 +159,7 @@ Find out more about 4.x `SparkNLP` versions in our official [documentation](http

### Databricks Support

Spark NLP 6.1.5 has been tested and is compatible with the following runtimes:
Spark NLP 6.2.0 has been tested and is compatible with the following runtimes:

| **CPU** | **GPU** |
|--------------------|--------------------|
Expand All @@ -177,7 +177,7 @@ We are compatible with older runtimes. For a full list check databricks support

### EMR Support

Spark NLP 6.1.5 has been tested and is compatible with the following EMR releases:
Spark NLP 6.2.0 has been tested and is compatible with the following EMR releases:

| **EMR Release** |
|--------------------|
Expand Down Expand Up @@ -267,7 +267,7 @@ Please check [these instructions](https://sparknlp.org/docs/en/install#s3-integr
Need more **examples**? Check out our dedicated [Spark NLP Examples](https://github.com/JohnSnowLabs/spark-nlp/tree/master/examples)
repository to showcase all Spark NLP use cases!

Also, don't forget to check [Spark NLP in Action](https://sparknlp.org/demo) built by Streamlit.
Also, don't forget to check [Spark NLP in Action](https://sparknlp.org/demos) built by Streamlit.

#### All examples: [spark-nlp/examples](https://github.com/JohnSnowLabs/spark-nlp/tree/master/examples)

Expand Down
2 changes: 1 addition & 1 deletion build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ name := getPackageName(is_silicon, is_gpu, is_aarch64)

organization := "com.johnsnowlabs.nlp"

version := "6.1.5"
version := "6.2.0"

(ThisBuild / scalaVersion) := scalaVer

Expand Down
4 changes: 2 additions & 2 deletions conda/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
{% set name = "spark-nlp" %}
{% set version = "6.1.5" %}
{% set version = "6.2.0" %}

package:
name: {{ name|lower }}
version: {{ version }}

source:
url: https://pypi.io/packages/source/{{ name[0] }}/{{ name }}/spark_nlp-{{ version }}.tar.gz
sha256: 834e5b785d6f1c6deb48195d88d11ae45433bb398dc632a63106e08fbe6f9273
sha256: 7cbeafc7d01afcda6f7dbb76cfb7fd34893fd2c98c4301d4d692a8962cd69f70

build:
noarch: python
Expand Down
6 changes: 3 additions & 3 deletions docs/Gemfile
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
source "https://rubygems.org"

gem "github-pages", "227"
gem "nokogiri", ">= 1.13.9"

gem "nokogiri", ">= 1.18.9"

gem "elasticsearch", "~> 7.10"

gem 'wdm', '~> 0.1.0'

gem "webrick"
gem "webrick", ">= 1.8.2"

gem "jekyll", "~> 3.9"

gem "aws-sdk-s3", "~>1"


group "jekyll-plugins" do
gem "jekyll-incremental", "0.1.0", path: "_plugins/jekyll-incremental"
end
132 changes: 63 additions & 69 deletions docs/Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -13,78 +13,65 @@ GEM
minitest (~> 5.1)
tzinfo (~> 1.1)
zeitwerk (~> 2.2, >= 2.2.2)
addressable (2.8.1)
public_suffix (>= 2.0.2, < 6.0)
addressable (2.8.7)
public_suffix (>= 2.0.2, < 7.0)
aws-eventstream (1.4.0)
aws-partitions (1.1126.0)
aws-sdk-core (3.226.2)
aws-partitions (1.1172.0)
aws-sdk-core (3.233.0)
aws-eventstream (~> 1, >= 1.3.0)
aws-partitions (~> 1, >= 1.992.0)
aws-sigv4 (~> 1.9)
base64
bigdecimal
jmespath (~> 1, >= 1.6.1)
logger
aws-sdk-kms (1.106.0)
aws-sdk-core (~> 3, >= 3.225.0)
aws-sdk-kms (1.113.0)
aws-sdk-core (~> 3, >= 3.231.0)
aws-sigv4 (~> 1.5)
aws-sdk-s3 (1.192.0)
aws-sdk-core (~> 3, >= 3.225.0)
aws-sdk-s3 (1.199.1)
aws-sdk-core (~> 3, >= 3.231.0)
aws-sdk-kms (~> 1)
aws-sigv4 (~> 1.5)
aws-sigv4 (1.12.1)
aws-eventstream (~> 1, >= 1.0.2)
base64 (0.3.0)
bigdecimal (3.3.1)
coffee-script (2.4.1)
coffee-script-source
execjs
coffee-script-source (1.11.1)
colorator (1.1.0)
commonmarker (0.23.8)
concurrent-ruby (1.2.2)
dnsruby (1.61.9)
simpleidn (~> 0.1)
elasticsearch (7.17.7)
elasticsearch-api (= 7.17.7)
elasticsearch-transport (= 7.17.7)
elasticsearch-api (7.17.7)
commonmarker (0.23.12)
concurrent-ruby (1.3.5)
dnsruby (1.73.0)
base64 (>= 0.2)
logger (~> 1.6)
simpleidn (~> 0.2.1)
elasticsearch (7.17.11)
elasticsearch-api (= 7.17.11)
elasticsearch-transport (= 7.17.11)
elasticsearch-api (7.17.11)
multi_json
elasticsearch-transport (7.17.7)
faraday (~> 1)
elasticsearch-transport (7.17.11)
base64
faraday (>= 1, < 3)
multi_json
em-websocket (0.5.3)
eventmachine (>= 0.12.9)
http_parser.rb (~> 0)
ethon (0.16.0)
ethon (0.17.0)
ffi (>= 1.15.0)
eventmachine (1.2.7)
eventmachine (1.2.7-x64-mingw32)
execjs (2.8.1)
faraday (1.10.3)
faraday-em_http (~> 1.0)
faraday-em_synchrony (~> 1.0)
faraday-excon (~> 1.1)
faraday-httpclient (~> 1.0)
faraday-multipart (~> 1.0)
faraday-net_http (~> 1.0)
faraday-net_http_persistent (~> 1.0)
faraday-patron (~> 1.0)
faraday-rack (~> 1.0)
faraday-retry (~> 1.0)
ruby2_keywords (>= 0.0.4)
faraday-em_http (1.0.0)
faraday-em_synchrony (1.0.0)
faraday-excon (1.1.0)
faraday-httpclient (1.0.1)
faraday-multipart (1.0.4)
multipart-post (~> 2)
faraday-net_http (1.0.1)
faraday-net_http_persistent (1.2.0)
faraday-patron (1.0.0)
faraday-rack (1.0.0)
faraday-retry (1.0.3)
ffi (1.15.5)
ffi (1.15.5-x64-mingw-ucrt)
ffi (1.15.5-x64-mingw32)
execjs (2.10.0)
faraday (2.14.0)
faraday-net_http (>= 2.0, < 3.5)
json
logger
faraday-net_http (3.4.1)
net-http (>= 0.5.0)
ffi (1.17.2)
ffi (1.17.2-arm64-darwin)
forwardable-extended (2.6.0)
gemoji (3.0.1)
github-pages (227)
Expand Down Expand Up @@ -253,41 +240,50 @@ GEM
html-pipeline (~> 2.2)
jekyll (>= 3.0, < 5.0)
jmespath (1.6.2)
json (2.15.1)
kramdown (2.3.2)
rexml
kramdown-parser-gfm (1.1.0)
kramdown (~> 2.0)
liquid (4.0.3)
listen (3.8.0)
listen (3.9.0)
rb-fsevent (~> 0.10, >= 0.10.3)
rb-inotify (~> 0.9, >= 0.9.10)
logger (1.7.0)
mercenary (0.3.6)
mini_portile2 (2.8.1)
mini_portile2 (2.8.9)
minima (2.5.1)
jekyll (>= 3.5, < 5.0)
jekyll-feed (~> 0.9)
jekyll-seo-tag (~> 2.1)
minitest (5.18.0)
multi_json (1.15.0)
multipart-post (2.3.0)
nokogiri (1.14.2)
mini_portile2 (~> 2.8.0)
minitest (5.26.0)
multi_json (1.17.0)
net-http (0.6.0)
uri
nokogiri (1.18.10)
mini_portile2 (~> 2.8.2)
racc (~> 1.4)
nokogiri (1.18.10-arm64-darwin)
racc (~> 1.4)
nokogiri (1.18.10-x64-mingw-ucrt)
racc (~> 1.4)
nokogiri (1.18.10-x86_64-darwin)
racc (~> 1.4)
nokogiri (1.18.10-x86_64-linux-gnu)
racc (~> 1.4)
octokit (4.25.1)
faraday (>= 1, < 3)
sawyer (~> 0.9)
pathutil (0.16.2)
forwardable-extended (~> 2.6)
public_suffix (4.0.7)
racc (1.6.2)
racc (1.8.1)
rb-fsevent (0.11.2)
rb-inotify (0.10.1)
rb-inotify (0.11.1)
ffi (~> 1.0)
rexml (3.2.5)
rexml (3.4.4)
rouge (3.26.0)
ruby2_keywords (0.0.5)
rubyzip (2.3.2)
rubyzip (2.4.1)
safe_yaml (1.0.5)
sass (3.7.4)
sass-listen (~> 4.0.0)
Expand All @@ -297,24 +293,22 @@ GEM
sawyer (0.9.2)
addressable (>= 2.3.5)
faraday (>= 0.17.3, < 3)
simpleidn (0.2.1)
unf (~> 0.1.4)
simpleidn (0.2.3)
terminal-table (1.8.0)
unicode-display_width (~> 1.1, >= 1.1.1)
thread_safe (0.3.6)
typhoeus (1.4.0)
typhoeus (1.4.1)
ethon (>= 0.9.0)
tzinfo (1.2.11)
thread_safe (~> 0.1)
unf (0.1.4)
unf_ext
unf_ext (0.0.8.2)
unicode-display_width (1.8.0)
uri (1.0.4)
wdm (0.1.1)
webrick (1.8.1)
zeitwerk (2.6.7)
webrick (1.9.1)
zeitwerk (2.6.18)

PLATFORMS
arm64-darwin-24
x64-mingw-ucrt
x64-mingw32
x86_64-darwin-21
Expand All @@ -327,9 +321,9 @@ DEPENDENCIES
github-pages (= 227)
jekyll (~> 3.9)
jekyll-incremental (= 0.1.0)!
nokogiri (>= 1.13.9)
nokogiri (>= 1.18.9)
wdm (~> 0.1.0)
webrick
webrick (>= 1.8.2)

BUNDLED WITH
2.3.24
2.3.26
5 changes: 3 additions & 2 deletions docs/_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ baseurl : # does not include hostname
title : Spark NLP
description: > # this means to ignore newlines until "Language & timezone"
High Performance NLP with Apache Spark
sparknlp_version: 6.1.5 # Version to be substituted in the documentation
sparknlp_version: 6.2.0 # Version to be substituted in the documentation


## => Language and Timezone
Expand Down Expand Up @@ -104,7 +104,8 @@ paginate_path: /page:num # don't change this unless for special need

## => Sources
##############################
sources: bootcdn # bootcdn (default), unpkg
# sources: bootcdn # bootcdn (default), unpkg
sources: unpkg # was bootcdn


## => Sharing
Expand Down
2 changes: 1 addition & 1 deletion docs/_config_local.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ baseurl : # does not include hostname
title : Spark NLP
description: > # this means to ignore newlines until "Language & timezone"
High Performance NLP with Apache Spark
sparknlp_version: 6.1.5 # Version to be substituted in the documentation
sparknlp_version: 6.2.0 # Version to be substituted in the documentation


## => Language and Timezone
Expand Down
Loading