Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
11 changes: 11 additions & 0 deletions CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
=======
6.0.4
=======
----------------
New Features & Enhancements
----------------
* Introducing MiniLMEmbeddings (SPARKNLP-282)
* Introducing DataFrameOptimizer (SPARKNLP-1086)
* Added PDF Reader features (SPARKNLP-1161)


=======
6.0.3
=======
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ $ java -version
$ conda create -n sparknlp python=3.7 -y
$ conda activate sparknlp
# spark-nlp by default is based on pyspark 3.x
$ pip install spark-nlp==6.0.3 pyspark==3.3.1
$ pip install spark-nlp==6.0.4 pyspark==3.3.1
```

In Python console or Jupyter `Python3` kernel:
Expand Down Expand Up @@ -129,7 +129,7 @@ For a quick example of using pipelines and models take a look at our official [d

### Apache Spark Support

Spark NLP *6.0.3* has been built on top of Apache Spark 3.4 while fully supports Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x
Spark NLP *6.0.4* has been built on top of Apache Spark 3.4 while fully supports Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, 3.4.x, and 3.5.x

| Spark NLP | Apache Spark 3.5.x | Apache Spark 3.4.x | Apache Spark 3.3.x | Apache Spark 3.2.x | Apache Spark 3.1.x | Apache Spark 3.0.x | Apache Spark 2.4.x | Apache Spark 2.3.x |
|-----------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|
Expand Down Expand Up @@ -159,7 +159,7 @@ Find out more about 4.x `SparkNLP` versions in our official [documentation](http

### Databricks Support

Spark NLP 6.0.3 has been tested and is compatible with the following runtimes:
Spark NLP 6.0.4 has been tested and is compatible with the following runtimes:

| **CPU** | **GPU** |
|--------------------|--------------------|
Expand All @@ -176,7 +176,7 @@ We are compatible with older runtimes. For a full list check databricks support

### EMR Support

Spark NLP 6.0.3 has been tested and is compatible with the following EMR releases:
Spark NLP 6.0.4 has been tested and is compatible with the following EMR releases:

| **EMR Release** |
|--------------------|
Expand Down
104 changes: 6 additions & 98 deletions build.sbt
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
import sbtassembly.MergeStrategy
import Dependencies.*
import M2Resolvers.m2Resolvers
import Dependencies._
import sbtassembly.MergeStrategy

name := getPackageName(is_silicon, is_gpu, is_aarch64)

organization := "com.johnsnowlabs.nlp"

version := "6.0.3"
version := "6.0.4"

(ThisBuild / scalaVersion) := scalaVer

Expand Down Expand Up @@ -34,100 +34,8 @@ Compile / doc / target := baseDirectory.value / "docs/api"
coverageExcludedPackages := ".*nlp.embeddings.*;.*ml.tensorflow.*;.*nlp.annotators.classifier.dl.*;" +
".*nlp.annotators.seq2seq.*;.*ml.*"

licenses += "Apache-2.0" -> url("https://opensource.org/licenses/Apache-2.0")

(ThisBuild / resolvers) := m2Resolvers

credentials += Credentials(Path.userHome / ".ivy2" / ".sbtcredentials")

sonatypeProfileName := "com.johnsnowlabs.nlp"

publishTo := sonatypePublishToBundle.value

sonatypeRepository := "https://s01.oss.sonatype.org/service/local"

sonatypeCredentialHost := "s01.oss.sonatype.org"

publishTo := {
val nexus = "https://s01.oss.sonatype.org/"
if (isSnapshot.value) Some("snapshots" at nexus + "content/repositories/snapshots")
else Some("releases" at nexus + "service/local/staging/deploy/maven2")
}

homepage := Some(url("https://sparknlp.org"))

scmInfo := Some(
ScmInfo(
url("https://github.com/JohnSnowLabs/spark-nlp"),
"scm:[email protected]:JohnSnowLabs/spark-nlp.git"))

(ThisBuild / developers) := List(
Developer(
id = "saifjsl",
name = "Saif Addin",
email = "[email protected]",
url = url("https://github.com/saifjsl")),
Developer(
id = "maziyarpanahi",
name = "Maziyar Panahi",
email = "[email protected]",
url = url("https://github.com/maziyarpanahi")),
Developer(
id = "albertoandreottiATgmail",
name = "Alberto Andreotti",
email = "[email protected]",
url = url("https://github.com/albertoandreottiATgmail")),
Developer(
id = "danilojsl",
name = "Danilo Burbano",
email = "[email protected]",
url = url("https://github.com/danilojsl")),
Developer(
id = "rohit13k",
name = "Rohit Kumar",
email = "[email protected]",
url = url("https://github.com/rohit13k")),
Developer(
id = "aleksei-ai",
name = "Aleksei Alekseev",
email = "[email protected]",
url = url("https://github.com/aleksei-ai")),
Developer(
id = "showy",
name = "Eduardo Muñoz",
email = "[email protected]",
url = url("https://github.com/showy")),
Developer(
id = "C-K-Loan",
name = "Christian Kasim Loan",
email = "[email protected]",
url = url("https://github.com/C-K-Loan")),
Developer(
id = "wolliq",
name = "Stefano Lori",
email = "[email protected]",
url = url("https://github.com/wolliq")),
Developer(
id = "vankov",
name = "Ivan Vankov",
email = "[email protected]",
url = url("https://github.com/vankov")),
Developer(
id = "alinapetukhova",
name = "Alina Petukhova",
email = "[email protected]",
url = url("https://github.com/alinapetukhova")),
Developer(
id = "hatrungduc",
name = "Devin Ha",
email = "[email protected]",
url = url("https://github.com/hatrungduc")),
Developer(
id = "ahmedlone127",
name = "Khawja Ahmed Lone",
email = "[email protected]",
url = url("https://github.com/ahmedlone127")))

lazy val analyticsDependencies = Seq(
"org.apache.spark" %% "spark-core" % sparkVer % Provided,
"org.apache.spark" %% "spark-mllib" % sparkVer % Provided)
Expand Down Expand Up @@ -164,8 +72,7 @@ lazy val utilDependencies = Seq(
exclude ("org.apache.logging.log4j", "log4j-api"),
scratchpad
exclude ("org.apache.logging.log4j", "log4j-api"),
pdfBox
)
pdfBox)

lazy val typedDependencyParserDependencies = Seq(junit)

Expand Down Expand Up @@ -238,7 +145,8 @@ lazy val root = (project in file("."))

(assembly / assemblyMergeStrategy) := {
case PathList("META-INF", "versions", "9", "module-info.class") => MergeStrategy.discard
case PathList("module-info.class") => MergeStrategy.discard // Discard any module-info.class globally
case PathList("module-info.class") =>
MergeStrategy.discard // Discard any module-info.class globally
case PathList("apache.commons.lang3", _ @_*) => MergeStrategy.discard
case PathList("org.apache.hadoop", _ @_*) => MergeStrategy.first
case PathList("com.amazonaws", _ @_*) => MergeStrategy.last
Expand Down
4 changes: 2 additions & 2 deletions conda/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
{% set name = "spark-nlp" %}
{% set version = "6.0.3" %}
{% set version = "6.0.4" %}

package:
name: {{ name|lower }}
version: {{ version }}

source:
url: https://pypi.io/packages/source/{{ name[0] }}/{{ name }}/spark_nlp-{{ version }}.tar.gz
sha256: ff09f27c512401cff1ec3af572069b2e2af35b87a0f6737c5340538bac10faf7
sha256: 29daf034686ae428eaf83dd371ad095b42164dff40148039002cd3afa66389ee

build:
noarch: python
Expand Down
8 changes: 4 additions & 4 deletions docs/api/com/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
<title>Spark NLP 6.0.3 ScalaDoc - com</title>
<meta name="description" content="Spark NLP 6.0.3 ScalaDoc - com" />
<meta name="keywords" content="Spark NLP 6.0.3 ScalaDoc com" />
<title>Spark NLP 6.0.4 ScalaDoc - com</title>
<meta name="description" content="Spark NLP 6.0.4 ScalaDoc - com" />
<meta name="keywords" content="Spark NLP 6.0.4 ScalaDoc com" />
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />


Expand All @@ -28,7 +28,7 @@
</head>
<body>
<div id="search">
<span id="doc-title">Spark NLP 6.0.3 ScalaDoc<span id="doc-version"></span></span>
<span id="doc-title">Spark NLP 6.0.4 ScalaDoc<span id="doc-version"></span></span>
<span class="close-results"><span class="left">&lt;</span> Back</span>
<div id="textfilter">
<span class="input">
Expand Down
8 changes: 4 additions & 4 deletions docs/api/com/johnsnowlabs/client/CloudClient.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
<title>Spark NLP 6.0.3 ScalaDoc - com.johnsnowlabs.client.CloudClient</title>
<meta name="description" content="Spark NLP 6.0.3 ScalaDoc - com.johnsnowlabs.client.CloudClient" />
<meta name="keywords" content="Spark NLP 6.0.3 ScalaDoc com.johnsnowlabs.client.CloudClient" />
<title>Spark NLP 6.0.4 ScalaDoc - com.johnsnowlabs.client.CloudClient</title>
<meta name="description" content="Spark NLP 6.0.4 ScalaDoc - com.johnsnowlabs.client.CloudClient" />
<meta name="keywords" content="Spark NLP 6.0.4 ScalaDoc com.johnsnowlabs.client.CloudClient" />
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />


Expand All @@ -28,7 +28,7 @@
</head>
<body>
<div id="search">
<span id="doc-title">Spark NLP 6.0.3 ScalaDoc<span id="doc-version"></span></span>
<span id="doc-title">Spark NLP 6.0.4 ScalaDoc<span id="doc-version"></span></span>
<span class="close-results"><span class="left">&lt;</span> Back</span>
<div id="textfilter">
<span class="input">
Expand Down
8 changes: 4 additions & 4 deletions docs/api/com/johnsnowlabs/client/CloudManager.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
<title>Spark NLP 6.0.3 ScalaDoc - com.johnsnowlabs.client.CloudManager</title>
<meta name="description" content="Spark NLP 6.0.3 ScalaDoc - com.johnsnowlabs.client.CloudManager" />
<meta name="keywords" content="Spark NLP 6.0.3 ScalaDoc com.johnsnowlabs.client.CloudManager" />
<title>Spark NLP 6.0.4 ScalaDoc - com.johnsnowlabs.client.CloudManager</title>
<meta name="description" content="Spark NLP 6.0.4 ScalaDoc - com.johnsnowlabs.client.CloudManager" />
<meta name="keywords" content="Spark NLP 6.0.4 ScalaDoc com.johnsnowlabs.client.CloudManager" />
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />


Expand All @@ -28,7 +28,7 @@
</head>
<body>
<div id="search">
<span id="doc-title">Spark NLP 6.0.3 ScalaDoc<span id="doc-version"></span></span>
<span id="doc-title">Spark NLP 6.0.4 ScalaDoc<span id="doc-version"></span></span>
<span class="close-results"><span class="left">&lt;</span> Back</span>
<div id="textfilter">
<span class="input">
Expand Down
8 changes: 4 additions & 4 deletions docs/api/com/johnsnowlabs/client/CloudResources$.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
<title>Spark NLP 6.0.3 ScalaDoc - com.johnsnowlabs.client.CloudResources</title>
<meta name="description" content="Spark NLP 6.0.3 ScalaDoc - com.johnsnowlabs.client.CloudResources" />
<meta name="keywords" content="Spark NLP 6.0.3 ScalaDoc com.johnsnowlabs.client.CloudResources" />
<title>Spark NLP 6.0.4 ScalaDoc - com.johnsnowlabs.client.CloudResources</title>
<meta name="description" content="Spark NLP 6.0.4 ScalaDoc - com.johnsnowlabs.client.CloudResources" />
<meta name="keywords" content="Spark NLP 6.0.4 ScalaDoc com.johnsnowlabs.client.CloudResources" />
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />


Expand All @@ -28,7 +28,7 @@
</head>
<body>
<div id="search">
<span id="doc-title">Spark NLP 6.0.3 ScalaDoc<span id="doc-version"></span></span>
<span id="doc-title">Spark NLP 6.0.4 ScalaDoc<span id="doc-version"></span></span>
<span class="close-results"><span class="left">&lt;</span> Back</span>
<div id="textfilter">
<span class="input">
Expand Down
8 changes: 4 additions & 4 deletions docs/api/com/johnsnowlabs/client/CloudStorage.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
<title>Spark NLP 6.0.3 ScalaDoc - com.johnsnowlabs.client.CloudStorage</title>
<meta name="description" content="Spark NLP 6.0.3 ScalaDoc - com.johnsnowlabs.client.CloudStorage" />
<meta name="keywords" content="Spark NLP 6.0.3 ScalaDoc com.johnsnowlabs.client.CloudStorage" />
<title>Spark NLP 6.0.4 ScalaDoc - com.johnsnowlabs.client.CloudStorage</title>
<meta name="description" content="Spark NLP 6.0.4 ScalaDoc - com.johnsnowlabs.client.CloudStorage" />
<meta name="keywords" content="Spark NLP 6.0.4 ScalaDoc com.johnsnowlabs.client.CloudStorage" />
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />


Expand All @@ -28,7 +28,7 @@
</head>
<body>
<div id="search">
<span id="doc-title">Spark NLP 6.0.3 ScalaDoc<span id="doc-version"></span></span>
<span id="doc-title">Spark NLP 6.0.4 ScalaDoc<span id="doc-version"></span></span>
<span class="close-results"><span class="left">&lt;</span> Back</span>
<div id="textfilter">
<span class="input">
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
<title>Spark NLP 6.0.3 ScalaDoc - com.johnsnowlabs.client.aws.AWSAnonymousCredentials</title>
<meta name="description" content="Spark NLP 6.0.3 ScalaDoc - com.johnsnowlabs.client.aws.AWSAnonymousCredentials" />
<meta name="keywords" content="Spark NLP 6.0.3 ScalaDoc com.johnsnowlabs.client.aws.AWSAnonymousCredentials" />
<title>Spark NLP 6.0.4 ScalaDoc - com.johnsnowlabs.client.aws.AWSAnonymousCredentials</title>
<meta name="description" content="Spark NLP 6.0.4 ScalaDoc - com.johnsnowlabs.client.aws.AWSAnonymousCredentials" />
<meta name="keywords" content="Spark NLP 6.0.4 ScalaDoc com.johnsnowlabs.client.aws.AWSAnonymousCredentials" />
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />


Expand All @@ -28,7 +28,7 @@
</head>
<body>
<div id="search">
<span id="doc-title">Spark NLP 6.0.3 ScalaDoc<span id="doc-version"></span></span>
<span id="doc-title">Spark NLP 6.0.4 ScalaDoc<span id="doc-version"></span></span>
<span class="close-results"><span class="left">&lt;</span> Back</span>
<div id="textfilter">
<span class="input">
Expand Down
8 changes: 4 additions & 4 deletions docs/api/com/johnsnowlabs/client/aws/AWSBasicCredentials.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
<title>Spark NLP 6.0.3 ScalaDoc - com.johnsnowlabs.client.aws.AWSBasicCredentials</title>
<meta name="description" content="Spark NLP 6.0.3 ScalaDoc - com.johnsnowlabs.client.aws.AWSBasicCredentials" />
<meta name="keywords" content="Spark NLP 6.0.3 ScalaDoc com.johnsnowlabs.client.aws.AWSBasicCredentials" />
<title>Spark NLP 6.0.4 ScalaDoc - com.johnsnowlabs.client.aws.AWSBasicCredentials</title>
<meta name="description" content="Spark NLP 6.0.4 ScalaDoc - com.johnsnowlabs.client.aws.AWSBasicCredentials" />
<meta name="keywords" content="Spark NLP 6.0.4 ScalaDoc com.johnsnowlabs.client.aws.AWSBasicCredentials" />
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />


Expand All @@ -28,7 +28,7 @@
</head>
<body>
<div id="search">
<span id="doc-title">Spark NLP 6.0.3 ScalaDoc<span id="doc-version"></span></span>
<span id="doc-title">Spark NLP 6.0.4 ScalaDoc<span id="doc-version"></span></span>
<span class="close-results"><span class="left">&lt;</span> Back</span>
<div id="textfilter">
<span class="input">
Expand Down
8 changes: 4 additions & 4 deletions docs/api/com/johnsnowlabs/client/aws/AWSClient.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
<title>Spark NLP 6.0.3 ScalaDoc - com.johnsnowlabs.client.aws.AWSClient</title>
<meta name="description" content="Spark NLP 6.0.3 ScalaDoc - com.johnsnowlabs.client.aws.AWSClient" />
<meta name="keywords" content="Spark NLP 6.0.3 ScalaDoc com.johnsnowlabs.client.aws.AWSClient" />
<title>Spark NLP 6.0.4 ScalaDoc - com.johnsnowlabs.client.aws.AWSClient</title>
<meta name="description" content="Spark NLP 6.0.4 ScalaDoc - com.johnsnowlabs.client.aws.AWSClient" />
<meta name="keywords" content="Spark NLP 6.0.4 ScalaDoc com.johnsnowlabs.client.aws.AWSClient" />
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />


Expand All @@ -28,7 +28,7 @@
</head>
<body>
<div id="search">
<span id="doc-title">Spark NLP 6.0.3 ScalaDoc<span id="doc-version"></span></span>
<span id="doc-title">Spark NLP 6.0.4 ScalaDoc<span id="doc-version"></span></span>
<span class="close-results"><span class="left">&lt;</span> Back</span>
<div id="textfilter">
<span class="input">
Expand Down
Loading