Skip to content

Conversation

@ambauma
Copy link

@ambauma ambauma commented Oct 18, 2017

What changes were proposed in this pull request?

This is the fix for the master branch applied to the 1.6 branch. My (unnamed) company will be using Spark 1.6 probably for another year. We have been blocked from having Spark 1.6 on our workstations until CVE-2017-7678 is patched, which SPARK-20393 does. I realize there will not be an official Spark 1.6.4 release, but it still seems wise to keep the code there patched for those who are stuck on that version. Otherwise I imagine several forks duplicating 1.6 compliance and security fixes.

How was this patch tested?

The patch came with unit tests. The test build passed. Manual testing on one of the effected screens showed the newline character removed. Screen display was the same regardless (html ignores newline characters).
screenshot from 2017-10-18 14-18-17

Please review http://spark.apache.org/contributing.html before opening a pull request.

The patch itself is from previous pull requests associated to SPARK-20939. My original "work" was actions on what to apply to branch 1.6. and I license the work to the project under the project’s open source license.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @ambauma .
Thank you for contributing.

  • SPARK-20393 should be backported branch-2.0 first.
  • PR titles should be the same with the original one except the version numbers.
[SPARK-20393][WEBU UI][2.0] Strengthen Spark to prevent XSS vulnerabilities
[SPARK-20393][WEBU UI][1.6] Strengthen Spark to prevent XSS vulnerabilities

@ambauma
Copy link
Author

ambauma commented Oct 18, 2017

Understood. Working on porting to 2.0...

@ambauma ambauma changed the title [SPARK-20393] [Core] Existing patch applied to 1.6 branch. [SPARK-20393][WEBU UI][1.6] Strengthen Spark to prevent XSS vulnerabilities Oct 18, 2017
@srowen
Copy link
Member

srowen commented Oct 19, 2017

I don't think there are any more 1.x releases coming, and doubt there are more 2.0.x releases. Do you really need this in Spark or is it something you can apply to your own release branch?

@ambauma
Copy link
Author

ambauma commented Oct 19, 2017

I have a release in my fork for my immediate needs. However, Spark 1.6 is still included in Hortonworks and is default in Cloudera. This patch addresses CVE-2017-7678. Some companies in strict regulatory environments may fail audits and be forced to remove Spark 1.6 if it is not patched. Rather than keeping security patches in forks, I think it makes sense to merge them back into the official branch for branches that are still in active use. That way if I get hit by a bus and CVE-2018-XXXX comes out, CVE-2017-7678 will already be covered and the work will not need to be duplicated.

@srowen
Copy link
Member

srowen commented Oct 19, 2017

(Spark 1.x is legacy in Cloudera, but, it has its own 1.x branch anyway)
I think it's not a big deal to backport if it goes into later branches first, sure. But I doubt there is another 1.x release here.

@ambauma
Copy link
Author

ambauma commented Oct 19, 2017

I just posted the 2.0 pull request. #19538

@SparkQA
Copy link

SparkQA commented Oct 19, 2017

Test build #3955 has finished for PR 19528 at commit ffe3e98.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ambauma
Copy link
Author

ambauma commented Oct 20, 2017

Working on duplicating PySpark failures...

@ambauma
Copy link
Author

ambauma commented Oct 20, 2017

Able to duplicate. Working theory is that this is related to numpy 1.12.1. Here is my conda env:

ca-certificates 2017.08.26 h1d4fec5_0
certifi 2016.2.28 py34_0
intel-openmp 2018.0.0 h15fc484_7
libedit 3.1 heed3624_0
libffi 3.2.1 h4deb6c0_3
libgcc-ng 7.2.0 h7cc24e2_2
libgfortran 1.0 0
libstdcxx-ng 7.2.0 h7a57d05_2
mkl 2017.0.3 0
ncurses 6.0 h06874d7_1
numpy 1.12.1 py34_0
openblas 0.2.19 0
openssl 1.0.2l h077ae2c_5
pip 9.0.1 py34_1
python 3.4.5 0
readline 6.2 2
setuptools 27.2.0 py34_0
sqlite 3.13.0 0
tk 8.5.18 0
wheel 0.29.0 py34_0
xz 5.2.3 h2bcbf08_1
zlib 1.2.11 hfbfcf68_1

…n LogisticRegressionModel

## What changes were proposed in this pull request?

Fixed TypeError with python3 and numpy 1.12.1. Numpy's `reshape` no longer takes floats as arguments as of 1.12. Also, python3 uses float division for `/`, we should be using `//` to ensure that `_dataWithBiasSize` doesn't get set to a float.

## How was this patch tested?

Existing tests run using python3 and numpy 1.12.

Author: Bago Amirbekian <[email protected]>

Closes apache#18081 from MrBago/BF-py3floatbug.
@ambauma
Copy link
Author

ambauma commented Oct 20, 2017

Believed fixed. Hard to say for sure without knowing the precise python and numpy versions the build is using.


package org.apache.spark.ui.jobs

import javax.servlet.http.HttpServletRequest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I'm not sure if this back-port is correct. This file's change doesn't look like it does anything and I don't see this change in the original: https://github.com/apache/spark/pull/17686/files

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will look into this...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, will remove.

@@ -0,0 +1,180 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise this isn't part of the backport is it? https://github.com/apache/spark/pull/17686/files

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into this as well...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what I did to make this whole file look new, but I've copied the 1.6 current and reapplied stripXSS locally. Waiting for my build to pass to commit again.

self._weightsMatrix = None
else:
self._dataWithBiasSize = self._coeff.size / (self._numClasses - 1)
self._dataWithBiasSize = self._coeff.size // (self._numClasses - 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nor this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to apply this to get past a python unit test failure. My assumption is that the NewSparkPullRequestBuilder is on a different version of numpy than when the Spark 1.6 branch was last built. The current python unit test failure looks like it has to do with a novel version of SciPy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already fixed in the 2.0 branch, btw. Just was never applied to 1.6. [SPARK-20862]

@SparkQA
Copy link

SparkQA commented Oct 20, 2017

Test build #3958 has finished for PR 19528 at commit cb1609b.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


import scala.collection.mutable.{HashMap, ListBuffer}
import scala.xml._
import scala.collection.JavaConverters._
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed?

@felixcheung
Copy link
Member

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Oct 23, 2017

Test build #82990 has finished for PR 19528 at commit 76ad8c5.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ambauma
Copy link
Author

ambauma commented Oct 27, 2017

I'm unable to duplicate the PySpark failures locally. I assume I need a specific version of SciPy to duplicate the error. Is there a way I could get what versions the build server is running? Something like:
sorted(["%s==%s" % (i.key, i.version) for i in pip.get_installed_distributions()]) for python and python 3.4?

@felixcheung
Copy link
Member

@shaneknapp - could you help check - what version of SciPy Jenkins is running with? thanks!

@shaneknapp
Copy link
Contributor

python2.7: 0.17.0
python3: 0.18.1

@shaneknapp
Copy link
Contributor

ok to test

@felixcheung
Copy link
Member

Jenkins test this please

@SparkQA
Copy link

SparkQA commented Jan 20, 2018

Test build #86410 has started for PR 19528 at commit 76ad8c5.

@jiangxb1987
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Apr 4, 2018

Test build #4149 has finished for PR 19528 at commit 76ad8c5.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor

retest this please

@felixcheung
Copy link
Member

Jenkins test this please

@SparkQA
Copy link

SparkQA commented Jun 12, 2018

Test build #91690 has started for PR 19528 at commit 76ad8c5.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jul 16, 2018

Test build #93074 has started for PR 19528 at commit 76ad8c5.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jul 16, 2018

Test build #93094 has finished for PR 19528 at commit 76ad8c5.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@srowen srowen mentioned this pull request Aug 20, 2018
@asfgit asfgit closed this in b8788b3 Aug 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.