Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
482 commits
Select commit Hold shift + click to select a range
254877c
[SPARK-20164][SQL] AnalysisException not tolerant of null query plan.
kunalkhamar Mar 31, 2017
c4c03ee
[SPARK-20084][CORE] Remove internal.metrics.updatedBlockStatuses from…
rdblue Mar 31, 2017
b2349e6
[SPARK-20160][SQL] Move ParquetConversions and OrcConversions Out Of …
gatorsmile Mar 31, 2017
567a50a
[SPARK-20165][SS] Resolve state encoder's deserializer in driver in F…
tdas Mar 31, 2017
cf5963c
[SPARK-20177] Document about compression way has some little detail ch…
Apr 1, 2017
89d6822
[SPARK-19148][SQL][FOLLOW-UP] do not expose the external table concep…
gatorsmile Apr 1, 2017
2287f3d
[SPARK-20186][SQL] BroadcastHint should use child's stats
Apr 1, 2017
d40cbb8
[SPARK-20143][SQL] DataType.fromJson should throw an exception with b…
HyukjinKwon Apr 2, 2017
76de2d1
[SPARK-20123][BUILD] SPARK_HOME variable might have spaces in it(e.g.…
Apr 2, 2017
657cb95
[SPARK-20173][SQL][HIVE-THRIFTSERVER] Throw NullPointerException when…
Apr 2, 2017
93dbfe7
[SPARK-20159][SPARKR][SQL] Support all catalog API in R
felixcheung Apr 2, 2017
2a903a1
[SPARK-19985][ML] Fixed copy method for some ML Models
BryanCutler Apr 3, 2017
cff11fd
[SPARK-20166][SQL] Use XXX for ISO 8601 timezone instead of ZZ (FastD…
HyukjinKwon Apr 3, 2017
364b0db
[MINOR][DOCS] Replace non-breaking space to normal spaces that breaks…
HyukjinKwon Apr 3, 2017
fb5869f
[SPARK-9002][CORE] KryoSerializer initialization does not include 'Ar…
Apr 3, 2017
4d28e84
[SPARK-19969][ML] Imputer doc and example
YY-OnCall Apr 3, 2017
4fa1a43
[SPARK-19641][SQL] JSON schema inference in DROPMALFORMED mode produc…
HyukjinKwon Apr 3, 2017
703c42c
[SPARK-20194] Add support for partition pruning to in-memory catalog
adrian-ionescu Apr 3, 2017
58c9e6e
[SPARK-20145] Fix range case insensitive bug in SQL
samelamin Apr 4, 2017
e7877fd
[SPARK-19408][SQL] filter estimation on two columns of same table
ron8hu Apr 4, 2017
3bfb639
[SPARK-10364][SQL] Support Parquet logical type TIMESTAMP_MILLIS
dilipbiswal Apr 4, 2017
51d3c85
[SPARK-20067][SQL] Unify and Clean Up Desc Commands Using Catalog Int…
gatorsmile Apr 4, 2017
b34f766
[SPARK-19825][R][ML] spark.ml R API for FPGrowth
zero323 Apr 4, 2017
c95fbea
[SPARK-20190][APP-ID] applications//jobs' in rest api,status should b…
Apr 4, 2017
26e7bca
[SPARK-20198][SQL] Remove the inconsistency in table/function name co…
gatorsmile Apr 4, 2017
11238d4
[SPARK-18278][SCHEDULER] Documentation to point to Kubernetes cluster…
foxish Apr 4, 2017
0736980
[SPARK-20191][YARN] Crate wrapper for RackResolver so tests can overr…
Apr 4, 2017
0e2ee82
[MINOR][R] Reorder `Collate` fields in DESCRIPTION file
HyukjinKwon Apr 4, 2017
402bf2a
[SPARK-20204][SQL] remove SimpleCatalystConf and CatalystConf type alias
cloud-fan Apr 4, 2017
295747e
[SPARK-19716][SQL] support by-name resolution for struct type element…
cloud-fan Apr 4, 2017
a59759e
[SPARK-20183][ML] Added outlierRatio arg to MLTestingUtils.testOutlie…
Apr 5, 2017
b28bbff
[SPARK-20003][ML] FPGrowthModel setMinConfidence should affect rules …
YY-OnCall Apr 5, 2017
c1b8b66
[SPARKR][DOC] update doc for fpgrowth
felixcheung Apr 5, 2017
b6e7103
Small doc fix for ReuseSubquery.
rxin Apr 5, 2017
dad499f
[SPARK-20209][SS] Execute next trigger immediately if previous batch …
tdas Apr 5, 2017
6f09dc7
[SPARK-20042][WEB UI] Fix log page buttons for reverse proxy mode
okoethibm Apr 5, 2017
71c3c48
[SPARK-19807][WEB UI] Add reason for cancellation when a stage is kil…
Apr 5, 2017
a2d8d76
[SPARK-20223][SQL] Fix typo in tpcds q77.sql
Apr 5, 2017
e277399
[SPARK-19454][PYTHON][SQL] DataFrame.replace improvements
zero323 Apr 5, 2017
9543fc0
[SPARK-20224][SS] Updated docs for streaming dropDuplicates and mapGr…
tdas Apr 5, 2017
9d68c67
[SPARK-20204][SQL][FOLLOWUP] SQLConf should react to change in defaul…
dilipbiswal Apr 6, 2017
1220605
[SPARK-20214][ML] Make sure converted csc matrix has sorted indices
viirya Apr 6, 2017
4000f12
[SPARK-20231][SQL] Refactor star schema code for the subsequent star …
ioana-delaney Apr 6, 2017
5142e5d
[SPARK-20217][CORE] Executor should not fail stage if killed task thr…
ericl Apr 6, 2017
e156b5d
[SPARK-19953][ML] Random Forest Models use parent UID when being fit
BryanCutler Apr 6, 2017
c8fc1f3
[SPARK-20085][MESOS] Configurable mesos labels for executors
Apr 6, 2017
d009fb3
[SPARK-20064][PYSPARK] Bump the PySpark verison number to 2.2
rubenljanssen Apr 6, 2017
bccc330
[SPARK-20196][PYTHON][SQL] update doc for catalog functions for all l…
felixcheung Apr 6, 2017
5a693b4
[SPARK-20195][SPARKR][SQL] add createTable catalog API and deprecate …
felixcheung Apr 6, 2017
a449162
[SPARK-17019][CORE] Expose on-heap and off-heap memory usage in vario…
jerryshao Apr 6, 2017
8129d59
[MINOR][DOCS] Fix typo in Hive Examples
Apr 6, 2017
626b4ca
[SPARK-19495][SQL] Make SQLConf slightly more extensible - addendum
rxin Apr 7, 2017
ad3cc13
[SPARK-20245][SQL][MINOR] pass output to LogicalRelation directly
cloud-fan Apr 7, 2017
1a52a62
[SPARK-20076][ML][PYSPARK] Add Python interface for ml.stats.Correlation
viirya Apr 7, 2017
9e0893b
[SPARK-20218][DOC][APP-ID] applications//stages' in REST API,add desc…
Apr 7, 2017
870b9d9
[SPARK-20026][DOC][SPARKR] Add Tweedie example for SparkR in programm…
actuaryzhang Apr 7, 2017
8feb799
[SPARK-20197][SPARKR] CRAN check fail with package installation
felixcheung Apr 7, 2017
1ad73f0
[SPARK-20258][DOC][SPARKR] Fix SparkR logistic regression example in …
actuaryzhang Apr 7, 2017
589f3ed
[SPARK-20255] Move listLeafFiles() to InMemoryFileIndex
adrian-ionescu Apr 7, 2017
7577e9c
[SPARK-20246][SQL] should not push predicate down through aggregate w…
cloud-fan Apr 8, 2017
e1afc4d
[SPARK-20262][SQL] AssertNotNull should throw NullPointerException
rxin Apr 8, 2017
34fc48f
[MINOR] Issue: Change "slice" vs "partition" in exception messages (a…
asmith26 Apr 9, 2017
1f0de3c
[SPARK-19991][CORE][YARN] FileSegmentManagedBuffer performance improv…
srowen Apr 9, 2017
261eaf5
[SPARK-20260][MLLIB] String interpolation required for error message
Apr 9, 2017
7a63f5e
[SPARK-20253][SQL] Remove unnecessary nullchecks of a return value fr…
kiszk Apr 10, 2017
7bfa05e
[SPARK-20264][SQL] asm should be non-test dependency in sql/core
rxin Apr 10, 2017
1a0bc41
[SPARK-20270][SQL] na.fill should not change the values in long or in…
Apr 10, 2017
3d7f201
[SPARK-20229][SQL] add semanticHash to QueryPlan
cloud-fan Apr 10, 2017
4f7d49b
[SPARK-20243][TESTS] DebugFilesystem.assertNoOpenStreams thread race
bogdanrdc Apr 10, 2017
5acaf8c
[SPARK-19518][SQL] IGNORE NULLS in first / last in SQL
HyukjinKwon Apr 10, 2017
fd711ea
[SPARK-20273][SQL] Disallow Non-deterministic Filter push-down into J…
gatorsmile Apr 10, 2017
a26e3ed
[SPARK-20156][CORE][SQL][STREAMING][MLLIB] Java String toLowerCase "T…
srowen Apr 10, 2017
f6dd8e0
[SPARK-20280][CORE] FileStatusCache Weigher integer overflow
bogdanrdc Apr 10, 2017
f9a50ba
[SPARK-20285][TESTS] Increase the pyspark streaming test timeout to 3…
zsxwing Apr 10, 2017
a35b9d9
[SPARK-20282][SS][TESTS] Write the commit log first to fix a race con…
zsxwing Apr 10, 2017
379b0b0
[SPARK-20283][SQL] Add preOptimizationBatches
rxin Apr 10, 2017
734dfbf
[SPARK-17564][TESTS] Fix flaky RequestTimeoutIntegrationSuite.further…
zsxwing Apr 11, 2017
0d2b796
[SPARK-20097][ML] Fix visibility discrepancy with numInstances and de…
BenFradet Apr 11, 2017
d11ef3d
Document Master URL format in high availability set up
MirrorZ Apr 11, 2017
c870698
[SPARK-20274][SQL] support compatible array element type in encoder
cloud-fan Apr 11, 2017
cd91f96
[SPARK-20175][SQL] Exists should not be evaluated in Join operator
viirya Apr 11, 2017
123b4fb
[SPARK-20289][SQL] Use StaticInvoke to box primitive types
rxin Apr 11, 2017
6297697
[SPARK-19505][PYTHON] AttributeError on Exception.message in Python3
Apr 11, 2017
cde9e32
[MINOR][DOCS] Update supported versions for Hive Metastore
dongjoon-hyun Apr 12, 2017
8ad63ee
[SPARK-20291][SQL] NaNvl(FloatType, NullType) should not be cast to N…
Apr 12, 2017
b14bfc3
[SPARK-19993][SQL] Caching logical plans containing subquery expressi…
dilipbiswal Apr 12, 2017
b938438
[MINOR][DOCS] Fix spacings in Structured Streaming Programming Guide
dongjinleekr Apr 12, 2017
bca4259
[MINOR][DOCS] JSON APIs related documentation fixes
HyukjinKwon Apr 12, 2017
044f7ec
[SPARK-20298][SPARKR][MINOR] fixed spelling mistake "charactor"
bdwyer2 Apr 12, 2017
ffc57b0
[SPARK-20302][SQL] Short circuit cast when from and to types are stru…
rxin Apr 12, 2017
2e1fd46
[SPARK-20296][TRIVIAL][DOCS] Count distinct error message for streaming
jtoka Apr 12, 2017
ceaf77a
[SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on Jenkins
HyukjinKwon Apr 12, 2017
504e62e
[SPARK-20303][SQL] Rename createTempFunction to registerFunction
gatorsmile Apr 12, 2017
5408553
[SPARK-20304][SQL] AssertNotNull should not include path in string re…
rxin Apr 12, 2017
99a9473
[SPARK-19570][PYSPARK] Allow to disable hive in pyspark shell
zjffdu Apr 12, 2017
924c424
[SPARK-20301][FLAKY-TEST] Fix Hadoop Shell.runCommand flakiness in St…
brkyvz Apr 12, 2017
a7b430b
[SPARK-15354][FLAKY-TEST] TopologyAwareBlockReplicationPolicyBehavior…
cloud-fan Apr 13, 2017
c5f1cc3
[SPARK-20131][CORE] Don't use `this` lock in StandaloneSchedulerBacke…
zsxwing Apr 13, 2017
ec68d8f
[SPARK-20189][DSTREAM] Fix spark kinesis testcases to remove deprecat…
yashs360 Apr 13, 2017
095d1cb
[SPARK-20265][MLLIB] Improve Prefix'span pre-processing efficiency
Syrux Apr 13, 2017
a4293c2
[SPARK-20284][CORE] Make {Des,S}erializationStream extend Closeable
Apr 13, 2017
fbe4216
[SPARK-20233][SQL] Apply star-join filter heuristics to dynamic progr…
ioana-delaney Apr 13, 2017
8ddf0d2
[SPARK-20232][PYTHON] Improve combineByKey docs
Apr 13, 2017
7536e28
[SPARK-20038][SQL] FileFormatWriter.ExecuteWriteTask.releaseResources…
steveloughran Apr 13, 2017
fb036c4
[SPARK-20318][SQL] Use Catalyst type for min/max in ColumnStat for ea…
Apr 14, 2017
98b41ec
[SPARK-20316][SQL] Val and Var should strictly follow the Scala syntax
Apr 15, 2017
35e5ae4
[SPARK-19716][SQL][FOLLOW-UP] UnresolvedMapObjects should always be s…
cloud-fan Apr 16, 2017
e090f3c
[SPARK-20335][SQL] Children expressions of Hive UDF impacts the deter…
gatorsmile Apr 16, 2017
a888fed
[SPARK-19740][MESOS] Add support in Spark to pass arbitrary parameter…
Apr 16, 2017
ad935f5
[SPARK-20343][BUILD] Add avro dependency in core POM to resolve build…
HyukjinKwon Apr 16, 2017
86d251c
[SPARK-20278][R] Disable 'multiple_dots_linter' lint rule that is aga…
HyukjinKwon Apr 16, 2017
24f09b3
[SPARK-19828][R][FOLLOWUP] Rename asJsonArray to as.json.array in fro…
HyukjinKwon Apr 17, 2017
01ff035
[SPARK-20349][SQL] ListFunctions returns duplicate functions after us…
gatorsmile Apr 17, 2017
e5fee3e
[SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patterns.
jodersky Apr 17, 2017
0075562
Typo fix: distitrbuted -> distributed
ash211 Apr 18, 2017
33ea908
[TEST][MINOR] Replace repartitionBy with distribute in CollapseRepart…
jaceklaskowski Apr 18, 2017
b0a1e93
[SPARK-17647][SQL][FOLLOWUP][MINOR] fix typo
felixcheung Apr 18, 2017
07fd94e
[SPARK-20344][SCHEDULER] Duplicate call in FairSchedulableBuilder.add…
snazy Apr 18, 2017
d4f10cb
[SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to resolve build f…
HyukjinKwon Apr 18, 2017
321b4f0
[SPARK-20366][SQL] Fix recursive join reordering: inside joins are no…
Apr 18, 2017
1f81dda
[SPARK-20354][CORE][REST-API] When I request access to the 'http: //i…
Apr 18, 2017
f654b39
[SPARK-20360][PYTHON] reprs for interpreters
rgbkrk Apr 18, 2017
74aa0df
[SPARK-20377][SS] Fix JavaStructuredSessionization example
tdas Apr 18, 2017
e468a96
[SPARK-20254][SQL] Remove unnecessary data conversion for Dataset wit…
kiszk Apr 19, 2017
702d85a
[SPARK-20208][R][DOCS] Document R fpGrowth support
zero323 Apr 19, 2017
608bf30
[SPARK-20359][SQL] Avoid unnecessary execution in EliminateOuterJoin …
koertkuipers Apr 19, 2017
773754b
[SPARK-20356][SQL] Pruned InMemoryTableScanExec should have correct o…
viirya Apr 19, 2017
3537876
[SPARK-20343][BUILD] Avoid Unidoc build only if Hadoop 2.6 is explici…
HyukjinKwon Apr 19, 2017
71a8e9d
[SPARK-20036][DOC] Note incompatible dependencies on org.apache.kafka…
koeninger Apr 19, 2017
4fea784
[SPARK-20397][SPARKR][SS] Fix flaky test: test_streaming.R.Terminated…
zsxwing Apr 19, 2017
63824b2
[SPARK-20350] Add optimization rules to apply Complementation Laws.
ptkool Apr 20, 2017
39e303a
[MINOR][SS] Fix a missing space in UnsupportedOperationChecker error …
zsxwing Apr 20, 2017
dd6d55d
[SPARK-20398][SQL] range() operator should include cancellation reaso…
ericl Apr 20, 2017
bdc6056
Fixed typos in docs
Apr 20, 2017
46c5749
[SPARK-20375][R] R wrappers for array and map
zero323 Apr 20, 2017
55bea56
[SPARK-20156][SQL][FOLLOW-UP] Java String toLowerCase "Turkish locale…
gatorsmile Apr 20, 2017
c6f62c5
[SPARK-20405][SQL] Dataset.withNewExecutionId should be private
rxin Apr 20, 2017
b91873d
[SPARK-20409][SQL] fail early if aggregate function in GROUP BY
cloud-fan Apr 20, 2017
c5a31d1
[SPARK-20407][TESTS] ParquetQuerySuite 'Enabling/disabling ignoreCorr…
bogdanrdc Apr 20, 2017
b2ebadf
[SPARK-20358][CORE] Executors failing stage on interrupted exception …
ericl Apr 20, 2017
d95e4d9
[SPARK-20334][SQL] Return a better error message when correlated pred…
dilipbiswal Apr 20, 2017
0332063
[SPARK-20410][SQL] Make sparkConf a def in SharedSQLContext
hvanhovell Apr 20, 2017
592f5c8
[SPARK-20172][CORE] Add file permission check when listing files in F…
jerryshao Apr 20, 2017
0368eb9
[SPARK-20367] Properly unescape column names of partitioning columns …
juliuszsompolski Apr 21, 2017
760c8d0
[SPARK-20329][SQL] Make timezone aware expression without timezone un…
hvanhovell Apr 21, 2017
48d760d
[SPARK-20281][SQL] Print the identical Range parameters of SparkConte…
maropu Apr 21, 2017
e2b3d23
[SPARK-20420][SQL] Add events to the external catalog
hvanhovell Apr 21, 2017
3476799
Small rewording about history server use case
dud225 Apr 21, 2017
c9e6035
[SPARK-20412] Throw ParseException from visitNonOptionalPartitionSpec…
juliuszsompolski Apr 21, 2017
a750a59
[SPARK-20341][SQL] Support BigInt's value that does not fit in long v…
kiszk Apr 21, 2017
eb00378
[SPARK-20423][ML] fix MLOR coeffs centering when reg == 0
WeichenXu123 Apr 21, 2017
fd648bf
[SPARK-20371][R] Add wrappers for collect_list and collect_set
zero323 Apr 21, 2017
ad29040
[SPARK-20401][DOC] In the spark official configuration document, the …
Apr 21, 2017
05a4514
[SPARK-20386][SPARK CORE] modify the log info if the block exists on …
eatoncys Apr 22, 2017
b3c572a
[SPARK-20430][SQL] Initialise RangeExec parameters in a driver side
maropu Apr 22, 2017
8765bc1
[SPARK-20132][DOCS] Add documentation for column string functions
map222 Apr 23, 2017
2eaf4f3
[SPARK-20385][WEB-UI] Submitted Time' field, the date format needs to…
Apr 23, 2017
e9f9715
[BUILD] Close stale PRs
maropu Apr 24, 2017
776a2c0
[SPARK-20439][SQL] Fix Catalog API listTables and getTable when faile…
gatorsmile Apr 24, 2017
90264ac
[SPARK-18901][ML] Require in LR LogisticAggregator is redundant
wangmiao1981 Apr 24, 2017
8a272dd
[SPARK-20438][R] SparkR wrappers for split and repeat
zero323 Apr 24, 2017
5280d93
[SPARK-20239][CORE] Improve HistoryServer's ACL mechanism
jerryshao Apr 25, 2017
f44c8a8
[SPARK-20453] Bump master branch version to 2.3.0-SNAPSHOT
JoshRosen Apr 25, 2017
31345fd
[SPARK-20451] Filter out nested mapType datatypes from sort order in …
sameeragarwal Apr 25, 2017
c8f1219
[SPARK-20455][DOCS] Fix Broken Docker IT Docs
original-brownbear Apr 25, 2017
0bc7a90
[SPARK-20404][CORE] Using Option(name) instead of Some(name)
szhem Apr 25, 2017
387565c
[SPARK-18901][FOLLOWUP][ML] Require in LR LogisticAggregator is redun…
wangmiao1981 Apr 25, 2017
67eef47
[SPARK-20449][ML] Upgrade breeze version to 0.13.1
yanboliang Apr 25, 2017
0a7f5f2
[SPARK-5484][GRAPHX] Periodically do checkpoint in Pregel
Apr 25, 2017
caf3920
[SPARK-18127] Add hooks and extension points to Spark
sameeragarwal Apr 26, 2017
57e1da3
[SPARK-16548][SQL] Inconsistent error handling in JSON parsing SQL fu…
Apr 26, 2017
df58a95
[SPARK-20437][R] R wrappers for rollup and cube
zero323 Apr 26, 2017
7a36525
[SPARK-20400][DOCS] Remove References to 3rd Party Vendor Tools
Apr 26, 2017
7fecf51
[SPARK-19812] YARN shuffle service fails to relocate recovery DB acro…
tgravescs Apr 26, 2017
dbb06c6
[MINOR][ML] Fix some PySpark & SparkR flaky tests
yanboliang Apr 26, 2017
66dd5b8
[SPARK-20391][CORE] Rename memory related fields in ExecutorSummay
jerryshao Apr 26, 2017
99c6cf9
[SPARK-20473] Enabling missing types in ColumnVector.Array
michal-databricks Apr 26, 2017
a277ae8
[SPARK-20474] Fixing OnHeapColumnVector reallocation
michal-databricks Apr 26, 2017
2ba1eba
[SPARK-12868][SQL] Allow adding jars from hdfs
weiqingy Apr 26, 2017
66636ef
[SPARK-20435][CORE] More thorough redaction of sensitive information
markgrover Apr 27, 2017
b4724db
[SPARK-20425][SQL] Support a vertical display mode for Dataset.show
maropu Apr 27, 2017
b58cf77
[DOCS][MINOR] Add missing since to SparkR repeat_string note.
zero323 Apr 27, 2017
ba76662
[SPARK-20208][DOCS][FOLLOW-UP] Add FP-Growth to SparkR programming guide
zero323 Apr 27, 2017
7633933
[SPARK-20483] Mesos Coarse mode may starve other Mesos frameworks
dgshep Apr 27, 2017
561e9cc
[SPARK-20421][CORE] Mark internal listeners as deprecated.
Apr 27, 2017
85c6ce6
[SPARK-20426] Lazy initialization of FileSegmentManagedBuffer for shu…
Apr 27, 2017
26ac2ce
[SPARK-20482][SQL] Resolving Casts is too strict on having time zone set
rednaxelafx Apr 27, 2017
a4aa466
[SPARK-20487][SQL] `HiveTableScan` node is quite verbose in explained…
tejasapatil Apr 27, 2017
039e32c
[SPARK-20483][MINOR] Test for Mesos Coarse mode may starve other Meso…
dgshep Apr 27, 2017
606432a
[SPARK-20047][ML] Constrained Logistic Regression
yanboliang Apr 27, 2017
01c999e
[SPARK-20461][CORE][SS] Use UninterruptibleThread for Executor and fi…
zsxwing Apr 27, 2017
823baca
[SPARK-20452][SS][KAFKA] Fix a potential ConcurrentModificationExcept…
zsxwing Apr 27, 2017
b90bf52
[SPARK-12837][CORE] Do not send the name of internal accumulator to e…
cloud-fan Apr 28, 2017
7fe8249
[SPARKR][DOC] Document LinearSVC in R programming guide
wangmiao1981 Apr 28, 2017
e3c8160
[SPARK-20476][SQL] Block users to create a table that use commas in t…
gatorsmile Apr 28, 2017
59e3a56
[SPARK-14471][SQL] Aliases in SELECT could be used in GROUP BY
maropu Apr 28, 2017
8c911ad
[SPARK-20465][CORE] Throws a proper exception when any temp directory…
HyukjinKwon Apr 28, 2017
733b81b
[SPARK-20496][SS] Bug in KafkaWriter Looks at Unanalyzed Plans
Apr 28, 2017
5d71f3d
[SPARK-20514][CORE] Upgrade Jetty to 9.3.11.v20160721
markgrover Apr 28, 2017
ebff519
[SPARK-20471] Remove AggregateBenchmark testsuite warning: Two level …
heary-cao Apr 28, 2017
77bcd77
[SPARK-19525][CORE] Add RDD checkpoint compression support
Apr 28, 2017
814a61a
[SPARK-20487][SQL] Display `serde` for `HiveTableScan` node in explai…
tejasapatil Apr 29, 2017
b28c3bc
[SPARK-20477][SPARKR][DOC] Document R bisecting k-means in R programm…
wangmiao1981 Apr 29, 2017
add9d1b
[SPARK-19791][ML] Add doc and example for fpgrowth
YY-OnCall Apr 29, 2017
ee694cd
[SPARK-20533][SPARKR] SparkR Wrappers Model should be private and val…
wangmiao1981 Apr 29, 2017
70f1bcd
[SPARK-20493][R] De-duplicate parse logics for DDL-like type strings …
HyukjinKwon Apr 29, 2017
d228cd0
[SPARK-20442][PYTHON][DOCS] Fill up documentations for functions in C…
HyukjinKwon Apr 29, 2017
4d99b95
[SPARK-20521][DOC][CORE] The default of 'spark.worker.cleanup.appData…
Apr 30, 2017
1ee494d
[SPARK-20492][SQL] Do not print empty parentheses for invalid primiti…
HyukjinKwon Apr 30, 2017
ae3df4e
[SPARK-20535][SPARKR] R wrappers for explode_outer and posexplode_outer
zero323 Apr 30, 2017
6613046
[MINOR][DOCS][PYTHON] Adding missing boolean type for replacement val…
May 1, 2017
80e9cf1
[SPARK-20490][SPARKR] Add R wrappers for eqNullSafe and ! / not
zero323 May 1, 2017
a355b66
[SPARK-20541][SPARKR][SS] support awaitTermination without timeout
felixcheung May 1, 2017
f0169a1
[SPARK-20290][MINOR][PYTHON][SQL] Add PySpark wrapper for eqNullSafe
zero323 May 1, 2017
6b44c4d
[SPARK-20534][SQL] Make outer generate exec return empty rows
hvanhovell May 1, 2017
ab30590
[SPARK-20517][UI] Fix broken history UI download link
jerryshao May 1, 2017
2a45102
Merge branch 'master' into rk/merge-upstream
May 1, 2017
6fc6cf8
[SPARK-20464][SS] Add a job group and description for streaming queri…
kunalkhamar May 1, 2017
2b2dd08
[SPARK-20540][CORE] Fix unstable executor requests.
rdblue May 1, 2017
af726cd
[SPARK-20459][SQL] JdbcUtils throws IllegalStateException: Cause alre…
srowen May 2, 2017
259860d
[SPARK-20463] Add support for IS [NOT] DISTINCT FROM.
ptkool May 2, 2017
943a684
[SPARK-20548] Disable ReplSuite.newProductSeqEncoder with REPL define…
sameeragarwal May 2, 2017
d20a976
[SPARK-20192][SPARKR][DOC] SparkR migration guide to 2.2.0
felixcheung May 2, 2017
90d77e9
[SPARK-20532][SPARKR] Implement grouping and grouping_id
zero323 May 2, 2017
afb21bf
[SPARK-20537][CORE] Fixing OffHeapColumnVector reallocation
kiszk May 2, 2017
86174ea
[SPARK-20549] java.io.CharConversionException: Invalid UTF-32' in Jso…
brkyvz May 2, 2017
e300a5a
[SPARK-20300][ML][PYSPARK] Python API for ALSModel.recommendForAllUse…
May 2, 2017
b1e639a
[SPARK-19235][SQL][TEST][FOLLOW-UP] Enable Test Cases in DDLSuite wit…
gatorsmile May 2, 2017
13f47dc
[SPARK-20490][SPARKR][DOC] add family tag for not function
felixcheung May 2, 2017
ef3df91
[SPARK-20421][CORE] Add a missing deprecation tag.
May 2, 2017
b946f31
[SPARK-20558][CORE] clear InheritableThreadLocal variables in SparkCo…
cloud-fan May 3, 2017
6235132
[SPARK-20567] Lazily bind in GenerateExec
marmbrus May 3, 2017
db2fb84
[SPARK-6227][MLLIB][PYSPARK] Implement PySpark wrappers for SVD and P…
MechCoder May 3, 2017
16fab6b
[SPARK-20523][BUILD] Clean up build warnings for 2.2.0 release
srowen May 3, 2017
7f96f2d
[SPARK-16957][MLLIB] Use midpoints for split values.
facaiy May 3, 2017
27f543b
[SPARK-20441][SPARK-20432][SS] Within the same streaming query, one S…
lw-lin May 3, 2017
527fc5d
[SPARK-20576][SQL] Support generic hint function in Dataset/DataFrame
rxin May 3, 2017
6b9e49d
[SPARK-19965][SS] DataFrame batch reader may fail to infer partitions…
lw-lin May 3, 2017
13eb37c
[MINOR][SQL] Fix the test title from =!= to <=>, remove a duplicated …
HyukjinKwon May 3, 2017
02bbe73
[SPARK-20584][PYSPARK][SQL] Python generic hint support
zero323 May 4, 2017
fc472bd
[SPARK-20543][SPARKR] skip tests when running on CRAN
felixcheung May 4, 2017
b8302cc
[SPARK-20015][SPARKR][SS][DOC][EXAMPLE] Document R Structured Streami…
felixcheung May 4, 2017
9c36aa2
[SPARK-20585][SPARKR] R generic hint support
zero323 May 4, 2017
f21897f
[SPARK-20544][SPARKR] R wrapper for input_file_name
zero323 May 4, 2017
57b6470
[SPARK-20571][SPARKR][SS] Flaky Structured Streaming tests
felixcheung May 4, 2017
c5dceb8
[SPARK-20047][FOLLOWUP][ML] Constrained Logistic Regression follow up
yanboliang May 4, 2017
bfc8c79
[SPARK-20566][SQL] ColumnVector should support `appendFloats` for array
dongjoon-hyun May 4, 2017
5386ff3
Compilation fixups
May 2, 2017
b95862d
Merge branch 'master' into rk/merge-upstream
May 5, 2017
f53fdff
add options
May 5, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -297,3 +297,4 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(MIT License) RowsGroup (http://datatables.net/license/mit)
(MIT License) jsonFormatter (http://www.jqueryscript.net/other/jQuery-Plugin-For-Pretty-JSON-Formatting-jsonFormatter.html)
(MIT License) modernizr (https://github.com/Modernizr/Modernizr/blob/master/LICENSE)
(MIT License) machinist (https://github.com/typelevel/machinist)
20 changes: 10 additions & 10 deletions R/check-cran.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,18 +20,18 @@
set -o pipefail
set -e

FWDIR="$(cd `dirname "${BASH_SOURCE[0]}"`; pwd)"
pushd $FWDIR > /dev/null
FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
pushd "$FWDIR" > /dev/null

. $FWDIR/find-r.sh
. "$FWDIR/find-r.sh"

# Install the package (this is required for code in vignettes to run when building it later)
# Build the latest docs, but not vignettes, which is built with the package next
. $FWDIR/install-dev.sh
. "$FWDIR/install-dev.sh"

# Build source package with vignettes
SPARK_HOME="$(cd "${FWDIR}"/..; pwd)"
. "${SPARK_HOME}"/bin/load-spark-env.sh
. "${SPARK_HOME}/bin/load-spark-env.sh"
if [ -f "${SPARK_HOME}/RELEASE" ]; then
SPARK_JARS_DIR="${SPARK_HOME}/jars"
else
Expand All @@ -40,16 +40,16 @@ fi

if [ -d "$SPARK_JARS_DIR" ]; then
# Build a zip file containing the source package with vignettes
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/"R CMD build $FWDIR/pkg
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/R" CMD build "$FWDIR/pkg"

find pkg/vignettes/. -not -name '.' -not -name '*.Rmd' -not -name '*.md' -not -name '*.pdf' -not -name '*.html' -delete
else
echo "Error Spark JARs not found in $SPARK_HOME"
echo "Error Spark JARs not found in '$SPARK_HOME'"
exit 1
fi

# Run check as-cran.
VERSION=`grep Version $FWDIR/pkg/DESCRIPTION | awk '{print $NF}'`
VERSION=`grep Version "$FWDIR/pkg/DESCRIPTION" | awk '{print $NF}'`

CRAN_CHECK_OPTIONS="--as-cran"

Expand All @@ -67,10 +67,10 @@ echo "Running CRAN check with $CRAN_CHECK_OPTIONS options"

if [ -n "$NO_TESTS" ] && [ -n "$NO_MANUAL" ]
then
"$R_SCRIPT_PATH/"R CMD check $CRAN_CHECK_OPTIONS SparkR_"$VERSION".tar.gz
"$R_SCRIPT_PATH/R" CMD check $CRAN_CHECK_OPTIONS "SparkR_$VERSION.tar.gz"
else
# This will run tests and/or build vignettes, and require SPARK_HOME
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/"R CMD check $CRAN_CHECK_OPTIONS SparkR_"$VERSION".tar.gz
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/R" CMD check $CRAN_CHECK_OPTIONS "SparkR_$VERSION.tar.gz"
fi

popd > /dev/null
10 changes: 5 additions & 5 deletions R/create-docs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -33,23 +33,23 @@ export FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
export SPARK_HOME="$(cd "`dirname "${BASH_SOURCE[0]}"`"/..; pwd)"

# Required for setting SPARK_SCALA_VERSION
. "${SPARK_HOME}"/bin/load-spark-env.sh
. "${SPARK_HOME}/bin/load-spark-env.sh"

echo "Using Scala $SPARK_SCALA_VERSION"

pushd $FWDIR > /dev/null
. $FWDIR/find-r.sh
pushd "$FWDIR" > /dev/null
. "$FWDIR/find-r.sh"

# Install the package (this will also generate the Rd files)
. $FWDIR/install-dev.sh
. "$FWDIR/install-dev.sh"

# Now create HTML files

# knit_rd puts html in current working directory
mkdir -p pkg/html
pushd pkg/html

"$R_SCRIPT_PATH/"Rscript -e 'libDir <- "../../lib"; library(SparkR, lib.loc=libDir); library(knitr); knit_rd("SparkR", links = tools::findHTMLlinks(paste(libDir, "SparkR", sep="/")))'
"$R_SCRIPT_PATH/Rscript" -e 'libDir <- "../../lib"; library(SparkR, lib.loc=libDir); library(knitr); knit_rd("SparkR", links = tools::findHTMLlinks(paste(libDir, "SparkR", sep="/")))'

popd

Expand Down
8 changes: 4 additions & 4 deletions R/create-rd.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@
set -o pipefail
set -e

FWDIR="$(cd `dirname "${BASH_SOURCE[0]}"`; pwd)"
pushd $FWDIR > /dev/null
. $FWDIR/find-r.sh
FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
pushd "$FWDIR" > /dev/null
. "$FWDIR/find-r.sh"

# Generate Rd files if devtools is installed
"$R_SCRIPT_PATH/"Rscript -e ' if("devtools" %in% rownames(installed.packages())) { library(devtools); devtools::document(pkg="./pkg", roclets=c("rd")) }'
"$R_SCRIPT_PATH/Rscript" -e ' if("devtools" %in% rownames(installed.packages())) { library(devtools); devtools::document(pkg="./pkg", roclets=c("rd")) }'
14 changes: 7 additions & 7 deletions R/install-dev.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,21 +29,21 @@
set -o pipefail
set -e

FWDIR="$(cd `dirname "${BASH_SOURCE[0]}"`; pwd)"
FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
LIB_DIR="$FWDIR/lib"

mkdir -p $LIB_DIR
mkdir -p "$LIB_DIR"

pushd $FWDIR > /dev/null
. $FWDIR/find-r.sh
pushd "$FWDIR" > /dev/null
. "$FWDIR/find-r.sh"

. $FWDIR/create-rd.sh
. "$FWDIR/create-rd.sh"

# Install SparkR to $LIB_DIR
"$R_SCRIPT_PATH/"R CMD INSTALL --library=$LIB_DIR $FWDIR/pkg/
"$R_SCRIPT_PATH/R" CMD INSTALL --library="$LIB_DIR" "$FWDIR/pkg/"

# Zip the SparkR package so that it can be distributed to worker nodes on YARN
cd $LIB_DIR
cd "$LIB_DIR"
jar cfM "$LIB_DIR/sparkr.zip" SparkR

popd > /dev/null
20 changes: 10 additions & 10 deletions R/install-source-package.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,28 +29,28 @@
set -o pipefail
set -e

FWDIR="$(cd `dirname "${BASH_SOURCE[0]}"`; pwd)"
pushd $FWDIR > /dev/null
. $FWDIR/find-r.sh
FWDIR="$(cd "`dirname "${BASH_SOURCE[0]}"`"; pwd)"
pushd "$FWDIR" > /dev/null
. "$FWDIR/find-r.sh"

if [ -z "$VERSION" ]; then
VERSION=`grep Version $FWDIR/pkg/DESCRIPTION | awk '{print $NF}'`
VERSION=`grep Version "$FWDIR/pkg/DESCRIPTION" | awk '{print $NF}'`
fi

if [ ! -f "$FWDIR"/SparkR_"$VERSION".tar.gz ]; then
echo -e "R source package file $FWDIR/SparkR_$VERSION.tar.gz is not found."
if [ ! -f "$FWDIR/SparkR_$VERSION.tar.gz" ]; then
echo -e "R source package file '$FWDIR/SparkR_$VERSION.tar.gz' is not found."
echo -e "Please build R source package with check-cran.sh"
exit -1;
fi

echo "Removing lib path and installing from source package"
LIB_DIR="$FWDIR/lib"
rm -rf $LIB_DIR
mkdir -p $LIB_DIR
"$R_SCRIPT_PATH/"R CMD INSTALL SparkR_"$VERSION".tar.gz --library=$LIB_DIR
rm -rf "$LIB_DIR"
mkdir -p "$LIB_DIR"
"$R_SCRIPT_PATH/R" CMD INSTALL "SparkR_$VERSION.tar.gz" --library="$LIB_DIR"

# Zip the SparkR package so that it can be distributed to worker nodes on YARN
pushd $LIB_DIR > /dev/null
pushd "$LIB_DIR" > /dev/null
jar cfM "$LIB_DIR/sparkr.zip" SparkR
popd > /dev/null

Expand Down
2 changes: 1 addition & 1 deletion R/pkg/.lintr
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
linters: with_defaults(line_length_linter(100), camel_case_linter = NULL, open_curly_linter(allow_single_line = TRUE), closed_curly_linter(allow_single_line = TRUE))
linters: with_defaults(line_length_linter(100), multiple_dots_linter = NULL, camel_case_linter = NULL, open_curly_linter(allow_single_line = TRUE), closed_curly_linter(allow_single_line = TRUE))
exclusions: list("inst/profile/general.R" = 1, "inst/profile/shell.R")
3 changes: 3 additions & 0 deletions R/pkg/DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ Collate:
'WindowSpec.R'
'backend.R'
'broadcast.R'
'catalog.R'
'client.R'
'context.R'
'deserialize.R'
Expand All @@ -43,6 +44,7 @@ Collate:
'jvm.R'
'mllib_classification.R'
'mllib_clustering.R'
'mllib_fpm.R'
'mllib_recommendation.R'
'mllib_regression.R'
'mllib_stat.R'
Expand All @@ -51,6 +53,7 @@ Collate:
'serialize.R'
'sparkR.R'
'stats.R'
'streaming.R'
'types.R'
'utils.R'
'window.R'
Expand Down
48 changes: 46 additions & 2 deletions R/pkg/NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,10 @@ exportMethods("glm",
"spark.randomForest",
"spark.gbt",
"spark.bisectingKmeans",
"spark.svmLinear")
"spark.svmLinear",
"spark.fpGrowth",
"spark.freqItemsets",
"spark.associationRules")

# Job group lifecycle management methods
export("setJobGroup",
Expand All @@ -82,6 +85,7 @@ exportMethods("arrange",
"as.data.frame",
"attach",
"cache",
"checkpoint",
"coalesce",
"collect",
"colnames",
Expand All @@ -97,6 +101,7 @@ exportMethods("arrange",
"createOrReplaceTempView",
"crossJoin",
"crosstab",
"cube",
"dapply",
"dapplyCollect",
"describe",
Expand All @@ -118,9 +123,11 @@ exportMethods("arrange",
"group_by",
"groupBy",
"head",
"hint",
"insertInto",
"intersect",
"isLocal",
"isStreaming",
"join",
"limit",
"merge",
Expand All @@ -138,6 +145,7 @@ exportMethods("arrange",
"registerTempTable",
"rename",
"repartition",
"rollup",
"sample",
"sample_frac",
"sampleBy",
Expand Down Expand Up @@ -169,12 +177,14 @@ exportMethods("arrange",
"write.json",
"write.orc",
"write.parquet",
"write.stream",
"write.text",
"write.ml")

exportClasses("Column")

exportMethods("%in%",
exportMethods("%<=>%",
"%in%",
"abs",
"acos",
"add_months",
Expand All @@ -197,6 +207,8 @@ exportMethods("%in%",
"cbrt",
"ceil",
"ceiling",
"collect_list",
"collect_set",
"column",
"concat",
"concat_ws",
Expand All @@ -207,6 +219,8 @@ exportMethods("%in%",
"count",
"countDistinct",
"crc32",
"create_array",
"create_map",
"hash",
"cume_dist",
"date_add",
Expand All @@ -222,6 +236,7 @@ exportMethods("%in%",
"endsWith",
"exp",
"explode",
"explode_outer",
"expm1",
"expr",
"factorial",
Expand All @@ -235,12 +250,15 @@ exportMethods("%in%",
"getField",
"getItem",
"greatest",
"grouping_bit",
"grouping_id",
"hex",
"histogram",
"hour",
"hypot",
"ifelse",
"initcap",
"input_file_name",
"instr",
"isNaN",
"isNotNull",
Expand Down Expand Up @@ -278,18 +296,21 @@ exportMethods("%in%",
"nanvl",
"negate",
"next_day",
"not",
"ntile",
"otherwise",
"over",
"percent_rank",
"pmod",
"posexplode",
"posexplode_outer",
"quarter",
"rand",
"randn",
"rank",
"regexp_extract",
"regexp_replace",
"repeat_string",
"reverse",
"rint",
"rlike",
Expand All @@ -313,6 +334,7 @@ exportMethods("%in%",
"sort_array",
"soundex",
"spark_partition_id",
"split_string",
"stddev",
"stddev_pop",
"stddev_samp",
Expand Down Expand Up @@ -355,17 +377,29 @@ export("as.DataFrame",
"clearCache",
"createDataFrame",
"createExternalTable",
"createTable",
"currentDatabase",
"dropTempTable",
"dropTempView",
"jsonFile",
"listColumns",
"listDatabases",
"listFunctions",
"listTables",
"loadDF",
"parquetFile",
"read.df",
"read.jdbc",
"read.json",
"read.orc",
"read.parquet",
"read.stream",
"read.text",
"recoverPartitions",
"refreshByPath",
"refreshTable",
"setCheckpointDir",
"setCurrentDatabase",
"spark.lapply",
"spark.addFile",
"spark.getSparkFilesRootDirectory",
Expand Down Expand Up @@ -402,6 +436,16 @@ export("partitionBy",
export("windowPartitionBy",
"windowOrderBy")

exportClasses("StreamingQuery")

export("awaitTermination",
"isActive",
"lastProgress",
"queryName",
"status",
"stopQuery")


S3method(print, jobj)
S3method(print, structField)
S3method(print, structType)
Expand Down
Loading