Skip to content

Conversation

@DieBauer
Copy link

@DieBauer DieBauer commented Apr 8, 2017

  • General

    • The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text")
    • The pull request addresses only one issue
    • Each commit in the PR has a meaningful commit message (including the JIRA id)
  • Documentation

    • Documentation has been added for new functionality
    • Old documentation affected by the pull request has been updated
    • JavaDoc for public methods has been added
  • Tests & Build

    • Functionality added by the pull request is covered by tests
    • mvn clean verify has been executed successfully locally or a Travis build has passed

This is an initial approach to make flink scala 2.12 ready.

I've introduced profiles to switch between 2.12, 2.11 and 2.10. All three profiles now compile.

mvn clean install -D$version where $version is scala-2.12, scala-2.11 or scala-2.10.

To overcome the flakka artifacts (akka2.3-custom) for scala 2.12, I've replaced them with the latest typesafe-akka artifacts when using the 2.12 profile.

TravisCI profiles are added and I've changed the initial release script to accomodate for 2.12, but this is by no means finished.

I encountered a lot of compilation errors, because types could not be inferred. Therefore I've added types to problematic expressions.

The kafka 0.10 dependency is bumped to 0.10.1.1 since that's the first released version for 2.12.
There is some trickery in the connector-parent-pom because only kafka-0.10 is released for 2.12, kafka-0.9 and kafka-0.8 aren't compiled for 2.12. I've to look into that a little more.

More updated dependencies:
javassist was bumped because of java.lang.IllegalStateException: Failed to transform class with name scala.concurrent.duration.Duration. Reason: javassist.bytecode.InterfaceMethodrefInfo cannot be cast to javassist.bytecode.MethodrefInfo. which led me to: http://stackoverflow.com/questions/31189086/powermock-and-java-8-issue-interfacemethodrefinfo-cannot-be-cast-to-methodrefin#37217871

twitter-chill was bumped to 0.7.7 version for cross-compiled versions.

grizzled slf4j was bumped for scala 2.12 to version 1.3.0.

scalatest was bumped for scala 2.12 to version 3.0.1

Right now I'm trying to make the travis build succeed.

I will squash all commits once this succeeds.

Any other suggestions are welcome!

@greghogan
Copy link
Contributor

@DieBauer thanks for taking this on! I haven't been using Flink with Scala but I think this will be important to have for the May release.

The required changes for type inferences are interesting. I'm puzzled why this would regress. Also, if developers are writing against 2.10 then these issues will not manifest until integration tests are run (the same problem you are experiencing).

One other thought: since Scala 2.12 requires Java 8, is it still necessary to specify jdk8 when executing the scala-2.12 profile?

Flink Forward starts Monday so developer activity will be low this week. @StephanEwen thoughts when you have the chance?

@DieBauer
Copy link
Author

DieBauer commented Apr 9, 2017

I'm running into an issue with asm in for example the flink-scala module, when compiling with 2.12.

java.io.IOException: Class not found
	at org.objectweb.asm.ClassReader.a(Unknown Source)
	at org.objectweb.asm.ClassReader.<init>(Unknown Source)
	at org.apache.flink.api.scala.ClosureCleaner$.getClassReader(ClosureCleaner.scala:44)
	at org.apache.flink.api.scala.ClosureCleaner$.getInnerClasses(ClosureCleaner.scala:92)
	at org.apache.flink.api.scala.ClosureCleaner$.clean(ClosureCleaner.scala:115)
	at org.apache.flink.api.scala.DataSet.clean(DataSet.scala:125)
	at org.apache.flink.api.scala.DataSet$$anon$12.<init>(DataSet.scala:910)

since the closurecleaner was initially copied from spark, I've looked there and found an issues (apache/spark#9512) regarding asm5 and java8.
However, flink is already using asm5 in the closurecleaner. Their dependency is

<groupId>org.apache.xbean</groupId>
<artifactId>xbean-asm5-shaded</artifactId>

and ours is from org.ow2.asm, asm. There are things going on in the shaded plugin in the parent pom with regard to relocating dependencies of asm, but I'm not sure how that all works out.

So for now, I'm a bit puzzled why we get this error.

@greghogan you're right, the profile jdk8 is only enabling the module with examples in java8. But since they are also compiled in the scala-2.11 case, I thought we want to have them? We can drop it of course.

@DieBauer
Copy link
Author

Ok, started looking into the issue a bit more.
It seems like it has to do with the new lambda generation in scala 2.12 and not with the asm library.

From the scala 2.12-M3 release notes (https://github.com/scala/scala/releases/tag/v2.12.0-M3):

Java 8 style closure classes

Scala 2.12 emits closures in the same style as Java 8.

For each lambda the compiler generates a method containing the lambda body.
At runtime, this method is passed as an argument to the LambdaMetaFactory provided by the JDK, which creates a closure object.

Compared to Scala 2.11, the new scheme has the advantage that the compiler does not generate an anonymous class for each lambda anymore.
This leads to significantly smaller JAR files.

Our ClosureCleaner uses the class name for instantiating the ClassReader, which is used later on.

However, since scala2.12 doesn't generate anonymous classes, the file isn't found (null), therefore we get class not found exception, which make sense now.

We have to look into how to circumvent/implement this new generation of 'lambdas'.

A small technical example, the testclass which throwed an exception AcceptPFTestBase.
And then the line containing: protected val groupedTuples = tuples.groupBy(_._1)
Since tuples is a Dataset the function that we have to check is _._1 (an anonymous function).

when compiling/executing with scala 2.11
we get class org.apache.flink.api.scala.extensions.base.AcceptPFTestBase$$anonfun$1 as cls in the ClosureCleaner.getClassReader method.
And this is indeed a file generated by the scala compiler and can be resolved by ls.getResourceAsStream(className).

However when using scala 2.12
we get class org.apache.flink.api.scala.extensions.base.AcceptPFTestBase$$Lambda$11/1489743810 which is not an existing file, and cannot be resolved by ls.getResourceAsStream(className).

Concluding, according to me, with the new scala 2.12 style with lambdas, the current closurecleaner doesn't suffice.
There is also a Spark issue (https://issues.apache.org/jira/browse/SPARK-14540) regarding closures in scala 2.12/java8.

Any thoughts how to proceed?

DieBauer added 27 commits April 16, 2017 21:09
[ERROR] /Users/jens/Development/flink/flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/GroupCombineITCase.java:54: error: not found: type TestExecutionMode
[INFO]  public GroupCombineITCase(TestExecutionMode mode) {

maybe related to scala/bug#10207 ?
java.lang.IllegalStateException: Failed to transform class with name scala.concurrent.duration.Duration. Reason: javassist.bytecode.InterfaceMethodrefInfo cannot be cast to javassist.bytecode.MethodrefInfo

http://stackoverflow.com/questions/31189086/powermock-and-java-8-issue-interfacemethodrefinfo-cannot-be-cast-to-methodrefin#37217871
@DieBauer DieBauer force-pushed the feature/scala-2.12 branch from 88ea483 to ec4fb48 Compare April 16, 2017 19:12
@StephanEwen
Copy link
Contributor

One thing you can try and do is to run TypeExtractionUtils.checkAndExtractLambda to see if it is a generated serializable Lambda.
In the case of a Lambda, you could switch to a different code path (possibly not clean anything in the first version).

@twalthr may have some thoughts on that as well...

@StephanEwen
Copy link
Contributor

We cannot add any other dependencies to the pom files. Adding "akka" back will create a conflict with the "flakka" files.

What we can do is wither of the following two options:

  • Release "flakka" for Scala 2.12 and then we need to change nothing in Flink. The flakka code is at https://github.com/mxm/flakka - we can do the release, you could help up by checking out what needs to be done to use flakka with Scala 2.12 (if it is at all possible)

  • See if we can pull out the dependency as a property ans use "flakka" int the Scala 2.10 and 2.11 case and use vanilla akka 2.4 in the java8/scala2.12 case. That would be a lot of Maven hacking, though - if possible, I would prefer the first variant (less complexity in Flink).

We can also not add more Travis build profiles (builds take too long already). We need to keep that number as it is and simply select one of these profiles to use Scala 2.12 rather than for example 2.10.

@joan38
Copy link

joan38 commented May 9, 2017

Going into the direction of dropping flakka would be a good win for users of Flink like us that are dealing everyday with SBT/Maven hacking to be able to use vanilla Akka and Flink. Such a pain...

@StephanEwen
Copy link
Contributor

@Joan - I actually agree with you. We needed to use "flakka" to be able to support Java 7 and bind to a wildcard address (across interfaces).

Would be great to be able to do that differently and not have a custom akka build (at least for Java 8 / Scala 2.11 / 2.12)

@joan38
Copy link

joan38 commented May 10, 2017

Java 7 Reaches End of Life. Oracle ceased public availability of security fixes and upgrades for Java 7 as of April 2015

We are in 2017
@StephanEwen but I understand if there is Flink users still attached to Java 7 👍

@StephanEwen
Copy link
Contributor

There is about 10% Flink users on Java 7 (we did a poll recently).
Big clusters change slowly...

@StephanEwen
Copy link
Contributor

@joan38 I have some WIP for a flag that allows you to use vanilla akka when running on Java 8, Scala 2.11

Here is the branch: https://github.com/StephanEwen/incubator-flink/commits/vanilla_akka

You can try to build it via: mvn clean package -Dscala-2.11 -Pjdk8,vanilla-akka

@joan38
Copy link

joan38 commented May 10, 2017

@StephanEwen Nice! That looks promising.

@frankbohman
Copy link

where can i watch the status page-thing.. that tells us when we can get off of old 2.11 ?

@fhueske
Copy link
Contributor

fhueske commented May 20, 2017

@frankbohman just watch the JIRA issue: https://issues.apache.org/jira/browse/FLINK-5005

@joan38
Copy link

joan38 commented Jul 14, 2017

Any news on this?
2.13.0-M1 is out https://github.com/scala/scala/releases/tag/v2.13.0-M1
I'm wondering if we will still be on 2.11 when 2.13.0 is out :P

@greghogan
Copy link
Contributor

@joan38 there has been a discussion on the mailing list about dropping Java 7 support (no one has objected) which will make it simpler to support Scala 2.12 in the upcoming release.

@joan38
Copy link

joan38 commented Jul 14, 2017

@greghogan That's a pretty good news.
Thanks for your answer.

@ariskk
Copy link

ariskk commented Jul 15, 2017

We are really looking forward to this 👍

@joan38
Copy link

joan38 commented Sep 7, 2017

Is there any news on this?

@joan38
Copy link

joan38 commented Oct 2, 2017

So I guess this PR is abandoned?

@aljoscha
Copy link
Contributor

aljoscha commented Oct 5, 2017

@DieBauer Do you still wan't to work on this? I also started trying to make Flink ready for 2.12 before I noticed this older branch. I'd be very happy to stop, though, if you're interested in bringing this to an end. It should be easier now that we dropped Java 8 support and also agreed to drop Scala 2.10 support.

@DieBauer
Copy link
Author

DieBauer commented Oct 9, 2017

Hi, I'm sorry for the late reaction. I haven't found the time to work on this anymore (also priorities shifted... )

Therefore this pull request is stale. (it still could be used as a reference).

I think the main challenge is in serialising the java8 lambdas. And dropping the support for scala 2.10 and Java7 certainly helps in taming the pom.xml profiles.

I will close this pull request to not keep the hopes up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants