-
Couldn't load subscription status.
- Fork 182
Add k8s helm charts that run HDFS daemons in Kubernetes #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/cc @kow3ns |
|
cc/ @prb also for thoughts on this. |
charts/hdfs-k8s/README.md
Outdated
| daemon. | ||
|
|
||
| ``` | ||
| $ kubectl label nodes YOUR-HOST hdfs-namenode-selector=hdfs-namenode-0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use a PV/PVC for the name-node, we could probably skip this step. Even if we don't do that here and continue to use hostpath, can we add a comment here to clarify why we're doing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's my thinking too. I'll add a comment.
charts/hdfs-k8s/README.md
Outdated
|
|
||
| 2. Find the IP of your `kube-dns` name server that resolves pod and service | ||
| host names in your k8s cluster. Default is 10.96.0.10. It will be supplied | ||
| below as the `clusterDnsIP` parameter. Try this command and find the IP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this configuration at all for clusterDnsIP? If we do, we should be using that of the service fronting kube-dns. You can get that IP address through kubectl get svc --all-namespaces | grep dns.
The individual kube-dns pods can get evicted and change their IP addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed to fill out /etc/resolv.conf fordatanodes. I meant the service IP, not pod IP of kube-dns. I think the command line example below is equivalent of what you're suggesting. I'll clarify.
| labels: | ||
| name: hdfs-datanode | ||
| annotations: | ||
| scheduler.alpha.kubernetes.io/tolerations: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want it to schedule on the master node? It's not typical to run any user pods on the master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I don't know what this annotation does. I don't want the master node either. Maybe dropping this annotation is the solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. So it seems this annotation specifically allows the master node to run a member daemon. I dropped it because that's not what we wanted.
But a daemon is still scheduled on the master node. I think I read about this behavior as a bug. Anyway, not having this annotation is better. So I'll update the patch.
| - name: datanode | ||
| image: uhopper/hadoop-datanode:2.7.2 | ||
| env: | ||
| # This works only with /etc/resolv.conf mounted from the config map. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This issue is also due to the docker version (1.12+) you're running?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could be wrong, but I don't think it's docker version. We were using much older version 1.10.x when I found this issue. Anyway, I'm looking forward to try out kubernetes 1.6 and get rid of this part.
|
Thanks for the PR @kimoonkim. I've left a few comments that we can discuss. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @foxish. Added some suggested comments in the new diff and answered some questions in-line. PTAL.
Please notice I commited 6373749 in between, that separates the files into two charts, one for the namenode and the other for datanodes. I was fighting two subtle but important bugs and this commit fixes them:
-
datanodesget stuck at startup if the statefulset DNS ofnamenodeis not set yet. In practice, this means you want to startnamenodefirst. And startdatanodeonly afterward. Having two charts makes it possible. -
The overlay network
weavethat we are using gets in the way ifnamenodeis not usinghostNetwork. It makes connections fromdatanodetonamenodego through its virtual NICs. This leadsnamenodeto believedatanodeIPs come from those virtual NICs. The fix is switchingnamenodetohostNetworkas well.
There is no other changes in 6373749
A good news is that I was able to run a Spark DfsReadWriteTEST job succesfully against this HDFS after fixing those bugs.
| - name: datanode | ||
| image: uhopper/hadoop-datanode:2.7.2 | ||
| env: | ||
| # This works only with /etc/resolv.conf mounted from the config map. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could be wrong, but I don't think it's docker version. We were using much older version 1.10.x when I found this issue. Anyway, I'm looking forward to try out kubernetes 1.6 and get rid of this part.
|
@foxish Addressed all your comments. Maybe ready for another look before merge? |
|
Thanks for addressing comments @kimoonkim. One last item I want to address is the DNS server IP address being supplied. Instead of targeting an individual kube-dns pod, we can use the kube-dns service which has a cluster-ip of |
|
I see. On your deployment, you're ending up with a serviceIP of |
|
That suggestion makes sense. Addressed in the latest diff. Thanks! |
|
Thanks! This looks like a good beginning. Merging. |
|
Great. Thanks for the review, @foxish! |
Adapt HDFS Charts to KEOS Kerberos
The prototype referred to by #1. See README.md for usage.
Cc @foxish @ssuchter @ash211