Skip to content

Commit 6cdfa7a

Browse files
author
Chris Cho
authored
Kafka Connector Guide (Feature Branch) (#549)
* Kafka Connector Guide
1 parent e40d3b8 commit 6cdfa7a

15 files changed

+2195
-3
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
*.swp
12
build/
23
*pyc
34
source/includes/table-*rst

conf.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,10 +68,12 @@
6868
'mms-docs': ('https://docs.cloud.mongodb.com%s', ''),
6969
'mms-home': ('https://cloud.mongodb.com%s', ''),
7070
'guides': ('https://docs.mongodb.com/guides%s', ''),
71-
'java-docs-latest': ('https://mongodb.github.io/mongo-java-driver/3.11/%s', ''),
71+
'java-docs-latest': ('http://mongodb.github.io/mongo-java-driver/3.11/%s', ''),
72+
'kafka-21-javadoc': ('https://kafka.apache.org/21/javadoc/org/apache/kafka%s', ''),
7273
'csharp-docs-latest': ('http://mongodb.github.io/mongo-csharp-driver/2.9%s', ''),
7374
'aws-docs': ('https://docs.aws.amazon.com/%s', ''),
7475
'wikipedia': ('https://en.wikipedia.org/wiki/%s', ''),
76+
'community-support': ('https://www.mongodb.com/community-support-resources%s', ''),
7577
}
7678

7779
intersphinx_mapping = {}
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
.. _kafka-connect-migration:
2+
3+
==========================
4+
Migrate from Kafka Connect
5+
==========================
6+
7+
.. default-domain:: mongodb
8+
9+
Follow the steps in this guide to migrate your Kafka deployments from
10+
`Kafka Connect <https://github.com/hpgrahsl/kafka-connect-mongodb>`_ to the
11+
`official MongoDB Kafka connector <https://github.com/mongodb/mongo-kafka>`_.
12+
13+
Update Configuration Settings
14+
-----------------------------
15+
16+
- Replace any property values that refer to ``at.grahsl.kafka.connect.mongodb``
17+
with ``com.mongodb.kafka.connect``.
18+
19+
- Replace ``MongoDbSinkConnector`` with ``MongoSinkConnector`` as the
20+
value of the ``connector.class`` key.
21+
22+
- Remove the "``mongodb.``" prefix from all configuration property key
23+
names.
24+
25+
- Remove the ``document.id.strategies`` key if it exists. If the value of
26+
this field contained references to any custom strategies, move them to the
27+
``document.id.strategy`` field and read the :ref:`custom-class-changes`
28+
section for additional required changes to your classes.
29+
30+
- Replace any keys that were used to specify per-topic and collection
31+
overrides that contain the ``mongodb.collection`` prefix with the
32+
equivalent key in `Topic-Specific Configuration Settings
33+
<https://github.com/mongodb/mongo-kafka/blob/master/docs/sink.md#topic-specific-configuration-settings>`_.
34+
35+
.. _custom-class-changes:
36+
37+
Update Custom Classes
38+
---------------------
39+
40+
If you added any classes or custom logic to your Kafka Connect connector,
41+
migrate them to the new MongoDB Kafka connector jar file and make the
42+
following changes to them:
43+
44+
- Update imports that refer to ``at.grahsl.kafka.connect.mongodb`` to
45+
``com.mongodb.kafka.connect``.
46+
47+
- Replace references to the ``MongoDbSinkConnector`` class with
48+
``MongoSinkConnector``.
49+
50+
- Update custom sink strategy classes to implement the
51+
``com.mongodb.kafka.connect.sink.processor.id.strategy.IdStrategy``
52+
interface.
53+
54+
- Update references to the ``MongoDbSinkConnectorConfig`` class which
55+
has been split into the `sink.MongoSinkConfig
56+
<https://github.com/mongodb/mongo-kafka/blob/master/src/main/java/com/mongodb/kafka/connect/sink/MongoSinkConfig.java>`_
57+
and `sink.MongoSinkTopicConfig
58+
<https://github.com/mongodb/mongo-kafka/blob/master/src/main/java/com/mongodb/kafka/connect/sink/MongoSinkTopicConfig.java>`_
59+
classes.
60+
61+
Update PostProcessor Subclasses
62+
-------------------------------
63+
64+
- Update any concrete methods that override methods in the Kafka Connect
65+
``PostProcessor`` class to match the new method signatures of the
66+
MongoDB Kafka Connector `PostProcessor
67+
<https://github.com/mongodb/mongo-kafka/blob/master/src/main/java/com/mongodb/kafka/connect/sink/processor/PostProcessor.java>`_
68+
class.
Lines changed: 213 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,213 @@
1+
.. _kafka-docker-example:
2+
3+
======================================
4+
MongoDB Kafka Connector Docker Example
5+
======================================
6+
7+
.. default-domain:: mongodb
8+
9+
.. contents:: On this page
10+
:local:
11+
:backlinks: none
12+
:depth: 1
13+
:class: singlecols
14+
15+
This guide provides an end-to-end setup of MongoDB and Kafka Connect to
16+
demonstrate the functionality of the MongoDB Kafka Source and Sink
17+
Connectors.
18+
19+
In this example, we create the following Kafka Connectors:
20+
21+
.. list-table::
22+
:header-rows: 1
23+
24+
* - Connector
25+
- Data Source
26+
- Destination
27+
28+
* - Confluent Connector:
29+
`Datagen <https://github.com/confluentinc/kafka-connect-datagen>`_
30+
- `Avro random generator
31+
<https://github.com/confluentinc/avro-random-generator>`_
32+
- Kafka topic: pageviews
33+
34+
* - Sink Connector: **mongo-sink**
35+
- Kafka topic: ``pageviews``
36+
- MongoDB collection: ``test.pageviews``
37+
38+
* - Source Connector: **mongo-source**
39+
- MongoDB collection: ``test.pageviews``
40+
- Kafka topic: ``mongo.test.pageviews``
41+
42+
* The **Datagen Connector** creates random data using the
43+
**Avro random generator** and publishes it to the Kafka topic "pageviews".
44+
45+
* The **mongo-sink** connector reads data from the "pageviews" topic and
46+
writes it to MongoDB in the "test.pageviews" collection.
47+
48+
* The **mongo-source** connector produces change events for the
49+
"test.pageviews" collection and publishes them to the
50+
"mongo.test.pageviews" collection.
51+
52+
Requirements
53+
------------
54+
55+
Linux/Unix-based OS
56+
~~~~~~~~~~~~~~~~~~~
57+
* `Docker <https://docs.docker.com/install/#supported-platforms>`_ 18.09 or later
58+
* `Docker Compose <https://docs.docker.com/compose/install/>`_ 1.24 or later
59+
60+
MacOS
61+
~~~~~
62+
63+
* `Docker Desktop Community Edition (Mac)
64+
<https://docs.docker.com/docker-for-mac/install/>`_ 2.1.0.1 or later
65+
66+
Windows
67+
~~~~~~~
68+
69+
* `Docker Desktop Community Edition (Windows)
70+
<https://docs.docker.com/docker-for-windows/install/>`_ 2.1.0.1 or later
71+
72+
How to Run the Example
73+
----------------------
74+
75+
Clone the `mongo-kafka <https://github.com/mongodb/mongo-kafka>`_ repository
76+
from GitHub:
77+
78+
.. code-block:: shell
79+
80+
git clone https://github.com/mongodb/mongo-kafka.git
81+
82+
Change directory to the ``docker`` directory
83+
84+
.. code-block:: shell
85+
86+
cd mongo-kafka/docker/
87+
88+
Start the shell script, **run.sh**:
89+
90+
.. code-block:: shell
91+
92+
./run.sh
93+
94+
The shell script executes the following sequence of commands:
95+
96+
#. Run the ``docker-compose up`` command
97+
98+
The ``docker-compose`` command installs and starts the following
99+
applications in a new docker container:
100+
101+
* Zookeeper
102+
* Kafka
103+
* Confluent Schema Registry
104+
* Confluent Kafka Connect
105+
* Confluent Control Center
106+
* Confluent KSQL Server
107+
* Kafka Rest Proxy
108+
* Kafka Topics UI
109+
* MongoDB replica set (three nodes: **mongo1**, **mongo2**, and
110+
**mongo3**)
111+
112+
#. Wait for MongoDB, Kafka, Kafka Connect to become ready
113+
#. Register the Confluent Datagen Connector
114+
#. Register the MongoDB Kafka Sink Connector
115+
#. Register the MongoDB Kafka Source Connector
116+
117+
.. note::
118+
119+
You may need to increase the RAM resource limits for Docker if the script
120+
fails. Use the `docker-compose stop <docker-compose-stop>` command to
121+
stop any running instances of docker if the script did not complete
122+
successfully.
123+
124+
Once the services have been started by the shell script, the Datagen Connector
125+
publishes new events to Kafka at short intervals which triggers the
126+
following cycle:
127+
128+
#. The Datagen Connector publishes new events to Kafka
129+
#. The Sink Connector writes the events into MongoDB
130+
#. The Source Connector writes the change stream messages back into Kafka
131+
132+
To view the Kafka topics, open the Kafka Control Center at
133+
http://localhost:9021/ and navigate to the cluster topics.
134+
135+
* The ``pageviews`` topic should contain documents added by the Datagen
136+
Connector that resemble the following:
137+
138+
.. code-block:: json
139+
140+
{
141+
"viewtime": {
142+
"$numberLong": "81"
143+
},
144+
"pageid": "Page_1",
145+
"userid": "User_8"
146+
}
147+
148+
* The ``mongo.test.pageviews`` topic should contain change events that
149+
resemble the following:
150+
151+
.. code-block:: json
152+
153+
{
154+
"_id": {
155+
"_data": "<resumeToken>"
156+
},
157+
"operationType": "insert",
158+
"clusterTime": {
159+
"$timestamp": {
160+
"t": 1563461814,
161+
"i": 4
162+
}
163+
},
164+
"fullDocument": {
165+
"_id": {
166+
"$oid": "5d3088b6bafa7829964150f3"
167+
},
168+
"viewtime": {
169+
"$numberLong": "81"
170+
},
171+
"pageid": "Page_1",
172+
"userid": "User_8"
173+
},
174+
"ns": {
175+
"db": "test",
176+
"coll": "pageviews"
177+
},
178+
"documentKey": {
179+
"_id": {
180+
"$oid": "5d3088b6bafa7829964150f3"
181+
}
182+
}
183+
}
184+
185+
Next, explore the collection data in the MongoDB replica set:
186+
187+
* In your local shell, navigate to the ``docker`` directory from which you
188+
ran the ``docker-compose`` commands and connect to the `mongo1` MongoDB
189+
instance using the following command:
190+
191+
.. code-block:: shell
192+
193+
docker-compose exec mongo1 /usr/bin/mongo
194+
195+
* If you insert or update a document in the ``test.pageviews``, the Source
196+
Connector publishes a change event document to the
197+
``mongo.test.pageviews`` Kafka topic.
198+
199+
.. _docker-compose-stop:
200+
201+
To stop the docker containers and all the processes running on them, use
202+
Ctrl-C in the shell running the script, or the following command:
203+
204+
.. code-block:: shell
205+
206+
docker-compose stop
207+
208+
To remove the docker containers and images completely, use the following
209+
command:
210+
211+
.. code-block:: shell
212+
213+
docker-compose down
Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
.. _kafka-installation:
2+
3+
===============================
4+
Install MongoDB Kafka Connector
5+
===============================
6+
7+
.. default-domain:: mongodb
8+
9+
.. contents:: On this page
10+
:local:
11+
:backlinks: none
12+
:depth: 2
13+
:class: singlecol
14+
15+
Overview
16+
--------
17+
18+
The MongoDB Kafka Connector build is available for both `Confluent Kafka
19+
<https://www.confluent.io/product/confluent-platform/>`_ and `Apache Kafka
20+
<https://kafka.apache.org/>`_ deployments.
21+
22+
23+
Use the :ref:`Confluent Kafka installation instructions
24+
<kafka-connector-install-confluent>` for a Confluent Kafka deployment or
25+
the :ref:`Apache Kafka installation instructions
26+
<kafka-connector-install-apache>` for an Apache Kafka deployment.
27+
28+
.. _kafka-connector-install-confluent:
29+
30+
Install the Connector for Confluent Kafka
31+
-----------------------------------------
32+
33+
Install using the Confluent Hub Client
34+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
35+
36+
1. Install the `Confluent Hub Client <https://docs.confluent.io/current/connect/managing/confluent-hub/client.html>`_
37+
if necessary.
38+
39+
2. Install the `MongoDB Connector for Apache Kafka <https://www.confluent.io/hub/mongodb/kafka-connect-mongodb>`_
40+
component using the Confluent Hub Client.
41+
42+
Install Manually
43+
~~~~~~~~~~~~~~~~
44+
45+
1. Follow the directions on the Confluent page for `Manually Installing
46+
Community Connectors
47+
<https://docs.confluent.io/current/connect/managing/community.html#manually-installing-community-connectors/>`_.
48+
49+
2. Use the GitHub URL and uber JAR locations in the :ref:`installation
50+
reference table <kafka-connector-installation-reference>` when prompted
51+
in the Confluent manual installation instructions.
52+
53+
.. _kafka-connector-install-apache:
54+
55+
Install the Connector for Apache Kafka
56+
--------------------------------------
57+
58+
1. Locate and download the uber JAR which is suffixed with ``all``
59+
to obtain all the dependencies required for the connector. Refer to the
60+
:ref:`installation reference table
61+
<kafka-connector-installation-reference>` for the uber JAR location.
62+
63+
2. Copy the uber JAR into the Kafka plugins directory:
64+
``/usr/local/share/kafka/plugins/``
65+
66+
.. note::
67+
68+
If you are running distributed worker processes, you must repeat this
69+
process for each server or VM.
70+
71+
.. _kafka-connector-installation-reference:
72+
73+
Installation Reference Table
74+
----------------------------
75+
76+
.. list-table::
77+
:stub-columns: 1
78+
79+
* - Connector GitHub repository
80+
- `mongodb/mongo-kafka <https://github.com/mongodb/mongo-kafka>`_
81+
82+
* - Uber JAR (Maven Central)
83+
- `mongo-kafka-connect <https://search.maven.org/search?q=g:org.mongodb.kafka%20AND%20a:mongo-kafka-connect>`_
84+
85+
* - Uber JAR (Sonatype OSS)
86+
- `mongodb-kafka
87+
<https://oss.sonatype.org/#nexus-search;quick~org.mongodb.kafka>`_

0 commit comments

Comments
 (0)