mongodb
diff --git a/‎source/aggregation.txt‎
Lines changed: 1 addition & 0 deletions b/‎source/aggregation.txt‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎source/applications.txt‎
Lines changed: 1 addition & 0 deletions b/‎source/applications.txt‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎source/applications/map-reduce.txt‎
Lines changed: 288 additions & 0 deletions b/‎source/applications/map-reduce.txt‎
Lines changed: 288 additions & 0 deletions
@@ -25,4 +25,5 @@ The following is the outline of the aggregation documentation:
    applications/aggregation
    tutorial/aggregation-examples
    reference/aggregation
+   applications/map-reduce
 
@@ -39,6 +39,7 @@ The following documents outline basic application development topics:
    - :doc:`/applications/replication` 
    - :doc:`/applications/indexes`
    - :doc:`/applications/aggregation`
+   - :doc:`/applications/map-reduce`
 
 .. _application-patterns:
 
 
@@ -0,0 +1,288 @@
+==========
+Map-Reduce
+==========
+
+.. default-domain:: mongodb
+
+Map-reduce operations can handle complex aggregation
+tasks. [#simple-aggregation-use-framework]_ To perform map-reduce operations,
+MongoDB provides the :dbcommand:`mapReduce` command and, in the
+:program:`mongo` shell, the wrapper :method:`db.collection.mapReduce()`
+method.
+
+This overview will cover:
+
+- :ref:`map-reduce-method`
+
+- :ref:`map-reduce-examples`
+
+- :ref:`map-reduce-incremental`
+
+- :ref:`map-reduce-sharded-cluster`
+
+- :ref:`map-reduce-additional-references`
+
+.. _map-reduce-method:
+
+mapReduce()
+-----------
+
+.. include:: /reference/method/db.collection.mapReduce.txt
+   :start-after: mongodb
+   :end-before: mapReduce-syntax-end
+
+.. _map-reduce-examples:
+
+Map-Reduce Examples
+-------------------
+.. include:: /includes/examples-map-reduce.rst
+   :start-after: map-reduce-examples-begin
+   :end-before: map-reduce-sum-price-wrapper-end
+   
+.. include:: /includes/examples-map-reduce.rst
+   :start-after: map-reduce-sum-price-cmd-end
+   :end-before: map-reduce-item-counts-avg-wrapper-end
+
+.. _map-reduce-incremental:
+
+Incremental Map-Reduce
+----------------------
+
+If the map-reduce dataset is constantly growing, then rather than
+performing the map-reduce operation over the entire dataset each time
+you want to run map-reduce, you may want to perform an incremental
+map-reduce.
+
+To perform incremental map-reduce:
+
+#. Run a map-reduce job over the current collection and output the
+   result to a separate collection.
+
+#. When you have more data to process, run subsequent map-reduce job
+   with:
+
+   - the ``<query>`` parameter that specifies conditions that match
+     *only* the new documents.
+
+   - the ``<out>`` parameter that specifies the ``reduce`` action to
+     merge the new results into the existing output collection.
+
+Consider the following example where you schedule a map-reduce
+operation on a ``sessions`` collection to run at the end of each day.
+
+**Data Setup**
+
+The ``sessions`` collection contains documents that log users' session
+each day and can be simulated as follows:
+
+.. code-block:: javascript
+
+   db.sessions.save( { userid: "a", ts: ISODate('2011-11-03 14:17:00'), length: 95 } );
+   db.sessions.save( { userid: "b", ts: ISODate('2011-11-03 14:23:00'), length: 110 } );
+   db.sessions.save( { userid: "c", ts: ISODate('2011-11-03 15:02:00'), length: 120 } );
+   db.sessions.save( { userid: "d", ts: ISODate('2011-11-03 16:45:00'), length: 45 } );
+
+   db.sessions.save( { userid: "a", ts: ISODate('2011-11-04 11:05:00'), length: 105 } );
+   db.sessions.save( { userid: "b", ts: ISODate('2011-11-04 13:14:00'), length: 120 } );
+   db.sessions.save( { userid: "c", ts: ISODate('2011-11-04 17:00:00'), length: 130 } );
+   db.sessions.save( { userid: "d", ts: ISODate('2011-11-04 15:37:00'), length: 65 } );
+
+**Initial Map-Reduce of Current Collection**
+
+#. Define the ``<map>`` function that maps the ``userid`` to an
+   object that contains the fields ``userid``, ``total_time``, ``count``,
+   and ``avg_time``:
+
+   .. code-block:: javascript
+
+      var mapFunction = function() {
+                            var key = this.userid;
+                            var value = {
+                                          userid: this.userid,
+                                          total_time: this.length,
+                                          count: 1,
+                                          avg_time: 0
+                                         };
+
+                            emit( key, value );
+                        };
+
+#. Define the corresponding ``<reduce>`` function with two arguments
+   ``key`` and ``values`` to calculate the total time and the count.
+   The ``key`` corresponds to the ``userid``, and the ``values`` is an
+   array whose elements corresponds to the individual objects mapped to the
+   ``userid`` in the ``mapFunction``.
+
+   .. code-block:: javascript
+
+      var reduceFunction = function(key, values) {
+
+                              var reducedObject = {
+                                                    userid: key,
+                                                    total_time: 0,
+                                                    count:0,
+                                                    avg_time:0
+                                                  };
+
+                              values.forEach( function(value) {
+                                                    reducedObject.total_time += value.total_time;
+                                                    reducedObject.count += value.count;
+                                              } 
+                                            );
+                              return reducedObject;
+                           };
+
+#. Define ``<finalize>`` function with two arguments ``key`` and
+   ``reducedValue``. The function modifies the ``reducedValue`` document
+   to add another field ``average`` and returns the modified document.
+
+   .. code-block:: javascript
+
+      var finalizeFunction = function (key, reducedValue) {
+
+                                 if (reducedValue.count > 0)
+                                     reducedValue.avg_time = reducedValue.total_time / reducedValue.count;
+
+                                 return reducedValue;
+                             };
+
+#. Perform map-reduce on the ``session`` collection using the
+   ``mapFunction``, the ``reduceFunction``, and the
+   ``finalizeFunction`` functions. Output the results to a collection
+   ``session_stat``. If the ``session_stat`` collection already exists,
+   the operation will replace the contents:
+
+   .. code-block:: javascript
+
+      db.runCommand(
+                     {
+                        mapreduce: "sessions",
+                        map: mapFunction,
+                        reduce:reduceFunction,
+                        out: { reduce: "session_stat" },
+                        finalize: finalizeFunction
+                     }
+                   );
+
+**Subsequent Incremental Map-Reduce**
+
+Assume the next day, the ``sessions`` collection grows by the following documents:
+
+   .. code-block:: javascript
+
+      db.session.save( { userid: "a", ts: ISODate('2011-11-05 14:17:00'), length: 100 } );
+      db.session.save( { userid: "b", ts: ISODate('2011-11-05 14:23:00'), length: 115 } );
+      db.session.save( { userid: "c", ts: ISODate('2011-11-05 15:02:00'), length: 125 } );
+      db.session.save( { userid: "d", ts: ISODate('2011-11-05 16:45:00'), length: 55 } );
+
+5. At the end of the day, perform incremental map-reduce on the
+   ``sessions`` collection but use the ``query`` field to select only the
+   new documents. Output the results to the collection ``session_stat``,
+   but ``reduce`` the contents with the results of the incremental
+   map-reduce:
+
+   .. code-block:: javascript
+
+      db.runCommand( { 
+                       mapreduce: "sessions", 
+                       map: mapFunction, 
+                       reduce:reduceFunction,
+                       query: { ts: { $gt: ISODate('2011-11-05 00:00:00') } },
+                       out: { reduce: "session_stat" }, 
+                       finalize:finalizeFunction 
+                     }
+                   );
+
+.. _map-reduce-temporay-collection:
+
+Temporary Collection
+--------------------
+
+The map-reduce operation uses a temporary collection during processing.
+At completion, the temporary collection will be renamed to the
+permanent name atomically. Thus, one can perform a map-reduce operation
+periodically with the same target collection name without worrying
+about a temporary state of incomplete data. This is very useful when
+generating statistical output collections on a regular basis.
+
+.. _map-reduce-sharded-cluster:
+
+Sharded Cluster
+---------------
+
+Sharded Input
+~~~~~~~~~~~~~
+
+If the input collection is sharded, :program:`mongos` will
+automatically dispatch the map-reduce job to each shard to be executed
+in parallel. There is no special option required. :program:`mongos`
+will wait for jobs on all shards to finish.
+
+Sharded Output
+~~~~~~~~~~~~~~
+
+By default the output collection will not be sharded. The process is:
+
+- :program:`mongos` dispatches a map-reduce finish job to the shard
+  that will store the target collection.
+
+- The target shard will pull results from all other shards, run a final
+  reduce/finalize, and write to the output.
+
+- If using the sharded option in the ``<out>`` parameter, the output will be
+  sharded using ``_id`` as the shard key.
+
+.. versionchanged:: 2.2
+
+- If the output collection does not exist, the collection is created
+  and sharded on the ``_id`` field. Even if empty, its initial chunks
+  are created based on the result of the first step of the map-reduce
+  operation.
+
+- :program:`mongos` dispatches, in parallel, a map-reduce finish job
+  to every shard that owns a chunk.
+
+- Each shard will pull the results it owns from all other shards, run a
+  final reduce/finalize, and write to the output collection.
+
+.. note::
+
+   - During additional map-reduce jobs, chunk splitting will be done as needed.
+
+   - Balancing of chunks for the output collection is automatically
+     prevented during post-processing to avoid concurrency issues.
+
+Prior to version 2.1:
+
+- :program:`mongos` retrieves the results from each shard, doing a
+  merge sort to order the results, and performs a reduce/finalize as
+  needed. :program:`mongos` then writes the result to the output
+  collection in sharded mode.
+
+- Only a small amount of memory is required even for large datasets.
+
+- Shard chunks do not get automatically split and migrated during
+  insertion. Manual intervention is required until the chunks are
+  granular and balanced.
+
+.. warning:: 
+
+   Sharded output for mapreduce has been overhauled in v2.2. Its use in
+   earlier versions is not recommended.
+
+.. _map-reduce-additional-references:
+
+Additional References
+---------------------
+
+.. seealso::
+
+   - :wiki:`Map-Reduce Concurrency
+     <How+does+concurrency+work#Howdoesconcurrencywork-MapReduce>` 
+
+   - `MapReduce, Geospatial Indexes, and Other Cool Features <http://www.slideshare.net/mongosf/mapreduce-geospatial-indexing-and-other-cool-features-kristina-chodorow>`_ - Kristina Chodorow at MongoSF (April 2010) 
+
+   - :wiki:`Troubleshooting MapReduce`
+
+.. [#simple-aggregation-use-framework] For many simple aggregation tasks, see the
+   :doc:`aggregation framework </applications/aggregation>`.