2020About |service| |data-lake|
2121---------------------------
2222
23- MongoDB |service| |data-lake| allows you to natively query and analyze
24- data across |aws| |s3| and MongoDB |service|. You can query your richly
25- structured data stored in |json|, |bson|, CSV, TSV, Avro, ORC, and
26- Parquet formats using the |mongo| shell, :dl:`MongoDB Compass
27- <compass>`, or any
28- :driver:`MongoDB driver </>` without data movement or transformation.
23+ MongoDB |service| |data-lake| allows you to natively query, transform,
24+ and move data across |aws| |s3| and MongoDB |service| clusters. You can
25+ query your richly structured data stored in |json|, |bson|, CSV, TSV,
26+ Avro, ORC, and Parquet formats using the |mongo| shell, :dl:`MongoDB
27+ Compass <compass>`, or any :driver:`MongoDB driver </>`.
28+
29+ Sample Uses
30+ -----------
31+
32+ You can use {+adl+} to:
33+
34+ - Convert richly structured MongoDB data into columnar Parquet or CSV
35+ files.
36+ - `Query across multiple Atlas clusters <https://developer.mongodb.com/how-to/query-multiple-databases-with-atlas-data-lake/?tck=featlearn>`__
37+ to get a holistic view of your data.
38+ - Materialize aggregations from MongoDB or |s3| data.
39+ - Automatically import data from your |s3| bucket into an |service|
40+ cluster.
2941
3042|data-lake| Access
3143------------------
3244
3345When you create a |data-lake|, you grant |service| either read only or
34- read and write access to |s3| buckets in your |aws| account. To access your
35- |service| clusters, |service| uses your existing :manual:`Role Based Access
36- Controls</core/authorization>`. You can view and edit the generated data
37- storage :ref:`configuration <datalake-configuration-file>` that maps data from
38- your |s3| buckets and |service| clusters to virtual databases and collections.
46+ read and write access to |s3| buckets in your |aws| account. To access
47+ your |service| clusters, |service| uses your existing :manual:`Role
48+ Based Access Controls</core/authorization>`. You can view and edit the
49+ generated data storage :ref:`configuration
50+ <datalake-configuration-file>` that maps data from your |s3| buckets
51+ and |service| clusters to virtual databases and collections.
3952
40- A database user must have one of the following roles to query an |service|
41- |data-lake|:
53+ A database user must have one of the following roles to query an
54+ |service| | data-lake|:
4255
4356- :atlas:`readWriteAnyDatabase </security-add-mongodb-users/#readWriteAnyDatabase>`
4457
@@ -52,7 +65,8 @@ A database user must have one of the following roles to query an |service|
5265Prerequisites
5366-------------
5467
55- Verify that you meet the following prerequisites before you create a |data-lake|:
68+ Verify that you meet the following prerequisites before you create a
69+ |data-lake|:
5670
5771* One or more |aws| |s3| buckets in the same |aws| account.
5872
@@ -92,9 +106,9 @@ Total Data Processed
92106~~~~~~~~~~~~~~~~~~~~
93107
94108|service| charges for the total number of bytes that |data-lake|
95- processes from your |aws| S3 buckets, rounded up to the nearest megabyte.
96- |service| charges **$5.00 per TB** of processed data, with a minimum of
97- **10 MB** or **$0.00005 per query**.
109+ processes from your |aws| S3 buckets, rounded up to the nearest
110+ megabyte. |service| charges **$5.00 per TB** of processed data, with a
111+ minimum of **10 MB** or **$0.00005 per query**.
98112
99113You can use partitioning strategies and compression in |aws| |s3| to
100114reduce the amount of data processed.
0 commit comments