Skip to content

Commit aea79b9

Browse files
committed
[FLINK-22119][hive][doc] Update document for hive dialect
This closes #15630
1 parent a4dcd91 commit aea79b9

File tree

4 files changed

+192
-30
lines changed

4 files changed

+192
-30
lines changed

docs/content.zh/docs/connectors/table/hive/hive_dialect.md

Lines changed: 70 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -335,26 +335,85 @@ CREATE FUNCTION function_name AS class_name;
335335
DROP FUNCTION [IF EXISTS] function_name;
336336
```
337337

338-
## DML
338+
## DML & DQL _`Beta`_
339339

340-
### INSERT
340+
Hive 方言支持常用的 Hive [DML](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML)
341+
[DQL](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select) 。 下表列出了一些 Hive 方言支持的语法。
341342

342-
```sql
343-
INSERT (INTO|OVERWRITE) [TABLE] table_name [PARTITION partition_spec] SELECT ...;
344-
```
343+
- [SORT/CLUSTER/DISTRIBUTE BY](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy)
344+
- [Group By](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+GroupBy)
345+
- [Join](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins)
346+
- [Union](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Union)
347+
- [LATERAL VIEW](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView)
348+
- [Window Functions](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics)
349+
- [SubQueries](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries)
350+
- [CTE](https://cwiki.apache.org/confluence/display/Hive/Common+Table+Expression)
351+
- [INSERT INTO dest schema](https://issues.apache.org/jira/browse/HIVE-9481)
352+
- [Implicit type conversions](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-AllowedImplicitConversions)
353+
354+
为了实现更好的语法和语义的兼容,强烈建议使用 [HiveModule]({{< ref "docs/connectors/table/hive/hive_functions" >}}#use-hive-built-in-functions-via-hivemodule)
355+
并将其放在 Module 列表的首位,以便在函数解析时优先使用 Hive 内置函数。
345356

346-
如果指定了 `partition_spec`,可以是完整或者部分分区列。如果是部分指定,则可以省略动态分区的列名。
357+
Hive 方言不再支持 [Flink SQL 语法]({{< ref "docs/dev/table/sql/queries" >}}) 。 若需使用 Flink 语法,请切换到 `default` 方言。
358+
359+
以下是一个使用 Hive 方言的示例。
360+
361+
```bash
362+
Flink SQL> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/opt/hive-conf');
363+
[INFO] Execute statement succeed.
347364

348-
## DQL
365+
Flink SQL> use catalog myhive;
366+
[INFO] Execute statement succeed.
349367

350-
目前,对于DQL语句 Hive 方言和 Flink SQL 支持的语法相同。有关更多详细信息,请参考[Flink SQL 查询]({{< ref "docs/dev/table/sql/queries" >}})。并且建议切换到 `default` 方言来执行 DQL 语句。
368+
Flink SQL> load module hive;
369+
[INFO] Execute statement succeed.
370+
371+
Flink SQL> use modules hive,core;
372+
[INFO] Execute statement succeed.
373+
374+
Flink SQL> set table.sql-dialect=hive;
375+
[INFO] Session property has been set.
376+
377+
Flink SQL> select explode(array(1,2,3)); -- call hive udtf
378+
+-----+
379+
| col |
380+
+-----+
381+
| 1 |
382+
| 2 |
383+
| 3 |
384+
+-----+
385+
3 rows in set
386+
387+
Flink SQL> create table tbl (key int,value string);
388+
[INFO] Execute statement succeed.
389+
390+
Flink SQL> insert overwrite table tbl values (5,'e'),(1,'a'),(1,'a'),(3,'c'),(2,'b'),(3,'c'),(3,'c'),(4,'d');
391+
[INFO] Submitting SQL update statement to the cluster...
392+
[INFO] SQL update statement has been successfully submitted to the cluster:
393+
394+
Flink SQL> select * from tbl cluster by key; -- run cluster by
395+
2021-04-22 16:13:57,005 INFO org.apache.hadoop.mapred.FileInputFormat [] - Total input paths to process : 1
396+
+-----+-------+
397+
| key | value |
398+
+-----+-------+
399+
| 1 | a |
400+
| 1 | a |
401+
| 5 | e |
402+
| 2 | b |
403+
| 3 | c |
404+
| 3 | c |
405+
| 3 | c |
406+
| 4 | d |
407+
+-----+-------+
408+
8 rows in set
409+
```
351410
352411
## 注意
353412
354413
以下是使用 Hive 方言的一些注意事项。
355414
356-
- Hive 方言只能用于操作 Hive 表,不能用于一般表。Hive 方言应与[HiveCatalog]({{< ref "docs/connectors/table/hive/hive_catalog" >}})一起使用。
415+
- Hive 方言只能用于操作 Hive 对象,并要求当前 Catalog 是一个 [HiveCatalog]({{< ref "docs/connectors/table/hive/hive_catalog" >}}) 。
416+
- Hive 方言只支持 `db.table` 这种两级的标识符,不支持带有 Catalog 名字的标识符。
357417
- 虽然所有 Hive 版本支持相同的语法,但是一些特定的功能是否可用仍取决于你使用的[Hive 版本]({{< ref "docs/connectors/table/hive/overview" >}}#支持的hive版本)。例如,更新数据库位置
358418
只在 Hive-2.4.0 或更高版本支持。
359-
- Hive 和 Calcite 有不同的保留关键字集合。例如,`default` 是 Calcite 的保留关键字,却不是 Hive 的保留关键字。即使使用 Hive 方言, 也必须使用反引号 ( ` ) 引用此类关键字才能将其用作标识符。
360-
- 由于扩展的查询语句的不兼容性,在 Flink 中创建的视图是不能在 Hive 中查询的。
419+
- 执行 DML 和 DQL 时应该使用 [HiveModule]({{< ref "docs/connectors/table/hive/hive_functions" >}}#use-hive-built-in-functions-via-hivemodule) 。

docs/content.zh/docs/connectors/table/hive/overview.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,9 @@ export HADOOP_CLASSPATH=`hadoop classpath`
127127
// Hive dependencies
128128
hive-exec-2.3.4.jar
129129
130+
// add antlr-runtime if you need to use hive dialect
131+
antlr-runtime-3.5.2.jar
132+
130133
```
131134
{{< /tab >}}
132135
{{< tab "Hive 1.0.0" >}}
@@ -146,6 +149,9 @@ export HADOOP_CLASSPATH=`hadoop classpath`
146149
orc-core-1.4.3-nohive.jar
147150
aircompressor-0.8.jar // transitive dependency of orc-core
148151
152+
// add antlr-runtime if you need to use hive dialect
153+
antlr-runtime-3.5.2.jar
154+
149155
```
150156
{{< /tab >}}
151157
{{< tab "Hive 1.1.0" >}}
@@ -165,6 +171,9 @@ export HADOOP_CLASSPATH=`hadoop classpath`
165171
orc-core-1.4.3-nohive.jar
166172
aircompressor-0.8.jar // transitive dependency of orc-core
167173
174+
// add antlr-runtime if you need to use hive dialect
175+
antlr-runtime-3.5.2.jar
176+
168177
```
169178
{{< /tab >}}
170179
{{< tab "Hive 1.2.1" >}}
@@ -184,6 +193,9 @@ export HADOOP_CLASSPATH=`hadoop classpath`
184193
orc-core-1.4.3-nohive.jar
185194
aircompressor-0.8.jar // transitive dependency of orc-core
186195
196+
// add antlr-runtime if you need to use hive dialect
197+
antlr-runtime-3.5.2.jar
198+
187199
```
188200
{{< /tab >}}
189201
{{< tab "Hive 2.0.0" >}}
@@ -197,6 +209,9 @@ export HADOOP_CLASSPATH=`hadoop classpath`
197209
// Hive dependencies
198210
hive-exec-2.0.0.jar
199211
212+
// add antlr-runtime if you need to use hive dialect
213+
antlr-runtime-3.5.2.jar
214+
200215
```
201216
{{< /tab >}}
202217
{{< tab "Hive 2.1.0" >}}
@@ -210,6 +225,9 @@ export HADOOP_CLASSPATH=`hadoop classpath`
210225
// Hive dependencies
211226
hive-exec-2.1.0.jar
212227
228+
// add antlr-runtime if you need to use hive dialect
229+
antlr-runtime-3.5.2.jar
230+
213231
```
214232
{{< /tab >}}
215233
{{< tab "Hive 2.2.0" >}}
@@ -227,6 +245,9 @@ export HADOOP_CLASSPATH=`hadoop classpath`
227245
orc-core-1.4.3.jar
228246
aircompressor-0.8.jar // transitive dependency of orc-core
229247
248+
// add antlr-runtime if you need to use hive dialect
249+
antlr-runtime-3.5.2.jar
250+
230251
```
231252
{{< /tab >}}
232253
{{< tab "Hive 3.1.0" >}}
@@ -241,6 +262,9 @@ export HADOOP_CLASSPATH=`hadoop classpath`
241262
hive-exec-3.1.0.jar
242263
libfb303-0.9.3.jar // libfb303 is not packed into hive-exec in some versions, need to add it separately
243264
265+
// add antlr-runtime if you need to use hive dialect
266+
antlr-runtime-3.5.2.jar
267+
244268
```
245269
{{< /tab >}}
246270
{{< /tabs >}}

docs/content/docs/connectors/table/hive/hive_dialect.md

Lines changed: 74 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -300,8 +300,6 @@ CREATE VIEW [IF NOT EXISTS] view_name [(column_name, ...) ]
300300

301301
#### Alter
302302

303-
**NOTE**: Altering view only works in Table API, but not supported via SQL client.
304-
305303
##### Rename
306304

307305
```sql
@@ -346,33 +344,90 @@ CREATE FUNCTION function_name AS class_name;
346344
DROP FUNCTION [IF EXISTS] function_name;
347345
```
348346

349-
## DML
347+
## DML & DQL _`Beta`_
350348

351-
### INSERT
349+
Hive dialect supports a commonly-used subset of Hive's [DML](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML)
350+
and [DQL](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select). The following lists some examples of
351+
HiveQL supported by the Hive dialect.
352352

353-
```sql
354-
INSERT (INTO|OVERWRITE) [TABLE] table_name [PARTITION partition_spec] SELECT ...;
355-
```
353+
- [SORT/CLUSTER/DISTRIBUTE BY](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy)
354+
- [Group By](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+GroupBy)
355+
- [Join](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins)
356+
- [Union](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Union)
357+
- [LATERAL VIEW](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView)
358+
- [Window Functions](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics)
359+
- [SubQueries](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries)
360+
- [CTE](https://cwiki.apache.org/confluence/display/Hive/Common+Table+Expression)
361+
- [INSERT INTO dest schema](https://issues.apache.org/jira/browse/HIVE-9481)
362+
- [Implicit type conversions](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-AllowedImplicitConversions)
363+
364+
In order to have better syntax and semantic compatibility, it's highly recommended to use [HiveModule]({{< ref "docs/connectors/table/hive/hive_functions" >}}#use-hive-built-in-functions-via-hivemodule)
365+
and place it first in the module list, so that Hive built-in functions can be picked up during function resolution.
366+
367+
Hive dialect no longer supports [Flink SQL queries]({{< ref "docs/dev/table/sql/queries" >}}). Please switch to `default`
368+
dialect if you'd like to write in Flink syntax.
369+
370+
Following is an example of using hive dialect to run some queries.
356371

357-
The `partition_spec`, if present, can be either a full spec or partial spec. If the `partition_spec` is a partial
358-
spec, the dynamic partition column names can be omitted.
372+
```bash
373+
Flink SQL> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/opt/hive-conf');
374+
[INFO] Execute statement succeed.
375+
376+
Flink SQL> use catalog myhive;
377+
[INFO] Execute statement succeed.
378+
379+
Flink SQL> load module hive;
380+
[INFO] Execute statement succeed.
359381

360-
## DQL
382+
Flink SQL> use modules hive,core;
383+
[INFO] Execute statement succeed.
361384

362-
At the moment, Hive dialect supports the same syntax as Flink SQL for DQLs. Refer to
363-
[Flink SQL queries]({{< ref "docs/dev/table/sql/queries" >}}) for more details. And it's recommended to switch to
364-
`default` dialect to execute DQLs.
385+
Flink SQL> set table.sql-dialect=hive;
386+
[INFO] Session property has been set.
387+
388+
Flink SQL> select explode(array(1,2,3)); -- call hive udtf
389+
+-----+
390+
| col |
391+
+-----+
392+
| 1 |
393+
| 2 |
394+
| 3 |
395+
+-----+
396+
3 rows in set
397+
398+
Flink SQL> create table tbl (key int,value string);
399+
[INFO] Execute statement succeed.
400+
401+
Flink SQL> insert overwrite table tbl values (5,'e'),(1,'a'),(1,'a'),(3,'c'),(2,'b'),(3,'c'),(3,'c'),(4,'d');
402+
[INFO] Submitting SQL update statement to the cluster...
403+
[INFO] SQL update statement has been successfully submitted to the cluster:
404+
405+
Flink SQL> select * from tbl cluster by key; -- run cluster by
406+
2021-04-22 16:13:57,005 INFO org.apache.hadoop.mapred.FileInputFormat [] - Total input paths to process : 1
407+
+-----+-------+
408+
| key | value |
409+
+-----+-------+
410+
| 1 | a |
411+
| 1 | a |
412+
| 5 | e |
413+
| 2 | b |
414+
| 3 | c |
415+
| 3 | c |
416+
| 3 | c |
417+
| 4 | d |
418+
+-----+-------+
419+
8 rows in set
420+
```
365421
366422
## Notice
367423
368424
The following are some precautions for using the Hive dialect.
369425
370-
- Hive dialect should only be used to manipulate Hive tables, not generic tables. And Hive dialect should be used together
371-
with a [HiveCatalog]({{< ref "docs/connectors/table/hive/hive_catalog" >}}).
426+
- Hive dialect should only be used to process Hive meta objects, and requires the current catalog to be a
427+
[HiveCatalog]({{< ref "docs/connectors/table/hive/hive_catalog" >}}).
428+
- Hive dialect only supports 2-part identifiers, so you can't specify catalog for an identifier.
372429
- While all Hive versions support the same syntax, whether a specific feature is available still depends on the
373430
[Hive version]({{< ref "docs/connectors/table/hive/overview" >}}#supported-hive-versions) you use. For example, updating database
374431
location is only supported in Hive-2.4.0 or later.
375-
- Hive and Calcite have different sets of reserved keywords. For example, `default` is a reserved keyword in Calcite and
376-
a non-reserved keyword in Hive. Even with Hive dialect, you have to quote such keywords with backtick ( ` ) in order to
377-
use them as identifiers.
378-
- Due to expanded query incompatibility, views created in Flink cannot be queried in Hive.
432+
- Use [HiveModule]({{< ref "docs/connectors/table/hive/hive_functions" >}}#use-hive-built-in-functions-via-hivemodule)
433+
to run DML and DQL.

docs/content/docs/connectors/table/hive/overview.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -131,6 +131,9 @@ Please find the required dependencies for different Hive major versions below.
131131
// Hive dependencies
132132
hive-exec-2.3.4.jar
133133
134+
// add antlr-runtime if you need to use hive dialect
135+
antlr-runtime-3.5.2.jar
136+
134137
```
135138
{{< /tab >}}
136139
{{< tab "Hive 1.0.0" >}}
@@ -150,6 +153,9 @@ Please find the required dependencies for different Hive major versions below.
150153
orc-core-1.4.3-nohive.jar
151154
aircompressor-0.8.jar // transitive dependency of orc-core
152155
156+
// add antlr-runtime if you need to use hive dialect
157+
antlr-runtime-3.5.2.jar
158+
153159
```
154160
{{< /tab >}}
155161
{{< tab "Hive 1.1.0" >}}
@@ -169,6 +175,9 @@ Please find the required dependencies for different Hive major versions below.
169175
orc-core-1.4.3-nohive.jar
170176
aircompressor-0.8.jar // transitive dependency of orc-core
171177
178+
// add antlr-runtime if you need to use hive dialect
179+
antlr-runtime-3.5.2.jar
180+
172181
```
173182
{{< /tab >}}
174183
{{< tab "Hive 1.2.1" >}}
@@ -188,6 +197,9 @@ Please find the required dependencies for different Hive major versions below.
188197
orc-core-1.4.3-nohive.jar
189198
aircompressor-0.8.jar // transitive dependency of orc-core
190199
200+
// add antlr-runtime if you need to use hive dialect
201+
antlr-runtime-3.5.2.jar
202+
191203
```
192204
{{< /tab >}}
193205
{{< tab "Hive 2.0.0" >}}
@@ -201,6 +213,9 @@ Please find the required dependencies for different Hive major versions below.
201213
// Hive dependencies
202214
hive-exec-2.0.0.jar
203215
216+
// add antlr-runtime if you need to use hive dialect
217+
antlr-runtime-3.5.2.jar
218+
204219
```
205220
{{< /tab >}}
206221
{{< tab "Hive 2.1.0" >}}
@@ -214,6 +229,9 @@ Please find the required dependencies for different Hive major versions below.
214229
// Hive dependencies
215230
hive-exec-2.1.0.jar
216231
232+
// add antlr-runtime if you need to use hive dialect
233+
antlr-runtime-3.5.2.jar
234+
217235
```
218236
{{< /tab >}}
219237
{{< tab "Hive 2.2.0" >}}
@@ -231,6 +249,9 @@ Please find the required dependencies for different Hive major versions below.
231249
orc-core-1.4.3.jar
232250
aircompressor-0.8.jar // transitive dependency of orc-core
233251
252+
// add antlr-runtime if you need to use hive dialect
253+
antlr-runtime-3.5.2.jar
254+
234255
```
235256
{{< /tab >}}
236257
{{< tab "Hive 3.1.0" >}}
@@ -245,6 +266,9 @@ Please find the required dependencies for different Hive major versions below.
245266
hive-exec-3.1.0.jar
246267
libfb303-0.9.3.jar // libfb303 is not packed into hive-exec in some versions, need to add it separately
247268
269+
// add antlr-runtime if you need to use hive dialect
270+
antlr-runtime-3.5.2.jar
271+
248272
```
249273
{{< /tab >}}
250274
{{< /tabs >}}

0 commit comments

Comments
 (0)