Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
206 changes: 13 additions & 193 deletions docs/content.zh/docs/connectors/models/openai.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,205 +82,15 @@ FROM ML_PREDICT(

### 公共选项

<table class="table table-bordered">
<thead>
<tr>
<th class="text-left" style="width: 25%">参数</th>
<th class="text-center" style="width: 10%">是否必选</th>
<th class="text-center" style="width: 10%">默认值</th>
<th class="text-center" style="width: 10%">数据类型</th>
<th class="text-center" style="width: 45%">描述</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<h5>provider</h5>
</td>
<td>必填</td>
<td style="word-wrap: break-word;">(none)</td>
<td>String</td>
<td>指定使用的模型提供方,必须为 'openai'。</td>
</tr>
<tr>
<td>
<h5>endpoint</h5>
</td>
<td>必填</td>
<td style="word-wrap: break-word;">(none)</td>
<td>String</td>
<td>OpenAI API端点的完整URL,例如:<code>https://api.openai.com/v1/chat/completions</code> 或
<code>https://api.openai.com/v1/embeddings</code>。</td>
</tr>
<tr>
<td>
<h5>api-key</h5>
</td>
<td>必填</td>
<td style="word-wrap: break-word;">(none)</td>
<td>String</td>
<td>用于认证的OpenAI API密钥。</td>
</tr>
<tr>
<td>
<h5>model</h5>
</td>
<td>必填</td>
<td style="word-wrap: break-word;">(none)</td>
<td>String</td>
<td>模型名称,例如:<code>gpt-3.5-turbo</code>, <code>text-embedding-ada-002</code>。</td>
</tr>
<tr>
<td>
<h5>max-context-size</h5>
</td>
<td>可选</td>
<td style="word-wrap: break-word;">(none)</td>
<td>Integer</td>
<td>单个请求的最大上下文长度,单位为Token数量。当长度超过该值时,将使用context-overflow-action指定的溢出行为。</td>
</tr>
<tr>
<td>
<h5>context-overflow-action</h5>
</td>
<td>可选</td>
<td style="word-wrap: break-word;">(none)</td>
<td>String</td>
<td>处理上下文溢出的操作。支持的操作:
<ul>
<li><code>truncated-tail</code>(默认): 从上下文尾部截断超出的token。</li>
<li><code>truncated-tail-log</code>: 从上下文尾部截断超出的token。记录截断日志。</li>
<li><code>truncated-head</code>: 从上下文头部截断超出的token。</li>
<li><code>truncated-head-log</code>: 从上下文头部截断超出的token。记录截断日志。</li>
<li><code>skipped</code>: 跳过输入行。</li>
<li><code>skipped-log</code>: 跳过输入行。记录跳过日志。</li>
</ul>
</td>
</tr>
</tbody>
</table>
{{< generated/model_openai_common_section >}}

### Chat Completions

<table class="table table-bordered">
<thead>
<tr>
<th class="text-left" style="width: 25%">参数</th>
<th class="text-center" style="width: 10%">是否必选</th>
<th class="text-center" style="width: 10%">默认值</th>
<th class="text-center" style="width: 10%">数据类型</th>
<th class="text-center" style="width: 45%">描述</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<h5>system-prompt</h5>
</td>
<td>可选</td>
<td style="word-wrap: break-word;">"You are a helpful assistant."</td>
<td>String</td>
<td>用于聊天任务的系统提示信息。</td>
</tr>
<tr>
<td>
<h5>temperature</h5>
</td>
<td>可选</td>
<td style="word-wrap: break-word;">null</td>
<td>Double</td>
<td>控制输出的随机性,取值范围<code>[0.0, 1.0]</code>。参考<a href="https://platform.openai.com/docs/api-reference/chat/create#chat-create-temperature">temperature</a></td>
</tr>
<tr>
<td>
<h5>top-p</h5>
</td>
<td>可选</td>
<td style="word-wrap: break-word;">null</td>
<td>Double</td>
<td>用于替代temperature的概率阈值。参考<a href="https://platform.openai.com/docs/api-reference/chat/create#chat-create-top_p">top_p</a></td>
</tr>
<tr>
<td>
<h5>stop</h5>
</td>
<td>可选</td>
<td style="word-wrap: break-word;">null</td>
<td>String</td>
<td>停止序列,逗号分隔的列表。参考<a href="https://platform.openai.com/docs/api-reference/chat/create#chat-create-stop">stop</a></td>
</tr>
<tr>
<td>
<h5>max-tokens</h5>
</td>
<td>可选</td>
<td style="word-wrap: break-word;">null</td>
<td>Long</td>
<td>生成的最大token数。参考<a href="https://platform.openai.com/docs/api-reference/chat/create#chat-create-max_tokens">max tokens</a></td>
</tr>
<tr>
<td>
<h5>presence-penalty</h5>
</td>
<td>可选</td>
<td style="word-wrap: break-word;">(none)</td>
<td>Double</td>
<td>数值范围为-2.0到2.0之间。正值会根据新token是否出现在当前文本中对其进行惩罚,从而增加模型讨论新话题的可能性。</td>
</tr>
<tr>
<td>
<h5>n</h5>
</td>
<td>可选</td>
<td style="word-wrap: break-word;">(none)</td>
<td>Long</td>
<td>为每个输入消息生成的聊天完成选项数量。请注意,您将根据所有选项生成的token数量进行收费。为最小化成本,需将n保持为1。</td>
</tr>
<tr>
<td>
<h5>seed</h5>
</td>
<td>可选</td>
<td style="word-wrap: break-word;">(none)</td>
<td>Long</td>
<td>如果指定,模型平台将尽最大努力进行确定性采样,使得使用相同种子和参数的重复请求应返回相同的结果。但不保证结果一定是确定的。</td>
</tr>
<tr>
<td>
<h5>response-format</h5>
</td>
<td>可选</td>
<td style="word-wrap: break-word;">(none)</td>
<td>Enum</td>
<td>响应的格式,例如 'text' 或 'json_object'。</td>
</tr>
</tbody>
</table>
{{< generated/model_openai_chat_section >}}

### Embeddings

<table class="table table-bordered">
<thead>
<tr>
<th class="text-left" style="width: 25%">参数</th>
<th class="text-center" style="width: 10%">是否必选</th>
<th class="text-center" style="width: 10%">默认值</th>
<th class="text-center" style="width: 10%">数据类型</th>
<th class="text-center" style="width: 45%">描述</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<h5>dimension</h5>
</td>
<td>可选</td>
<td style="word-wrap: break-word;">null</td>
<td>Long</td>
<td>embedding向量的维度。参考<a href="https://platform.openai.com/docs/api-reference/embeddings/create#embeddings-create-dimensions">dimensions</a></td>
</tr>
</tbody>
</table>
{{< generated/model_openai_embedding_section >}}

## Schema要求

Expand All @@ -305,3 +115,13 @@ FROM ML_PREDICT(
</tr>
</tbody>
</table>

### 可用元数据

当配置 `error-handling-strategy` 为 `ignore` 时,您可以选择额外指定以下元数据列,将故障信息展示到您的输出流中。

* error-string(STRING):与错误相关的消息
* http-status-code(INT):HTTP状态码
* http-headers-map(MAP<STRING, ARRAY<STRING>>):响应返回的头部信息

如果您在Output Schema中定义了这些元数据列,但调用未失败,则这些列将填充为null值。
Loading