LLM_SIMILARITY

描述

判断两个文本之间的语义相似性

语法

LLM_LLM_SIMILARITY([<resource_name>], <text_1>, <text_2>)

参数

参数	说明
`<resource_name>`	指定的资源名称
`<text_1>`	文本
`<text_2>`	文本

返回值

返回一个 0 - 10 之间的浮点数。0 表示无相似性，10 表示强相似性。

当输入有值为 NULL 时返回 NULL

结果为大模型生成，所以返回内容并不固定

示例

假设我有如下表，代表某家快递公司收到的评论：

CREATE TABLE user_comments (
    id      INT,
    comment VARCHAR(500)
) DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 10
PROPERTIES (
    "replication_num" = "1"
);

当我想按顾客语气情绪对评论进行排行时可以：

SELECT comment,
    LLM_SIMILARITY('resource_name', 'I am extremely dissatisfied with their service.', comment) AS score
FROM user_comments ORDER BY score DESC LIMIT 5;

查询结果大致如下：

+-------------------------------------------------+-------+
| comment                                         | score |
+-------------------------------------------------+-------+
| It arrived broken and I am really disappointed. |   7.5 |
| Delivery was very slow and frustrating.         |   6.5 |
| Not bad, but the packaging could be better.     |   3.5 |
| It is fine, nothing special to mention.         |     3 |
| Absolutely fantastic, highly recommend it.      |     1 |
+-------------------------------------------------+-------+

描述​

语法​

参数​

返回值​

示例​

描述

语法

参数

返回值

示例