LLM_SIMILARITY
Description
Determines the semantic similarity between two texts.
Syntax
LLM_SIMILARITY([<resource_name>], <text_1>, <text_2>)
Parameters
Parameter | Description |
---|---|
<resource_name> | The specified resource name |
<text_1> | Text |
<text_2> | Text |
Return Value
Returns a floating-point number between 0 and 10. 0 means no similarity, 10 means strong similarity.
If any input is NULL, returns NULL.
The result is generated by the large language model, so the output may not be fixed.
Example
Suppose you have the following table representing comments received by a courier company:
CREATE TABLE user_comments (
id INT,
comment VARCHAR(500)
) DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 10
PROPERTIES (
"replication_num" = "1"
);
If you want to rank comments by customer sentiment, you can use:
SELECT comment,
LLM_SIMILARITY('resource_name', 'I am extremely dissatisfied with their service.', comment) AS score
FROM user_comments ORDER BY score DESC LIMIT 5;
The query result may look like:
+-------------------------------------------------+-------+
| comment | score |
+-------------------------------------------------+-------+
| It arrived broken and I am really disappointed. | 7.5 |
| Delivery was very slow and frustrating. | 6.5 |
| Not bad, but the packaging could be better. | 3.5 |
| It is fine, nothing special to mention. | 3 |
| Absolutely fantastic, highly recommend it. | 1 |
+-------------------------------------------------+-------+