跳到主要内容

转换为/从 JSON 类型

Doris 中的 JSON 类型采用二进制编码存储,而不是文本存储,提供更高效的处理和存储方式。JSON 类型与 Doris 内部类型存在一一对应的关系。

转换为 JSON

FROM String

当将字符串转换为 JSON 时,字符串内容必须符合 RFC7159 定义的有效 JSON 语法。解析器会验证字符串并将其转换为相应的 JSON 二进制格式。

字符串解析规则

  • 如果字符串包含有效的 JSON 结构(对象、数组、数字、布尔值或 null),将解析为对应的 JSON 类型:

    mysql> SELECT CAST('[1,2,3,4]' AS JSON); -- 输出:[1,2,3,4](解析为 JSON 数组)
    +---------------------------+
    | CAST('[1,2,3,4]' AS JSON) |
    +---------------------------+
    | [1,2,3,4] |
    +---------------------------+
  • 要创建 JSON 字符串值(将字符串本身视为 JSON 字符串值而不是解析它),请使用 TO_JSON 函数:

    mysql> SELECT TO_JSON('[1,2,3,4]'); -- 输出:"[1,2,3,4]"(带引号的 JSON 字符串)
    +----------------------+
    | TO_JSON('[1,2,3,4]') |
    +----------------------+
    | "[1,2,3,4]" |
    +----------------------+

数字解析规则

从 JSON 字符串解析数值时:

  • 如果数字包含小数点,将转换为 JSON Double 类型:

    mysql> SELECT JSON_TYPE(CAST('{"key":123.45}' AS JSON), '$.key');
    +------------------------------------------------------+
    | JSON_TYPE(CAST('{"key":123.45}' AS JSON), '$.key') |
    +------------------------------------------------------+
    | double |
    +------------------------------------------------------+
  • 如果数字是整数形式,将存储为最小兼容整数类型:

    mysql> SELECT JSON_TYPE(CAST('{"key":123456789}' AS JSON), '$.key');
    +---------------------------------------------------------+
    | JSON_TYPE(CAST('{"key":123456789}' AS JSON), '$.key') |
    +---------------------------------------------------------+
    | int |
    +---------------------------------------------------------+
    mysql> SELECT JSON_TYPE(CAST('{"key":1234567891234}' AS JSON), '$.key');
    +-------------------------------------------------------------+
    | JSON_TYPE(CAST('{"key":1234567891234}' AS JSON), '$.key') |
    +-------------------------------------------------------------+
    | bigint |
    +-------------------------------------------------------------+
  • 特别地,如果整数超出 Int128 范围,会使用 double 类型存储,这时会丢失精度:

    mysql> SELECT JSON_TYPE(CAST('{"key":12345678901234567890123456789012345678901234567890}' AS JSON), '$.key');
    +--------------------------------------------------------------------------------------------------+
    | JSON_TYPE(CAST('{"key":12345678901234567890123456789012345678901234567890}' AS JSON), '$.key') |
    +--------------------------------------------------------------------------------------------------+
    | double |
    +--------------------------------------------------------------------------------------------------+

错误处理

将字符串解析为 JSON 时:

  • 在严格模式(默认)下,无效的 JSON 语法会导致错误
  • 在非严格模式下,无效的 JSON 语法会返回 NULL
mysql> SET enable_strict_cast = false;
mysql> SELECT CAST('{"invalid JSON' AS JSON);
+-----------------------------+
| CAST('{"invalid JSON' AS JSON) |
+-----------------------------+
| NULL |
+-----------------------------+

mysql> SET enable_strict_cast = true;
mysql> SELECT CAST('{"invalid JSON' AS JSON);
ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INVALID_ARGUMENT]Failed to parse json string: {"invalid JSON, ...

FROM 其他 Doris 类型

以下 Doris 类型可以直接转换为 JSON 而不丢失精度:

Doris 类型JSON 类型
BOOLEANBool
TINYINTInt8
SMALLINTInt16
INTInt32
BIGINTInt64
LARGEINTInt128
FLOATFloat
DOUBLEDouble
DECIMALDecimal
STRINGString
ARRAYArray
STRUCTObject

示例

-- 整数数组转 JSON
mysql> SELECT CAST(ARRAY(123,456,789) AS JSON);
+----------------------------------+
| CAST(ARRAY(123,456,789) AS JSON) |
+----------------------------------+
| [123,456,789] |
+----------------------------------+

-- Decimal 数组转 JSON(保留原始精度)
mysql> SELECT CAST(ARRAY(12345678.12345678,0.00000001,12.000000000000000001) AS JSON);
+--------------------------------------------------------------------------+
| CAST(ARRAY(12345678.12345678,0.00000001,12.000000000000000001) AS JSON) |
+--------------------------------------------------------------------------+
| [12345678.123456780000000000,0.000000010000000000,12.000000000000000001] |
+--------------------------------------------------------------------------+

不直接支持的类型

上表中未列出的类型不能直接转换为 JSON:

mysql> SELECT CAST(MAKEDATE(2021, 1) AS JSON);
ERROR 1105 (HY000): CAST AS JSONB can only be performed between JSONB, String, Number, Boolean, Array, Struct types. Got Date to JSONB

解决方案:先转换为兼容类型,再转为 JSON:

mysql> SELECT CAST(CAST(MAKEDATE(2021, 1) AS BIGINT) AS JSON);
+---------------------------------------------------+
| CAST(CAST(MAKEDATE(2021, 1) AS BIGINT) AS JSON) |
+---------------------------------------------------+
| 20210101 |
+---------------------------------------------------+

从 JSON 转换

行为变更

在 4.0 版本之前,Doris 对 JSON CAST 的行为比较宽松,不会处理溢出行为。

从 4.0 版本开始,在 JSON CAST 中出现溢出行为,在严格模式下报错,非严格模式下返回 NULL。

TO Boolean

JSON Bool、Number 和 String 类型可以转换为 BOOLEAN:

-- 从 JSON Bool 转换
mysql> SELECT CAST(CAST('true' AS JSON) AS BOOLEAN);
+---------------------------------------+
| CAST(CAST('true' AS JSON) AS BOOLEAN) |
+---------------------------------------+
| 1 |
+---------------------------------------+

-- 从 JSON Number 转换
mysql> SELECT CAST(CAST('123' AS JSON) AS BOOLEAN);
+--------------------------------------+
| CAST(CAST('123' AS JSON) AS BOOLEAN) |
+--------------------------------------+
| 1 |
+--------------------------------------+

-- 从 JSON String 转换(必须包含有效的布尔值表示)
mysql> SELECT CAST(TO_JSON('true') AS BOOLEAN);
+----------------------------------+
| CAST(TO_JSON('true') AS BOOLEAN) |
+----------------------------------+
| 1 |
+----------------------------------+

TO 数值类型

JSON Bool、Number 和 String 类型可以转换为数值类型(TINYINT、SMALLINT、INT、BIGINT、LARGEINT、FLOAT、DOUBLE、DECIMAL):

-- 从 JSON Number 转换为 INT
mysql> SELECT CAST(CAST('123' AS JSON) AS INT);
+----------------------------------+
| CAST(CAST('123' AS JSON) AS INT) |
+----------------------------------+
| 123 |
+----------------------------------+

-- 从 JSON Bool 转换为数值类型
mysql> SELECT CAST(CAST('true' AS JSON) AS INT), CAST(CAST('false' AS JSON) AS DOUBLE);
+-----------------------------------+--------------------------------------+
| CAST(CAST('true' AS JSON) AS INT) | CAST(CAST('false' AS JSON) AS DOUBLE) |
+-----------------------------------+--------------------------------------+
| 1 | 0 |
+-----------------------------------+--------------------------------------+

当转换为较小类型时,适用数值溢出规则:

-- 在严格模式下,溢出会导致错误
mysql> SET enable_strict_cast = true;
mysql> SELECT CAST(TO_JSON(12312312312312311) AS INT);
ERROR 1105 (HY000): Cannot cast from jsonb value type 12312312312312311 to doris type INT

-- 在非严格模式下,溢出返回 NULL
mysql> SET enable_strict_cast = false;
mysql> SELECT CAST(TO_JSON(12312312312312311) AS INT);
+-----------------------------------------+
| CAST(TO_JSON(12312312312312311) AS INT) |
+-----------------------------------------+
| NULL |
+-----------------------------------------+

TO String

任何 JSON 类型都可以转换为 STRING,生成 JSON 文本表示:

mysql> SELECT CAST(CAST('{"key1":"value1","key2":123}' AS JSON) AS STRING);
+----------------------------------------------------------+
| CAST(CAST('{"key1":"value1","key2":123}' AS JSON) AS STRING) |
+----------------------------------------------------------+
| {"key1":"value1","key2":123} |
+----------------------------------------------------------+

mysql> SELECT CAST(CAST('true' AS JSON) AS STRING);
+--------------------------------------+
| CAST(CAST('true' AS JSON) AS STRING) |
+--------------------------------------+
| true |
+--------------------------------------+

TO Array

JSON Array , String 类型可以转换为 Doris ARRAY 类型:

mysql> SELECT CAST(TO_JSON(ARRAY(1,2,3)) AS ARRAY<INT>);
+-------------------------------------------+
| CAST(TO_JSON(ARRAY(1,2,3)) AS ARRAY<INT>) |
+-------------------------------------------+
| [1, 2, 3] |
+-------------------------------------------+

-- 数组元素内的类型转换
mysql> SELECT CAST(TO_JSON(ARRAY(1.2,2.3,3.4)) AS ARRAY<INT>);
+-------------------------------------------------+
| CAST(TO_JSON(ARRAY(1.2,2.3,3.4)) AS ARRAY<INT>) |
+-------------------------------------------------+
| [1, 2, 3] |
+-------------------------------------------------+

-- 把字符串转换成数组
mysql> SELECT CAST(TO_JSON("['123','456']") AS ARRAY<INT>);
+----------------------------------------------+
| CAST(TO_JSON("['123','456']") AS ARRAY<INT>) |
+----------------------------------------------+
| [123, 456] |
+----------------------------------------------+

数组中的元素按标准转换规则单独转换:

-- 在非严格模式下,无效元素变为 NULL
mysql> SET enable_strict_cast = false;
mysql> SELECT CAST(TO_JSON(ARRAY(10,20,200)) AS ARRAY<TINYINT>);
+---------------------------------------------------+
| CAST(TO_JSON(ARRAY(10,20,200)) AS ARRAY<TINYINT>) |
+---------------------------------------------------+
| [10, 20, null] |
+---------------------------------------------------+

-- 在严格模式下,无效元素导致错误
mysql> SET enable_strict_cast = true;
mysql> SELECT CAST(TO_JSON(ARRAY(10,20,200)) AS ARRAY<TINYINT>);
ERROR 1105 (HY000): Cannot cast from jsonb value type 200 to doris type TINYINT

TO Struct

JSON Object,String 类型可以转换为 Doris STRUCT 类型:

mysql> SELECT CAST(CAST('{"key1":123,"key2":"456"}' AS JSON) AS STRUCT<key1:INT,key2:STRING>);
+------------------------------------------------------------------------------+
| CAST(CAST('{"key1":123,"key2":"456"}' AS JSON) AS STRUCT<key1:INT,key2:STRING>) |
+------------------------------------------------------------------------------+
| {"key1":123, "key2":"456"} |
+------------------------------------------------------------------------------+


mysql> SELECT CAST(TO_JSON('{"key1":123,"key2":"456"}') AS STRUCT<key1:INT,key2:STRING>);
+----------------------------------------------------------------------------+
| CAST(TO_JSON('{"key1":123,"key2":"456"}') AS STRUCT<key1:INT,key2:STRING>) |
+----------------------------------------------------------------------------+
| {"key1":123, "key2":"456"} |
+----------------------------------------------------------------------------+

结构中的字段根据指定的类型单独转换:

mysql> SELECT CAST(CAST('{"key1":[123.45,678.90],"key2":[12312313]}' AS JSON) AS STRUCT<key1:ARRAY<DOUBLE>,key2:ARRAY<BIGINT>>);
+--------------------------------------------------------------------------------------------------------------------------+
| CAST(CAST('{"key1":[123.45,678.90],"key2":[12312313]}' AS JSON) AS STRUCT<key1:ARRAY<DOUBLE>,key2:ARRAY<BIGINT>>) |
+--------------------------------------------------------------------------------------------------------------------------+
| {"key1":[123.45, 678.9], "key2":[12312313]} |
+--------------------------------------------------------------------------------------------------------------------------+

JSON 和结构定义之间的字段计数和名称必须匹配:

-- 在非严格模式下,字段不匹配返回 NULL
mysql> SET enable_strict_cast = false;
mysql> SELECT CAST(CAST('{"key1":123,"key2":456}' AS JSON) AS STRUCT<key1:INT>);
+-------------------------------------------------------------------------+
| CAST(CAST('{"key1":123,"key2":456}' AS JSON) AS STRUCT<key1:INT>) |
+-------------------------------------------------------------------------+
| NULL |
+-------------------------------------------------------------------------+

-- 在严格模式下,字段不匹配导致错误
mysql> SET enable_strict_cast = true;
mysql> SELECT CAST(CAST('{"key1":123,"key2":456}' AS JSON) AS STRUCT<key1:INT>);
ERROR 1105 (HY000): jsonb_value field size 2 is not equal to struct size 1

JSON Null 处理

JSON null 与 SQL NULL 不同:

  • 当 JSON 字段包含 null 值时,转换为任何 Doris 类型都会产生 SQL NULL:
mysql> SELECT CAST(CAST('null' AS JSON) AS INT);
+----------------------------------+
| CAST(CAST('null' AS JSON) AS INT) |
+----------------------------------+
| NULL |
+----------------------------------+

类型转换总结表

JSON 类型可转换为
BoolBOOLEAN, TINYINT, SMALLINT, INT, BIGINT, LARGEINT, DOUBLE, FLOAT, DECIMAL, STRING
Null(始终转换为 SQL NULL)
NumberBOOLEAN, TINYINT, SMALLINT, INT, BIGINT, LARGEINT, DOUBLE, FLOAT, DECIMAL, STRING
StringBOOLEAN, TINYINT, SMALLINT, INT, BIGINT, LARGEINT, DOUBLE, FLOAT, DECIMAL, STRING, ARRAY, STRUCT
ArraySTRING, ARRAY
ObjectSTRING, STRUCT

keywords

JSON, JSONB, CAST, 类型转换,to_json