JSON
JSON Introduction
JSON (JavaScript Object Notation) is an open standard file format and data interchange format that uses human-readable text to store and transmit data. According to the official specification RFC7159, JSON supports the following basic types:
- Bool
- Null
- Number
- String
- Array
- Object
The JSON data type stores JSON data efficiently in a binary format and allows access to its internal fields through JSON functions.
By default, it supports up to 1048576 bytes (1MB), and can be increased up to 2147483643 bytes (2GB). This can be adjusted via the string_type_length_soft_limit_bytes
configuration.
Compared to storing JSON strings in a regular STRING type, the JSON type has two main advantages:
- JSON format validation during data insertion.
- More efficient binary storage format, enabling faster access to JSON internal fields using functions like
json_extract
, compared toget_json_xx
functions.
In version 1.2.x, the JSON type was named JSONB. To maintain compatibility with MySQL, it was renamed to JSON starting from version 2.0.0. Older tables can still use the previous name.
JSON Number Precision Issues
When using JSON, special attention is needed regarding number precision:
- In most systems, the Number type is implemented based on IEEE 754-2008 binary 64-bit (double-precision) floating-point numbers (e.g., double type in C++)
- Since the JSON specification doesn't strictly define the underlying type for Number, and JSON data is exchanged between different systems as text, precision loss may occur
For a JSON string like {"abc": 18446744073709551616}
:
-- Conversion result in MySQL
cast('{"abc": 18446744073709551616}' as json)
-- Result: {"abc": 1.8446744073709552e19}
// Conversion result in JavaScript
console.log(JSON.parse('{"abc": 18446744073709551616}'));
// Result: {abc: 18446744073709552000}
To ensure numeric precision is preserved when exchanging data between systems, large numbers should be stored as strings, e.g., {"abc": "18446744073709551616"}
.
JSON Type in Doris
Doris supports data types that conform to the JSON standard specification and uses an efficient JSONB (JSON Binary) format for binary encoding storage.
Supported Types
Doris JSONB supports all standard JSON types. The main difference is that Doris provides more fine-grained extensions for the Number type to more accurately map to Doris's internal types.
JSON Type | Subtype | Corresponding Doris Type |
---|---|---|
Bool | - | BOOLEAN |
Null | - | (No direct equivalent, represents JSON null value) |
Number | Int8 | TINYINT |
Int16 | SMALLINT | |
Int32 | INT | |
Int64 | BIGINT | |
Int128 | LARGEINT | |
Double | DOUBLE | |
Float | FLOAT | |
Decimal | DECIMAL | |
String | - | STRING |
Array | - | ARRAY |
Object | - | STRUCT |
Important Notes:
- Meaning of Null:
- Null in JSON is a valid value representing "empty value". This is different from SQL's NULL, which represents "unknown" or "missing".
- CAST('null' AS JSON) results in a JSONB column containing a JSON null value, which itself is not NULL at the SQL level.
- CAST('null' AS JSON) IS NULL returns false (0), because the column contains a known JSON null value, which is not a SQL NULL.
Operations and Limitations
- Comparison and Arithmetic:
- JSONB columns cannot be directly compared with other data types (including other JSONB columns) or used in arithmetic operations.
- Solution: Use JSON_EXTRACT function to extract scalar values (like INT, DOUBLE, STRING, BOOLEAN) from JSONB, then convert them to the corresponding native Doris types for comparison or calculation.
- Sorting and Grouping:
- JSONB columns do not support ORDER BY and GROUP BY operations.
- Implicit Conversion:
- Input Only: When inputting data into a JSONB column, STRING type can be implicitly converted to JSONB (provided the string content is valid JSON text). Other Doris types cannot be implicitly converted to JSONB.
Syntax
Definition:
json_column_name JSON
Insertion:
- Using
INSERT INTO VALUES
with the format as a string surrounded by quotes. For example:
INSERT INTO table_name(id, json_column_name) VALUES (1, '{"k1": "100"}')
- For STREAM LOAD, the format for the corresponding column is a string without additional quotes. For example:
12 {"k1":"v31", "k2": 300}
13 []
14 [123, 456]
- When the special character with
'\'
such as'\r'
,'\t'
appears in JSON, you need to use the replace function to replace"\"
with"\\"
, for example, you need replace"\n"
to"\\n"
Query:
- Directly select the entire JSON column:
SELECT json_column_name FROM table_name;
- Extract specific fields or other information from JSON using JSON functions. For example:
SELECT json_extract(json_column_name, '$.k1') FROM table_name;
- The JSON type can be cast to and from integers, strings, BOOLEAN, ARRAY, and MAP. For example:
SELECT CAST('{"k1": "100"}' AS JSON);
SELECT CAST(json_column_name AS STRING) FROM table_name;
SELECT CAST(json_extract(json_column_name, '$.k1') AS INT) FROM table_name;
The JSON type currently cannot be used for GROUP BY
, ORDER BY
, or comparison operations.
JSON Input
Convert a string that conforms to JSON syntax to JSONB using CAST.
-- Simple scalar/basic values (numeric types, bool, null, string)
mysql> SELECT cast('5' as json);
+-------------------+
| cast('5' as json) |
+-------------------+
| 5 |
+-------------------+
-- Arrays with zero or more elements (elements don't need to be the same type)
mysql> SELECT cast('[1, 2, "foo", null]' as json);
+-------------------------------------+
| cast('[1, 2, "foo", null]' as json) |
+-------------------------------------+
| [1,2,"foo",null] |
+-------------------------------------+
-- Objects containing key-value pairs
-- Note that object keys must always be quoted strings
mysql> SELECT cast('{"bar": "baz", "balance": 7.77, "active": false}' as json);
+------------------------------------------------------------------+
| cast('{"bar": "baz", "balance": 7.77, "active": false}' as json) |
+------------------------------------------------------------------+
| {"bar":"baz","balance":7.77,"active":false} |
+------------------------------------------------------------------+
-- Arrays and objects can be nested arbitrarily
mysql> SELECT cast('{"foo": [true, "bar"], "tags": {"a": 1, "b": null}}' as json);
+---------------------------------------------------------------------+
| cast('{"foo": [true, "bar"], "tags": {"a": 1, "b": null}}' as json) |
+---------------------------------------------------------------------+
| {"foo":[true,"bar"],"tags":{"a":1,"b":null}} |
+---------------------------------------------------------------------+
Doris's JSONB doesn't preserve semantically irrelevant details like whitespace.
mysql> -- The input text and JSON output may not look the same
mysql> SELECT cast('[1, 2]' as json);
+----------------------------------------+
| cast('[1, 2]' as json) |
+----------------------------------------+
| [1,2] |
+----------------------------------------+
Key Differences and Notes:
- CAST(string AS JSON): Used to parse strings that conform to JSON syntax.
- CAST(string AS JSON): For Number types, it will only parse Int8, Int16, Int32, Int64, Int128, and Double types, not Decimal type.
- Unlike most other JSON implementations, Doris's JSONB type supports up to Int128 precision. Numbers exceeding Int128 precision may overflow.
- If the input number string is 12.34, it will be parsed as a Double; if there's no decimal point, it will be parsed as an integer (if the size exceeds Int128 range, it will be converted to Double but with precision loss)
Using to_json to Convert Doris Internal Types to JSONB Type
mysql> SELECT to_json(1) , to_json(3.14) , to_json("12345");
+------------+---------------+------------------+
| to_json(1) | to_json(3.14) | to_json("12345") |
+------------+---------------+------------------+
| 1 | 3.14 | "12345" |
+------------+---------------+------------------+
mysql> SELECT to_json(array(array(1,2,3),array(4,5,6)));
+-------------------------------------------+
| to_json(array(array(1,2,3),array(4,5,6))) |
+-------------------------------------------+
| [[1,2,3],[4,5,6]] |
+-------------------------------------------+
mysql> SELECT json_extract(to_json(array(array(1,2,3),array(4,5,6))), '$.[1].[2]');
+----------------------------------------------------------------------+
| json_extract(to_json(array(array(1,2,3),array(4,5,6))), '$.[1].[2]') |
+----------------------------------------------------------------------+
| 6 |
+----------------------------------------------------------------------+
mysql> SELECT to_json(struct(123,array(4,5,6),"789"));
+------------------------------------------+
| to_json(struct(123,array(4,5,6),"789")) |
+------------------------------------------+
| {"col1":123,"col2":[4,5,6],"col3":"789"} |
+------------------------------------------+
mysql> SELECT json_extract(to_json(struct(123,array(4,5,6),"789")),"$.col2");
+----------------------------------------------------------------+
| json_extract(to_json(struct(123,array(4,5,6),"789")),"$.col2") |
+----------------------------------------------------------------+
| [4,5,6] |
+----------------------------------------------------------------+
to_json only supports converting Doris types that map to JSONB types. For example, DECIMAL can be used with to_json. However, DATE cannot; it needs to be converted to STRING first, then use to_json.
JSONB Output
When converting to plain text for interaction with other systems, Doris's JSONB type ensures generation of valid JSON text:
- Null values:
- Output as null (without quotes)
- Boolean values:
- true → output true
- false → output false
- Numeric types:
- All numeric values output directly
- Example: 5 → output 5, 3.14 → output 3.14
- Strings:
- Output in double quotes: "
" - Special characters are escaped:
- " → "
- \ → \
- / → /
- Backspace → \b
- Form feed → \f
- Newline → \n
- Carriage return → \r
- Tab → \t
- Other control characters (ASCII < 32) convert to Unicode escape sequences: \uXXXX
- Output in double quotes: "
- Objects:
- Format: {
} - Key-value pair format: "
": - Multiple key-value pairs separated by commas
- Format: {
- Arrays:
- Format: [
] - Multiple elements separated by commas
- Format: [
- Nested structure handling:
- Objects and arrays support unlimited nesting levels
- Each nesting level processed recursively using the same rules
Number Precision Issues
When converting Doris internal types to JSONB using to_json, no precision loss occurs. When using Doris internal JSON functions, if the return value is also a JSONB type, no precision loss occurs. However, converting Doris JSONB to plain text and then back to JSONB can cause precision loss.
Example: Doris JSON type object
Object{
"a": (Decimal 18446744073709551616.123)
}
Converted to plain text:
{"a": 18446744073709551616.123}
When plain text is converted back to Doris JSON type:
Object{
"a": (Double 18446744073709552000) // precision loss
}
Configuration and Limitations
- JSON supports 1,048,576 bytes (1 MB) by default
- Size limit can be adjusted via the BE configuration parameter string_type_length_soft_limit_bytes
- Maximum adjustment up to 2,147,483,643 bytes (approximately 2 GB)
- In Doris JSON type Objects, key length cannot exceed 255 bytes
Usage Example
A tutorial for JSON datatype including create table, load data and query.
create database and table
CREATE DATABASE testdb;
USE testdb;
CREATE TABLE test_json (
id INT,
j JSON
)
DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 10
PROPERTIES("replication_num" = "1");
Load data
stream load test_json.csv test data
- there are 2 columns, the 1st column is id and the 2nd column is json string
- there are 25 rows, the first 18 rows are valid json and the last 7 rows are invalid
1 \N
2 null
3 true
4 false
5 100
6 10000
7 1000000000
8 1152921504606846976
9 6.18
10 "abcd"
11 {}
12 {"k1":"v31", "k2": 300}
13 []
14 [123, 456]
15 ["abc", "def"]
16 [null, true, false, 100, 6.18, "abc"]
17 [{"k1":"v41", "k2": 400}, 1, "a", 3.14]
18 {"k1":"v31", "k2": 300, "a1": [{"k1":"v41", "k2": 400}, 1, "a", 3.14]}
19 ''
20 'abc'
21 abc
22 100x
23 6.a8
24 {x
25 [123, abc]
- due to the 28% of rows is invalid, stream load with default configuration will fail with error message "too many filtered rows"
curl --location-trusted -u root: -T test_json.csv http://127.0.0.1:8840/api/testdb/test_json/_stream_load
{
"TxnId": 12019,
"Label": "744d9821-9c9f-43dc-bf3b-7ab048f14e32",
"TwoPhaseCommit": "false",
"Status": "Fail",
"Message": "too many filtered rows",
"NumberTotalRows": 25,
"NumberLoadedRows": 18,
"NumberFilteredRows": 7,
"NumberUnselectedRows": 0,
"LoadBytes": 380,
"LoadTimeMs": 48,
"BeginTxnTimeMs": 0,
"StreamLoadPutTimeMs": 1,
"ReadDataTimeMs": 0,
"WriteDataTimeMs": 45,
"CommitAndPublishTimeMs": 0,
"ErrorURL": "http://172.21.0.5:8840/api/_load_error_log?file=__shard_2/error_log_insert_stmt_95435c4bf5f156df-426735082a9296af_95435c4bf5f156df_426735082a9296af"
}
- stream load will success after set header configuration 'max_filter_ratio: 0.3'
curl --location-trusted -u root: -H 'max_filter_ratio: 0.3' -T test_json.csv http://127.0.0.1:8840/api/testdb/test_json/_stream_load
{
"TxnId": 12017,
"Label": "f37a50c1-43e9-4f4e-a159-a3db6abe2579",
"TwoPhaseCommit": "false",
"Status": "Success",
"Message": "OK",
"NumberTotalRows": 25,
"NumberLoadedRows": 18,
"NumberFilteredRows": 7,
"NumberUnselectedRows": 0,
"LoadBytes": 380,
"LoadTimeMs": 68,
"BeginTxnTimeMs": 0,
"StreamLoadPutTimeMs": 2,
"ReadDataTimeMs": 0,
"WriteDataTimeMs": 45,
"CommitAndPublishTimeMs": 19,
"ErrorURL": "http://172.21.0.5:8840/api/_load_error_log?file=__shard_0/error_log_insert_stmt_a1463f98a7b15caf-c79399b920f5bfa3_a1463f98a7b15caf_c79399b920f5bfa3"
}
- use SELECT to view the data loaded by stream load. The column with JSON type will be displayed as plain JSON string.
mysql> SELECT * FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+
| id | j |
+------+---------------------------------------------------------------+
| 1 | NULL |
| 2 | null |
| 3 | true |
| 4 | false |
| 5 | 100 |
| 6 | 10000 |
| 7 | 1000000000 |
| 8 | 1152921504606846976 |
| 9 | 6.18 |
| 10 | "abcd" |
| 11 | {} |
| 12 | {"k1":"v31","k2":300} |
| 13 | [] |
| 14 | [123,456] |
| 15 | ["abc","def"] |
| 16 | [null,true,false,100,6.18,"abc"] |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} |
+------+---------------------------------------------------------------+
18 rows in set (0.03 sec)
write data using insert into
- total rows increae from 18 to 19 after insert 1 row
mysql> INSERT INTO test_json VALUES(26, '{"k1":"v1", "k2": 200}');
Query OK, 1 row affected (0.09 sec)
{'label':'insert_4ece6769d1b42fd_ac9f25b3b8f3dc02', 'status':'VISIBLE', 'txnId':'12016'}
mysql> SELECT * FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+
| id | j |
+------+---------------------------------------------------------------+
| 1 | NULL |
| 2 | null |
| 3 | true |
| 4 | false |
| 5 | 100 |
| 6 | 10000 |
| 7 | 1000000000 |
| 8 | 1152921504606846976 |
| 9 | 6.18 |
| 10 | "abcd" |
| 11 | {} |
| 12 | {"k1":"v31","k2":300} |
| 13 | [] |
| 14 | [123,456] |
| 15 | ["abc","def"] |
| 16 | [null,true,false,100,6.18,"abc"] |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} |
| 26 | {"k1":"v1","k2":200} |
+------+---------------------------------------------------------------+
19 rows in set (0.03 sec)
Query
extract some filed from json by json_extract functions
- extract the whole json, '$' stands for root in json path
+------+---------------------------------------------------------------+---------------------------------------------------------------+
| id | j | json_extract(`j`, '$') |
+------+---------------------------------------------------------------+---------------------------------------------------------------+
| 1 | NULL | NULL |
| 2 | null | null |
| 3 | true | true |
| 4 | false | false |
| 5 | 100 | 100 |
| 6 | 10000 | 10000 |
| 7 | 1000000000 | 1000000000 |
| 8 | 1152921504606846976 | 1152921504606846976 |
| 9 | 6.18 | 6.18 |
| 10 | "abcd" | "abcd" |
| 11 | {} | {} |
| 12 | {"k1":"v31","k2":300} | {"k1":"v31","k2":300} |
| 13 | [] | [] |
| 14 | [123,456] | [123,456] |
| 15 | ["abc","def"] | ["abc","def"] |
| 16 | [null,true,false,100,6.18,"abc"] | [null,true,false,100,6.18,"abc"] |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | [{"k1":"v41","k2":400},1,"a",3.14] |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} |
| 26 | {"k1":"v1","k2":200} | {"k1":"v1","k2":200} |
+------+---------------------------------------------------------------+---------------------------------------------------------------+
19 rows in set (0.03 sec)
- extract k1 field, return NULL if it does not exist
mysql> SELECT id, j, json_extract(j, '$.k1') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+----------------------------+
| id | j | json_extract(`j`, '$.k1') |
+------+---------------------------------------------------------------+----------------------------+
| 1 | NULL | NULL |
| 2 | null | NULL |
| 3 | true | NULL |
| 4 | false | NULL |
| 5 | 100 | NULL |
| 6 | 10000 | NULL |
| 7 | 1000000000 | NULL |
| 8 | 1152921504606846976 | NULL |
| 9 | 6.18 | NULL |
| 10 | "abcd" | NULL |
| 11 | {} | NULL |
| 12 | {"k1":"v31","k2":300} | "v31" |
| 13 | [] | NULL |
| 14 | [123,456] | NULL |
| 15 | ["abc","def"] | NULL |
| 16 | [null,true,false,100,6.18,"abc"] | NULL |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | NULL |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | "v31" |
| 26 | {"k1":"v1","k2":200} | "v1" |
+------+---------------------------------------------------------------+----------------------------+
19 rows in set (0.03 sec)
- extract element 0 of the top level array
mysql> SELECT id, j, json_extract(j, '$[0]') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+----------------------------+
| id | j | json_extract(`j`, '$[0]') |
+------+---------------------------------------------------------------+----------------------------+
| 1 | NULL | NULL |
| 2 | null | NULL |
| 3 | true | NULL |
| 4 | false | NULL |
| 5 | 100 | NULL |
| 6 | 10000 | NULL |
| 7 | 1000000000 | NULL |
| 8 | 1152921504606846976 | NULL |
| 9 | 6.18 | NULL |
| 10 | "abcd" | NULL |
| 11 | {} | NULL |
| 12 | {"k1":"v31","k2":300} | NULL |
| 13 | [] | NULL |
| 14 | [123,456] | 123 |
| 15 | ["abc","def"] | "abc" |
| 16 | [null,true,false,100,6.18,"abc"] | null |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | {"k1":"v41","k2":400} |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | NULL |
| 26 | {"k1":"v1","k2":200} | NULL |
+------+---------------------------------------------------------------+----------------------------+
19 rows in set (0.03 sec)
- extract a whole json array of name a1
mysql> SELECT id, j, json_extract(j, '$.a1') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+------------------------------------+
| id | j | json_extract(`j`, '$.a1') |
+------+---------------------------------------------------------------+------------------------------------+
| 1 | NULL | NULL |
| 2 | null | NULL |
| 3 | true | NULL |
| 4 | false | NULL |
| 5 | 100 | NULL |
| 6 | 10000 | NULL |
| 7 | 1000000000 | NULL |
| 8 | 1152921504606846976 | NULL |
| 9 | 6.18 | NULL |
| 10 | "abcd" | NULL |
| 11 | {} | NULL |
| 12 | {"k1":"v31","k2":300} | NULL |
| 13 | [] | NULL |
| 14 | [123,456] | NULL |
| 15 | ["abc","def"] | NULL |
| 16 | [null,true,false,100,6.18,"abc"] | NULL |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | NULL |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | [{"k1":"v41","k2":400},1,"a",3.14] |
| 26 | {"k1":"v1","k2":200} | NULL |
+------+---------------------------------------------------------------+------------------------------------+
19 rows in set (0.02 sec)
- extract nested field from an object in an array
mysql> SELECT id, j, json_extract(j, '$.a1[0]'), json_extract(j, '$.a1[0].k1') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+-------------------------------+----------------------------------+
| id | j | json_extract(`j`, '$.a1[0]') | json_extract(`j`, '$.a1[0].k1') |
+------+---------------------------------------------------------------+-------------------------------+----------------------------------+
| 1 | NULL | NULL | NULL |
| 2 | null | NULL | NULL |
| 3 | true | NULL | NULL |
| 4 | false | NULL | NULL |
| 5 | 100 | NULL | NULL |
| 6 | 10000 | NULL | NULL |
| 7 | 1000000000 | NULL | NULL |
| 8 | 1152921504606846976 | NULL | NULL |
| 9 | 6.18 | NULL | NULL |
| 10 | "abcd" | NULL | NULL |
| 11 | {} | NULL | NULL |
| 12 | {"k1":"v31","k2":300} | NULL | NULL |
| 13 | [] | NULL | NULL |
| 14 | [123,456] | NULL | NULL |
| 15 | ["abc","def"] | NULL | NULL |
| 16 | [null,true,false,100,6.18,"abc"] | NULL | NULL |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | NULL | NULL |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | {"k1":"v41","k2":400} | "v41" |
| 26 | {"k1":"v1","k2":200} | NULL | NULL |
+------+---------------------------------------------------------------+-------------------------------+----------------------------------+
19 rows in set (0.02 sec)
- extract field with specific datatype
- json_extract_string will extract field with string type, convert to string if the field is not string
mysql> SELECT id, j, json_extract_string(j, '$') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+---------------------------------------------------------------+
| id | j | json_extract_string(`j`, '$') |
+------+---------------------------------------------------------------+---------------------------------------------------------------+
| 1 | NULL | NULL |
| 2 | null | null |
| 3 | true | true |
| 4 | false | false |
| 5 | 100 | 100 |
| 6 | 10000 | 10000 |
| 7 | 1000000000 | 1000000000 |
| 8 | 1152921504606846976 | 1152921504606846976 |
| 9 | 6.18 | 6.18 |
| 10 | "abcd" | abcd |
| 11 | {} | {} |
| 12 | {"k1":"v31","k2":300} | {"k1":"v31","k2":300} |
| 13 | [] | [] |
| 14 | [123,456] | [123,456] |
| 15 | ["abc","def"] | ["abc","def"] |
| 16 | [null,true,false,100,6.18,"abc"] | [null,true,false,100,6.18,"abc"] |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | [{"k1":"v41","k2":400},1,"a",3.14] |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} |
| 26 | {"k1":"v1","k2":200} | {"k1":"v1","k2":200} |
+------+---------------------------------------------------------------+---------------------------------------------------------------+
19 rows in set (0.02 sec)
mysql> SELECT id, j, json_extract_string(j, '$.k1') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+-----------------------------------+
| id | j | json_extract_string(`j`, '$.k1') |
+------+---------------------------------------------------------------+-----------------------------------+
| 1 | NULL | NULL |
| 2 | null | NULL |
| 3 | true | NULL |
| 4 | false | NULL |
| 5 | 100 | NULL |
| 6 | 10000 | NULL |
| 7 | 1000000000 | NULL |
| 8 | 1152921504606846976 | NULL |
| 9 | 6.18 | NULL |
| 10 | "abcd" | NULL |
| 11 | {} | NULL |
| 12 | {"k1":"v31","k2":300} | v31 |
| 13 | [] | NULL |
| 14 | [123,456] | NULL |
| 15 | ["abc","def"] | NULL |
| 16 | [null,true,false,100,6.18,"abc"] | NULL |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | NULL |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | v31 |
| 26 | {"k1":"v1","k2":200} | v1 |
+------+---------------------------------------------------------------+-----------------------------------+
19 rows in set (0.03 sec)
- json_extract_int will extract field with int type, return NULL if the field is not int
mysql> SELECT id, j, json_extract_int(j, '$') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+-----------------------------+
| id | j | json_extract_int(`j`, '$') |
+------+---------------------------------------------------------------+-----------------------------+
| 1 | NULL | NULL |
| 2 | null | NULL |
| 3 | true | NULL |
| 4 | false | NULL |
| 5 | 100 | 100 |
| 6 | 10000 | 10000 |
| 7 | 1000000000 | 1000000000 |
| 8 | 1152921504606846976 | NULL |
| 9 | 6.18 | NULL |
| 10 | "abcd" | NULL |
| 11 | {} | NULL |
| 12 | {"k1":"v31","k2":300} | NULL |
| 13 | [] | NULL |
| 14 | [123,456] | NULL |
| 15 | ["abc","def"] | NULL |
| 16 | [null,true,false,100,6.18,"abc"] | NULL |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | NULL |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | NULL |
| 26 | {"k1":"v1","k2":200} | NULL |
+------+---------------------------------------------------------------+-----------------------------+
19 rows in set (0.02 sec)
mysql> SELECT id, j, json_extract_int(j, '$.k2') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+--------------------------------+
| id | j | json_extract_int(`j`, '$.k2') |
+------+---------------------------------------------------------------+--------------------------------+
| 1 | NULL | NULL |
| 2 | null | NULL |
| 3 | true | NULL |
| 4 | false | NULL |
| 5 | 100 | NULL |
| 6 | 10000 | NULL |
| 7 | 1000000000 | NULL |
| 8 | 1152921504606846976 | NULL |
| 9 | 6.18 | NULL |
| 10 | "abcd" | NULL |
| 11 | {} | NULL |
| 12 | {"k1":"v31","k2":300} | 300 |
| 13 | [] | NULL |
| 14 | [123,456] | NULL |
| 15 | ["abc","def"] | NULL |
| 16 | [null,true,false,100,6.18,"abc"] | NULL |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | NULL |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | 300 |
| 26 | {"k1":"v1","k2":200} | 200 |
+------+---------------------------------------------------------------+--------------------------------+
19 rows in set (0.03 sec)
- json_extract_bigint will extract field with bigint type, return NULL if the field is not bigint
mysql> SELECT id, j, json_extract_bigint(j, '$') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+--------------------------------+
| id | j | json_extract_bigint(`j`, '$') |
+------+---------------------------------------------------------------+--------------------------------+
| 1 | NULL | NULL |
| 2 | null | NULL |
| 3 | true | NULL |
| 4 | false | NULL |
| 5 | 100 | 100 |
| 6 | 10000 | 10000 |
| 7 | 1000000000 | 1000000000 |
| 8 | 1152921504606846976 | 1152921504606846976 |
| 9 | 6.18 | NULL |
| 10 | "abcd" | NULL |
| 11 | {} | NULL |
| 12 | {"k1":"v31","k2":300} | NULL |
| 13 | [] | NULL |
| 14 | [123,456] | NULL |
| 15 | ["abc","def"] | NULL |
| 16 | [null,true,false,100,6.18,"abc"] | NULL |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | NULL |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | NULL |
| 26 | {"k1":"v1","k2":200} | NULL |
+------+---------------------------------------------------------------+--------------------------------+
19 rows in set (0.03 sec)
mysql> SELECT id, j, json_extract_bigint(j, '$.k2') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+-----------------------------------+
| id | j | json_extract_bigint(`j`, '$.k2') |
+------+---------------------------------------------------------------+-----------------------------------+
| 1 | NULL | NULL |
| 2 | null | NULL |
| 3 | true | NULL |
| 4 | false | NULL |
| 5 | 100 | NULL |
| 6 | 10000 | NULL |
| 7 | 1000000000 | NULL |
| 8 | 1152921504606846976 | NULL |
| 9 | 6.18 | NULL |
| 10 | "abcd" | NULL |
| 11 | {} | NULL |
| 12 | {"k1":"v31","k2":300} | 300 |
| 13 | [] | NULL |
| 14 | [123,456] | NULL |
| 15 | ["abc","def"] | NULL |
| 16 | [null,true,false,100,6.18,"abc"] | NULL |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | NULL |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | 300 |
| 26 | {"k1":"v1","k2":200} | 200 |
+------+---------------------------------------------------------------+-----------------------------------+
19 rows in set (0.02 sec)
- json_extract_double will extract field with double type, return NULL if the field is not double
mysql> SELECT id, j, json_extract_double(j, '$') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+--------------------------------+
| id | j | json_extract_double(`j`, '$') |
+------+---------------------------------------------------------------+--------------------------------+
| 1 | NULL | NULL |
| 2 | null | NULL |
| 3 | true | NULL |
| 4 | false | NULL |
| 5 | 100 | 100 |
| 6 | 10000 | 10000 |
| 7 | 1000000000 | 1000000000 |
| 8 | 1152921504606846976 | 1.152921504606847e+18 |
| 9 | 6.18 | 6.18 |
| 10 | "abcd" | NULL |
| 11 | {} | NULL |
| 12 | {"k1":"v31","k2":300} | NULL |
| 13 | [] | NULL |
| 14 | [123,456] | NULL |
| 15 | ["abc","def"] | NULL |
| 16 | [null,true,false,100,6.18,"abc"] | NULL |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | NULL |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | NULL |
| 26 | {"k1":"v1","k2":200} | NULL |
+------+---------------------------------------------------------------+--------------------------------+
19 rows in set (0.02 sec)
mysql> SELECT id, j, json_extract_double(j, '$.k2') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+-----------------------------------+
| id | j | json_extract_double(`j`, '$.k2') |
+------+---------------------------------------------------------------+-----------------------------------+
| 1 | NULL | NULL |
| 2 | null | NULL |
| 3 | true | NULL |
| 4 | false | NULL |
| 5 | 100 | NULL |
| 6 | 10000 | NULL |
| 7 | 1000000000 | NULL |
| 8 | 1152921504606846976 | NULL |
| 9 | 6.18 | NULL |
| 10 | "abcd" | NULL |
| 11 | {} | NULL |
| 12 | {"k1":"v31","k2":300} | 300 |
| 13 | [] | NULL |
| 14 | [123,456] | NULL |
| 15 | ["abc","def"] | NULL |
| 16 | [null,true,false,100,6.18,"abc"] | NULL |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | NULL |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | 300 |
| 26 | {"k1":"v1","k2":200} | 200 |
+------+---------------------------------------------------------------+-----------------------------------+
19 rows in set (0.03 sec)
- json_extract_bool will extract field with boolean type, return NULL if the field is not boolean
mysql> SELECT id, j, json_extract_bool(j, '$') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+------------------------------+
| id | j | json_extract_bool(`j`, '$') |
+------+---------------------------------------------------------------+------------------------------+
| 1 | NULL | NULL |
| 2 | null | NULL |
| 3 | true | 1 |
| 4 | false | 0 |
| 5 | 100 | NULL |
| 6 | 10000 | NULL |
| 7 | 1000000000 | NULL |
| 8 | 1152921504606846976 | NULL |
| 9 | 6.18 | NULL |
| 10 | "abcd" | NULL |
| 11 | {} | NULL |
| 12 | {"k1":"v31","k2":300} | NULL |
| 13 | [] | NULL |
| 14 | [123,456] | NULL |
| 15 | ["abc","def"] | NULL |
| 16 | [null,true,false,100,6.18,"abc"] | NULL |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | NULL |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | NULL |
| 26 | {"k1":"v1","k2":200} | NULL |
+------+---------------------------------------------------------------+------------------------------+
19 rows in set (0.01 sec)
mysql> SELECT id, j, json_extract_bool(j, '$[1]') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+---------------------------------+
| id | j | json_extract_bool(`j`, '$[1]') |
+------+---------------------------------------------------------------+---------------------------------+
| 1 | NULL | NULL |
| 2 | null | NULL |
| 3 | true | NULL |
| 4 | false | NULL |
| 5 | 100 | NULL |
| 6 | 10000 | NULL |
| 7 | 1000000000 | NULL |
| 8 | 1152921504606846976 | NULL |
| 9 | 6.18 | NULL |
| 10 | "abcd" | NULL |
| 11 | {} | NULL |
| 12 | {"k1":"v31","k2":300} | NULL |
| 13 | [] | NULL |
| 14 | [123,456] | NULL |
| 15 | ["abc","def"] | NULL |
| 16 | [null,true,false,100,6.18,"abc"] | 1 |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | NULL |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | NULL |
| 26 | {"k1":"v1","k2":200} | NULL |
+------+---------------------------------------------------------------+---------------------------------+
19 rows in set (0.01 sec)
- json_extract_isnull will extract field with json null type, return 1 if the field is json null , else 0
- json null is different from SQL NULL. SQL NULL stands for no value for a field, but json null stands for an field with special value null.
mysql> SELECT id, j, json_extract_isnull(j, '$') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+--------------------------------+
| id | j | json_extract_isnull(`j`, '$') |
+------+---------------------------------------------------------------+--------------------------------+
| 1 | NULL | NULL |
| 2 | null | 1 |
| 3 | true | 0 |
| 4 | false | 0 |
| 5 | 100 | 0 |
| 6 | 10000 | 0 |
| 7 | 1000000000 | 0 |
| 8 | 1152921504606846976 | 0 |
| 9 | 6.18 | 0 |
| 10 | "abcd" | 0 |
| 11 | {} | 0 |
| 12 | {"k1":"v31","k2":300} | 0 |
| 13 | [] | 0 |
| 14 | [123,456] | 0 |
| 15 | ["abc","def"] | 0 |
| 16 | [null,true,false,100,6.18,"abc"] | 0 |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | 0 |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | 0 |
| 26 | {"k1":"v1","k2":200} | 0 |
+------+---------------------------------------------------------------+--------------------------------+
19 rows in set (0.03 sec)
check if a field is existed in json by json_exists_path
mysql> SELECT id, j, json_exists_path(j, '$') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+-----------------------------+
| id | j | json_exists_path(`j`, '$') |
+------+---------------------------------------------------------------+-----------------------------+
| 1 | NULL | NULL |
| 2 | null | 1 |
| 3 | true | 1 |
| 4 | false | 1 |
| 5 | 100 | 1 |
| 6 | 10000 | 1 |
| 7 | 1000000000 | 1 |
| 8 | 1152921504606846976 | 1 |
| 9 | 6.18 | 1 |
| 10 | "abcd" | 1 |
| 11 | {} | 1 |
| 12 | {"k1":"v31","k2":300} | 1 |
| 13 | [] | 1 |
| 14 | [123,456] | 1 |
| 15 | ["abc","def"] | 1 |
| 16 | [null,true,false,100,6.18,"abc"] | 1 |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | 1 |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | 1 |
| 26 | {"k1":"v1","k2":200} | 1 |
+------+---------------------------------------------------------------+-----------------------------+
19 rows in set (0.02 sec)
mysql> SELECT id, j, json_exists_path(j, '$.k1') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+--------------------------------+
| id | j | json_exists_path(`j`, '$.k1') |
+------+---------------------------------------------------------------+--------------------------------+
| 1 | NULL | NULL |
| 2 | null | 0 |
| 3 | true | 0 |
| 4 | false | 0 |
| 5 | 100 | 0 |
| 6 | 10000 | 0 |
| 7 | 1000000000 | 0 |
| 8 | 1152921504606846976 | 0 |
| 9 | 6.18 | 0 |
| 10 | "abcd" | 0 |
| 11 | {} | 0 |
| 12 | {"k1":"v31","k2":300} | 1 |
| 13 | [] | 0 |
| 14 | [123,456] | 0 |
| 15 | ["abc","def"] | 0 |
| 16 | [null,true,false,100,6.18,"abc"] | 0 |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | 0 |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | 1 |
| 26 | {"k1":"v1","k2":200} | 1 |
+------+---------------------------------------------------------------+--------------------------------+
19 rows in set (0.03 sec)
mysql> SELECT id, j, json_exists_path(j, '$[2]') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+--------------------------------+
| id | j | json_exists_path(`j`, '$[2]') |
+------+---------------------------------------------------------------+--------------------------------+
| 1 | NULL | NULL |
| 2 | null | 0 |
| 3 | true | 0 |
| 4 | false | 0 |
| 5 | 100 | 0 |
| 6 | 10000 | 0 |
| 7 | 1000000000 | 0 |
| 8 | 1152921504606846976 | 0 |
| 9 | 6.18 | 0 |
| 10 | "abcd" | 0 |
| 11 | {} | 0 |
| 12 | {"k1":"v31","k2":300} | 0 |
| 13 | [] | 0 |
| 14 | [123,456] | 0 |
| 15 | ["abc","def"] | 0 |
| 16 | [null,true,false,100,6.18,"abc"] | 1 |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | 1 |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | 0 |
| 26 | {"k1":"v1","k2":200} | 0 |
+------+---------------------------------------------------------------+--------------------------------+
19 rows in set (0.02 sec)
get the datatype of a field in json by json_type
- return the data type of the field specified by json path, NULL if not existed.
mysql> SELECT id, j, json_type(j, '$') FROM test_json ORDER BY id;
+------+---------------------------------------------------------------+----------------------+
| id | j | json_type(`j`, '$') |
+------+---------------------------------------------------------------+----------------------+
| 1 | NULL | NULL |
| 2 | null | null |
| 3 | true | bool |
| 4 | false | bool |
| 5 | 100 | int |
| 6 | 10000 | int |
| 7 | 1000000000 | int |
| 8 | 1152921504606846976 | bigint |
| 9 | 6.18 | double |
| 10 | "abcd" | string |
| 11 | {} | object |
| 12 | {"k1":"v31","k2":300} | object |
| 13 | [] | array |
| 14 | [123,456] | array |
| 15 | ["abc","def"] | array |
| 16 | [null,true,false,100,6.18,"abc"] | array |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | array |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | object |
| 26 | {"k1":"v1","k2":200} | object |
+------+---------------------------------------------------------------+----------------------+
19 rows in set (0.02 sec)
mysql> select id, j, json_type(j, '$.k1') from test_json order by id;
+------+---------------------------------------------------------------+-------------------------+
| id | j | json_type(`j`, '$.k1') |
+------+---------------------------------------------------------------+-------------------------+
| 1 | NULL | NULL |
| 2 | null | NULL |
| 3 | true | NULL |
| 4 | false | NULL |
| 5 | 100 | NULL |
| 6 | 10000 | NULL |
| 7 | 1000000000 | NULL |
| 8 | 1152921504606846976 | NULL |
| 9 | 6.18 | NULL |
| 10 | "abcd" | NULL |
| 11 | {} | NULL |
| 12 | {"k1":"v31","k2":300} | string |
| 13 | [] | NULL |
| 14 | [123,456] | NULL |
| 15 | ["abc","def"] | NULL |
| 16 | [null,true,false,100,6.18,"abc"] | NULL |
| 17 | [{"k1":"v41","k2":400},1,"a",3.14] | NULL |
| 18 | {"k1":"v31","k2":300,"a1":[{"k1":"v41","k2":400},1,"a",3.14]} | string |
| 26 | {"k1":"v1","k2":200} | string |
+------+---------------------------------------------------------------+-------------------------+
19 rows in set (0.03 sec)
FAQ
- Is there a difference between null in JSON and NULL in SQL (i.e., IS NULL)?
Yes, there is a difference. In JSON, null (e.g., {"key1": null}) means that the key key1 exists and its value is explicitly null. This is a special type that gets encoded into JSON binary.
In contrast, SQL NULL (when using IS NULL) can indicate that the key doesn’t exist at all in the JSON object.
For example:
mysql> SELECT JSON_EXTRACT_STRING('{"key1" : null}', "$.key1") IS NULL;
+----------------------------------------------------------+
| JSON_EXTRACT_STRING('{"key1" : null}', "$.key1") IS NULL |
+----------------------------------------------------------+
| 0 |
+----------------------------------------------------------+
1 row in set (0.00 sec)
mysql> SELECT JSON_EXTRACT_STRING('{"key1" : null}', "$.key_not_exist") IS NULL;
+-------------------------------------------------------------------+
| JSON_EXTRACT_STRING('{"key1" : null}', "$.key_not_exist") IS NULL |
+-------------------------------------------------------------------+
| 1 |
+-------------------------------------------------------------------+
1 row in set (0.01 sec)
- What’s the difference between
GET_JSON_XXX
andJSON_EXTRACT_XXX
functions, and how should I choose between them?
The GET_JSON_XXX
functions are designed for use on string types — they extract values directly from raw JSON strings. On the other hand, JSON_EXTRACT_XXX
functions are implemented specifically for the JSON data type and are optimized for it.
keywords
JSONB, JSON, json_parse, json_parse_error_to_null, json_parse_error_to_value, json_extract, json_extract_isnull, json_extract_bool, json_extract_int, json_extract_bigint, json_extract_double, json_extract_string, json_exists_path, json_type