Data Lakehouse FAQ
Certificate Issues
-
When querying, an error
curl 77: Problem with the SSL CA cert.occurs. This indicates that the current system certificate is too old and needs to be updated locally.- You can download the latest CA certificate from
https://curl.se/docs/caextract.html. - Place the downloaded
cacert-xxx.peminto the/etc/ssl/certs/directory, for example:sudo cp cacert-xxx.pem /etc/ssl/certs/ca-certificates.crt.
- You can download the latest CA certificate from
-
When querying, an error occurs:
ERROR 1105 (HY000): errCode = 2, detailMessage = (x.x.x.x)[CANCELLED][INTERNAL_ERROR]error setting certificate verify locations: CAfile: /etc/ssl/certs/ca-certificates.crt CApath: none.yum install -y ca-certificates
ln -s /etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-certificates.crt
Kerberos
-
When connecting to a Hive Metastore authenticated with Kerberos, an error
GSS initiate failedis encountered.This is usually due to incorrect Kerberos authentication information. You can troubleshoot by following these steps:
-
In versions prior to 1.2.1, the libhdfs3 library that Doris depends on did not enable gsasl. Please update to versions 1.2.2 and later.
-
Ensure that correct keytab and principal are set for each component and verify that the keytab file exists on all FE and BE nodes.
hadoop.kerberos.keytab/hadoop.kerberos.principal: Used for Hadoop HDFS access, fill in the corresponding values for HDFS.hive.metastore.kerberos.principal: Used for Hive Metastore.
-
Try replacing the IP in the principal with a domain name (do not use the default
_HOSTplaceholder). -
Ensure that the
/etc/krb5.conffile exists on all FE and BE nodes.
-
-
When connecting to a Hive database through the Hive Catalog, an error occurs:
RemoteException: SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS].If the error occurs during the query when there are no issues with
show databasesandshow tables, follow these two steps:- Place
core-site.xmlandhdfs-site.xmlin thefe/confandbe/confdirectories. - Execute Kerberos
kiniton the BE node, restart BE, and then proceed with the query.
- Place
-
When encountering the error
GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos Ticket)while querying a table configured with Kerberos, restarting FE and BE nodes usually resolves the issue.- Before restarting all nodes, configure
-Djavax.security.auth.useSubjectCredsOnly=falsein the JAVA_OPTS parameter in"${DORIS_HOME}/be/conf/be.conf"to obtain JAAS credentials information through the underlying mechanism rather than the application. - Refer to JAAS Troubleshooting for solutions to common JAAS errors.
- Before restarting all nodes, configure
-
To resolve the error
Unable to obtain password from userwhen configuring Kerberos in the Catalog:- Ensure the principal used is listed in klist by checking with
klist -kt your.keytab. - Verify the Catalog configuration for any missing settings such as
yarn.resourcemanager.principal. - If the above checks are fine, it may be due to the JDK version installed by the system's package manager not supporting certain encryption algorithms. Consider installing JDK manually and setting the
JAVA_HOMEenvironment variable. - Kerberos typically uses AES-256 for encryption. For Oracle JDK, JCE must be installed. Some distributions of OpenJDK automatically provide unlimited strength JCE, eliminating the need for separate installation.
- JCE versions correspond to JDK versions; download the appropriate JCE zip package and extract it to the
$JAVA_HOME/jre/lib/securitydirectory based on the JDK version:
- Ensure the principal used is listed in klist by checking with
-
When encountering the error
java.security.InvalidKeyException: Illegal key sizewhile accessing HDFS with KMS, upgrade the JDK version to >= Java 8 u162, or install the corresponding JCE Unlimited Strength Jurisdiction Policy Files. -
If configuring Kerberos in the Catalog results in the error
SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS], place thecore-site.xmlfile in the"${DORIS_HOME}/be/conf"directory.If accessing HDFS results in the error
No common protection layer between client and server, ensure that thehadoop.rpc.protectionproperties on the client and server are consistent.<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
</configuration> -
When using Broker Load with Kerberos configured and encountering the error
Cannot locate default realm.:Add the configuration item
-Djava.security.krb5.conf=/your-pathto theJAVA_OPTSin thestart_broker.shscript for Broker Load. -
When using Kerberos configuration in the Catalog, the
hadoop.usernameproperty cannot be used simultaneously. -
Accessing Kerberos with JDK 17
When running Doris with JDK 17 and accessing Kerberos services, you may encounter issues due to the use of deprecated encryption algorithms. You need to add the
allow_weak_crypto=trueproperty inkrb5.conf, or upgrade the encryption algorithm in Kerberos.For more details, refer to: https://seanjmullan.org/blog/2021/09/14/jdk17#kerberos
JDBC Catalog
-
Error connecting to SQLServer via JDBC Catalog:
unable to find valid certification path to requested targetAdd the
trustServerCertificate=trueoption in thejdbc_url. -
Connecting to MySQL database via JDBC Catalog results in Chinese character garbling or incorrect Chinese character query conditions
Add
useUnicode=true&characterEncoding=utf-8in thejdbc_url.Note: Starting from version 1.2.3, when connecting to MySQL database via JDBC Catalog, these parameters will be automatically added.
-
Error connecting to MySQL database via JDBC Catalog:
Establishing SSL connection without server's identity verification is not recommendedAdd
useSSL=truein thejdbc_url. -
When synchronizing MySQL data to Doris using JDBC Catalog, date data synchronization error occurs. Verify if the MySQL version matches the MySQL driver package, for example, MySQL 8 and above require the driver
com.mysql.cj.jdbc.Driver. -
When a single field is too large, a Java memory OOM occurs on the BE side during a query.
When JDBC Scanner reads data through JDBC, the Session Variable
batch_sizedetermines the number of rows processed in the JVM per batch. If a single field is too large, it may causefield_size * batch_size(approximate value, considering JVM static memory and data copy overhead) to exceed the JVM memory limit, resulting in OOM.Solutions:
- Reduce the
batch_sizevalue by executingset batch_size = 512;. The default value is 4064. - Increase the BE JVM memory by modifying the
-Xmxparameter inJAVA_OPTS. For example:-Xmx8g.
- Reduce the
Hive Catalog
-
Accessing Iceberg or Hive table through Hive Catalog reports an error:
failed to get schemaorStorage schema reading not supportedYou can try the following methods:
-
Put the
icebergruntime-related jar package in thelib/directory of Hive. -
Configure in
hive-site.xml:metastore.storage.schema.reader.impl=org.apache.hadoop.hive.metastore.SerDeStorageSchemaReaderAfter the configuration is completed, you need to restart the Hive Metastore.
-
Add
"get_schema_from_table" = "true"in the Catalog properties.This parameter is supported since versions 2.1.10 and 3.0.6.
-
-
Error connecting to Hive Catalog:
Caused by: java.lang.NullPointerExceptionIf the fe.log contains the following stack trace:
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook.getFilteredObjects(AuthorizationMetaStoreFilterHook.java:78) ~[hive-exec-3.1.3-core.jar:3.1.3]
at org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook.filterDatabases(AuthorizationMetaStoreFilterHook.java:55) ~[hive-exec-3.1.3-core.jar:3.1.3]
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:1548) ~[doris-fe.jar:3.1.3]
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:1542) ~[doris-fe.jar:3.1.3]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_181]Try adding
"metastore.filter.hook" = "org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl"in theCREATE CATALOGstatement to resolve. -
If after creating Hive Catalog,
show tablesworks fine but querying results injava.net.UnknownHostException: xxxxxAdd the following in the Catalog's PROPERTIES:
'fs.defaultFS' = 'hdfs://<your_nameservice_or_actually_HDFS_IP_and_port>' -
Tables in ORC format in Hive 1.x may encounter system column names in the underlying ORC file Schema as
_col0,_col1,_col2, etc. In this case, addhive.versionas 1.x.x in the Catalog configuration to map with the column names in the Hive table.CREATE CATALOG hive PROPERTIES (
'hive.version' = '1.x.x'
); -
When querying table data using Catalog, errors related to Hive Metastore such as
Invalid method nameare encountered, set thehive.versionparameter. -
When querying a table in ORC format, if the FE reports
Could not obtain blockorCaused by: java.lang.NoSuchFieldError: types, it may be due to the FE accessing HDFS to retrieve file information and perform file splitting by default. In some cases, the FE may not be able to access HDFS. This can be resolved by adding the following parameter:"hive.exec.orc.split.strategy" = "BI". Other options include HYBRID (default) and ETL. -
In Hive, you can find the partition field values of a Hudi table, but in Doris, you cannot. Doris and Hive currently have different ways of querying Hudi. In Doris, you need to add the partition fields in the avsc file structure of the Hudi table. If not added, Doris will query with partition_val being empty (even if
hoodie.datasource.hive_sync.partition_fields=partition_valis set).{
"type": "record",
"name": "record",
"fields": [{
"name": "partition_val",
"type": [
"null",
"string"
],
"doc": "Preset partition field, empty string when not partitioned",
"default": null
},
{
"name": "name",
"type": "string",
"doc": "Name"
},
{
"name": "create_time",
"type": "string",
"doc": "Creation time"
}
]
} -
When querying a Hive external table, if you encounter the error
java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found, search forhadoop-lzo-*.jarin the Hadoop environment, place it in the"${DORIS_HOME}/fe/lib/"directory, and restart the FE. Starting from version 2.0.2, you can place this file in thecustom_lib/directory of the FE (if it does not exist, create it manually) to prevent file loss when upgrading the cluster due to thelibdirectory being replaced. -
When creating a Hive table specifying the serde as
org.apache.hadoop.hive.contrib.serde2.MultiDelimitserDe, and encountering the errorstorage schema reading not supportedwhen accessing the table, add the following configuration to thehive-site.xmlfile and restart the HMS service:<property>
<name>metastore.storage.schema.reader.impl</name>
<value>org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader</value>
</property> -
Error:
java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty. The complete error message in the FE log is as follows:org.apache.doris.common.UserException: errCode = 2, detailMessage = S3 list path failed. path=s3://bucket/part-*,msg=errors while get file status listStatus on s3://bucket: com.amazonaws.SdkClientException: Unable to execute HTTP request: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty: Unable to execute HTTP request: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
org.apache.doris.common.UserException: errCode = 2, detailMessage = S3 list path exception. path=s3://bucket/part-*, err: errCode = 2, detailMessage = S3 list path failed. path=s3://bucket/part-*,msg=errors while get file status listStatus on s3://bucket: com.amazonaws.SdkClientException: Unable to execute HTTP request: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty: Unable to execute HTTP request: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
org.apache.hadoop.fs.s3a.AWSClientIOException: listStatus on s3://bucket: com.amazonaws.SdkClientException: Unable to execute HTTP request: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty: Unable to execute HTTP request: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
Caused by: javax.net.ssl.SSLException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
Caused by: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty
Caused by: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-emptyTry updating the CA certificate on the FE node using
update-ca-trust(CentOS/RockyLinux), and then restart the FE process. -
BE error:
java.lang.InternalError. If you see an error similar to the following inbe.INFO:W20240506 15:19:57.553396 266457 jni-util.cpp:259] java.lang.InternalError
at org.apache.hadoop.io.compress.zlib.ZlibDecompressor.init(Native Method)
at org.apache.hadoop.io.compress.zlib.ZlibDecompressor.<init>(ZlibDecompressor.java:114)
at org.apache.hadoop.io.compress.GzipCodec$GzipZlibDecompressor.<init>(GzipCodec.java:229)
at org.apache.hadoop.io.compress.GzipCodec.createDecompressor(GzipCodec.java:188)
at org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:183)
at org.apache.parquet.hadoop.CodecFactory$HeapBytesDecompressor.<init>(CodecFactory.java:99)
at org.apache.parquet.hadoop.CodecFactory.createDecompressor(CodecFactory.java:223)
at org.apache.parquet.hadoop.CodecFactory.getDecompressor(CodecFactory.java:212)
at org.apache.parquet.hadoop.CodecFactory.getDecompressor(CodecFactory.java:43)It is because the Doris built-in
libz.aconflicts with the system environment'slibz.so. To resolve this issue, first executeexport LD_LIBRARY_PATH=/path/to/be/lib:$LD_LIBRARY_PATH, and then restart the BE process. -
When inserting data into Hive, an error occurred as
HiveAccessControlException Permission denied: user [user_a] does not have [UPDATE] privilege on [database/table].Since after inserting the data, the corresponding statistical information needs to be updated, and this update operation requires the ALTER privilege. Therefore, the ALTER privilege needs to be added for this user on Ranger.
-
When querying ORC files, if an error like
Orc row reader nextBatch failed. reason = Can't open /usr/share/zoneinfo/+08:00occurs.First check the
time_zonesetting of the current session. It is recommended to use a region-based timezone name such asAsia/Shanghai.If the session timezone is already set to
Asia/Shanghaibut the query still fails, it indicates that the ORC file was generated with the timezone+08:00. During query execution, this timezone is required when parsing the ORC footer. In this case, you can try creating a symbolic link under the/usr/share/zoneinfo/directory that points+08:00to an equivalent timezone. -
When querying a Hive table that uses JSON SerDe (e.g.,
org.openx.data.jsonserde.JsonSerDe), an error occurs:failed to get schemaorStorage schema reading not supportedWhen a Hive table uses JSON format storage (ROW FORMAT SERDE is
org.openx.data.jsonserde.JsonSerDe), the Hive Metastore may not be able to read the table's schema information through the default method, causing the following error when querying from Doris:errCode = 2, detailMessage = failed to get schema for table xxx in db xxx.
reason: org.apache.hadoop.hive.metastore.api.MetaException:
java.lang.UnsupportedOperationException: Storage schema reading not supportedThis can be resolved by adding
"get_schema_from_table" = "true"in the Catalog properties. This parameter instructs Doris to retrieve the schema directly from the Hive table metadata instead of relying on the underlying storage's Schema Reader.CREATE CATALOG hive PROPERTIES (
'type' = 'hms',
'hive.metastore.uris' = 'thrift://x.x.x.x:9083',
'get_schema_from_table' = 'true'
);This parameter is supported since versions 2.1.10 and 3.0.6.
-
When querying Hive Catalog tables, query planning is extremely slow, the
nereids cost too much timeerror occurs, and each HMS access takes a consistently long time (e.g., around 10 seconds).Root Cause Analysis:
This issue is usually not caused by slow execution of the HMS RPC itself. Instead, the most common root cause is incorrect DNS configuration on the Doris FE node. During the initialization phase of the Hive Metastore Client, hostname resolution is triggered. If the configured DNS server is unreachable or unresponsive, it causes a DNS resolution timeout (typically 10 seconds) every time a new HMS client connection is established, which severely slows down metadata fetching.
Typical Symptoms:
- Normal Network Connectivity: The HMS port is reachable, but metadata access in Doris remains extremely slow.
- Consistent Delay: The delay consistently hits a fixed timeout threshold (e.g., 10 seconds).
- Workarounds Fail: Simply increasing the HMS client timeout parameter in the Catalog properties only masks the error but does not eliminate the fixed 10-second delay on each connection.
Troubleshooting Steps:
Run the following commands on the Doris FE node to verify the DNS and hostname resolution:
# Check current DNS server configuration
cat /etc/resolv.conf
# Test if the DNS server is reachable and measure resolution latency
ping <nameserver_ip>
dig @<nameserver_ip> example.com
dig @<nameserver_ip> -x <hms_ip>Solutions (Choose One):
- Fix DNS Configuration (Recommended): Correct the
nameserverentries in/etc/resolv.confon the Doris FE node to ensure the DNS service is reachable and responds quickly. If DNS is not required in your local network environment, consider commenting out the invalid nameservers. - Configure Static Hosts Mapping: Add the IP and Hostname mapping of the HMS nodes to
/etc/hostson the FE node. - Standardize Catalog Properties: When creating the Catalog, it is highly recommended to use a resolvable hostname instead of a bare IP address for the
hive.metastore.urisproperty.
-
Queries on Hive Catalog tables occasionally experience extremely long hangs or directly report the optimizer timeout error
nereids cost too much time, but subsequent queries work fine immediately after.Problem Description:
This usually happens after the Catalog has been idle for a while. When an HMS RPC is initiated, if a stale connection from the pool is reused, the request will hang for the duration of the Socket Timeout (default 10s). Due to the Hive Client's internal retry mechanism, this can result in cumulative waits of 20-30 seconds if multiple retries occur. This causes the query planning phase to be extremely slow, often triggering the Doris FE optimizer timeout error
nereids cost too much time. Once the connection is purged and rebuilt, performance returns to normal.Root Cause Analysis:
Doris maintains a Client Pool for each HMS Catalog to reuse connections. In complex network environments (e.g., across VPCs, through firewalls, or NAT gateways), idle TCP connections are often "silently" reclaimed by network devices after an
idle timeout. Since these devices typically do not send FIN/RST packets to notify the endpoints, Doris still believes the connection is valid. Reusing such a "zombie connection" requires waiting for a full Socket Timeout before the failure is detected and a retry is triggered.Troubleshooting Steps:
- Verify if there are firewalls, NAT gateways, or Load Balancers between Doris FE and HMS.
- Use the Pulse (hms-tools) diagnostic tool. If the tool shows fast network connectivity but stable delays that are multiples of 10s when executing RPCs after a long idle period, it confirms that idle connections are being silently reclaimed.
Solution:
Configure the connection lifetime in your Catalog properties to be slightly shorter than the network device's idle timeout. We recommend using Hive's native socket lifetime property:
CREATE CATALOG hive_catalog PROPERTIES (
"type" = "hms",
"hive.metastore.uris" = "thrift://<hms_host>:<port>",
-- Set a value shorter than your network's idle timeout (e.g., 300s)
"hive.metastore.client.socket.lifetime" = "300s"
);When set, the HMS Client will check the connection age before sending an RPC. If it exceeds the
lifetime, it proactively reconnects, avoiding long hangs and optimizer timeouts caused by stale connections.
HDFS
-
When accessing HDFS 3.x, if you encounter the error
java.lang.VerifyError: xxx, in versions prior to 1.2.1, Doris depends on Hadoop version 2.8. You need to update to 2.10.2, or upgrade Doris to versions after 1.2.2. -
Using Hedged Read to optimize slow HDFS reads. In some cases, high load on HDFS may lead to longer read times for data replicas on a specific HDFS, thereby slowing down overall query efficiency. The HDFS Client provides the Hedged Read feature. This feature initiates another read thread to read the same data if a read request exceeds a certain threshold without returning, and the result returned first is used.
Note: This feature may increase the load on the HDFS cluster, so use it judiciously.
You can enable this feature by:
CREATE CATALOG regression PROPERTIES (
'type' = 'hms',
'hive.metastore.uris' = 'thrift://172.21.16.47:7004',
'dfs.client.hedged.read.threadpool.size' = '128',
'dfs.client.hedged.read.threshold.millis' = '500'
);dfs.client.hedged.read.threadpool.sizerepresents the number of threads used for Hedged Read, which are shared by an HDFS Client. Typically, for an HDFS cluster, BE nodes will share an HDFS Client.dfs.client.hedged.read.threshold.millisis the read threshold in milliseconds. When a read request exceeds this threshold without returning, a Hedged Read is triggered.When enabled, you can see the related parameters in the Query Profile:
TotalHedgedRead: Number of times Hedged Read was initiated.HedgedReadWins: Number of successful Hedged Reads (times when the request was initiated and returned faster than the original request).Note that these values are cumulative for a single HDFS Client, not for a single query. The same HDFS Client can be reused by multiple queries.
-
Couldn't create proxy provider class org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProviderIn the startup scripts of FE and BE, the environment variable
HADOOP_CONF_DIRis added to the CLASSPATH. IfHADOOP_CONF_DIRis set incorrectly, such as pointing to a non-existent or incorrect path, it may load the wrongxxx-site.xmlfile, resulting in reading incorrect information.Check if
HADOOP_CONF_DIRis configured correctly or remove this environment variable. -
BlockMissingExcetpion: Could not obtain block: BP-XXXXXXXXX No live nodes contain current blockPossible solutions include:
-
Use
hdfs fsck file -files -blocks -locationsto check if the file is healthy. -
Check connectivity with DataNodes using
telnet.The following error may be printed in the error log:
No live nodes contain current block Block locations: DatanodeInfoWithStorage[10.70.150.122:50010,DS-7bba8ffc-651c-4617-90e1-6f45f9a5f896,DISK]You can first check the connectivity between the Doris cluster and
10.70.150.122:50010.In addition, in some cases, the HDFS cluster uses dual network with internal and external IPs. In this case, domain names are required for communication, and the following needs to be added to the Catalog properties:
"dfs.client.use.datanode.hostname" = "true".At the same time, please check whether the parameter is true in the
hdfs-site.xmlfile placed underfe/confandbe/conf. -
Check DataNode logs.
If you encounter the following error:
org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to read expected SASL data transfer protection handshake from client at /XXX.XXX.XXX.XXX:XXXXX. Perhaps the client is running an older version of Hadoop which does not support SASL data transfer protectionit means that the current HDFS has enabled encrypted transmission, but the client has not, causing the error.
Use any of the following solutions:
- Copy
hdfs-site.xmlandcore-site.xmltofe/confandbe/conf. (Recommended) - In
hdfs-site.xml, find the corresponding configurationdfs.data.transfer.protectionand set this parameter in the Catalog.
- Copy
-
-
When querying a Hive Catalog table, an error occurs:
RPC response has a length of xxx exceeds maximum data lengthFor example:
RPC response has a length of 1213486160 exceeds maximum data lengthThe value
1213486160in hexadecimal is0x48545450, which corresponds to the ASCII string"HTTP". This indicates that the Doris FE attempted to connect to an HDFS NameNode RPC port, but received an HTTP response instead.The root cause is that the HDFS NameNode port configured in the Catalog or in
hdfs-site.xmlis incorrect — an HTTP port was used where an RPC port is required. HDFS NameNode typically exposes two types of ports:- RPC port (default:
8020or9000): Used for HDFS client communication (this is the correct port for Doris). - HTTP port (default:
9870or50070): Used for the NameNode Web UI.
Check the HDFS NameNode port configuration in the Catalog properties or in
hdfs-site.xmlunderfe/confandbe/conf, and ensure it is set to the RPC port (dfs.namenode.rpc-address), not the HTTP port (dfs.namenode.http-address). - RPC port (default:
DLF Catalog
-
When using the DLF Catalog, if
Invalid addressoccurs during BE reading JindoFS data, add the domain name appearing in the logs to IP mapping in/etc/hosts. -
If there is no permission to read data, use the
hadoop.usernameproperty to specify a user with permission. -
The metadata in the DLF Catalog should be consistent with DLF. When managing metadata using DLF, newly imported partitions in Hive may not be synchronized by DLF, leading to inconsistencies between DLF and Hive metadata. To address this, ensure that Hive metadata is fully synchronized by DLF.
Other Issues
-
Query results in garbled characters after mapping Binary type to Doris
Doris natively does not support the Binary type, so when mapping Binary types from various data lakes or databases to Doris, it is usually done using the String type. The String type can only display printable characters. If you need to query the content of Binary data, you can use the
TO_BASE64()function to convert it to Base64 encoding before further processing. -
Analyzing Parquet files
When querying Parquet files, due to potential differences in the format of Parquet files generated by different systems, such as the number of RowGroups, index values, etc., sometimes it is necessary to check the metadata of Parquet files for issue identification or performance analysis. Here is a tool provided to help users analyze Parquet files more conveniently:
-
Download and unzip Apache Parquet Cli 1.14.0.
-
Download the Parquet file to be analyzed to your local machine, assuming the path is
/path/to/file.parquet. -
Use the following command to analyze the metadata of the Parquet file:
./parquet-tools meta /path/to/file.parquet -
For more functionalities, refer to Apache Parquet Cli documentation.
-
Diagnostic Tools
Pulse
Pulse is a lightweight connectivity testing toolkit designed to diagnose infrastructure dependencies in data lake environments. It includes several specialized tools to help users quickly pinpoint environment-related issues in external table access.
Pulse consists of the following key toolsets:
-
HMS Diagnostic Tool (
hms-tools):- Designed specifically for troubleshooting Hive Metastore (HMS) issues.
- Supports health checks, ping tests, object metadata retrieval, and configuration diagnostics.
- Performance Benchmarking: Features a
benchmode to measure the response distribution and latency of HMS, helping determine if the bottleneck is at the metadata layer.
-
Kerberos Diagnostic Tool (
kerberos-tools):- Used to validate
krb5.confconfigurations in environments with Kerberos authentication. - Supports testing KDC reachability, inspecting keytab files, and performing login tests to ensure the security layer is not blocking the connection.
- Used to validate
-
Object Storage Diagnostic Tools (
s3-tools,gcs-tools,azure-blob-cpp):- Diagnostic tools for major cloud storage services (AWS S3, Google GCS, Azure Blob Storage).
- Used for troubleshooting common external table access issues such as "Access Denied" or "Bucket Not Found".
- Supports validating credential sources and STS identities, and performing bucket-level operation tests.
Example Commands (e.g., HMS):
# Test basic HMS connectivity and latency details using hms-tools
java -jar hms-tools.jar ping --uris thrift://<hms_host>:<port> --count 3 --verbose
# Benchmark actual metadata RPC response distribution using hms-tools
java -jar hms-tools.jar bench --uris thrift://<hms_host>:<port> --rpc get_all_databases --iterations 10
When metadata access is slow or external table connectivity fails, it is recommended to use the corresponding Pulse tool based on the issue type (e.g., authentication failure, slow metadata, or storage reachability) for investigation. If the connect phase is extremely fast but there are significant and consistent delays during the overall initialization, please refer to the FAQ above to check the DNS and hostname resolution settings on the FE node.