Skip to content

How can trino query data in hive with using alluxio cache, and hive table location is hdfs path? #25358

@houhang1005

Description

@houhang1005

Hi, here needs some little help.I using trino 460 to query hive data and all hive tables' location is hdfs path. I wonder use cache to reduce query time cast, so I change the config of etc/catalog/hive.propertues on all trino nodes:

connector.name=hive
hive.metastore.uri=thrift://xxx:9083
fs.alluxio.enabled=true
fs.hadoop.enabled=true
hive.config.resources=/data/hh/trino460/trino-server-460/etc/core-site.xml,/data/hh/trino460/trino-server-460/etc/hdfs-site.xml
fs.cache.enabled=true

fs.cache.max-disk-usage-percentages=80
fs.cache.ttl=2d
fs.cache.preferred-hosts-count=2
fs.cache.page-size=15MB
fs.cache.directories=/default_tests_files/stan/trino_qry_hive/cache2

And I also put core-site.xml hdfs-site.xml, hive-site.xml, alluxio-site.properties in etc/catalog/,besids there is a alluxio-site.properties copy in /opt/alluxio/conf/.

But when i using trino to query a hive table, I get the result but no cache file in alluxio dir(/default_tests_files/stan/trino_qry_hive/cache2), and the cache generated in linux local path which named /default_tests_files/stan/trino_qry_hive/cache2

After view io.trino.filesystem.alluxio.AlluxioFileSystemCache, I guess alluxio cache only work when the hive data also located in alluxio path. Can someone tell me the right usage about my situation? Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions