Skip to content

Hive working with GCP storage emulator #215

Open
@TirayrK

Description

@TirayrK

Hi Team,

Can you please describe what core-site.xml I need to use for hive to be able to reach to the emulator?

I am using the config below which throws access denied error when I am trying to run create_table command.

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>fs.AbstractFileSystem.gs.impl</name>
        <value>com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS</value>    
    </property>
    <property>
        <name>fs.gs.impl</name>
        <value>com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem</value>
    </property>
    <property>
        <name>fs.gs.project.id</name>
        <value>test-project</value>    
    </property>
    <property>
        <name>spark.hadoop.google.cloud.auth.service.account.enable</name>
        <value>true</value>
    </property>
    <property>
        <name>spark.hadoop.google.cloud.auth.service.account.json.keyfile</name>
        <value>/run/secrets/gcloudservicekey</value>
    </property>

</configuration>```

ERROR:

`hmsclient.hmsclient.genthrift.hive_metastore.ttypes.MetaException: MetaException(message='Got exception: java.io.IOException Error accessing gs://bec1c26d1ce7c49b19773cfb768ef690c/test_user/foo')
    raise ThriftHiveError(f"error creating table {table_object}") from ex
apollo.core.utils.hive.ThriftHiveError: error creating table Table(tableName='test_user__foo', dbName='testdb_98edf558bfa54e759b83c9926fb5719b', owner=None, createTime=None, lastAccessTime=None, retention=None, sd=StorageDescriptor(cols=[FieldSchema(name='count', type='int', comment='')], location='gs://bec1c26d1ce7c49b19773cfb768ef690c/test_user/foo', inputFormat='org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat', outputFormat='org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat', compressed=None, numBuckets=None, serdeInfo=SerDeInfo(name=None, serializationLib='org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe', parameters=None), bucketCols=None, sortCols=None, parameters=None, skewedInfo=None, storedAsSubDirectories=None), partitionKeys=[], parameters={'EXTERNAL': 'TRUE'}, viewOriginalText=None, viewExpandedText=None, tableType='EXTERNAL_TABLE', privileges=None, temporary=False, rewriteEnabled=None)
`

ENV:

> Docker environment
> Local Hive setup
> Python 3.6
> gcp-storage-emulator==2021.12.14"
> google-auth==2.14.1
> POST commands to the server work but GET commands fail with access denied issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions