Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Remove incorrect @VisibleForTesting annotation & apply Spotless formatting #6571

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ Apache Gravitino is a high-performance, geo-distributed, and federated metadata
![Gravitino Architecture](docs/assets/gravitino-architecture.png)

Gravitino aims to provide several key features:

* Unified Metadata Management: Gravitino provides a unified model and API to manage different types of metadata, including relational (e.g., Hive, MySQL) and file-based (e.g., HDFS, S3) metadata sources.
* End-to-End Data Governance: Gravitino offers a unified governance layer for managing metadata with features like access control, auditing, and discovery.
* Direct Metadata Management: Gravitino connects directly to metadata sources via connectors, ensuring changes are instantly reflected between Gravitino and the underlying systems.
Expand Down
35 changes: 16 additions & 19 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,19 @@ There are several options for getting started with Apache Gravitino. Installing

If you want to download and install Gravitino:

- on AWS, see [Getting started on Amazon Web Services](#getting-started-on-amazon-web-services)
- Google Cloud Platform, see [Getting started on Google Cloud Platform](#getting-started-on-google-cloud-platform)
- locally, see [Getting started locally](#getting-started-locally)
- on AWS, see [Getting started on Amazon Web Services](#getting-started-on-amazon-web-services)
- Google Cloud Platform, see [Getting started on Google Cloud Platform](#getting-started-on-google-cloud-platform)
- locally, see [Getting started locally](#getting-started-locally)

If you have your own Apache Gravitino setup and want to use Apache Hive:
If you have your own Apache Gravitino setup and want to use Apache Hive:

- on AWS or Google Cloud Platform, see [Installing Apache Hive on AWS or Google Cloud Platform](#installing-apache-hive-on-aws-or-google-cloud-platform)
- locally, see [Installing Apache Hive locally](#installing-apache-hive-locally)
- on AWS or Google Cloud Platform, see [Installing Apache Hive on AWS or Google Cloud Platform](#installing-apache-hive-on-aws-or-google-cloud-platform)
- locally, see [Installing Apache Hive locally](#installing-apache-hive-locally)

If you prefer to get started quickly and use Docker for Gravitino, Apache Hive, Trino, and others:

- on AWS or Google Cloud Platform, see [Installing Gravitino playground on AWS or Google Cloud Platform](#installing-apache-gravitino-playground-on-aws-or-google-cloud-platform)
- locally, see [Installing Gravitino playground locally](#installing-apache-gravitino-playground-locally)
- on AWS or Google Cloud Platform, see [Installing Gravitino playground on AWS or Google Cloud Platform](#installing-apache-gravitino-playground-on-aws-or-google-cloud-platform)
- locally, see [Installing Gravitino playground locally](#installing-apache-gravitino-playground-locally)

If you are using AWS and want to access the instance remotely, be sure to read [Accessing Gravitino on AWS externally](#accessing-apache-gravitino-on-aws-externally)

Expand Down Expand Up @@ -56,7 +56,6 @@ If you are using AWS and want to access the instance remotely, be sure to read [
10. **Next steps**
- Concluding thoughts and suggested next steps for users who have completed the setup.


## Getting started on Amazon Web Services

To begin using Gravitino on AWS, follow these steps:
Expand Down Expand Up @@ -214,7 +213,7 @@ Gravitino provides a bundle of Docker images to launch a Gravitino playground, w
includes Apache Hive, Apache Hadoop, Trino, MySQL, PostgreSQL, and Gravitino. You can use
Docker Compose to start them all.

Installing Docker and Docker Compose is a requirement for using the playground.
Installing Docker and Docker Compose is a requirement for using the playground.

```shell
sudo apt install docker docker-compose
Expand Down Expand Up @@ -317,23 +316,21 @@ After completing these steps, you should be able to access the Gravitino REST in

1. **Explore documentation:**
- Delve deeper into the Gravitino documentation for advanced features and configuration options.
- Check out https://gravitino.apache.org/docs/latest
- Check out <https://gravitino.apache.org/docs/latest>

2. **Community engagement:**
- Join the Gravitino community forums to connect with other users, share experiences, and seek assistance if needed.
- Check out our GitHub repository: https://github.com/apache/gravitino
- Check out our Slack channel in ASF Slack: https://the-asf.slack.com
- Check out our GitHub repository: <https://github.com/apache/gravitino>
- Check out our Slack channel in ASF Slack: <https://the-asf.slack.com>

3. **Read our blogs:**
- Check out: https://gravitino.apache.org/blog
- Check out: <https://gravitino.apache.org/blog>

4. **Continuous updates:**
- Stay informed about Gravitino updates and new releases to benefit from the latest features, optimizations, and security
- Stay informed about Gravitino updates and new releases to benefit from the latest features, optimizations, and security
enhancements.
- Check out our Website: https://gravitino.apache.org
- Check out our Website: <https://gravitino.apache.org>


This document is just the beginning. You're welcome to customize your Gravitino setup based on your requirements and to explore the vast possibilities this powerful tool offers. If you encounter any issues or have questions, you can always connect with the Gravitino community for assistance.

<img src="https://analytics.apache.org/matomo.php?idsite=62&rec=1&bots=1&action_name=GettingStarted" alt="" />

20 changes: 9 additions & 11 deletions docs/gravitino-server-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,12 +64,11 @@ The following table lists the storage configuration items:
| `gravitino.entity.store.relational.jdbcPassword` | The password that the `JDBCBackend` needs to use when connecting the database. It is required for `MySQL`. | `gravitino` | Yes if the jdbc connection url is not `jdbc:h2` | 0.5.0 |
| `gravitino.entity.store.relational.storagePath` | The storage path for embedded JDBC storage implementation. It supports both absolute and relative path, if the value is a relative path, the final path is `${GRAVITINO_HOME}/${PATH_YOU_HAVA_SET}`, default value is `${GRAVITINO_HOME}/data/jdbc` | `${GRAVITINO_HOME}/data/jdbc` | No | 0.6.0-incubating |


:::caution
We strongly recommend that you change the default value of `gravitino.entity.store.relational.storagePath`, as it's under the deployment directory and future version upgrades may remove it.
:::

#### Create JDBC backend schema and table
#### Create JDBC backend schema and table

For H2 database, All tables needed by Gravitino are created automatically when the Gravitino server starts up. For MySQL, you should firstly initialize the database tables yourself by executing the ddl scripts in the `${GRAVITINO_HOME}/scripts/mysql/` directory.

Expand All @@ -94,7 +93,7 @@ Gravitino server uses tree lock to ensure the consistency of the data. The tree

| Configuration item | Description | Default value | Since Version |
|-------------------------------|--------------------------------------------------------------------------------------------------------------------------------|---------------|---------------|
| `gravitino.auxService.names ` | The auxiliary service name of the Gravitino Iceberg REST server. Use **`iceberg-rest`** for the Gravitino Iceberg REST server. | (none) | 0.2.0 |
| `gravitino.auxService.names` | The auxiliary service name of the Gravitino Iceberg REST server. Use **`iceberg-rest`** for the Gravitino Iceberg REST server. | (none) | 0.2.0 |

Refer to [Iceberg REST catalog service](iceberg-rest-service.md) for configuration details.

Expand All @@ -107,8 +106,8 @@ To leverage the event listener, you must implement the `EventListenerPlugin` int
| Property name | Description | Default value | Required | Since Version |
|----------------------------------------|--------------------------------------------------------------------------------------------------------|---------------|----------|---------------|
| `gravitino.eventListener.names` | The name of the event listener, For multiple listeners, separate names with a comma, like "audit,sync" | (none) | Yes | 0.5.0 |
| `gravitino.eventListener.{name}.class` | The class name of the event listener, replace `{name}` with the actual listener name. | (none) | Yes | 0.5.0 |
| `gravitino.eventListener.{name}.{key}` | Custom properties that will be passed to the event listener plugin. | (none) | Yes | 0.5.0 |
| `gravitino.eventListener.{name}.class` | The class name of the event listener, replace `{name}` with the actual listener name. | (none) | Yes | 0.5.0 |
| `gravitino.eventListener.{name}.{key}` | Custom properties that will be passed to the event listener plugin. | (none) | Yes | 0.5.0 |

#### Event

Expand Down Expand Up @@ -149,7 +148,7 @@ The plugin provides several operational modes for how to process event, supporti
- **SYNC**: Events are processed synchronously, immediately following the associated operation. This mode ensures events are processed before the operation's result is returned to the client, but it may delay the main process if event processing takes too long.

- **ASYNC_SHARED**: This mode employs a shared queue and dispatcher for asynchronous event processing. It prevents the main process from being blocked, though there's a risk events might be dropped if not promptly consumed. Sharing a dispatcher can lead to poor isolation in case of slow listeners.

- **ASYNC_ISOLATED**: Events are processed asynchronously, with each listener having its own dedicated queue and dispatcher thread. This approach offers better isolation but at the expense of multiple queues and dispatchers.

When processing pre-event, you could throw a `ForbiddenException` to skip the following executions. For more details, please refer to the definition of the plugin.
Expand All @@ -163,8 +162,8 @@ Gravitino provides a default implement to log basic audit information to a file,
| Property name | Description | Default value | Required | Since Version |
|---------------------------------------|----------------------------------------|---------------------------------------------|----------|----------------------------|
| `gravitino.audit.enabled` | The audit log enable flag. | false | NO | 0.7.0-incubating |
| `gravitino.audit.writer.className` | The class name of audit log writer. | org.apache.gravitino.audit.FileAuditWriter | NO | 0.7.0-incubating |
| `gravitino.audit.formatter.className` | The class name of audit log formatter. | org.apache.gravitino.audit.SimpleFormatter | NO | 0.7.0-incubating |
| `gravitino.audit.writer.className` | The class name of audit log writer. | org.apache.gravitino.audit.FileAuditWriter | NO | 0.7.0-incubating |
| `gravitino.audit.formatter.className` | The class name of audit log formatter. | org.apache.gravitino.audit.SimpleFormatter | NO | 0.7.0-incubating |

#### Audit log formatter

Expand Down Expand Up @@ -221,7 +220,6 @@ Below is a list of catalog properties that will be used by all Gravitino catalog
| `cloud.name` | The property to specify the cloud that the catalog is running on. The valid values are `aws`, `azure`, `gcp`, `on_premise` and `other`. | (none) | No | 0.6.0-incubating |
| `cloud.region-code` | The property to specify the region code of the cloud that the catalog is running on. | (none) | No | 0.6.0-incubating |


The following table lists the catalog specific properties and their default paths:

| catalog provider | catalog properties | catalog properties configuration file path |
Expand Down Expand Up @@ -255,5 +253,5 @@ Currently, due to the absence of a comprehensive user permission system, Graviti
Apache Hadoop access. Ensure that the user starting the Gravitino server has Hadoop (HDFS, YARN, etc.) access
permissions; otherwise, you may encounter a `Permission denied` error. There are two ways to resolve this error:

* Grant Gravitino startup user permissions in Hadoop
* Specify the authorized Hadoop username in the environment variables `HADOOP_USER_NAME` before starting the Gravitino server.
- Grant Gravitino startup user permissions in Hadoop
- Specify the authorized Hadoop username in the environment variables `HADOOP_USER_NAME` before starting the Gravitino server.
10 changes: 5 additions & 5 deletions docs/hadoop-catalog-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@ Gravitino Hadoop catalog index includes the following chapters:

Apart from the above, you can also refer to the following topics to manage and access cloud storage like S3, GCS, ADLS, and OSS:

- [Using Hadoop catalog to manage S3](./hadoop-catalog-with-s3.md).
- [Using Hadoop catalog to manage GCS](./hadoop-catalog-with-gcs.md).
- [Using Hadoop catalog to manage ADLS](./hadoop-catalog-with-adls.md).
- [Using Hadoop catalog to manage OSS](./hadoop-catalog-with-oss.md).
- [Using Hadoop catalog to manage S3](./hadoop-catalog-with-s3.md).
- [Using Hadoop catalog to manage GCS](./hadoop-catalog-with-gcs.md).
- [Using Hadoop catalog to manage ADLS](./hadoop-catalog-with-adls.md).
- [Using Hadoop catalog to manage OSS](./hadoop-catalog-with-oss.md).

More storage options will be added soon. Stay tuned!
More storage options will be added soon. Stay tuned!
Loading