Skip to content

Commit

Permalink
Change artifact name to incubator-xtable and Add deploy plugin to mvn
Browse files Browse the repository at this point in the history
  • Loading branch information
vinishjail97 committed Aug 2, 2024
1 parent 072a1f8 commit 5baf117
Show file tree
Hide file tree
Showing 18 changed files with 171 additions and 45 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ catalogOptions: # all other options are passed through in a map
key1: value1
key2: value2
```
5. run with `java -jar xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml [--hadoopConfig hdfs-site.xml] [--convertersConfig converters.yaml] [--icebergCatalogConfig catalog.yaml]`
5. run with `java -jar incubator-xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml [--hadoopConfig hdfs-site.xml] [--convertersConfig converters.yaml] [--icebergCatalogConfig catalog.yaml]`
The bundled jar includes hadoop dependencies for AWS, Azure, and GCP. Sample hadoop configurations for configuring the converters
can be found in the [xtable-hadoop-defaults.xml](https://github.com/apache/incubator-xtable/blob/main/utilities/src/main/resources/xtable-hadoop-defaults.xml) file.
The custom hadoop configurations can be passed in with the `--hadoopConfig [custom-hadoop-config-file]` option.
Expand Down
6 changes: 3 additions & 3 deletions demo/notebook/demo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,9 @@
"import $ivy.`org.apache.hudi:hudi-spark3.2-bundle_2.12:0.14.0`\n",
"import $ivy.`org.apache.hudi:hudi-java-client:0.14.0`\n",
"import $ivy.`io.delta:delta-core_2.12:2.0.2`\n",
"import $cp.`/home/jars/xtable-core-0.1.0-SNAPSHOT.jar`\n",
"import $cp.`/home/jars/xtable-api-0.1.0-SNAPSHOT.jar`\n",
"import $cp.`/home/jars/xtable-hudi-support-utils-0.1.0-SNAPSHOT.jar`\n",
"import $cp.`/home/jars/incubator-xtable-core-0.1.0-SNAPSHOT.jar`\n",
"import $cp.`/home/jars/incubator-xtable-api-0.1.0-SNAPSHOT.jar`\n",
"import $cp.`/home/jars/incubator-xtable-hudi-support-utils-0.1.0-SNAPSHOT.jar`\n",
"import $ivy.`org.apache.iceberg:iceberg-hive-runtime:1.3.1`\n",
"import $ivy.`io.trino:trino-jdbc:431`\n",
"import java.util._\n",
Expand Down
6 changes: 3 additions & 3 deletions demo/start_demo.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@ cd $XTABLE_HOME

mvn install -am -pl xtable-core -DskipTests -T 2
mkdir -p demo/jars
cp xtable-hudi-support/xtable-hudi-support-utils/target/xtable-hudi-support-utils-0.1.0-SNAPSHOT.jar demo/jars
cp xtable-api/target/xtable-api-0.1.0-SNAPSHOT.jar demo/jars
cp xtable-core/target/xtable-core-0.1.0-SNAPSHOT.jar demo/jars
cp xtable-hudi-support/xtable-hudi-support-utils/target/incubator-xtable-hudi-support-utils-0.1.0-SNAPSHOT.jar demo/jars
cp xtable-api/target/incubator-xtable-api-0.1.0-SNAPSHOT.jar demo/jars
cp xtable-core/target/incubator-xtable-core-0.1.0-SNAPSHOT.jar demo/jars

cd demo
docker-compose up
134 changes: 130 additions & 4 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<modelVersion>4.0.0</modelVersion>

<groupId>org.apache.xtable</groupId>
<artifactId>xtable</artifactId>
<artifactId>incubator-xtable</artifactId>
<name>xtable</name>

<parent>
Expand All @@ -46,8 +46,14 @@
<log4j.version>2.22.0</log4j.version>
<junit.version>5.9.0</junit.version>
<lombok.version>1.18.30</lombok.version>
<lombok-maven-plugin.version>1.18.20.0</lombok-maven-plugin.version>
<hadoop.version>3.3.6</hadoop.version>
<hudi.version>0.14.0</hudi.version>
<maven-source-plugin.version>3.3.1</maven-source-plugin.version>
<maven-javadoc-plugin.version>3.8.0</maven-javadoc-plugin.version>
<maven-gpg-plugin.version>3.2.4</maven-gpg-plugin.version>
<maven-deploy-plugin.version>3.1.1</maven-deploy-plugin.version>
<maven-release-plugin.version>2.5.3</maven-release-plugin.version>
<parquet.version>1.12.2</parquet.version>
<scala.version>2.12.15</scala.version>
<scala.version.prefix>2.12</scala.version.prefix>
Expand All @@ -62,6 +68,8 @@
<delta.standalone.version>0.5.0</delta.standalone.version>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<target.dir.pattern>**/target/**</target.dir.pattern>
<delombok.output.dir>${project.build.directory}/delombok</delombok.output.dir>

<!-- Test properties -->
<skipTests>false</skipTests>
<skipUTs>${skipTests}</skipUTs>
Expand All @@ -80,17 +88,17 @@
<dependencies>
<dependency>
<groupId>org.apache.xtable</groupId>
<artifactId>xtable-api</artifactId>
<artifactId>incubator-xtable-api</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.xtable</groupId>
<artifactId>xtable-core</artifactId>
<artifactId>incubator-xtable-core</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.xtable</groupId>
<artifactId>xtable-hudi-support-utils</artifactId>
<artifactId>incubator-xtable-hudi-support-utils</artifactId>
<version>${project.version}</version>
</dependency>

Expand Down Expand Up @@ -539,6 +547,25 @@
</plugins>
</pluginManagement>
<plugins>
<plugin>
<groupId>org.projectlombok</groupId>
<artifactId>lombok-maven-plugin</artifactId>
<version>${lombok-maven-plugin.version}</version>
<configuration>
<sourceDirectory>${project.basedir}/src/main/java</sourceDirectory>
<addOutputDirectory>false</addOutputDirectory>
<outputDirectory>${delombok.output.dir}</outputDirectory>
<encoding>UTF-8</encoding>
</configuration>
<executions>
<execution>
<phase>generate-sources</phase>
<goals>
<goal>delombok</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-enforcer-plugin</artifactId>
Expand Down Expand Up @@ -597,6 +624,31 @@
<argLine>-Xmx1024m</argLine>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-release-plugin</artifactId>
<version>${maven-release-plugin.version}</version>
<configuration>
<autoVersionSubmodules>true</autoVersionSubmodules>
<useReleaseProfile>false</useReleaseProfile>
<releaseProfiles>release</releaseProfiles>
<goals>deploy</goals>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-deploy-plugin</artifactId>
<version>${maven-deploy-plugin.version}</version>
<executions>
<execution>
<id>default-deploy</id>
<phase>deploy</phase>
<goals>
<goal>deploy</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.rat</groupId>
<artifactId>apache-rat-plugin</artifactId>
Expand Down Expand Up @@ -770,4 +822,78 @@
</plugins>
</build>

<repositories>
<repository>
<id>Maven Central</id>
<name>Maven Repository</name>
<url>https://repo.maven.apache.org/maven2</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>

<profiles>
<profile>
<id>release</id>
<activation>
<property>
<name>deployArtifacts</name>
<value>true</value>
</property>
</activation>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-source-plugin</artifactId>
<version>${maven-source-plugin.version}</version>
<executions>
<execution>
<id>attach-sources</id>
<goals>
<goal>jar-no-fork</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-javadoc-plugin</artifactId>
<version>${maven-javadoc-plugin.version}</version>
<executions>
<execution>
<id>attach-javadocs</id>
<goals>
<goal>jar</goal>
</goals>
</execution>
</executions>
<configuration>
<doclint>none</doclint>
<source>1.8</source>
<sourcepath>${delombok.output.dir}</sourcepath>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-gpg-plugin</artifactId>
<version>${maven-gpg-plugin.version}</version>
<executions>
<execution>
<id>sign-artifacts</id>
<phase>verify</phase>
<goals>
<goal>sign</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</profile>
</profiles>
</project>
4 changes: 2 additions & 2 deletions website/docs/biglake-metastore.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ This document walks through the steps to register an Apache XTable™ (Incubatin
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account_key.json
```
5. Clone the Apache XTable™ (Incubating) [repository](https://github.com/apache/incubator-xtable) and create the
`xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)
`incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)
6. Download the [BigLake Iceberg JAR](gs://spark-lib/biglake/biglake-catalog-iceberg1.2.0-0.1.0-with-dependencies.jar) locally.
Apache XTable™ (Incubating) requires the JAR to be present in the classpath.

Expand Down Expand Up @@ -117,7 +117,7 @@ catalogOptions:
From your terminal under the cloned Apache XTable™ (Incubating) directory, run the sync process using the below command.

```shell md title="shell"
java -cp xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:/path/to/downloaded/biglake-catalog-iceberg1.2.0-0.1.0-with-dependencies.jar org.apache.xtable.utilities.RunSync --datasetConfig my_config.yaml --icebergCatalogConfig catalog.yaml
java -cp xtable-utilities/target/incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:/path/to/downloaded/biglake-catalog-iceberg1.2.0-0.1.0-with-dependencies.jar org.apache.xtable.utilities.RunSync --datasetConfig my_config.yaml --icebergCatalogConfig catalog.yaml
```

:::tip Note:
Expand Down
4 changes: 2 additions & 2 deletions website/docs/bigquery.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,9 @@ If you are not planning on using Iceberg, then you do not need to add these to y
:::

#### Steps to add additional configurations to the Hudi writers:
1. Add the extensions jar (`hudi-extensions-0.1.0-SNAPSHOT-bundled.jar`) to your class path
1. Add the extensions jar (`incubator-xtable-hudi-extensions-0.1.0-SNAPSHOT-bundled.jar`) to your class path
For example, if you're using the Hudi [quick-start guide](https://hudi.apache.org/docs/quick-start-guide#spark-shellsql)
for spark you can just add `--jars hudi-extensions-0.1.0-SNAPSHOT-bundled.jar` to the end of the command.
for spark you can just add `--jars incubator-xtable-hudi-extensions-0.1.0-SNAPSHOT-bundled.jar` to the end of the command.
2. Set the following configurations in your writer options:
```shell md title="shell"
hoodie.avro.write.support.class: org.apache.xtable.hudi.extensions.HoodieAvroWriteSupportWithFieldIds
Expand Down
2 changes: 1 addition & 1 deletion website/docs/fabric.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ An example hadoop configuration for authenticating to ADLS storage account is as
```

```shell md title="shell"
java -jar xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml --hadoopConfig hadoop.xml
java -jar xtable-utilities/target/incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml --hadoopConfig hadoop.xml
```

Running the above command will translate the table `people` in Iceberg or Hudi format to Delta Lake format. To validate
Expand Down
4 changes: 2 additions & 2 deletions website/docs/glue-catalog.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ This document walks through the steps to register an Apache XTable™ (Incubatin
also set up access credentials by following the steps
[here](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-quickstart.html)
3. Clone the Apache XTable™ (Incubating) [repository](https://github.com/apache/incubator-xtable) and create the
`xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)
`incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)

## Steps
### Running sync
Expand Down Expand Up @@ -84,7 +84,7 @@ Replace with appropriate values for `sourceFormat`, `tableBasePath` and `tableNa
From your terminal under the cloned xtable directory, run the sync process using the below command.

```shell md title="shell"
java -jar xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
java -jar xtable-utilities/target/incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
```

:::tip Note:
Expand Down
4 changes: 2 additions & 2 deletions website/docs/hms.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ This document walks through the steps to register an Apache XTable™ (Incubatin
or a distributed system like Amazon EMR, Google Cloud's Dataproc, Azure HDInsight etc.
This is a required step to register the table in HMS using a Spark client.
3. Clone the XTable™ (Incubating) [repository](https://github.com/apache/incubator-xtable) and create the
`xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)
`incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)
4. This guide also assumes that you have configured the Hive Metastore locally or on EMR/Dataproc/HDInsight
and is already running.

Expand Down Expand Up @@ -88,7 +88,7 @@ datasets:

From your terminal under the cloned Apache XTable™ (Incubating) directory, run the sync process using the below command.
```shell md title="shell"
java -jar xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
java -jar xtable-utilities/target/incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
```

:::tip Note:
Expand Down
4 changes: 2 additions & 2 deletions website/docs/how-to.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ history to enable proper point in time queries.
1. A compute instance where you can run Apache Spark. This can be your local machine, docker,
or a distributed service like Amazon EMR, Google Cloud's Dataproc, Azure HDInsight etc
2. Clone the Apache XTable™ (Incubating) [repository](https://github.com/apache/incubator-xtable) and create the
`xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)
`incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)
3. Optional: Setup access to write to and/or read from distributed storage services like:
* Amazon S3 by following the steps
[here](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) to install AWSCLIv2
Expand Down Expand Up @@ -351,7 +351,7 @@ Authentication for GCP requires service account credentials to be exported. i.e.
In your terminal under the cloned Apache XTable™ (Incubating) directory, run the below command.

```shell md title="shell"
java -jar xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
java -jar xtable-utilities/target/incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
```

**Optional:**
Expand Down
4 changes: 2 additions & 2 deletions website/docs/unity-catalog.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ This document walks through the steps to register an Apache XTable™ (Incubatin
3. Create a Unity Catalog metastore in Databricks as outlined [here](https://docs.gcp.databricks.com/data-governance/unity-catalog/create-metastore.html#create-a-unity-catalog-metastore).
4. Create an external location in Databricks as outlined [here](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-create-location.html).
5. Clone the Apache XTable™ (Incubating) [repository](https://github.com/apache/incubator-xtable) and create the
`xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)
`incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)

## Pre-requisites (for open-source Unity Catalog)
1. Source table(s) (Hudi/Iceberg) already written to external storage locations like S3/GCS/ADLS or local.
Expand Down Expand Up @@ -48,7 +48,7 @@ datasets:
From your terminal under the cloned Apache XTable™ (Incubating) directory, run the sync process using the below command.

```shell md title="shell"
java -jar xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
java -jar xtable-utilities/target/incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
```

:::tip Note:
Expand Down
4 changes: 2 additions & 2 deletions xtable-api/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,12 @@
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<artifactId>xtable-api</artifactId>
<artifactId>incubator-xtable-api</artifactId>
<name>xtable-api</name>

<parent>
<groupId>org.apache.xtable</groupId>
<artifactId>xtable</artifactId>
<artifactId>incubator-xtable</artifactId>
<version>0.1.0-SNAPSHOT</version>
</parent>

Expand Down
8 changes: 4 additions & 4 deletions xtable-core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -19,23 +19,23 @@
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<artifactId>xtable-core</artifactId>
<artifactId>incubator-xtable-core</artifactId>
<name>xtable-core</name>

<parent>
<groupId>org.apache.xtable</groupId>
<artifactId>xtable</artifactId>
<artifactId>incubator-xtable</artifactId>
<version>0.1.0-SNAPSHOT</version>
</parent>

<dependencies>
<dependency>
<groupId>org.apache.xtable</groupId>
<artifactId>xtable-api</artifactId>
<artifactId>incubator-xtable-api</artifactId>
</dependency>
<dependency>
<groupId>org.apache.xtable</groupId>
<artifactId>xtable-hudi-support-utils</artifactId>
<artifactId>incubator-xtable-hudi-support-utils</artifactId>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
Expand Down
Loading

0 comments on commit 5baf117

Please sign in to comment.