Tutorial: Checking Minimum Required Version of Java Jar in PySpark Application

Disclaimer: This tutorial demonstrates various methods of checking the minimum required version of a Java jar in a PySpark application. The provided code examples serve for illustrative purposes and may not be robust in real-world scenarios.

Method 1: Direct Package Version Check on Python Side

This method involves the Python script communicating with the Java jar using py4j to retrieve the library version.

  • Step 1: Create a Java Class (VersionGetter1.java)

Create a Java class named VersionGetter1.java with the following code:

public class VersionGetter1 {
    public static String getVersion() {
        return "1.3.5";
    }
}
  • Step 2: Compile and Generate a .jar File

Compile the Java class and generate a .jar file using the following command:

javac VersionGetter1.java && jar cvf versiongetter1.jar VersionGetter1.class
  • Step 3: Write the Python Script (tester1.py)

Write a Python script named tester1.py to perform version checking on the Python side:

pythonCopy code
# Perform Version Checking on the Python side
from pyspark.sql import SparkSession
from py4j.java_gateway import JavaGateway

def get_java_library_version():
    gateway = JavaGateway.launch_gateway(classpath="versiongetter1.jar")
    version = gateway.jvm.VersionGetter1.getVersion()
    return version

min_compatible_version = "5.0.0"
spark = SparkSession.builder.appName("Haizly Tutorial").getOrCreate()
java_library_version = get_java_library_version()
print("From Python: Java library version:", java_library_version)

# Check compatibility
if java_library_version < min_compatible_version:
    raise Exception(
        f"Java library version {java_library_version} is not compatible. Minimum version required: {min_compatible_version}"
    )
  • Step 4: Run the Python Script

Execute the Python script tester1.py:

python tester1.py

Method 2: Direct Package Version Check on Java Side

This method involves the Python script communicating with the Java jar using py4j, and the version checking is performed automatically on the Java side.

  • Step 1: Create a Java Class (VersionGetter2.java)

Create a Java class named VersionGetter2.java with the following code:

// Perform Version Checking on the Java side
public class VersionGetter2 {
    private static String version = "1.3.5";

    public static String getVersion(String minCompatibleVersion) {

        String libraryVersion = getVersion();
        if (compareVersions(libraryVersion, minCompatibleVersion) < 0) {
            throw new RuntimeException("From Python: Java library version " + libraryVersion + " is not compatible. Minimum version required: " + minCompatibleVersion);
        }
        return libraryVersion;
    }

    private static int compareVersions(String version1, String version2) {
        // Assuming version strings are in the format "x.y.z", Compare major, minor & patch version
        String[] v1 = version1.split("\\\\.");
        String[] v2 = version2.split("\\\\.");
        int result = Integer.compare(Integer.parseInt(v1[0]), Integer.parseInt(v2[0]));
        if (result != 0) return result;
        result = Integer.compare(Integer.parseInt(v1[1]), Integer.parseInt(v2[1]));
        if (result != 0) return result;
        return Integer.compare(Integer.parseInt(v1[2]), Integer.parseInt(v2[2]));
    }

    private static String getVersion() { return version; }
}
  • Step 2: Compile and Generate a .jar File

Compile the Java class and generate a .jar file using the following command:

javac VersionGetter2.java && jar cvf versiongetter2.jar VersionGetter2.class
  • Step 3: Write the Python Script (tester2.py)

Write a Python script named tester2.py to perform version checking on the Java side:

# Perform Version Checking on the Java side
from pyspark.sql import SparkSession
from py4j.java_gateway import JavaGateway

def check_java_library_version(min_compatible_version):
    gateway = JavaGateway.launch_gateway(classpath="versiongetter2.jar")
    version = gateway.jvm.VersionGetter2.getVersion(min_compatible_version)
    return version

min_compatible_version = "5.0.0"
spark = SparkSession.builder.appName("Haizly app").getOrCreate()

java_library_version = check_java_library_version(min_compatible_version)
print("From Python: Java version:", java_library_version)
  • Step 4: Run the Python Script

Execute the Python script tester2.py:

python tester2.py

Method 3 – Best Solution: Check the Version Directly within the Java Application

This method involves the Java application fetching its own manifest to determine its version and perform version checking internally.

  • Step 1: Create a Java Class (VersionGetter3.java)

Create a Java class named VersionGetter3.java with the following code:

import java.net.URLClassLoader;
import java.net.URL;
import java.io.IOException;
import java.util.jar.Manifest;
import java.util.jar.Attributes;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class VersionGetter3 {

  public static void checkJavaVersionCompatibility(String javaFrameworkVersion, String inputMinimumCompatibleVersion) {
    String[] javaVersionParts = javaFrameworkVersion.split("-")[0].split("\\.");
    String[] minimumVersionParts = inputMinimumCompatibleVersion.split("\\.");
    int[] javaVersionIntParts = new int[javaVersionParts.length];
    int[] minimumVersionIntParts = new int[minimumVersionParts.length];

    for (int i = 0; i < javaVersionParts.length; i++) {
      javaVersionIntParts[i] = Integer.parseInt(javaVersionParts[i]);
    }

    for (int i = 0; i < minimumVersionParts.length; i++) {
      minimumVersionIntParts[i] = Integer.parseInt(minimumVersionParts[i]);
    }

    for (int i = 0; i < javaVersionIntParts.length; i++) {
      if (javaVersionIntParts[i] < minimumVersionIntParts[i]) {
        throw new RuntimeException("Minimum Compatible Version (" + inputMinimumCompatibleVersion + ") is greater than the current Java Framework version (" + javaFrameworkVersion + ")");
      } else if (javaVersionIntParts[i] > minimumVersionIntParts[i]) {
        System.out.println("The Minimum Compatible Version & the Current Java Framework are Compatible");
         return;
      }
    }

    // if the 3 version parts are the same, check if there are additional parts in javaVersion (like 1.2.3-snapshot)
    if (javaVersionIntParts.length < minimumVersionIntParts.length) {
      throw new RuntimeException("Minimum Compatible Version (" + inputMinimumCompatibleVersion + ") is greater than the current Java Framework version (" + javaFrameworkVersion + ")");
    } else {
      System.out.println("The Minimum Compatible Version & the Current Java Framework are Compatible");

    }
  }

  private static String getJavaFrameworkVersionFromManifest() {
    URLClassLoader classLoader = (URLClassLoader) VersionGetter3.class.getClassLoader();
    String version = "";
    try {
      URL url = classLoader.findResource("META-INF/MANIFEST.MF");
      Manifest manifest = new Manifest(url.openStream());
      Attributes attributes = manifest.getMainAttributes();
      version = attributes.getValue("Implementation-Version");
    } catch (IOException e) {
      throw new RuntimeException("Could Not Get the Current Java Framework Version.\n");
    }
    return version;
  }

  public static void validateInputMinimumCompatibleVersionFormat(String inputMinimumCompatibleVersion) throws IllegalArgumentException {
    Pattern pattern = Pattern.compile("^\\d+\\.\\d+\\.\\d+$");
    Matcher matcher = pattern.matcher(inputMinimumCompatibleVersion);
    if (!matcher.matches()) {
      throw new IllegalArgumentException(inputMinimumCompatibleVersion + " is not a valid version number");
    } else {
      System.out.println("The input Minimum compatible version format is valid: " + inputMinimumCompatibleVersion + "\n");
    }
  }

  public static void checkVersionFromManifest(String inputMinimumCompatibleVersion) {
    validateInputMinimumCompatibleVersionFormat(inputMinimumCompatibleVersion);
    String javaFrameworkVersion = getJavaFrameworkVersionFromManifest();
    checkJavaVersionCompatibility(javaFrameworkVersion, inputMinimumCompatibleVersion);
  }

  public static void main(String[] args) {
    checkVersionFromManifest(args[0]);
  }
}
  • Step 2: Compile and Generate a .jar File

Compile the Java class and generate a .jar file using the following command:

bashCopy code
javac VersionGetter3.java && jar cvf versiongetter3.jar VersionGetter3.class

  • Step 3: Write the Python Script (tester3.py)

Write a Python script named tester3.py to run version checking using the Java application:

pythonCopy code
import argparse

def check_version(version: str = None):
    from py4j.java_gateway import JavaGateway
    gateway = JavaGateway.launch_gateway(classpath="versiongetter3.jar")
    java_framework_version = gateway.jvm.VersionGetter.checkVersionFromManifest(version)
    return java_framework_version

if __name__ == '__main__':
    parser = argparse.ArgumentParser(
        formatter_class=argparse.RawTextHelpFormatter,
        prog="Haizy",
        description='Testing'
    )

    parser.add_argument('-v',
                        '--version',
                        dest='provide the version',
                        type=validate_version,
                        default='1.0.1'
                        )
    args = parser.parse_args()
    check_version(args.version)

  • Step 4: Run the Python Script

Execute the Python script tester3.py with the desired version argument:

Written by

Albert Oplog

Hi, I'm Albert Oplog. I would humbly like to share my tech journey with people all around the world.