Introduction
Getting started with Microsoft Azure Cobalt 100, Azure Linux 3.0, and Apache Spark
Create an Azure Cobalt 100 Arm64 virtual machine
Set up an Azure Linux 3.0 environment
Install Apache Spark on Azure Cobalt 100 processors
Validate Apache Spark on Azure Cobalt 100 Arm64 VMs
Benchmark Apache Spark
Next Steps
After installing Apache Spark on your Arm64 virtual machine, you can perform simple baseline testing to validate that Spark runs correctly and produces the expected output.
Use a text editor of your choice to create a file named test_spark.py
with the following content:
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("Test").getOrCreate()
df = spark.createDataFrame([(1, "ARM64"), (2, "Azure")], ["id", "name"])
df.show()
spark.stop()
Execute the test script with:
spark-submit test_spark.py
You should see output similar to:
25/07/22 05:16:00 INFO CodeGenerator: Code generated in 10.545923 ms
25/07/22 05:16:00 INFO SparkContext: SparkContext is stopping with exitCode 0.
+---+-----+
| id| name|
+---+-----+
| 1|ARM64|
| 2|Azure|
+---+-----+