[Question] Is there any documentation on how to connect to Amazon S3? #601
-
I have some Avro files in an S3 bucket that I'd like to load into a Dataframe. All the documentation I've been able to find online has been for Scala, and not being a Scala dev myself, I'm struggling to translate it into .NET for Apache Spark. Are there any docs or examples for how to do this in .NET? |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments
-
Any specific doc you are referring to? I checked https://sparkbyexamples.com/spark/spark-read-write-avro-files-from-amazon-s3/ and it should be straightforward. If you share a Scala code, I am happy to translate it. |
Beta Was this translation helpful? Give feedback.
-
@rsaltrelli were you able to get your original issue resolved ? If so, please feel free to close the issue. |
Beta Was this translation helpful? Give feedback.
-
Unfortunately, no. I have abandoned the endeavor for now. There seem to be too many caveats and workarounds with .NET for Spark right now. Perhaps someone who is already familiar with Spark can manage but that's not me. I'll check back later when the project and docs have matured a bit. |
Beta Was this translation helpful? Give feedback.
-
I hope it helps
using System;
using Microsoft.Spark.Sql;
namespace emrApp
{
class Program
{
static void Main(string[] args)
{
SparkSession spark = SparkSession
.Builder()
.AppName("myapp")
.GetOrCreate();
DataFrame dataFrame = spark
.Read()
.Format("avro")
.Load("s3a://<your_buck_address>");
dataFrame.Show();
}
}
} you can try it on and if you encounter aws credential errors,
|
Beta Was this translation helpful? Give feedback.
-
@rsaltrelli Are you open to trying out the suggestion from @KimKiHyuk ? |
Beta Was this translation helpful? Give feedback.
I hope it helps