How to link Virtuoso distributed version to Hadoop

I Have a cluster of 4 nodes, I installed Hadoop+ Spark (GraphX)…

Now I have to process a big RDF dataset, my question is : Can I install Virtuoso on the cluster so to store this RDF datasets and to be able to execute SPARQL distributed queries?

To the best of your knowledge, I need a web endpoint to allow users putting their SPARQL Queries.

in other words: is Virtuoso a good solution that works in a hadoop cluster, and can use SPARK to execute the distributed queries?

is there any free access to the Distributed version for academic people?

Virtuoso does not support Hadoop storage (HDFS, etc.), as it uses its own storage system. Nor does Virtuoso support Spark, so it cannot be used as you describe.

What are these “big RDF datasets” you are seeking to host in terms of number of triples? What sort of distributed queries would you be executing against them that can’t be run using the Virtuoso SPARQL query engine?