In this post we will see how we can integrate Tableau Desktop with Spark SQL. Tableau’s integration with Spark brings tremendous value to the Spark community – we can visually analyse data without writing a single line of Spark SQL code. That’s a big deal because creating a visual interface to our data expands the Spark technology beyond data scientists and data engineers to all business users. The Spark connector takes advantage of Tableau’s flexible connection architecture that gives customers the option to connect live and issue interactive queries, or use Tableau’s fast in-memory database engine.
Software requirements :-
We will be using the following softwares to do the integration -
1. Tableau Desktop-9-2-0
2. Hive 1.2.1
3. Spark 1.4.0 for Hadoop 2.6.0
We can skip the Hive and can directly work with Spark SQL. For this example we will use Hive, import Hive tables to Spark SQL and will Integrate them with Tableau SQL.
Hive Setup :-
1. Download and install Hive 1.2.1.
2. Download and copy mysql connector jar file to ${HIVE_HOME}/lib directory so hive will use MySql metastore.
3. Start Hive ${HIVE_HOME}/bin $./hive
4. Create some table and insert data to that table
create table product(productid INT, productname STRING, proce FLOAT, category STRING) ROW FORMAT DELIMITED
FIELDS TERMINATED BY ',';
INSERT INTO TABLE product VALUES(1,Book,25,Statonery);
INSERT INTO TABLE product VALUES(2,Pens,10,Stationery);
INSERT INTO TABLE product VALUES(3,Sugar,40.05,House Hold Item);
INSERT INTO TABLE product VALUES(4,Furniture,1200,Interiors);
Hive setup is complete now.
Spark Setup :-
1. Download and extract Spark 1.5.2 for Hadoop 2.6.0
2. Copy hive-site.xml from ${HIVE_HOME}/conf directory to ${SPARK_HOME}/conf directory
3. Replace all "s" from time values like 0s to 0 or <xyz>ms to <xyz> else it might give us Number Format Exception
4. Define SPARK MASTER IP export SPARK_MASTER_IP=<host_ip_addr> in spark-env.sh file (without this thrift server will not work) located at ${SPARK_HOME}/conf directory
5. Start spark master and slave
Now time to start Thrift server -
7. ${SPARK_HOME}/sbin $ ././start-thriftserver.sh --master spark://<spark_host_ip>:<port> --driver-class-path ../lib/mysql-connector-java-5.1.34.jar --hiveconf hive.server2.thrift.bind.host localhost --hiveconf hive.server2.thrift.port 10001
It will start thrift server on 10001 port
Now go to Tableau Desktop
Software requirements :-
We will be using the following softwares to do the integration -
1. Tableau Desktop-9-2-0
2. Hive 1.2.1
3. Spark 1.4.0 for Hadoop 2.6.0
We can skip the Hive and can directly work with Spark SQL. For this example we will use Hive, import Hive tables to Spark SQL and will Integrate them with Tableau SQL.
Hive Setup :-
1. Download and install Hive 1.2.1.
2. Download and copy mysql connector jar file to ${HIVE_HOME}/lib directory so hive will use MySql metastore.
3. Start Hive ${HIVE_HOME}/bin $./hive
4. Create some table and insert data to that table
create table product(productid INT, productname STRING, proce FLOAT, category STRING) ROW FORMAT DELIMITED
FIELDS TERMINATED BY ',';
INSERT INTO TABLE product VALUES(1,Book,25,Statonery);
INSERT INTO TABLE product VALUES(2,Pens,10,Stationery);
INSERT INTO TABLE product VALUES(3,Sugar,40.05,House Hold Item);
INSERT INTO TABLE product VALUES(4,Furniture,1200,Interiors);
Hive setup is complete now.
Spark Setup :-
1. Download and extract Spark 1.5.2 for Hadoop 2.6.0
2. Copy hive-site.xml from ${HIVE_HOME}/conf directory to ${SPARK_HOME}/conf directory
3. Replace all "s" from time values like 0s to 0 or <xyz>ms to <xyz> else it might give us Number Format Exception
4. Define SPARK MASTER IP export SPARK_MASTER_IP=<host_ip_addr> in spark-env.sh file (without this thrift server will not work) located at ${SPARK_HOME}/conf directory
5. Start spark master and slave
- ${SPARK_HOME}/sbin $./start-master.sh
- ${SPARK_HOME}/sbin $./start-slaves.sh
6. Goto http://localhost:8080/ and check that worker has started
Now time to start Thrift server -
7. ${SPARK_HOME}/sbin $ ././start-thriftserver.sh --master spark://<spark_host_ip>:<port> --driver-class-path ../lib/mysql-connector-java-5.1.34.jar --hiveconf hive.server2.thrift.bind.host localhost --hiveconf hive.server2.thrift.port 10001
It will start thrift server on 10001 port
8. Go to http://localhost:8080/ and check spark sql application has started
Now go to Tableau Desktop
- Select Spark Sql.
- Enter host as localhost, enter thrift server port from step here its 10001
- Select type as SparkThriftServer, Authentication as User Name
- Keep rest of the fields empty and click on OK
You are done!!! Happy report building using Tableau-Spark.
It is nice blog Thank you provide important information and i am searching for same information to save my time Big data hadoop online Course
ReplyDeleteLearned a lot of new things from your post ,It's amazing article
ReplyDeleteTableau Online Training
kayseriescortu.com - alacam.org - xescortun.com
ReplyDeleteHow do I know a casino that has a $10 free bonus?
ReplyDeleteI've always thought a no 의정부 출장안마 deposit casino bonus is the most 통영 출장마사지 appealing kind of promotion. A no deposit bonus is 과천 출장안마 an incentive for 세종특별자치 출장안마 new players 익산 출장샵 to sign up
perde modelleri
ReplyDeletesms onay
Türk telekom mobil ödeme bozdurma
NFT NASİL ALİNİR
Ankara evden eve nakliyat
trafik sigortasi
dedektör
web sitesi kurma
aşk kitapları
smm panel
ReplyDeleteSmm Panel
İS İLANLARİ
instagram takipçi satın al
hirdavatciburada.com
beyazesyateknikservisi.com.tr
servis
jeton hilesi indir