This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://github.com/Thomas-George-T/Movies-Analytics-in-Spark-and-Scala | |
Change execution engine = Tez, spark ( set Tez/Spark client jars into HADOOP_CLASSPATH) | |
Partitioning - PARTITIONED BY clause is used to divide the table into buckets. | |
Buckting - CLUSTERED BY clause is used to divide the table into buckets. | |
Map-Side join, Bucket-Map-Side join, Sorted Bucket-Map-Side join | |
Usage of suitable file format = ORC(Optimized Row Columnar) file formate | |
Indexing | |
Vectorization along with ORC | |
CBO |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/sh | |
# | |
# init.d script with LSB support. | |
# | |
# Copyright (c) 2007 Javier Fernandez-Sanguino <[email protected]> | |
# | |
# This is free software; you may redistribute it and/or modify | |
# it under the terms of the GNU General Public License as | |
# published by the Free Software Foundation; either version 2, | |
# or (at your option) any later version. |