APACHE HADOOP INSTALLATION ON UBUNTU INSTALL
Spark Programming Model : Resilient Distributed Dataset (RDD) with CDHĪpache Spark 2.0.2 with PySpark (Spark Python API) ShellĪpache Spark 2.0.2 tutorial with PySpark : RDDĪpache Spark 2.0.0 tutorial with PySpark : Analyzing Neuroimaging Data with ThunderĪpache Spark Streaming with Kafka and CassandraĪpache Spark 1.2 with PySpark (Spark Python API) Wordcount using CDH5Īpache Drill with ZooKeeper install on Ubuntu 16.04 - Embedded & DistributedĪpache Drill - Query File System, JSON, and Parquet Spark 1.2 using VirtualBox and QuickStart VM - wordcount Wordcount MapReduce with Oozie workflow with Hue browser - CDH 5.3 Hadoop cluster using VirtualBox and QuickStart VM HBase - Map, Persistent, Sparse, Sorted, Distributed and Multidimensionalįlume with CDH5: a single-node Flume deployment (telnet example)Īpache Hadoop (CDH 5) Flume with VirtualBox : syslog example via NettyAvroRpcClientĪpache Hadoop : Creating Wordcount Java Project with Eclipse Part 1Īpache Hadoop : Creating Wordcount Java Project with Eclipse Part 2Īpache Hadoop : Creating Card Java Project with Eclipse using Cloudera VM UnoExample for CDH5 - local runĪpache Hadoop : Creating Wordcount Maven Project with Eclipse Zookeeper & Kafka - Single node and multiple brokersĪpache Hadoop Tutorial I with CDH - OverviewĪpache Hadoop Tutorial II with CDH - MapReduce Word CountĪpache Hadoop Tutorial III with CDH - MapReduce Word Count 2Īpache Hive 2.1.0 install on Ubuntu 16.04Ĭreating HBase table with HBase shell and HUEĪpache Hadoop : Hue 3.11 install on Ubuntu 16.04 Zookeeper & Kafka - single node single broker QuickStart VMs for CDH 5.3 II - Hive DB query QuickStart VMs for CDH 5.3 II - Testing with wordcount Hadoop 2.6.5 - Installing on Ubuntu 16.04 (Single-Node Cluster)ĬDH5.3 Install on four EC2 instances (1 Name node and 3 Datanodes) using Cloudera Manager 5
![apache hadoop installation on ubuntu apache hadoop installation on ubuntu](https://1.bp.blogspot.com/-Bkq9sZ2KZFk/T7dRQKH1SLI/AAAAAAAAAOo/6Fw5PTiWc-Q/s400/Download+Hadoop.jpg)
Hadoop 2.6 - Installing on Ubuntu 14.04 (Single-Node Cluster) We can exit from that Hive shell by using exit command: Consider using a different execution engine (i.e. Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Logging initialized using configuration in jar:file:/usr/local/apache-hive-2.1.0-bin/lib/hive-common-2.1.0.jar!/hive-log4j2.properties Async: true
![apache hadoop installation on ubuntu apache hadoop installation on ubuntu](https://linuxhint.com/wp-content/uploads/2018/04/hadoop-unarchived.png)
Now that we fixed the errors, let's start Hive hive The followings are the errors and corresponding fixes:Įxception in thread "main" : Couldn't create directory $ We may get couple of errors when we try to start hive via bin/hive command.
![apache hadoop installation on ubuntu apache hadoop installation on ubuntu](https://linuxhint.com/wp-content/uploads/2018/04/sudoers-before.png)
To use the Hive command line interface (CLI) from the shell, issue bin/hive command to verify echo $HIVE_HOME/bin/hive Verifying Hive Installation by running Hive CLI Initialization script hive-schema-2.1.0.derby.sql Starting metastore schema initialization to 2.1.0
APACHE HADOOP INSTALLATION ON UBUNTU DRIVER
Metastore Connection Driver : .EmbeddedDriver Metastore connection URL: jdbc:derby: databaseName=metastore_db create=true SLF4J: Class path contains multiple SLF4J bindings. In our case, we use derby as db schematool -dbType derby -initSchema Starting from Hive 2.1, we need to run the schematool command below as an initialization step. We need to set permission to Hive sudo chown -R hduser:hadoop apache-hive-2.1.0-bin = jdbc:derby://hadoop1:1527/metastore_db create = true To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.įor example, jdbc:postgresql://myhost/db?ssl=true for postgres database.Ĭreate a file named jpox.properties and add the following lines into it: JDBC connect string for a JDBC metastore. Jdbc:derby: databaseName=metastore_db create=true Make sure the following lines are between the and tags of hive-site.xml: Let's copy the template file using the following cd sudo cp hive-site.xml We want to do this by editing the hive-site.xml file, which is in the $HIVE_HOME/conf directory. The directory warehouse is the location to store the table or data related to hive, and the temporary directory tmp is the temporary location to store the intermediate result of processing.Ĭonfiguring Metastore means specifying to Hive where the database is stored.
![apache hadoop installation on ubuntu apache hadoop installation on ubuntu](https://i1.wp.com/wpcademy.com/wp-content/uploads/2019/03/Apache-Hadoop-1.jpg)
In addition, we must use below HDFS commands to create /tmp and /user/hive/warehouse (aka ) and set them chmod g+w before we can create a table in hdfs dfs -ls /ĭrwxr-xr-x - hduser supergroup 0 11:17 /hbaseĭrwx- hduser supergroup 0 16:04 /tmpĭrwxr-xr-x - hduser supergroup 0 09:13 hdfs dfs -mkdir hdfs dfs -chmod g+w hdfs dfs -chmod g+w hdfs dfs -ls /ĭrwx-w- hduser supergroup 0 16:04 /tmpĭrwxr-xr-x - hduser supergroup 0 17:18 hdfs dfs -ls /userĭrwxr-xr-x - hduser supergroup 0 23:17 /user/hduserĭrwxr-xr-x - hduser supergroup 0 17:18 /user/hive Hive uses Hadoop, so we must have Hadoop in our path: