广告

本站里的文章大部分经过自行整理与测试

2016年4月30日星期六

Ubuntu - Hadoop 2.7.1 - Pseudo Distributed Mode 安装

这个是伪分布模式, 单机用
是 namenode 也是 datanode
已在 ubuntu 14.04 测试过

1. 准备 Java 与 ssh

安装 JAVA
http://jasonmun.blogspot.my/2016/04/ubuntu-java-6-7-8-9.html

安装 SSH
http://jasonmun.blogspot.my/2015/08/ubuntu-ssh-scp.html


2. 修改 /etc/hostname 和 /etc/hosts

$ sudo gedit /etc/hostname
ubuntu

# 让 hostname 生效
$ /etc/init.d/hostname restart

$ /etc/init.d/hostname.sh restart

$ sudo gedit /etc/hosts

127.0.0.1    localhost
127.0.0.1    ubuntu


3. 将 hadoop-2.7.1.tar.gz 解压, 放到 /home/userID/hadoop-2.7.1
http://hadoop.apache.org/releases.html
http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz


# 假设 hadoop-2.7.1.tar.gz 已放到 /home/userID
$ cd ~
$ tar zxvf hadoop-2.7.1.tar.gz
 

4. 修改 /etc/profile

$ sudo gedit /etc/profile

#####################
### Oracle Java 8 ###
#####################
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export PATH="$JAVA_HOME/bin:$PATH"
export CLASSPATH="$JAVA_HOME/lib"

#############################
# hadoop 2.7.1
#############################
export HADOOP_PREFIX=/home/userID/hadoop-2.7.1

export HADOOP_HOME=$HADOOP_PREFIX

export PATH=$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin:$PATH

export HADOOP_MAPRED_HOME=$HADOOP_PREFIX
export HADOOP_COMMON_HOME=$HADOOP_PREFIX
export HADOOP_HDFS_HOME=$HADOOP_PREFIX
export YARN_HOME=$HADOOP_PREFIX
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_PREFIX/lib"


$ source /etc/profile


以下开始修改 hadoop 2.7.1 设置文件 (共6个)

5. hadoop-2.7.1/etc/hadoop/slaves 文件 (这个是 datanode 列表)
写入
localhost

$ cd ~/hadoop-2.7.1/etc/hadoop
$ gedit slaves

6. 修改 hadoop-2.7.1/etc/hadoop/hadoop-env.sh

$ cd ~/hadoop-2.7.1/etc/hadoop
$ gedit hadoop-env.xml

# export JAVA_HOME=${JAVA_HOME} 的下一行增加
export JAVA_HOME=/usr/lib/jvm/java-8-oracle


7. 修改 hadoop-2.7.1/etc/hadoop/hdfs-site.xml

$ gedit hdfs-site.xml

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>

    <property>
        <name>dfs.name.dir</name>
        <value>file:///home/userID/hadoop-2.7.1/hdfs/namenode</value>
    </property>
    <property>
        <name>dfs.data.dir</name>
        <value>file:///home/userID/hadoop-2.7.1/hdfs/datanode</value>
    </property>
</configuration>


8. 修改 hadoop-2.7.1/etc/hadoop/core-site.xml

$ gedit core-site.xml

<configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://
localhost:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/userID/hadoop-2.7.1/tmp</value>
    </property>
</configuration>


9. 修改 hadoop-2.7.1/etc/hadoop/yarn-site.xml

$ gedit yarn-site.xml

<configuration>
    <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
    </property>
    <property>
            <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
            <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
            <name>yarn.resourcemanager.resource-tracker.address</name>
            <value>
localhost:8025</value>
    </property>
    <property>
            <name>yarn.resourcemanager.scheduler.address</name>
            <value>
localhost:8030</value>
    </property>
    <property>
            <name>yarn.resourcemanager.address</name>
            <value>
localhost:8050</value>
    </property>
</configuration>


10. 复制模板  hadoop-2.7.1/etc/hadoop/mapred-site.xml.template 修改为 mapred-site.xml

$ cp mapred-site.xml.template mapred-site.xml
$ gedit mapred-site.xml

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

没有评论:

发表评论