How to quickly configure CentOS HDFS
Apr 14, 2025 pm 07:24 PMDeploying Hadoop Distributed File System (HDFS) on a CentOS system requires several steps, and the following guide briefly describes the configuration process in stand-alone mode. Full cluster deployment is more complex.
1. Java environment configuration
First, make sure that the system has Java installed. Install OpenJDK using the following command:
yum install -y java-1.8.0-openjdk-devel
Configure Java environment variables:
echo "export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk" >> /etc/profile echo "export PATH=$JAVA_HOME/bin:$PATH" >> /etc/profile source /etc/profile java -version
2. SSH password-free login settings
In order to communicate seamlessly between nodes, SSH password-free login is required.
- Generate an SSH key pair:
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
- Copy the public key to all nodes (here is only a stand-alone configuration, so this step is omitted):
3. Hadoop download and decompression
Download the Hadoop distribution from the Apache Hadoop official website and unzip it to the specified directory:
wget https://downloads.apache.org/hadoop/core/hadoop-3.1.3/hadoop-3.1.3.tar.gz tar -zxvf hadoop-3.1.3.tar.gz mv hadoop-3.1.3 /opt/hadoop
4. Hadoop environment variable configuration
Edit the /etc/profile
file and add the following environment variables:
export HADOOP_HOME=/opt/hadoop export PATH=$HADOOP_HOME/bin:$PATH source /etc/profile
5. Hadoop configuration file modification
core-site.xml
Edit /opt/hadoop/etc/hadoop/core-site.xml
, add the following (replace 192.168.1.1
with your host IP):
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://192.168.1.1:9000</value> </property> </configuration>
hdfs-site.xml
Edit /opt/hadoop/etc/hadoop/hdfs-site.xml
and add the following:
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/opt/hadoop/hdfs/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/opt/hadoop/hdfs/datanode</value> </property> </configuration>
6. NameNode formatting
Format NameNode:
/opt/hadoop/bin/hdfs namenode -format
7. HDFS startup
Start HDFS service:
/opt/hadoop/sbin/start-dfs.sh
8. HDFS status verification
Check HDFS status:
jps
You should see the NameNode and DataNode processes running.
9. HDFS Web UI Access
Visit http://192.168.1.1:50070
(replace 192.168.1.1
with your host IP) to view the HDFS web interface.
This guide is for reference only for stand-alone HDFS configuration. Multi-node cluster deployment requires additional configuration of ZooKeeper, Secondary NameNode, etc., and ensure that all node configuration files are consistent.
The above is the detailed content of How to quickly configure CentOS HDFS. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

PHP code can be executed in many ways: 1. Use the command line to directly enter the "php file name" to execute the script; 2. Put the file into the document root directory and access it through the browser through the web server; 3. Run it in the IDE and use the built-in debugging tool; 4. Use the online PHP sandbox or code execution platform for testing.

Updating the Tomcat version in the Debian system generally includes the following process: Before performing the update operation, be sure to do a complete backup of the existing Tomcat environment. This covers the /opt/tomcat folder and its related configuration documents, such as server.xml, context.xml, and web.xml. The backup task can be completed through the following command: sudocp-r/opt/tomcat/opt/tomcat_backup Get the new version Tomcat Go to ApacheTomcat's official website to download the latest version. According to your Debian system

There are many methods and tools for monitoring Hadoop clusters on Debian systems. The following are some commonly used monitoring tools and their usage methods: Hadoop's own monitoring tool HadoopAdminUI: Access the HadoopAdminUI interface through a browser to intuitively understand the cluster status and resource utilization. HadoopResourceManager: Access the ResourceManager WebUI (usually http://ResourceManager-IP:8088) to monitor cluster resource usage and job status. Hadoop

DebianApache2's SEO optimization skills cover multiple levels. Here are some key methods: Keyword research: Use tools (such as keyword magic tools) to mine the core and auxiliary keywords of the page. High-quality content creation: produce valuable and original content, and the content needs to be conducted in-depth research to ensure smooth language and clear format. Content layout and structure optimization: Use titles and subtitles to guide reading. Write concise and clear paragraphs and sentences. Use the list to display key information. Combining multimedia such as pictures and videos to enhance expression. The blank design improves the readability of text. Technical level SEO improvement: robots.txt file: Specifies the access rights of search engine crawlers. Accelerate web page loading: optimized with the help of caching mechanism and Apache configuration

The main reason for integrating Oracle databases with Hadoop is to leverage Oracle's powerful data management and transaction processing capabilities, as well as Hadoop's large-scale data storage and analysis capabilities. The integration methods include: 1. Export data from OracleBigDataConnector to Hadoop; 2. Use ApacheSqoop for data transmission; 3. Read Hadoop data directly through Oracle's external table function; 4. Use OracleGoldenGate to achieve data synchronization.

Multi-version Apache coexistence can be achieved through the following steps: 1. Install different versions of Apache to different directories; 2. Configure independent configuration files and listening ports for each version; 3. Use virtual hosts to further isolate different versions. Through these methods, multiple Apache versions can be run efficiently on the same server to meet the needs of different projects.

To improve the performance of spool on Debian system, try the following method: Check the print queue status: Run the lpq command to see what tasks are in the current print queue, which can help grasp the situation and progress of the queue. Control printing tasks: Use the lpr and lp commands to send files to the printing queue, and can set parameters such as printer name, number of copies, and printing priority. Use the lprm command to remove specific tasks in the print queue, or use the cancel command to terminate the print task. Adjust kernel settings: Edit /etc/sysctl.conf file, add or modify kernel parameters to improve performance, such as increasing the upper limit of file descriptors, adjusting the TCP window size, etc. Clear unnecessary software and

ToinstallPHPandcommonextensionsonCentOS,firstenableEPELandRemirepositoriesviasudoyuminstallepel-release-yandsudoyuminstallhttps://rpms.remirepo.net/enterprise/remi-release-7.rpm-y,theninstallyum-utils.Next,enablethedesiredPHPstreamsuchassudoyum-confi
