Connecting To HDP from Talend Studio
Mohamad's interest is in Programming (Mobile, Web, Database and Machine Learning). He is studying at the Center For Artificial Intelligence Technology (CAIT), Universiti Kebangsaan Malaysia (UKM).
To configure Talend Open Studio to connect to the Hortonworks Data Platform (HDP) from the Ambari dashboard, ensure that your Hadoop services are running and properly configured in Ambari.
Step 1: Access the Ambari Dashboard
Open Your Web Browser:
- Navigate to the Ambari dashboard, typically at
http://<your-sandbox-ip>:8080(replace<your-sandbox-ip>with the actual IP address of your Hortonworks Sandbox).
- Navigate to the Ambari dashboard, typically at
Log In:
- Use the default credentials.
Step 2: Verify Hadoop Services
Check Cluster Status:
- In the Ambari dashboard, check the status of your cluster components (HDFS, YARN, etc.). Ensure that all necessary services are running.
Service Information:
Click on each service (like HDFS) to review their configurations and ensure they are set correctly. Pay attention to:
NameNode URI: For HDFS, you need to note the NameNode address (usually
hdfs://<namenode-host>:<port>).Web Interfaces: Ensure the web interfaces are accessible (e.g., HDFS at
http://<your-sandbox-ip>:50070).
Step 3: Configure HDFS (if needed)
Edit Configurations:
Click on the HDFS service in Ambari.
Go to the Configs tab to view or modify settings like
dfs.replication,dfs.namenode.name.dir, etc.Ensure the configurations match what you need for your Talend connection.
Save Changes:
- If you make any changes, click Save and then Restart the service if prompted.
Step 4: Retrieve Connection Details
Get Connection Information:
- Note the NameNode URI and any other relevant configuration details from the Ambari dashboard that you will need for Talend.
Step 5: Connect from Talend Open Studio
Now that you have confirmed the services and gathered the necessary connection details, you can proceed to configure Talend:
Open Talend Open Studio and create or select your project.
Create a New Hadoop Connection as described in the previous instructions:
Use the NameNode URI gathered from Ambari (e.g.,
hdfs://<your-sandbox-ip>:8020).Set up any required authentication if your cluster is secured.
Test the Connection to ensure that Talend can connect to the Hadoop services running on your Ambari-managed cluster.
Step 6: Monitor and Manage
- After you've set up the connection in Talend, you can return to the Ambari dashboard to monitor the cluster's performance and check for any issues while running your Talend jobs.

Right-click Job Designs, create new job.

Drag tHDFSConnection to the main panel.


Install external jar





