Java SE 8: Creating an HTTP Link Checker with Java
Overview
Purpose
This tutorial shows you how to use Java Platform, Standard Edition 8 (Java SE 8) and NetBeans 8 to create a link checker with the HTTPClient class.
Time to Complete
Approximately 80 minutes
Introduction
HTTP is the foundation for communication of data on the web. The proliferation of network-enabled applications has increased the use of the HTTP protocol beyond user-driven web browsers.- The HTTPClient class helps build HTTP-aware client applications, such as web browsers and web service clients for distributed communication.
- The URL class is a pointer to a resource on the web. A resource can be something as simple as a file or a directory, or it can be a reference to a more complicated object, such as a query to a database or to a search engine.
- The HttpURLConnection class helps establish an HTTP connection between the HTTPClient and server.
Scenario
A testing team wants to verify and validate a given set of URLs.
Hardware and Software Requirements
- Install the Java SE 8 JDK.
- Install NetBeans 8.0.
- Extract the Url_file.zip file.
Creating a Java Application
In this section, you create a Java application that you will use to demonstrate the HTTP link checker application.
-
In NetBeans IDE 8.0, select New Project from the File menu.
- On the Choose Project page, perform the following steps:
- Select Java from Categories.
- Select Java Application from Projects.
- Click Next.
- On the Name and Location page, perform the following steps:
- Enter LinkChecker as the project name.
- Select Create Main Class.
- Enter com.example.HTTPClient.
- Click Finish.


The Java SE 8 LinkChecker project is created in NetBeans. You're now ready to use the HTTPClient.java file to implement the link checker application.
Creating a Java enum Data Type
In this section, you create an enum data type to store the HTTP response code. An enum data type is a special data type that includes a set of predefined constants for a variable. The variable must be equal to one of the predefined values. You declare the HTTP response code and validate the URLs against them.
In this section, you use urlStatus, which has values like HTTP_OK(200, "OK", "SUCCESS"), NO_CONTENT(204, "No Content", "SUCCESS"), and INTERNAL_SERVER_ERROR(500, "Internal Server Error", "ERROR").
-
Create URLStatus.java and initialize it by using a constructor.
-
Retrieve the HTTP status message.
public static String getStatusMessageForStatusCode(int httpcode) {
String returnStatusMessage = "Status Not Defined";
for (URLStatus object : URLStatus.values()) {
if (object.statusCode == httpcode) {
returnStatusMessage = object.httpMessage;
}
}
return returnStatusMessage;
}
The getStatusMessageForStatusCode()method receives httpcode as the input parameter. The httpcode parameter is verified across all defined enum values. For
httpcode,
if an enum is defined, then the HTTP message for that code is returned; otherwise,"Status Not Defined" is
returned. -
Retrieve the result of the URL.
public static String getResultForStatusCode(int code) {
String returnResultMessage = "Result Not Defined";
for (URLStatus object : URLStatus.values()) {
if (object.statusCode == code) {
returnResultMessage = object.result;
}
}
return returnResultMessage;
}
}
The
getResultForStatusCode
()method receivescode
as the input parameter. The
code
parameter is verified across all defined enum values. Forcode,
if an enum is defined, then the result for that code is returned; otherwise,"
is returned.Result Not Defined
" - Review the code. Your code should look like the following:
public enum URLStatus {
HTTP_OK(200, "OK", "SUCCESS"),
NO_CONTENT(204, "No Content", "SUCCESS"),
MOVED_PERMANENTLY(301, "Moved Permanently",
"SUCCESS"), NOT_MODIFIED(304, "Not modified", "SUCCESS"),
USE_PROXY(305, "Use Proxy", "SUCCESS"),
INTERNAL_SERVER_ERROR(500, "Internal Server Error", "ERROR"),
NOT_FOUND(404, "Not Found", "ERROR");
private int statusCode;
private String httpMessage;
private String result;
public int getStatusCode() {
return statusCode;
}
private URLStatus(int code, String message,
String status) {
statusCode = code;
httpMessage = message;
result = status;
}
You defined the set of HTTP response code values as constants inside the URLStatus enum type. You initialize the declared enum values by using the constructor.
Note: Here is an explanation of some of the HTTP response codes:
- 200, OK: The client request was received, understood, and processed successfully.
- 301, Moved Permanently: The location was moved, and you're directed to the new location.
- 500, Internal Server Error: An error occurred during execution.
Verifying and Validating URLs
In this section, you verify and validate the URLs that are available in the url-list.txt file. You verify the URL for its correct format by using the verifyUrl method, and then you validate the verified URLs by using the validateUrl method to check for broken URLs. Add the url-list.txt file to the source package.
Verifying the URLs
In this section, you use the Java SE 8 regular expression to validate the URL format. The HTTPClient.java file has a verifyUrl method, which accepts the URL as the input parameter.
- Import the following packages:
-
Add the following method to the HTTPClient.java file to verify the URL format:
- Review the code. Your code should look like the following:
import
java.net.HttpURLConnection;
import java.net.URL;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class HTTPClient {
private boolean
verifyUrl(String url) {
String urlRegex = "^(http|https)://[-a-zA-Z0-9+&@#/%?=~_|,!:.;]*[-a-zA-Z0-9+@#/%=&_|]";
Pattern pattern =
Pattern.compile(urlRegex);
Matcher m =
pattern.matcher(url);
if (m.matches()) {
return
true;
} else {
return false;
}
}
The verifyUrl method
verifies the url
parameter passed as the input
parameter by matching it with the regular expression. If the
match is successful, then it returns
true; otherwise, it returns false.
Validating the URLs
In this section, you validate the URLs listed in the url-list.txt file.
-
Modify HTTPClient.java.
- Declare the following variables:
private String failedURLS ="";
private String succeededURLS ="";
private String incorrectURLS = ""; Retrieve
the URLs from the url-list.txt file.public void validateUrl() throws Exception {
Path filePath = Paths.get("src/url-list.txt");
List<String> myURLArrayList = Files.readAllLines(filePath);
get
methodwith the Paths
class. Using thereadAllLines
method, you read the URLs in thefilePath
into theList
of type string.- Invoke the
verifyUrl
method.myURLArrayList.forEach((String url) -> {
if (verifyUrl(url)) {
try{Here, you're using the For-Each loop, which you write with a lambda expression. The For-Each loop retrieves the URL from
myURLArrayList,
and the retrieved URL is passed as an input parameter to theverifyUrl
method. TheverifyUrl
method returns true for a valid URL format; otherwise, it returns false. If theverifyUrl
method returns true, then the if condition is true, thereby executing the code in thetry
block.
- Create the HttpURLConnection
connection.
URL myURL = new URL(url);
HttpURLConnection myConnection = (HttpURLConnection) myURL.openConnection();
You will open the myURL instance with the connection that you created in this step.
- Validate the URL with the response code.
if (myConnection.getResponseCode()==URLStatus.HTTP_OK.getStatusCode()) {
succeededURLS = succeededURLS + "\n" + url + "****** Status message is : "
+ URLStatus.getStatusMessageForStatusCode(myConnection.getResponseCode());
} else {
failedURLS = failedURLS + "\n" + url + "****** Status message is : "
+ URLStatus.getStatusMessageForStatusCode(myConnection.getResponseCode());
}
The myConnection instance receives the URL's response code and verifies the status. If the status code is 200 (HTTP_OK), then the URL is classified as
succeededURLS,
otherwise, it's classified asfailedURLS
. - Close try
with the catch
block.
} catch (Exception e) {
System.out.print("For url- " + url+ "" +e.getMessage());
}The catch block is executed when an exception is thrown when
HttpURLConnection
is created and opened. - Verify the incorrect URLs.
}
else {
incorrectURLS += "\n" + url;
}
});
}The else block is executed when the
verifyUrl
method returns false because the URL validation failed.
- Declare the following variables:
- Review the code. Your code should look like the following:
- Add the following code to the
main()
method in the HTTPClient.java file:public static void main(String[] args) {
try {
HTTPClient myClient = new HTTPClient();
myClient.validateUrl();
System.out.println("Valid URLS that have successfully connected :");
System.out.println(myClient.succeededURLS);
System.out.println("\n--------------\n\n");
System.out.println("Broken URLS that did not successfully connect :");
System.out.println(myClient.failedURLS);
} catch (Exception e) {
System.out.print(e.getMessage()); }
}
}The main()method creates an instance of
HTTPClient
named myClient. Using the myClient instance, you invoke the validateUrl method. The myClient instance displays the valid URLs that connected, the broken URLs that did not connect, and the status code in the console. - Review the code. Your code should look like the following:
-
On the Projects tab, right-click HTTPClient.java and select Run File.
-
Review the set of URLs displayed in the url-list.txt file.
-
Verify the output.


For the given set of URLs, the application retrieves each URL, verifies it, validates it, and classifies it accordingly.

You successfully used the URLs listed in the url-list.txt file and classified them as valid URLs or broken URLs. The status codes of the broken URLs are displayed in the console.
Note: When you run this application on the Oracle network, some URLs may be blocked, and a connection timeout error is displayed.
Summary
In this tutorial, you learned how to create a Java SE project. You also learned how to use the URL and HttpURLConnection classes.
Resources
- To learn more about HttpURLConnection in Java, see
Java SE docs: HttpURLConnection.
- To learn more about URL in Java, see
Java SE docs: URL.
- To learn more about Java SE, refer to additional OBEs in the Oracle Learning Library.
- For more information about HTTP status codes.
Credits
- Curriculum Developer: Shilpa Chetan
To navigate this Oracle by Example tutorial, note the following:
- Topic List:
- Click a topic to navigate to that section.
- Expand All Topics:
- Click the button to show or hide the details for the sections.
By default, all topics are collapsed.
- Hide All Images:
- Click the button to show or hide the screenshots. By default,
all images are displayed.
- Print:
- Click the button to print the content. The content that is
currently displayed or hidden is printed.
To navigate to a particular section in this tutorial, select the topic from the list.