Network Troubleshooting with Oracle Linux
by Robert Chase
Published June 2014
Use built-in Oracle Linux tools to troubleshoot network connectivity problems.
When applications that use network connectivity for communication are not working, the cause is often a mystery. Despite advances in modern operating systems, many users believe there is no way to directly "see" what's going over the wire, and that often leads to confusion and difficulties when something goes wrong.
The reality is that you can actually see what's going over the wire, and there are a number of tools built into Oracle Linux for troubleshooting network issues. This article will help solve some of the mystery and make network connectivity a bit more user friendly.
Tools for Network Troubleshooting
The most basic test you can perform to determine network connectivity is the
ping test. The
ping application is a simple utility that sends an Internet Control Message Protocol (ICMP) packet to a remote host on your network to see whether you get a reply. ICMP is part of the TCP/IP stack.
In the example shown in Listing 1, we are going to ping the
test2 system from the
test1 system to see whether
test2 responds on the network. The
ping application will continue to run until you terminate it by pressing Ctrl-C.
[root@test1 ~]# ping test2 PING test2 (10.0.0.2) 56(84) bytes of data. 64 bytes from test2 (10.0.0.2): icmp_seq=1 ttl=64 time=0.268 ms 64 bytes from test2 (10.0.0.2): icmp_seq=2 ttl=64 time=0.191 ms 64 bytes from test2 (10.0.0.2): icmp_seq=3 ttl=64 time=0.235 ms 64 bytes from test2 (10.0.0.2): icmp_seq=4 ttl=64 time=0.214 ms 64 bytes from test2 (10.0.0.2): icmp_seq=5 ttl=64 time=0.211 ms 64 bytes from test2 (10.0.0.2): icmp_seq=6 ttl=64 time=0.191 ms ^C --- test2 ping statistics --- 6 packets transmitted, 6 received, 0% packet loss, time 5180ms rtt min/avg/max/mdev = 0.191/0.218/0.268/0.029 ms
Listing 1. Example of using the
In the output shown in Listing 1, you can see that
test2 is responding. It was sent six ICMP packets and responded to all of them and had an average response time of 0.218ms.
ping command can also be used to identify network latency issues. If one of the packets had an abnormally long reply time, that might be an indicator of some type of network congestion or issue.
Many networks are quite complicated, and packets move through many different layers of routing and switching equipment to reach the intended destination. The
traceroute application can show us the path that packets are taking to reach their destination.
The example shown in Listing 2 uses
traceroute to trace the path to a popular search engine from the demo cloud instance. The IP address of the cloud instance has been replaced with
x.x.x.x in this example.
[user@test1 ~]$ traceroute google.com traceroute to google.com (18.104.22.168), 30 hops max, 40 byte packets 1 x.x.x.x (x.x.x.x) 0.553 ms 0.594 ms 0.624 ms 2 core1-Aggr301a-3.ord1.rackspace.net (22.214.171.124) 1.072 ms 1.135 ms 1.135 ms 3 corea.ord1.rackspace.net (126.96.36.199) 1.132 ms 1.162 ms corea.ord1.rackspace.net (188.8.131.52) 0.636 ms 4 edge1.ord1.rackspace.net (184.108.40.206) 0.670 ms edge1.ord1.rackspace.net (220.127.116.11) 0.930 ms 0.901 ms 5 18.104.22.168 (22.214.171.124) 1.146 ms 2.081 ms 1.018 ms 6 126.96.36.199 (188.8.131.52) 1.151 ms 184.108.40.206 (220.127.116.11) 1.262 ms 18.104.22.168 (22.214.171.124) 1.310 ms 7 ae-1-51.edge3.Chicago3.Level3.net (126.96.36.199) 1.180 ms 1.163 ms 1.312 ms 8 GOOGLE-INC.edge3.Chicago3.Level3.net (188.8.131.52) 1.363 ms 1.649 ms 1.519 ms 9 184.108.40.206 (220.127.116.11) 1.404 ms 1.549 ms 1.515 ms 10 18.104.22.168 (22.214.171.124) 1.721 ms 2.014 ms 2.205 ms 11 ord08s11-in-f8.1e100.net (126.96.36.199) 1.495 ms 1.609 ms 1.520 ms
Listing 2. Example of using the
traceroute are quite useful on their own, but in a lot of cases, they can provide only limited information based on a short duration of time. A very useful application for looking at network latency is
mtr. (mtr stands for My Traceroute, although it originally stood for Matt's Traceroute because the application was written by Matt Kimball in 1997.)
mtr application can provide real-time data about latency and routing changes for end users. Figure 1 shows an example of the output from the cloud instance to the same popular search engine used in the Listing 2.
Figure 1. Example of output from the
Have you ever had a situation in which you could contact a host but were not sure you were communicating with the correct host or using the correct IP address? The
nmap utility is a very powerful tool for network scanning and diagnostics, and it can provide exactly that information.
In example shown in Listing 3, we are verifying that the
ocfs2-2 node is actually the node we are looking for. You can verify you have the correct node by looking at its running services and its OS fingerprint. The
nmap utility can identify a number of operating systems based on their fingerprints if the fingerprints are contained in the
nmap database of known operating systems.
[root@ocfs2-1 ~]# nmap -A ocfs2-2 Starting Nmap 5.51 ( http://nmap.org ) at 2013-09-27 16:14 EDT Nmap scan report for ocfs2-2 (10.0.0.2) Host is up (0.00019s latency). Not shown: 997 closed ports PORT STATE SERVICE VERSION 22/tcp open ssh OpenSSH 5.3 (protocol 2.0) | ssh-hostkey: 1024 5c:f8:d8:b1:c9:5b:f7:08:ae:f4:b5:78:d6:a0:9e:c0 (DSA) |_2048 a1:2c:e2:f5:b8:b6:b8:cc:da:41:08:3c:f7:e1:24:06 (RSA) 111/tcp open rpcbind 7777/tcp open tcpwrapped MAC Address: 08:00:23:89:00:9D (Cadmus Computer Systems) No exact OS matches for host (If you know what OS is running on it, see http://nmap.org/submit/ ). TCP/IP fingerprint: OS:SCAN(V=5.51%D=9/27%OT=22%CT=1%CU=35384%PV=Y%DS=1%DC=D%G=Y%M=080027%TM=52 OS:45E71A%P=x86_64-unknown-linux-gnu)SEQ(SP=106%GCD=1%ISR=10D%TI=Z%CI=Z%II= OS:I%TS=A)OPS(O1=M5B4ST11NW5%O2=M5B4ST11NW5%O3=M5B4NNT11NW5%O4=M5B4ST11NW5% OS:O5=M5B4ST11NW5%O6=M5B4ST11)WIN(W1=3890%W2=3890%W3=3890%W4=3890%W5=3890%W OS:6=3890)ECN(R=Y%DF=Y%T=40%W=3908%O=M5B4NNSNW5%CC=Y%Q=)T1(R=Y%DF=Y%T=40%S= OS:O%A=S+%F=AS%RD=0%Q=)T2(R=N)T3(R=N)T4(R=Y%DF=Y%T=40%W=0%S=A%A=Z%F=R%O=%RD OS:=0%Q=)T5(R=Y%DF=Y%T=40%W=0%S=Z%A=S+%F=AR%O=%RD=0%Q=)T6(R=Y%DF=Y%T=40%W=0 OS:%S=A%A=Z%F=R%O=%RD=0%Q=)T7(R=Y%DF=Y%T=40%W=0%S=Z%A=S+%F=AR%O=%RD=0%Q=)U1 OS:(R=Y%DF=N%T=40%IPL=164%UN=0%RIPL=G%RID=G%RIPCK=G%RUCK=G%RUD=G)IE(R=Y%DFI OS:=N%T=40%CD=S) Network Distance: 1 hop TRACEROUTE HOP RTT ADDRESS 1 0.19 ms ocfs2-2 (10.0.0.2) OS and Service detection performed. Please report any incorrect results at http://nmap.org/submit/ . Nmap done: 1 IP address (1 host up) scanned in 18.02 seconds
Listing 3. Example of using the
nmap utility with a host's name
The example in Listing 3 is quite useful, but what if you did not have the correct host? That's no problem for
nmap. You can scan the entire subnet and
nmap will report back which hosts are responding on the network.
In the example shown in Listing 4, we are using
nmap with the "scan and ping" option (
-sP) to scan 256 IP addresses using a /24 subnet with the mask of 255.255.255.0. The output shows that we get a response back from three hosts that are up: 10.0.0.1, 10.0.0.2, and 10.0.0.3.
[root@ocfs2-1 ~]# nmap -sP 10.0.0.0/24 Starting Nmap 5.51 ( http://nmap.org ) at 2013-09-27 16:14 EDT Nmap scan report for ocfs2-1 (10.0.0.1) Host is up. Nmap scan report for ocfs2-2 (10.0.0.2) Host is up (0.00015s latency). MAC Address: 08:00:27:89:00:9D (Cadmus Computer Systems) Nmap scan report for ocfs2-3 (10.0.0.3) Host is up (0.00087s latency). MAC Address: 08:00:27:15:DD:5F (Cadmus Computer Systems) Nmap done: 256 IP addresses (3 hosts up) scanned in 2.34 seconds
Listing 4. Example of using
nmap with the "scan and ping" option
Once you know what hosts are available on your subnet, you can do an additional scan on the host that you want to explore further.
Firewalls are indispensable tools for securing systems, but they often pose a problem when you inadvertently block traffic you are not intending to block. To further complicate matters, sometimes you don't have full administrative rights to the systems that you are connecting to, so you can't verify the firewall settings and make any needed changes. Even with these challenges, there are ways of determining connectivity through a firewall by using simple tools.
One of the first tools that you can use is
telnet. Let's say for example that you want to check whether port 7777 is open on another host. You can use
telnet to do this, as shown in Listing 5, which indicates you were able to connect to remote host 10.0.0.3 on port 7777.
[root@test1 ~]# telnet 10.0.0.3 7777 Trying 10.0.0.3... Connected to 10.0.0.3. Escape character is '^]'. Connection closed by foreign host.
Listing 5. Example of using
telnet to check whether a port is open
If a firewall was blocking access, the output would instead look like Listing 6:
[root@test1 ~]# telnet 10.0.0.3 7777 Trying 10.0.0.3... telnet: connect to address 10.0.0.3: Connection refused
Listing 6. Example of
telnet output when a port is blocked
A much better way to check your ability to connect would be to use a utility called
netcat. Before you can use
netcat, you need to install it on your system by running the following command.
yum install nc
netcat is installed, you can try to connect to remote host 10.0.0.3 again, just as you did in Listing 5 and Listing 6. Listing 7 shows the successful connection.
[root@test1 ~]# nc -zv 10.0.0.3 7777 Connection to 10.0.0.3 7777 port [tcp/cbt] succeeded!
Listing 7. Example of using
netcat to successfully connect to a remote host
The output shown in Listing 8 is an example of a failed connection using
[root@test1 ~]# nc -zv 10.0.0.3 7777 nc: connect to 10.0.0.3 port 7777 (tcp) failed: Connection refused
Listing 8. Example of
netcat output when connection to a remote host fails
These are just some of the techniques you can use to troubleshoot issues you are experiencing on the network. There are many other tools available and many other techniques that work just as well, depending on the specific needs of the environment that you are working in.
About the Author
Robert Chase is a member of the Oracle Linux product management team. He has been involved with Linux and open source software since 1996. He has worked with systems as small as embedded devices and with large supercomputer-class hardware.
Revision 1.0, 06/06/2014