Comparison of Solaris OS and Linux for Application Developers

   
By Max Bruning, June 2006  
Contents
 
Introduction
System Calls and Libraries
Conclusions
 

Many developers are writing applications to run under the Linux operating system. With the many new features of the Solaris 10 OS, and with the new emphasis Sun has placed on supporting the Solaris OS on AMD and Intel processor-based machines, developers are becoming interested in being able to develop their applications on the Solaris platform. This article examines similarities and differences in the development environments of both operating systems. Someone responsible for porting applications from Linux to the Solaris OS, or programmers with prior Linux experience that want to learn development on the Solaris OS, should benefit from this article.

In this article, the term "Solaris" refers to the Solaris 10 OS (and OpenSolaris), and "Linux" refers to Linux 2.6. Many of the details covered will also apply to earlier versions of Solaris and Linux. The Linux distribution is meant to be generic, though examples have been tested on SuSe 9.1. Also, the article concentrates on applications written using the C programming language, though C++ should behave the same. Since Java technology-based applications should not be making function calls specific to Linux or the Solaris OS, they should be portable as is.

Introduction

This article discusses similarities and differences that will be visible to application programmers and analysts on the Solaris OS and Linux. It is not meant as an exhaustive description of differences, nor is it meant to show that one OS is superior to the other. Rather, the article tries to help developers experienced in one of the OSes to work with the other OS as quickly as possible.

A simple application that is POSIX-compliant and doesn't make any system calls or library functions specific to the Solaris OS or Linux should be portable between the OSes without changes. You should be able to write your app, compile for the Solaris OS or Linux, and simply recompile for the other OS, and it should work. Most of the system calls and library routines on both OSes will fall into this category.

Many system calls in Linux exist as library functions in the Solaris OS, and vice versa. For instance, sched_setscheduler() is a system call in Linux and a library function that calls the priocntl(2) system call in the Solaris OS. The priocntl(2) system call does not exist in Linux, but Linux does not support multiple schedulers beyond time share and real time. The next section of this article groups system calls into functional sections and compares what is available in each OS.

Most of the applications and toolkits from the Linux world will compile and run without changes. These include gcc, emacs, MySQL, perl, and many others. Precompiled binaries for many packages are available at http://www.sunfreeware.com.

A few articles are available comparing Linux and the Solaris OS, but most of them are comparing older versions of both. You can find them by searching for "Linux and Solaris comparison" on the web. See the Seal Rock Research White Paper (pdf) on the Solaris OS and Linux, which does cover the Solaris 10 OS and 2.6 Linux. Migrating Solaris Applications to Linux is the beginning of several pages that discuss issues porting the Solaris OS to Linux.

Various administrative differences exist between the Solaris OS and Linux, and within Linux, between different distributions. The Solaris 10 OS has introduced the "Service Management Framework" (SMF), which is a big change from previous versions of Solaris. Coverage of system administration differences will not be handled in this paper, except where it affects developers.

System Calls and Libraries

Most of the system calls and libraries that exist in Linux also exist in the Solaris OS. This section will cover system calls and library routines that are different between the two systems. The system calls and library routines are categorized as follows:

The Solaris OS keeps a list of system calls in /usr/include/sys/syscall.h. Linux maintains the same information in /usr/include/asm/unistd.h. (Note that both Linux and the Solaris OS have unistd.h and syscall.h files, and that in some cases, the files agree in content.)

Documentation for system calls is available in the Solaris OS and on Linux at /usr/share/man/man2. (The Solaris OS has a symbolic link from /usr/man to the same place.) Library routines are documented in various manual sections. See man intro.3 for an overview of the library sections on Linux and on the Solaris OS. Note that the Solaris OS breaks down the library routines more finely than Linux. For instance, aio_read() is documented at aio_read(3RT) on the Solaris OS, while on Linux, it is documented at aio_read(3). The result of this is that when compiling a program using aio_read() on the Solaris OS, one must include the real-time library via -lrt with the compilation/link command, which is not necessary on Linux.

Both Linux and the Solaris OS come with over 200 different libraries, with more than 50,000 functions defined within the libraries.

The following table lists some libraries on Linux and the Solaris OS. Note that this is not meant to be a complete listing. Also note that some of these libraries must be downloaded and installed separately from normal installation of the system.

Table 1: Some Libraries in Linux and Solaris OS
 
 
Solaris OS
Linux
Description
libc
libc
The standard C library (POSIX, SysV, ANSI, etc.) See man libc on Solaris OS.
libucb
libc
UCB (University California Berkeley) compatibility library
libmalloc
libc
There are several different malloc libraries; the default is in libc.
libsocket
libc
Socket library (sockets are in libc on Linux).
libxnet
libc
X/Open Networking library
libresolv
libresolv
DNS routines (and on Solaris OS, inet_* routines)
libnsl
libnsl/libc
Network services library (linux - nis/nis+ routines)
librpc
librpc
RPC functions
libslp
libslp
Service Location Protocol
libsasl
libsasl
Simple Authentication and Security Layer
libaio
libaio
Asynchronous I/O library
libdoor
 
Door support ( door_create(), door_return(), etc.)
librt
librt
POSIX Real Time library
libcfgadm
 
Configuration administration library
libcontract
 
Contract management library (see man contract.4 on Solaris OS)
libcpc
 
CPU performance counter library (on Linux, may need to install kernel module?)
libdat
 
Direct Access Transport Library (see http://www.datcollaborative.org)
libelf
libelf
ELF support library
libm
libm
Math library

The next sections take a closer look at some of the system calls and libraries. We'll concentrate on what's different between the systems.

Sockets and Networking

Most of the socket and networking code should simply need to be recompiled for the OS you are using, and the resulting executable should work. This section compares network-related system calls and library routines that are typically used on the Solaris OS and Linux.

socket()

The socket() routine, in addition to the AF_UNIX, AF_INET, and AF_INET6 domain arguments, has additional values on the Solaris OS and Linux. On the Solaris OS, the AF_NCA domain is used to specify the Network Cache and Accelerator (see nca(1)) for use with a socket. Most of the address families (domains) exist on both Linux and the Solaris OS. Note: See /usr/include/sys/socket.h on the Solaris OS and /usr/include/linux/socket.h for the possible address families. But you may need to download or write code to support some of the domains.

Linux has several additional domains documented on the socket(2) man page. The additional documented domains on Linux are:

  • AF_IPX - Novell IPX protocols (may be for SuSe only?).
  • AF_NETLINK - Kernel/user interface device, allows users to access kernel modules. Note: Other ways exist to do this on the Solaris OS (and on Linux for that matter).
  • AF_X25 - X25 protocol. On the Solaris OS, this domain is included with Solstice X.25 product.
  • AF_AX25 - Amateur radio AX.25 protocol.
  • AF_ATMPVC - Permanent Virtual Circuits over ATM.
  • AF_APPLETALK - See man ddp on Linux. Also exists on the Solaris OS but not documented.
  • AF_PACKET - See man packet.7 on Linux. Raw packet interface. On the Solaris OS, open the NIC device and use getmsg(2)/putmsg(2) to receive/send raw packets using DLPI. (See Data Link Provider Interface (DLPI), Version 2 for details on DLPI).
bind()

The Linux man page ( man bind.2), includes some information about different address families besides AF_INET and AF_UNIX. The Solaris man page is man bind.3socket.

listen()

On both Linux and the Solaris OS, the backlog argument (the second argument to listen()) refers to the queue length for established connections that are waiting to be accepted. The Linux man page says this, while the Solaris man page just refers to the "queue of pending connections".

accept()

Linux supports three connection-based socket types: SOCK_STREAM, SOCK_SEQPACKET, and SOCK_RDM, whereas the Solaris OS only documents SOCK_STREAM. The Linux implementation does not inherit some socket flags. This may differ from other implementations.

connect()

The Linux man page ( man connect.2) documents SOCK_SEQPACKET, while the Solaris OS does not. Linux breaks the association between a connectionless socket and connect() by connecting to an address with sa_family in struct sockaddr set to AF_UNSPEC. This behavior is not documented in the Solaris OS.

send()/recv()

As in the other socket library functions, these behave almost identically between the systems. Linux has some additional flags argument documentation on the man page.

shutdown()

No noticeable difference between the Solaris OS and Linux.

Networking Example

It can be useful to look at an application where some of the differences appear. The tracedump program uses a packet capture library ( libpcap) to read Ethernet packets at the user level. The code to read raw Ethernet is quite different between the Solaris OS and Linux. (libpcap can also be used to examine the differences with other systems, such as FreeBSD, HP-UX, and AIX.) The applicable code in libpcap is at pcap-linux.c and pcap-dlpi.c. The DLPI code is used for Solaris, HP-UX, AIX, and other operating systems. Linux provides a mechanism for reading raw socket packets via the standard socket calls. The Solaris OS uses the getmsg(2) and putmsg(2) calls to receive and send DLPI packets.

The following code demonstrates a way to do user-level packet capture on a network interface in the Solaris OS. This is followed by the analogous code in Linux. This code is a (very greatly) simplified extraction from the libpcap library.

#include <sys/types.h>
#include <sys/dlpi.h>
#include <sys/stream.h>
#include <stdio.h>

#include <errno.h>
#include <stropts.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>

int
main(int argc, char *argv[])
{
    register char *cp;
    int fd;
    dl_info_ack_t *infop;
    union DL_primitives dlp;
    dl_info_req_t inforeq;
      dl_bind_req_t    bindreq;
    dl_attach_req_t attachreq;
    dl_promiscon_req_t promisconreq;
    struct    strbuf    ctl, data;
    int    flags;
    char buffer[8192];
    dl_error_ack_t *edlp;

    fd = open(argv[1], O_RDWR);  /* for instance, /dev/elxl0 */

    /* attach to a specific interface */
    attachreq.dl_primitive = DL_ATTACH_REQ;
    attachreq.dl_ppa = 0;  /* assume we want /dev/xxx0 */
    ctl.maxlen = 0;
    ctl.len = sizeof(attachreq);
    ctl.buf = (char *)&attachreq;
    flags = 0;
    /* send attach req */
    putmsg(fd, &ctl, (struct strbuf *)NULL, flags);
    ctl.maxlen = sizeof(dlp);
    ctl.len = 0;
    ctl.buf = (char *)&dlp;
    /* get ok ack, may contain error */
    getmsg(fd, &ctl, (struct strbuf*)NULL, &flags);

    memset((char *)&bindreq, 0, sizeof(bindreq));
    /* the following bind might not need to be done */
    bindreq.dl_primitive = DL_BIND_REQ;
    bindreq.dl_sap = 0; 
    bindreq.dl_max_conind = 1;
    bindreq.dl_service_mode = DL_CLDLS;
    bindreq.dl_conn_mgmt = 0;
    bindreq.dl_xidtest_flg = 0;
    ctl.maxlen = 0;
    ctl.len = sizeof(bindreq);
    ctl.buf = (char *)&bindreq;
    flags = 0;
    /* send bind req */
    putmsg(fd, &ctl, (struct strbuf *)NULL, flags);
    ctl.maxlen = sizeof(dlp);
    ctl.len = 0;
    ctl.buf = (char *)&dlp;
    /* get bind ack */
    getmsg(fd, &ctl, (struct strbuf*)NULL, &flags);

    promisconreq.dl_primitive = DL_PROMISCON_REQ;
    promisconreq.dl_level = DL_PROMISC_PHYS;
    ctl.maxlen = 0;
    ctl.len = sizeof(promisconreq);
    ctl.buf = (char *)&promisconreq;
    flags = 0;
    /* send promiscuous on req */
    putmsg(fd, &ctl, (struct strbuf *)NULL, flags);
    ctl.maxlen = sizeof(dlp);
    ctl.len = 0;
    ctl.buf = (char *)&dlp;
    /* get get ok ack */
    getmsg(fd, &ctl, (struct strbuf*)NULL, &flags);

    promisconreq.dl_primitive = DL_PROMISCON_REQ;
    promisconreq.dl_level = DL_PROMISC_SAP;
    ctl.maxlen = 0;
    ctl.len = sizeof(promisconreq);
    ctl.buf = (char *)&promisconreq;
    flags = 0;
    /* send promiscuous on req */
    putmsg(fd, &ctl, (struct strbuf *)NULL, flags);
    ctl.maxlen = sizeof(dlp);
    ctl.len = 0;
    ctl.buf = (char *)&dlp;
    /* get get ok ack */
    getmsg(fd, &ctl, (struct strbuf*)NULL, &flags);

    /* read and echo to stdout whatever comes to us */
    while (1) {
      data.buf = buffer;
      data.maxlen = sizeof(buffer);
      data.len = 0;
      ctl.buf = (char *) &dlp;
      ctl.maxlen = sizeof(dlp);
      ctl.len = 0;
      flags = 0;
      getmsg(fd, &ctl, &data, &flags);
      write(1, "\nCTL:\n", 6);
      write(1, ctl.buf, ctl.len);
      write(1, "\nDAT:\n", 6);
      write(1, data.buf, data.len);
    }
}

The Solaris code forms DLPI requests and gets DLPI responses to tell the interface that the application wants a copy of all packets arriving at the interface.

The code in Linux is much simpler, as a socket(2) call allows one to specify raw packets. Linux does not use DLPI or STREAMS.

#include <errno.h>
#include <stdlib.h>
#include <unistd.h>

#include <fcntl.h>
#include <string.h>
#include <sys/socket.h>
#include <sys/ioctl.h>
#include <net/if.h>
#include <netinet/in.h>

#include <linux/if_ether.h>
#include <linux/if_packet.h>
#include <net/if_arp.h>
#include <stdio.h>

int
main(int argc, char *argv[])
{
    int    sock_fd = -1;
    struct sockaddr_ll    sll, from;
    struct packet_mreq    mr;
    socklen_t    fromlen;
    int        packet_len;
    char        buffer[8192];

    sock_fd = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL));

    memset(&sll, 0, sizeof(sll));
    sll.sll_family    = AF_PACKET;
    sll.sll_ifindex    = 0;
    sll.sll_protocol    = htons(ETH_P_ALL);

    bind(sock_fd, (struct sockaddr *) &sll, sizeof(sll));

    while (1) {
      fromlen = sizeof(from);
      packet_len = recvfrom(
        sock_fd, buffer, sizeof(buffer), MSG_TRUNC,
        (struct sockaddr *) &from, &fromlen);
      write(1, buffer, packet_len);
    }
}

Process/Processor Management

A process on both the Solaris OS and Linux is a running instance of a program. In both the Solaris OS and in Linux (2.6), a process is a container for an address space and one or more threads. Every process in the system has a unique process ID (PID), which remains unique for some time after the process dies. Processes are created using fork(2) and its variants. On Linux, processes (and threads) can also be created using clone(2), but pthread_create(3) is more portable. On the Solaris OS, the undocumented lwp_create() system call is somewhat analogous to clone(2).

vfork() performs similarly on both systems. The Solaris OS has fork1() and forkall(). In the case of fork1(), this causes the child process to only have the thread that executed the fork() call; in the case of forkall(), all the threads that were in the parent are replicated in the child. The default fork is fork1(). forkall() must be explicitly used. forkall() does not exist in Linux, (i.e., Linux only supports fork1() semantics).

The ps -elfL command can be used on both the Solaris OS and Linux to see the threads in a process. Both systems report the number of LWPs and the lwpid for each thread in the process. Note that an lwpid is unique across processes in Linux. In the Solaris OS, the lwpid is unique within the process. In Linux, the process ID of a multithreaded process is actually a thread group ID. The thread group ID is equivalent to the process ID of the main thread. Sending a signal (via kill(1)/kill(2)) to any lwpid is equivalent to sending the signal to the process. In the Solaris OS, you send the signal to the pid. In both cases, if the default action is taken, the process typically exits and all threads are terminated. See the man page for ps(1) for more details.

Both Linux and the Solaris OS support the notion of binding a process or thread to a processor. Linux allows binding to a set of processors for non-exclusive use of those processors. The Solaris OS allows binding to a set of processors for exclusive use, (that is, CPU fencing), but does not allow binding to a group for non-exclusive use (except via Solaris Zones?). Linux does not have a mechanism for CPU fencing, though implementations can be found on the web (see, for example, the CPUSETS for Linux page on the bullopensource.org site). The Linux system calls that are processor affinity based are sched_setaffinity(2) and sched_getaffinity(2). The Solaris OS has the following:

  • processor_bind(2) to bind/unbind LWPs or processes to a processor
  • pset_create(2) to set up a processor set
  • pbind(1) and psrset(1), which are command-line interfaces

For completeness, output of the ps(1) command, first on Linux, then on the Solaris OS, is shown in the section on Threads.

On Linux and the Solaris OS, all forms of the exec system call result in calling execve(2). The Solaris OS documents all six flavors of exec(2) on the same manual page. The Linux man page exec(3) documents execv, execl, execle, execlp, and execvp. A separate page covers execve(2).

The /proc file system exists in slightly different variations on Linux and the Solaris OS. On both systems, /proc is a directory containing files whose names are the process IDs of the current active processes on the system. Each PID-named file is in turn a directory. /proc on Linux has various other directories besides processes. Most of these deal with processors, devices, and statistics on the system. On Linux, one looks in /proc to find information about processes, processors, devices, machine architecture, and so on. On the Solaris OS, the same kind of information is typically available by using a command. For instance, prtconf(1) can be used to learn about machine configuration on the Solaris OS. On Linux, this is done largely by looking at files in /proc.

The virtual address space used by processes can be examined using pmap(1) on the Solaris OS, and by catting the /proc/ pid/maps file on Linux, as shown below. See pmap(1) on the Solaris OS and proc(5) on Linux for more details.

<-- on solaris, address space of this instance of bash -->
bash-3.00$ pmap -x $$  
1043:    /usr/bin/bash -i
 Address  Kbytes     RSS    Anon  Locked Mode   Mapped File
08045000      12      12       4       - rw---    [ stack ]
08050000     528     468       -       - r-x--  bash
080E3000      76      72       8       - rwx--  bash
080F6000     124     108      40       - rwx--    [ heap ]
FED8E000       4       4       -       - rwxs-    [ anon ]
FEDA0000       4       4       -       - rwx--    [ anon ]
FEDB0000     760     660       -       - r-x--  libc.so.1
FEE7E000      24      24       8       - rw---  libc.so.1
FEE84000       8       8       -       - rw---  libc.so.1
FEE90000      24       8       4       - rwx--    [ anon ]
FEEA0000     524     324       -       - r-x--  libnsl.so.1
FEF33000      20      20       4       - rw---  libnsl.so.1
FEF38000      32       -       -       - rw---  libnsl.so.1
FEF50000      44      40       -       - r-x--  libsocket.so.1
FEF6B000       4       4       -       - rw---  libsocket.so.1
FEF70000       4       4       4       - rwx--    [ anon ]
FEF80000     144     132       -       - r-x--  libcurses.so.1
FEFB4000      28      24       -       - rw---  libcurses.so.1
FEFBB000       8       -       -       - rw---  libcurses.so.1
FEFC0000       4       4       -       - r-x--  libdl.so.1
FEFC7000     140     140       -       - r-x--  ld.so.1
FEFFA000       4       4       4       - rwx--  ld.so.1
FEFFB000       8       8       4       - rwx--  ld.so.1
-------- ------- ------- ------- -------
total Kb    2528    2072      80       -
bash-3.00$ 

For the equivalent on Linux, see Figure 1. Note that Linux shows the full path name to libraries (the output has been edited to only show the library name). To get the full path names to libraries on the Solaris OS, use pldd(1).

 
Figure 1: Examining Virtual Address Space Used by Processes in Linux
Figure 1: Examining Virtual Address Space Used by Processes in Linux


Threads

Linux and the Solaris OS support POSIX threads, Linux via The Native POSIX Thread Library for Linux, and the Solaris OS as part of the standard C library. See Multithreaded Programming Guide, specifically, Chapter 5 Programming with the Solaris Software, for details of threads on the Solaris OS. Also quite good is the white paper Multithreading in the Solaris Operating Environment.

In addition to POSIX threads, the Solaris OS supports "Solaris threads". The threads(5) man page describes the similarities and differences between the POSIX thread library and the Solaris thread library. The implementations are interoperable and can be used with care within the same application. The following is straight from the man page.

Similarities

Most of the functions in the libpthread and libthread libraries have a counterpart in the other corresponding library. POSIX function names, with the exception of the semaphore names, have a "pthread" prefix. Names for similar POSIX and Solaris functions have similar endings. Typically, similar POSIX and Solaris functions have the same number and use of arguments.

Differences
  • POSIX threads are more portable.
  • POSIX threads establish characteristics for each thread according to configurable attribute objects.
  • POSIX pthreads implement thread cancellation.
  • POSIX pthreads enforce scheduling algorithms.
  • POSIX pthreads allow for clean-up handlers for fork(2) calls.
  • Solaris threads can be suspended and continued.
  • Solaris threads implement interprocess robust mutex locks.
  • Solaris threads implement daemon threads, for whose demise the process does not wait.

The following is a very simple MT program. Very few differences are found in the ways in which multithreaded applications work between the two OSes. Of course, the underlying implementations have several differences.

#include <pthread.h>
#include <stdio.h>

void *fcn(void *);

int
main(int argc, char *argv[])
{
    pthread_t tid;

    pthread_create(&tid, NULL, fcn, NULL);
    (void) printf("main thread id = %x\n", pthread_self());
    pthread_join(tid, NULL);
}

void *
fcn(void *arg)
{
    printf("new thread id = %x\n", pthread_self());
}

Use the following to compile and run the program on the Solaris platform:

bash-3.00$ cc simplepthread.c -o simplepthread
bash-3.00$ ./simplepthread
main thread id = 1
new thread id = 2
bash-3.00$ 

Using gcc on the Solaris platform gives the same results. On Linux it appears thus:

max@linux:~/source> cc simplepthread.c
/tmp/cc8u7kZs.o(.text+0x1e): In function `main':
simplepthread.c: undefined reference to `pthread_create'
/tmp/cc8u7kZs.o(.text+0x4a):simplepthread.c: undefined reference
       to `pthread_join'
collect2: ld returned 1 exit status
max@linux:~/source> cc simplepthread.c -lpthread -o simplepthread
max@linux:~/source> ./simplepthread
main thread id = 4015c6c0
new thread id = 4035cbb0
max@linux:~/source> 

On Linux, the POSIX thread library needs to be explicitly linked. Note that Solaris 9 and earlier versions also require this. In the Solaris 10 OS, POSIX threads are in the standard C library ( libc.so). Note also that the Solaris OS assigns thread IDs using a monotonically increasing integer starting at 1. Linux uses the user virtual address of the pthread structure (structure used internally by the thread library).

Visibility to threads is provided on both systems by the ps(1) command, and via the /proc file system. See Figure 2 for the output of the ps(1) command on the Solaris platform and Figure 3 for the output on Linux. You'll see that, given the same options, the output is very similar between the machines.

 
Figure 2: Output of ps(1) Command on Solaris Platform
Figure 2: Output of ps(1) Command on Solaris Platform


 
Figure 3: Output of ps(1) Command on Linux
Figure 3: Output of ps(1) Command on Linux


The command shows state, user, PID, parent PID, LWP ID, number of LWPs (for user processes, this is the number of threads), scheduling class, scheduling priority, user virtual size, wait channel, start time, tty, time spent running, and command. Linux does not report ADDR, and the Solaris OS shows the (kernel) virtual address of the proc_t data structure, which the kernel uses to maintain the process. Linux shows WCHAN as a symbol, while the Solaris OS shows it as an address. In the Solaris OS, the WCHAN column is the address of a synchronization variable on which the thread is blocked. On Linux, WCHAN is the routine in which the thread is sleeping. To get the equivalent information in the Solaris OS, use ::threadlist -v inside of mdb -k.

Note that on a machine running a 64-bit kernel (that is, SPARC or AMD64 architecture based), the ADDR and WCHAN fields will display a question mark ( ?). To see the values for these two fields, use ps -e -o addr,wchan,comm.

More likely, you are interested in what the application threads are doing. For this, use pstack(1) on the process ID of interest. There is a pstack on Linux, but it must be downloaded. Search for it on http://rpmfind.net/linux/RPM/. Note that it only gives the stack backtrace of one thread (the thread ID that is passed to it as an argument). If you want a backtrace of all threads within a process, you need to pass the thread IDs as separate arguments.

 <-- get user-level stack(s) of a process on Solaris -->
bash-3.00$ pstack `pgrep mozilla-bin` 
21528: /usr/sfw/bin/../lib/mozilla/mozilla-bin -UILocale en-US
-----------------  lwp# 1 / thread# 1  --------------------
 fef68967 pollsys  (896dac8, 9, 0, 0)
 fef2b2aa poll     (896dac8, 9, ffffffff) + 52
 fe793242 g_main_context_iterate () + 39d
-----------------  lwp# 2 / thread# 2  --------------------
 fef68967 pollsys  (fbf5bd04, 1, 0, 0)
 fef2b2aa poll     (fbf5bd04, 1, ffffffff) + 52
 fede047d _pr_poll_with_poll (816fa0c, 1, ffffffff, fbf5bf64,
                                 fc0558aa, 816fa0c) + 2d5
 fede05f1 PR_Poll  (816fa0c, 1, ffffffff) + 11
 fc0558aa __1cYnsSocketTransportServiceEPoll6M_i_ (816f6b8) + 58
 fc055f7d __1cYnsSocketTransportServiceDRun6M_I_ (816f6b8) + 18f
 fc3d1262 __1cInsThreadEMain6Fpv_v_ (816eb60) + 32
 fede1693 _pt_root (816fcc0) + 9e
 fef67b30 _thr_setup (feec2400) + 51
 fef67f40 _lwp_start (feec2400, 0, 0, 0, 0, 0)
-----------------  lwp# 4 / thread# 4  --------------------
 fef67f7b lwp_park (0, fa87deb8, 0)
 fef620bb cond_wait_queue (825cfec, 816b8d0, fa87deb8, 0) + 3e
 fef62462 cond_wait_common (825cfec, 816b8d0, fa87deb8) + 1e9
 fef62691 _cond_timedwait (825cfec, 816b8d0, fa87df38) + 4a
 fef62722 cond_timedwait (825cfec, 816b8d0, fa87df38) + 27
 fef62761 pthread_cond_timedwait (825cfec, 816b8d0,
                                    fa87df38) + 21
 feddc598 pt_TimedWait (825cfec, 816b8d0, f1c) + b8
 feddc767 PR_WaitCondVar (825cfe8, f1c) + 64
 fc3d417e __1cLTimerThreadDRun6M_I_ (81e5108) + 16e
 fc3d1262 __1cInsThreadEMain6Fpv_v_ (820d690) + 32
 fede1693 _pt_root (820e6b0) + 9e
 fef67b30 _thr_setup (fb520400) + 51
 fef67f40 _lwp_start (fb520400, 0, 0, 0, 0, 0)
bash-3.00$ 

Here is an equivalent on Linux. It is interesting that programs like Mozilla and xemacs are stripped on Linux and not stripped on the Solaris OS.

max@linux:~> cd /proc/`pgrep mozilla`/task
max@linux:/proc/3991/task> pstack *

3991: /opt/mozilla/lib/mozilla-bin
(No symbols found)
0xffffe410: ???? (8803488, 8, ffffffff, 8803488, 9, 400fbea0) + 40
0x404b0a6d: ???? (8129258, 4035236c, 57f, 4011e4e6, 4048de14,
                   403513c4) + 20
0x404b0d07: ???? (814b898, 814b898, 0, 0, 415a8f64, 814b898) + 30
0x401dc11f: ???? (8106350, bfffee80, bfffede8, 807673e, 8084cf4, 0)
0x415c4006: ???? (8106350, 0)
0x414fbae4: ???? (8105ee8, 0, 8079c2c, bfffee90, 80a67b8,
                      40ad841c) + 1f0
0x08059b7c: ???? (80e7f08, bffff058, 40017068, 14, 4081ccf8,
                       1) + 90
0x08055a47: ???? (1, bffff134, bffff13c, 4081ccf8, 406eebd0,
                     400168c0) + 40
0x405f2500: ???? (8055840, 1, bffff134, 80557b0, 8055740, 
                     4000d330) + 40000ed8

4001: /opt/mozilla/lib/mozilla-bin
(No symbols found)
0xffffe410: ???? (413eb7f0, 1, ffffffff, 18, 413eb7f8, 0) + 230
0x400c7439: ???? (818911c, 1, ffffffff, 40c5a0a8, ffffffff,
                   8188dec)
0x40bc8a52: ???? (8188dc8, 8188df4, 1, 8188dec, 8188f7c, 1) + 10
0x40bc8bcb: ???? (8188dc8, 413ebbb0, 40102ce0, 400d5238,
                   8189478, 0)
0x40a8da6b: ???? (81893f8, 8189478, 4000ca40, 40102be8, 0, 0)
0x400cb7a6: ???? (8189478, 413ebac4, 0, 0, 0, 0) + 54
0x400fa9dd: ???? (413ebbb0, 0, 0, 0, 0, 0) + bec144d4

4004: /opt/mozilla/lib/mozilla-bin
(No symbols found)
0xffffe410: ???? (40656756, 400d5238, 81ed160, 81ed2d0, 41ffba08,
                   400c5721) + 170fd55
crawl: Input/output error
Error tracing through process 4004
0x1afcdbf8: ????max@linux:/proc/3991/task> 

Solaris threads are given a default user stack size of 1MB. For Linux, the default stack size is 2MB (SuSe 9.1).

Synchronization

Both OSes support POSIX synchronization mechanisms, i.e., mutexes, condition variables, reader/writer locks, semaphores, and barriers. The underlying mechanisms rely on mutexes. In Solaris, user-level mutexes are implemented using "adaptive" spin locks. On Linux, the mechanism is the "futex", or fast user level mutex. Both mechanisms avoid going into the kernel in the non-contention case, and should give comparable performance and behavior.

The Solaris user-level adaptive spin mutexes are described in Multithreading in the the Solaris Operating Environment (pdf). Linux futexes are described in Futexes Are Tricky (pdf).

The Solaris OS mechanisms lwp_park() and lwp_unpark(), and Linux mechanisms futex_up() and futex_down(), can be used by applications. However, I have not found any source code examples. It is probably best to stick with the POSIX APIs. If you want to compare relative speeds of the POSIX locking mechanisms (as well as performance of various other library routines and system calls), I recommend getting a copy of the libmicro micro benchmark and trying it out on both the Solaris OS and Linux. (You can download libmicro from the OpenSolaris site.) Be aware that the upcoming Solaris 11 release (the latest build available through OpenSolaris and Solaris Express, code named Nevada), is a debug build, which will have an effect on any performance numbers you are seeing.

Memory Management

Without describing differences in the kernels' handling of memory, we can say that at user level several different memory allocation (malloc) libraries exist, most of which are available (or can be built) for either OS. A comparison of some of the user-level memory allocators can be found in the Sun Developer Network article A Comparison of Memory Allocators in Multiprocessors. "A Memory Allocator" at http://gee.cs.oswego.edu/dl/html/malloc.html contains a (dated) description of a memory allocator used on Linux. More comments can be found in the source code.

Timers

At application level, the Solaris OS and Linux both offer POSIX timer routines, including timer_create(), timer_delete(), and nanosleep(). The Solaris OS has an additional timer, CLOCK_HIGHRES, that attempts to use an optimal hardware source, and may give close to nanosecond resolution. A CLOCK_HIGH_RES timer may give similar resolution on Linux, but needs to be installed as a kernel patch (see home page for the high resolution timers project at http://high-res-timers.sourceforge.net/ for details). The following is example code that uses the CLOCK_HIGHRES timer to fire on user-specified intervals for a user-specified duration. The interval is specified in nanoseconds, and the duration in seconds. When the program completes, it prints the number of times the timer fired, and the number of times the timer was "overrun". The "overrun" value is a count of the number of timer expirations that occurred between the time a timer fired (causing a signal to be generated), and the time the signal is handled (see timer_getoverrun(3RT). Running the program real-time with too short an interval may cause the system to hard hang.

#include <pthread.h>
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <time.h>

#include <errno.h>

#define DURATION 120    /* default time to run in seconds */

  /* default .5 seconds in nanosecs */
#define INTERVAL (1000*1000*500)

void* timer_fcn(void* arg);
void* signaler_thd(void* arg);

/* Program globals */
extern int errno;
int duration = DURATION;
int interval = INTERVAL;

int
main(int argc, char *argv[]) 
{
   sigset_t mask;
   pthread_t wtid = 0;
   pthread_t stid = 0;
   int rval;
   int n;

   if (argc >=2) {
       errno = 0;
       if (argc == 2)
         duration = strtol(argv[1], NULL, 0);
       else if (argc == 3) {
         interval = strtol(argv[1], NULL, 0);
         duration = strtol(argv[2], NULL, 0);
       }
       if (errno || argc > 3 || interval <= 0
          || duration <= 0) {
           fprintf(stderr, "Usage: %s [[interval] duration]\n",
                  argv[0]);
           fprintf(stderr, "interval nsecs, duration seconds\n");
           exit(1);
       }
   }
     
   /* mask SIGALRM signals */
   sigemptyset(&mask);
   sigaddset(&mask, SIGALRM);
   sigaddset(&mask, SIGUSR1);
   rval = pthread_sigmask(SIG_BLOCK, &mask, NULL);
   if(rval != 0) {
      printf("%s: pthread_sigmask failed, errno = %d.\n",
             argv[0], rval);
      exit(1);
   }

   rval = pthread_create(&wtid, NULL, timer_fcn, NULL);
   if (rval != 0) {  /* Waiter create call create failed */
    perror ("Waiter create");
    printf ("Waiter create call failed: %d.\n", rval);
    exit (1);
    }


   /* Do signaler thread */
   rval = pthread_create(&stid, NULL, signaler_thd, &mask);
   if (rval != 0) {  /* Signaler call create failed */
    printf ("Signaler call create failed: %d.\n", rval);
    exit (1);
   }

   /* Wait for waiter and signaler to finish */    
   rval = pthread_join(stid, NULL);
   if (rval != 0) {  /* Signaler call join failed */
    printf ("Signaler call join failed: %d.\n", rval);
    exit (1);
   }

   rval = pthread_join(wtid, NULL);
   if (rval != 0) {  /* Waiter call join failed */
    printf ("Waiter call join failed: %d.\n", rval);
    exit (1);
   }

   printf("done\n");
   exit(0);
}

pthread_mutex_t mp;
pthread_cond_t cv;
int time_expired = 0;
int timerentered;
int timeroverrun;
timer_t itimerid;

void *
timer_fcn(void *arg)
{
  struct itimerspec value;
  struct sigevent event;

  value.it_interval.tv_sec = 0;
  value.it_interval.tv_nsec = interval;  /* nsec intervals  */
  value.it_value.tv_sec = 1;  /* starting in 1 second */
  value.it_value.tv_nsec = 0;  /* plus 0 nanosecs */

  event.sigev_notify = SIGEV_SIGNAL;
  event.sigev_signo = SIGALRM;
  event.sigev_value.sival_int = 0;
  

  if (timer_create(CLOCK_HIGHRES, &event,
     &itimerid) == -1) {
       perror("timer_create failed");
       exit(1);
  }

  /* the second arg can be set to TIMER_ABSTIME */
  if (timer_settime(itimerid, 0, &value, NULL) == -1) {
      /* else time value is relative to when the call is made */
    perror("timer_settime failed");
    exit(1);
  }

  pthread_mutex_lock(&mp);
  while (time_expired == 0)
    pthread_cond_wait(&cv, &mp);
  printf("timerentered = %d\n", timerentered);
  printf("timeroverrun = %d\n", timeroverrun);
  pthread_mutex_unlock(&mp);
  exit(0);
}

int timerset;

void *
signaler_thd(void *arg)
{
    int signo;
    
    while (1) {
      signo = sigwait(arg);
      if (signo == SIGALRM) {
       if (!timerset) {
        struct itimerspec value;
        struct sigevent event;

        timer_t endtimerid;

        ++timerset;
        value.it_interval.tv_sec = 0;
        value.it_interval.tv_nsec = 0;
        value.it_value.tv_sec = duration; /*wait duration secs*/
        value.it_value.tv_nsec = 0;  /* plus 0 nanosecs */

        event.sigev_notify = SIGEV_SIGNAL;
        event.sigev_signo = SIGUSR1;
        event.sigev_value.sival_int = 0;
  

        if (timer_create(CLOCK_HIGHRES, &event,
         &endtimerid) == -1) {
           perror("timer_create failed");
           exit(1);
        }

        /* the second arg can be set to TIMER_ABSTIME */
        if (timer_settime(endtimerid, 0, &value, NULL)
          == -1) {
          perror("timer_settime failed");
          exit(1);
        }
       } else {  /* if (!timerset) */
        ++timerentered;
        timeroverrun += timer_getoverrun(itimerid);
       }
      } else {  /* SIGUSR1 */

       struct itimerspec value;
       struct sigevent event;


       /* cancel the interval timer */
       value.it_interval.tv_sec = 0;
       value.it_interval.tv_nsec = 0;  /* nanosecond intervals */
       /* setting the following to 0 should stop the timer */
       value.it_value.tv_sec = 0;
       value.it_value.tv_nsec = 0;  /* plus 0 nanosecs */

       event.sigev_notify = SIGEV_SIGNAL;
       event.sigev_signo = SIGALRM;
       event.sigev_value.sival_int = 0;
  
       pthread_mutex_lock(&mp);
       if (timer_settime(itimerid, 0, &value, NULL) == -1) {
        perror("timer_settime failed");
        exit(1);
       }

       ++time_expired;
       pthread_cond_signal(&cv);
       pthread_mutex_unlock(&mp);
      }
   }
}

And here are some examples of running the compiled code.

  <-- realtime library and best optimization -->
bash-3.00$ cc timerex1.c -lrt -o timerex1 -O -fast
bash-3.00$ ./timerex1  <-- only root can use high res timer
timer_create failed: Not owner
bash-3.00$ su
Password: 
  <-- default interval is .5 seconds, duration is 120 seconds -->
# ./timerex1  
timerentered = 240  <-- timer fired every .5 seconds
timeroverrun = 0
# ./timerex1 1000000 10  <-- interval is 1 msec for 10 secs
timerentered = 9912
timeroverrun = 88
# priocntl -e -c RT ./timerex1 1000000 10  <-- run it real time
timerentered = 10000  <-- timer fired once each msec for 10 secs
timeroverrun = 0
# ./timerex1 100000 10  <-- interval is 100 usecs for 10 seconds
timerentered = 99615  <-- we missed a few
timeroverrun = 386
# priocntl -e -c RT ./timerex1 100000 10  <-- try real time 
timerentered = 99871  <-- almost 1 every 100 microseconds
timeroverrun = 129
# ./timerex1 10000 10  <-- interval is 10 microseconds
timerentered = 485905  <-- here we miss over half
timeroverrun = 514125  <-- (sig handler takes > 10 usecs?)
 <-- using RT 1 usec interval causes hang on my machine -->

# priocntl -e -c RT ./timerex1 1000 10 

IPC

Both the Solaris OS and Linux support System V IPC (shared memory, message queues, and semaphores). Both systems also support pipes and the real-time shared memory operations ( shm_open(), shm_unlink(), and so on). Both systems support the tmpfs file system (using memory and swap space for files). The Solaris OS places /tmp, /var/run, and /etc/svc/volatile in tmpfs. Linux uses /dev/shm. Both systems allow other mount points to be added.

Here are the steps for using tmpfs on the Solaris OS; steps for Linux are shown below. Note that "swap" on the Solaris OS uses memory as well as disk (if needed). In other words, files created in /tmp are stored in memory. If memory gets full, the pageout daemon may write data from /tmp to swap space on disk.

# mkdir /foo
<-- create a tmpfs file system using swap on /foo
# mount -F tmpfs swap /foo  
# df -h /foo
Filesystem         size   used  avail capacity  Mounted on
swap           652M     0K   652M     0%    /foo
# df -h /tmp
Filesystem         size   used  avail capacity  Mounted on
swap           652M    52K   652M     1%    /tmp
# 

And here are the analogous steps on Linux.

linux:/home/max # mkdir /foo
 <-- tmpfs also uses swap space and memory -->
linux:/home/max # mount tmpfs /foo -t tmpfs 
linux:/home/max # df -h /foo
Filesystem        Size  Used Avail Use% Mounted on
tmpfs         248M     0  248M   0% /foo
linux:/home/max # df -h /dev/shm
Filesystem        Size  Used Avail Use% Mounted on
tmpfs         248M   16K  248M   1% /dev/shm
linux:/home/max # 

It might be interesting to run the libmicro benchmarks mentioned earlier in the article to get some idea of relative performance between the systems.

Signal Handling

The Solaris OS and Linux treat signals similarly. Some signals exist in the Solaris OS and not in Linux, and vice versa. Also, some of the same signals use different signal numbers. Both OSes recommend using sigaction(2) over signal() to catch signals, and the use of sigwait() to handle asynchronous signals in multithreaded applications. The sigwait(3) manual page on Linux has a BUGS section. The Linux signal handling differs from the POSIX standard. POSIX states that an asynchronously delivered signal (a signal sent externally to the process), is handled by any thread that does not have the signal currently blocked. In Linux, asynchronous signals may be sent to specific threads (signals can be sent to the thread ID via kill(1)). The Solaris OS implements the POSIX standard for this. There is no way to send a signal to a specific thread externally to the process. One can send a signal via kill(1) to the process, not to a specific thread within the process.

Some of the differences are described in "Building Applications with the Linux Standard Base" at http://lsbbook.gforge.freestandards.org/sig-handling.html. Note that this page may not be entirely accurate. For instance, the page says that Linux sets SIGBUS to SIGUNUSED because there is no "bus error" in Linux. However, the Linux man page for mmap(2) documents receiving SIGBUS when accessing a memory range that does not correspond to a valid location in the file that mmap was used with. (The Solaris OS does the same).

On both the Solaris OS and Linux, signals are handled when a non-held, non-ignored signal is found pending for a thread returning from kernel to user mode. On both systems, SIGKILL and SIGSTOP take priority over other signals. Otherwise, on Solaris signals are handled in an undocumented order (lowest signal number first). On Linux, signals are handled in the order they are delivered (again, excepting SIGKILL and SIGSTOP).

On the Solaris OS, to see the signal settings for a running process, use psig.

bash-3.00$ psig $$  <-- signal disp for current shell
954:    /usr/bin/bash -i
HUP     caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
INT     caught  sigint_sighandler       0
QUIT    ignored
ILL     caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
TRAP    caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
ABRT    caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
EMT     caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
FPE     caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
KILL    default
BUS     caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
SEGV    caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
SYS     caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
PIPE    caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
ALRM    caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
TERM    ignored
USR1    caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
USR2    caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
CLD     blocked,caught  0x807d4d7       0
PWR     default
WINCH   caught  0x807e182   0  <-- not all syms are present
URG     default
POLL    default
STOP    default
TSTP    ignored
CONT    default
TTIN    ignored
TTOU    ignored
VTALRM  caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
PROF    default
XCPU    caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
XFSZ    caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
WAITING default
LWP     default
FREEZE  default
THAW    default
CANCEL  default
LOST    caught  termination_unwind_protect      0       HUP,INT,
  ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
  VTALRM,XCPU,XFSZ,LOST
XRES    default
JVM1    default
JVM2    default
RTMIN   default
RTMIN+1 default
RTMIN+2 default
RTMIN+3 default
RTMAX-3 default
RTMAX-2 default
RTMAX-1 default
RTMAX   default
bash-3.00$ 

As far as I can tell, there is no easy way to do this in Linux, but someone has probably implemented a kernel patch/module to give you the information. Certainly it should be do-able with User Mode Linux.

Conclusions

Generally, if you are developing a POSIX-compliant application on Linux or the Solaris OS, the application should port to the other OS simply by recompilation. Of course, many applications will have parts that are not addressed by POSIX. For instance, device ioctl(2) handling tends to be OS (and, of course, device) specific.

Getting documentation for the Solaris OS is reasonably straightforward, since most of the documentation is at http://docs.sun.com. Getting documentation for Linux is sometimes simple (search on the web), and sometimes not so simple. You'll find that Linux typically offers multiple ways to do the same thing (different implementations of threads, for example). My impression is that much of the Linux documentation is in the source code itself. This is fine if you have access to all the source code. You do have access to all of the source code, but it is not all in one place. In fact, it seems scattered all over the place. Sun's source is currently available all in one place ( http://www.opensolaris.org), but not all of it is there. I expect that over time, developers will add software to OpenSolaris that may not be available in the OpenSolaris source tree.

This article touched on some of the visibility tools available on the two systems, but did not get into much detail. Prior to Sun's coming out with OpenSolaris, Linux advocates could always point to the source as a differentiator when it came to visibility as to how things work. Now, with OpenSolaris and tools such as DTrace, Linux will have to play catch up. And at the rate of change of Linux, I'm sure it won't take long. I'm looking forward to both systems benefiting from each other's good features, and learning from their mistakes.

Rate and Review
Tell us what you think of the content of this page.
Excellent   Good   Fair   Poor  
Comments:
Your email address (no reply is possible without an address):
Sun Privacy Policy

Note: We are not able to respond to all submitted comments.