by Jonathan Knudsen
September 2002 </>
This is the first of a series of four articles about building security into wireless Java applications:
Security is Not Absolute
Secure systems protect something valuable, like money or personal property. Secure computer applications protect valuable information. The challenge of building secure systems is finding and defending every vulnerability.
One common misconception is that either systems are secure or they're not. People often ask "Is it secure?" when they should ask "How secure is it?" In truth, a sufficient investment of money and time can break any system. There are more secure and less secure systems, but perfect security is unattainable.
The goal of secure system design is to make the cost of breaking the system exceed the value of the information being protected. The cost of breaking the system is measured in both money and personal risk. Secure systems are usually products of tradeoffs among the value of the information being protected, the cost of breaking in, and the system's usability.
An Example Application
To help you understand the challenges of building secure systems, I'll describe the concerns and vulnerabilities of a fictional application. The overall architecture is shown here:
What the application does is not really important; it could be an online store, or a bank, or a travel reservation service. The salient features of the application from a security standpoint are:
There are many ways to attack such a system:
These are only a few of the possibilities. More elaborate attacks are possible, ranging from spoofing the server (setting up a machine to look and act like the server) to social engineering (asking users for passwords or credit card numbers under the guise of providing technical support). The client device may be stolen.
Introduction to Cryptography
Cryptography is a branch of mathematics that has powerful implications for data security. The basic principle of cryptography is that some math problems are computationally expensive, which is a fancy way of saying they take a long time to solve. Cryptography relies on the use of keys. A key is a number that makes it easy to solve a math problem. By keeping the key a secret, you can build a system that protects data from people who don't have the key.
Cryptography is one tool in the belt of the security architect. It's a big tool, and an effective one, but adding cryptography to a system is useless unless every other part of the system is also made more secure. If you don't lock the door to the server room, all the cryptography in the world won't help.
Most cryptographic operations are based on fairly simple equations using very large numbers, numbers much larger than an
int or a
long. In the J2SE APIs, these numbers are instances of
BigIntegers are commonly represented as byte arrays. Although
BigInteger is not part of CLDC or MIDP, a pure Java implementation is available with the Bouncy Castle Cryptography APIs, which will be covered in later parts of this series.
Ciphers and Keys
A cipher is an algorithm useful for keeping data confidential. It can translate between regular data, called plaintext, and an encrypted form of the data, called ciphertext. Essentially, the cipher is an equation that takes one number (the plaintext) and makes it into another number (the ciphertext).
Most ciphers use keys to encrypt and decrypt data. Keys are just numbers that are used in the cipher's equation. Different keys produce different ciphertexts from the same plaintext.
Ciphers provide confidentiality for data because it's extremely hard for attackers to decrypt the ciphertext without the right key, even if they know the algorithm. The example application shown above could use a cipher to encrypt credit card information on its way from the wireless client to the server. Even if an attacker intercepts this information, either from the air or using a wired packet sniffer, decrypting it will be prohibitively difficult without the right key.
There are two types of ciphers, symmetric and asymmetric. A symmetric cipher uses a single key for both encryption and decryption. Two people with the same key on opposite sides of the Internet can use a symmetric cipher to send encrypted messages to each other. Symmetric cipher keys are sometimes called secret keys or private keys. Despite their usefulness, symmetric ciphers can be tricky because both people using the cipher must have the same key. One person can generate the key, but it must be safely transmitted to the other person.
Asymmetric ciphers use a key pair, two keys that are related to each other. One is a public key, the other is a private key. Data encrypted using one key can be decrypted using the other key. The public key can be freely distributed without compromising security; the private key must be kept private. Imagine how paired keys might work in practice: Someone wanting to send you a secret message can encrypt it using your public key and send the ciphertext to you. Assuming you haven't let anyone steal your private key, you are the only person who can decrypt the ciphertext.
Asymmetric ciphers are useful for authentication, which means proving identity. Anyone sending you a message encrypted with your public key is sure that only you can decrypt the message, so long as you keep your private key hidden. You are effectively authenticated to the sender. Asymmetric ciphers work the other way around, too. If you encrypt a message with your private key, anyone decrypting it with your public key is assured that you originated the message, because only you possess your private key. Here you have authenticated yourself to the recipient. If the recipient uses your public key to decipher a message from anyone lacking your private key, the result is gibberish.
The math for asymmetric ciphers is more complicated than for symmetric ciphers, so symmetric ciphers usually run faster. Encrypting large messages using an asymmetric cipher usually takes too long. A hybrid approach is sometimes useful, where two systems use an asymmetric cipher to agree on a symmetric cipher key. They then use a symmetric cipher and this session key for the remainder of the interaction.
Common cipher algorithms are DES, Rijndael, Blowfish, and ElGamal. Keys are specific to cipher algorithms; if you are using a Rijndael cipher, you have to have a Rijndael key. Many algorithms can use keys of different lengths, commonly measured in bits. Longer keys are slower to use than shorter keys but the ciphertext they produce is harder to break.
Where Do Keys Come From?
Keys can be generated from random numbers. The public and private keys in a key pair are mathematically related to each other but can be generated randomly. "Random" is a dubious word in this context. Computers are surprisingly bad at finding random numbers. Most use a pseudo-random number generator (PRNG), which produces a repeatable sequence of bits. Use two PRNGs, initialized identically, and you'll get exactly the same sequence of numbers. What good is a supposedly random key if an attacker can use the same PRNG to determine its value? I won't cover this subject exhaustively, but be aware that
java.util.Random won't meet your needs. See the
java.security.SecureRandom documentation for more details.
Another way to generate keys is to use a key agreement protocol. This is a clever mathematical trick two parties can use to agree on a session key. Neither party needs prior knowledge of the other, and eavesdroppers who listen to the entire exchange will still be unable to determine the value of the session key.
The most common key agreement protocol is Diffie-Hellman. A key agreement protocol is used by SSL and TLS, as you'll see in the next article in this series.
Message Digests and Signatures
A message digest is used to create a "fingerprint" of a piece of data. It takes an arbitrarily large message or file and mashes it down into a short, "digested" version, called the message digest value. Change just one bit of the original message and the digest value will be entirely and unpredictably different. You can use message digests to assure data integrity. When you download a file from a server, you can compute its digest value and compare it to the value computed by the server. If the two are the same, you can be sure that the file has not been modified on its way to you.
Common message digest algorithms are SHA-1 and RipeMD.
You use your handwritten signature to guarantee the validity of checks, contracts, and other documents. Digital signatures perform the same function, more reliably, on electronic documents. Supply a message and a private key (the signing key) to a digital signature algorithm, and out pops a number that, in essence, is an encrypted message digest value. This signature is unique to your private key and the message itself.
Suppose you sign a file, then send the file and the signature to your friend. She can use your public key and the message itself to verify your signature. She uses your public key to decrypt your signature, which gives her the digest value you computed. For comparison, she then computes her own digest value for the message. If her value matches yours, she knows that she received the file exactly as you sent it. (This process is a good way to conceptualize the verification of a digital signature, but the steps may not be explicit in practice.) If an attacker intercepts the file and modifies it, the message digest values won't match. He can't simply create a new signature for the modified file because he doesn't have your private key.
Common signature algorithms are DSA and RSA.
The example application could use digital signatures to authenticate users to the server. Suppose the user's private key is stored on the device and the corresponding public key is stored on the server. The MIDlet can generate a signature of a message and send it to the server. Knowing the public key, the server can verify the signature and trust the user's identity. Note that anyone who steals the device is also stealing a private key. A runtime password challenge would make it harder for the thief to use the application.
Certificates and Key Management
All these discussions of ciphers and signatures neatly sidestep the ugly monster in cryptography's closet: key management. How do you find someone's public key? Where do you keep your private key? Suppose someone you don't know, Pablo, sends you a message with a signature. You need to get his public key to verify the signature. How do you get it? How do you know you've got Pablo's real public key and not a fake?
A cryptographic certificate offers one solution. A certificate is a container for a public key. It's an electronic document that says something like "Violet certifies that Pablo's public key has this value". The certificate would contain information about Violet, information about Pablo, and the value of Pablo's public key. The whole thing would be signed by Violet. The certificate allows for extension of trust. If you know Violet's public key, and if you think she's reliable and a good judge of character, and if you can verify the certificate signature, then you can be pretty sure that Pablo's public key has the value contained in the certificate.
You've really only shifted the problem, however. Fine, Violet's public key verifies Pablo's public key, but who verifies hers, and how do you get it? Where's the end of the chain? The answer is a self-signed certificate, a certificate asserting the value of a public key, signed using the corresponding private key. This kind of certificate is called a root certificate, and companies or institutions that use them to sign other certificates are called certificate authorities (CAs). CAs generate their own root certificates and distribute them as widely as possible. The problem with self-signed certificates is that anyone can generate one, claiming to be the U.S. Post Office or the King of Norway. Trust in root certificates is based on the fact that they are widely published, making them hard to spoof. If you have a root certificate in hand, you should be able to verify its validity by comparing its signature to the signature published on the CA's web site.
Certificates can be assembled in chains, a ladder of verification starting at the bottom and ending at a CA. For example, Pablo could send you a message and a certificate chain consisting of the following certificates:
Assuming you're sure you have the King of Norway's genuine root certificate, you can verify the whole chain of certificates, assuring you that Pablo's public key is authentic.
Key management is a matter of keeping track of your private key (or keys), and of managing certificates from other people and companies, including root certificates you can use to verify certificate chains.
The de facto standard for certificates is X.509v3.
The example application could use certificates in several places. First, the server needs users to authenticate themselves, to verify they are paid subscribers entitled to request services. The client device can send a message, the user's signature of the message, and a certificate chain to the server, enabling the server to authenticate the user.
On the flip side, the client may well wish to authenticate the server to make sure it's not talking to an attacker instead. The server can cooperate by following a similar strategy, sending the client a signed message and its certificate chain. Another possibility: a client that already has the server's certificate embedded in the application can verify signed messages sent by the server immediately.
In this article, you learned how you can build security into the system to balance the value of its contents with the cost of breaking it. Cryptography is a powerful tool for security. It includes ciphers for encrypting and decrypting data, signatures for assuring data integrity, and certificates for authentication. In the next part of this series, you'll learn about the Secure Sockets Layer and Transport Layer Security protocols, and how they are implemented in MIDP.
For more information about every aspect of cryptography, I heartily recommend Bruce Schneier's authoritative and exhaustive Applied Cryptography. For an introduction to cryptography in Java (slightly out of date), try my own Java Cryptography. Chapter 2 goes beyond this article with an expanded, illustrated introduction to cryptographic concepts.