by Michael Padilla
Content personalization seems simple: Certain users should be able to access certain content while others should not. The concept, however, is deceivingly complex. While you may be tempted to jump in and start assigning entitlements to users based on arbitrary classifications, taking the time to properly classify users and content with an extensible model will go a long way in meeting your personalization needs far into the future. Classifying distinct users and content is a challenge. How can you best classify users such that you can easily and accurately target broad and finely focused audiences? How can you minimize the administrative overhead of maintaining the classification information of both users and content? To what level of granularity should the content be divided to support personalization? Finally, how can all this be implemented with BEA WebLogic Portal 8.1? Content personalization is certainly complex, but by logically tackling its issues, it need not be complicated.
Personalization sounds like a great concept. I'm a unique user, with unique interests and needs (and perhaps even purchasing power). Wouldn't it be great if a Web site somehow knew those applicable characteristics so it could selectively filter content based on those characteristics? I could be spared from having to wade through content that is of no personal interest. The Web site publisher could selectively push content based on the intersection of site goals and the user's characteristics. Even more importantly, content that absolutely must not be exposed to certain users for security purposes can be filtered so only permissible users may view it.
Personalization is powerful and complex, so let's start with the basics. Conceptually, you have a pile of content and many users. You need to place locks on each piece of content and have custom keys created for each of those locks. Users are given a key ring with a collection of keys with which they can access content deemed applicable to them. If none of a user's keys fit a lock, that user can't access the content.
Locksmithing the locks and keys can be tricky. The content locks and keys are made of characteristics that describe the user and the content. For example, a message about a scheduled office fire alarm should be viewable only by users at that office location (123 Main Street). The lock for the message is simply Location = 123 Main Street. Users may have many attributes in their profiles, but as long as they have Location = 123 Main Street, they have the key to view the content.
You can further decouple users and content by applying the locks and keys to groups of users and groups of content. If several users fit the same profile, you can define a user group and define a single profile for that group that all users who are assigned to that group will automatically inherit. Likewise, you can assign metadata to your content and define groups of content based on the metadata. For example, if you have 100 pieces of content that are all about sharks, you can define a metadata attribute "Subject = Sharks". You can then set all users who belong to the user group "Marine Biologist" to be allowed to see any content whose subject equals shark. This decoupling of individual users and specific pieces of content provides highly dynamic and targeted content.
User characteristics describe the users, but understanding the content targeting needs drives which characteristics must be captured on a user's profile. The personalization model reflects how the characteristics should be organized to achieve the desired personalization goals. Users can be described in innumerable ways, by age, height, sex, eye color, department, location, even shoe size. But unless there is a need to filter content based on a particular characteristic, that characteristic does not need to be captured. You need to determine the minimal set of characteristics required to definitively describe who can access a particular piece of content. Taken across your entire pile of content, you'll have your initial pass at the characteristics that must be included in your user profiles.
When defining the necessary user characteristics, you need to avoid using poorly defined, blurred attributes. This occurs most often when two unique types of characteristics have a large overlap. For example, a company may have multiple business lines, each operating primarily from a unique geographic location. Acme Airlines is based in New York, Acme Cigarettes is based in Chicago, and Acme Oil is based in Dallas. While the majority of employees for a particular business line work in the same location, it is not true for all employees. In defining personalized content for Acme's global intranet, it's important to avoid the mistake of categorizing all users as only New York users, Chicago users, or Dallas users. Is it the user's geographic location that drives the content or the user's business line? For weather information it's the former, while for an oil press release, it's the latter. Correspondingly, a user would have to be classified with two sets of information: the user's business line and geographic location.
A user's profile consists of categories containing attributes, each having a particular value based on the user. For example, a category could be Hair Color with the following attributes: a) black, b) blond, c) brown, d) red, e) none. For a company intranet, a category could be Organization with a hierarchical set of attributes including specific business lines, divisions, departments, and working groups. The attributes are the building blocks from which you can forge your locks and keys for your content. Each lock and key consists of at least one attribute, but may consist of many.
Content targeting needs drive how many categories and values within each category are required for your user profiles. Reality checks are important in defining these characteristics. If you find yourself with hundreds of attributes that are apparently required to support highly targeted content, you may want to rethink your content distribution medium altogether. Do you really need all those characteristics defined across both users and content so that you can target content to an innumerable set of small groups? Keep in mind there may be significant administrative overhead in maintaining highly detailed profiles for users and correspondingly classifying content. If you get to the point where you think you may be architecting a replacement for email, it's time to take a step back and look into generalizing some of your categories and/or attributes.
Once you have made your initial attempt in defining categories and their respective attributes you should check the quality of your categorization. All the categories and attributes within a category should be mutually exclusive. If any of the attributes overlap, you may run into issues of targeting content. For example, if your attributes for Hair Color were defined as a) black, b) blond, c) brown, d) red, e) none, and f) dark, you will run into issues accurately classifying both content and users. If a user has black hair do you select black, or dark, or both? You may need to remove attributes, define a hierarchy with the attributes, or create new categories.
The granularity of content for creating personalization also needs to be defined. How small do you need to "chop up the content"? Do you need to limit access to an entire site, a section of a site, a page, a section on a page, or even a single link? As with classification definition, the granularity at which content is targeted comes at price. As you work with smaller and smaller pieces of content, the number of pieces of content multiplies. Each piece of content comes with the administrative overhead required to properly classify it so that it may be targeted to a specific audience. Once again, if you find yourself dividing content into exceedingly small pieces, you may want to consider email as the more appropriate solution to deliver your highly targeted content.