Disclaimer and License

Opinions expressed here by Tim Tripcony are his own and not representative of his employer.

Creative Commons License
Tip of the Iceberg is licensed under a Creative Commons Attribution 3.0 Unported License.
Based on a work at timtripcony.com.

Unless otherwise explicitly specified, all code samples and downloads are copyright Tim Tripcony and licensed under Apache License 2.0.

Search

What the Quote?

"Breastmilk makes everything cooler!"

Mike Koenig

"At my last job, it took me at least three clucks before people would start blowing me off."

Allan Seyberth

"I'm gonna buy shoes this weekend after you leave town."

Laura Tripcony

« Rolling your own authentication for REST API's, part 3 - Anonymity is a warm blanket | Main| Rolling your own authentication for REST API's, part 5 - the Keymaster and the Gatekeeper »

Rolling your own authentication for REST API's, part 4 - Welcome to the club

Category xpages

Before we can allow users to log in, we must first, of course, have users. Since we're using a custom authentication scheme, we obviously need a custom registration mechanism as well.


For the purposes of this authentication approach, a user consists only of a document within our application that stores a single item value: the user's password. This, again, may seem to be unconventional if not heretical. Remember, however, that no end user will have direct access to any data. As long as our API does not programmatically expose a user's password, our system will remain secure. But you may also have noticed that we're not storing the user's ID in the document that stores their password... there's no need, as you'll soon see.


Earlier we discussed an option built into Domino for obtaining an MD2 hash of any string value. Because this implementation is specific to Domino, it is actually a variation on the standard algorithm, and will not return the same encoded result that would be returned by other implementations. Furthermore, the token format used in the authentication technique this article proposes relies upon the use of MD5 hashes; not only is there no implementation of this algorithm built directly into Domino, but we need to be absolutely sure that the token value we receive from third-party applications is the precise value we have calculated in advance that we are expecting to receive. We need, therefore, a more generic method for hashing string values.


One of the key benefits of XPages is that it is now easier than ever to incorporate external code into our applications: specifically, Java code written by developers with no knowledge of Domino can be imported directly into, and leveraged within, Domino applications – nearly always without any additional modification needed. In many cases, features built into the base Java language can be very useful in our applications. As an initial example, let's take a look at a small custom utility class, called HashMaster:


  1. import java.math.BigInteger;
  2. import java.security.MessageDigest;
  3. import java.security.NoSuchAlgorithmException;
  4.  
  5. public class HashMaster {
  6.  
  7.     public static String md2(String unhashed) {
  8.         return encode(unhashed, "MD2");
  9.     }
  10.  
  11.     public static String md5(String unhashed) {
  12.         return encode(unhashed, "MD5");
  13.     }
  14.  
  15.     private static String encode(String unhashed, String algorithmId) {
  16.         String hashed = "";
  17.         byte[] defaultBytes = unhashed.getBytes();
  18.         try {
  19.             MessageDigest algorithm = MessageDigest.getInstance(algorithmId);
  20.             algorithm.reset();
  21.             algorithm.update(defaultBytes);
  22.             byte messageDig est[] = algorithm.digest();
  23.             hashed = new BigInteger(1, messageDigest).toString(16);
  24.         } catch (NoSuchAlgorithmException nsae) {
  25.             nsae.printStackTrace();
  26.         }
  27.         return hashed;
  28.     }
  29. }


This class not only allows us to generate an MD2 hash for reasons already mentioned, but the same class can also obtain an MD5 hash, which will be useful later, both in generating and in validating authentication tokens. In the meantime, let's take a look at a method from another class (DominoUtil), which will be used extensively in our API:


  1. public Document getDocumentByPrimaryKey(Database source, String key,
  2.     Boolean createOnFail) {
  3.   Document result = null;
  4.   String unid = "";
  5.   try {
  6.     unid = HashMaster.md2(key);
  7.     result = source.getDocumentByUNID(unid);
  8.   } catch (NotesException ne) {
  9.     if (ne.id == 4091 && createOnFail) {
  10.       try {
  11.         result = source.createDocument();
  12.         result.setUniversalID(unid);
  13.       } catch (Exception e) {
  14.         e.printStackTrace();
  15.       }
  16.     }
  17.   } catch (Exception e) {
  18.     e.printStackTrace();
  19.   }
  20.   return result;
  21. }


This method does the following:

  1. Obtains an MD2 hash of the "primary key" passed to the method

  2. Since the result of the hash is a syntactically valid UNID, it attempts to locate a document within the specified database that has the same UNID

  3. Although the hash result might be the UNID of an existing document but also might not be, there is the possibility that an exception will be thrown. In that scenario we may optionally create a new document and set its UNID to match the hash. Take note that, if a new document is created, it is not yet saved. In either scenario, we end up with a document whose UNID is an encoded version of the text used to request it... in other words, the UNID is meaningful: if I pass "Tim Tripcony" as the key, the resulting UNID is an encoded representation of my name.


Finally, let's examine how this concept is put to good use within the register() method of our UserRegistration class:


  1. public boolean register() {
  2.   try {
  3.     DominoUtil util = new DominoUtil(getDatabase());
  4.     Document userRecord = util.getDocumentByPrimaryKey(getUserId(),
  5.         true);
  6.     if (userRecord.isNewNote()) {
  7.       userRecord.replaceItemValue("userPass", getPassword());
  8.       userRecord.replaceItemValue("Form", "user");
  9.       setSuccess(userRecord.save());
  10.     } else {
  11.       setError("User Id " + getUserId() + " already exists");
  12.     }
  13.   } catch (NotesException e) {
  14.     setError("An unexpected error occurred while attempting to register user");
  15.     e.printStackTrace();
  16.   }
  17.   return isSuccess();
  18. }


As you can see, this method attempts to locate a document that has a UNID matching the MD2 hash of the user ID we are attempting to register. If the document it returns is not a new note, then the user has already been registered: the database already contains a document with a UNID representative of the specified user ID. If, however, the document is a new note, then we can safely create the new account.


As was previously mentioned, we're not storing the user ID in their account record – just their password. For the sake of convenience, we are writing a Form item, in case we want to create a view to display all user accounts, though it's not likely we ever would. Similarly, you may choose in your own application to store additional information about each user to allow users to search for other users and view information about them, such as a profile picture or contact information. For the purposes of authentication, however, we never need to search any index for a user's account record, because we can always obtain a hash of their ID and navigate straight to the account record via the resulting UNID.


One concept that should be noted is the format used for the user ID. This could be a canonical name (i.e. "CN=John Doe/OU=..."), a shortname (i.e. "jdoe"), an email address... when using this style of authentication, the name format is inconsequential as long as it's consistent. The key limitation is that – unlike traditional Domino authentication – this particular technique would not allow the same user to enter their canonical name, shortname or email address and still be authenticated as the same user... you have to choose one.


An additional implication of using a hash of the user ID as the UNID for the account record is that this makes the user ID case sensitive, which is atypical; most authentication systems expect the user's password to be entered using the same case as the stored value, but do not require the case to match for the user ID. Because an hash of a lowercase string will not return the same value as an MD2 hash of an upper- or mixed-case version of the same string, however, our implementation requires that the case of the ID match each time it is evaluated – at least, in the low-level code. You can certainly shield your users from this requirement by converting the case of the ID to ensure it is always the same prior to being hashed. If, however, you prefer that the ID be case-sensitive, hashing the ID as passed without case modification will automatically enforce that requirement for you.

Comments

Gravatar Image1 - Rob, you're quite right: sadly, without session cookies, there's no feasible way (that I've yet found) to leverage true Domino authentication over HTTP. The security manager prevents us from using NotesFactory.createSession() to establish a new session on behalf of a user even if we have access to their credentials. Perhaps in a future version, in addition to "session" and "sessionAsSigner", IBM will give us a "sessionOnBehalfOf(userName)" so that we can execute operations as though the user were truly logged in, but in the meantime - as you implied - we're stuck duplicating portions of the existing security architecture that's already sitting there ready to be used but inaccessible to us due to portions of that very security architecture.

It's not all bad news, however. If you're mapping a "real user" to each user's API ID (either by storing that ID in canonical format to begin with or defining a relationship somewhere between their API user ID and their actual Notes name), then you can at least determine some information about their permissions to documents. For example:

var isAuthor = function(doc, userId) {
var result = false;
var docAuthors = session.evaluate("@Author", doc);
if (docAuthors.contains(userId)) {
result = true;
} else {
var roles = signerDatabase.queryAccessRoles(userId);
for (var i = 0; i < roles.size(); i++) {
if (docAuthors.contains(roles.get(i)) {
result = true;
break;
}
}
}
return result;
}

In other words, you can logically determine whether the user has author access to a certain document. Technically this will still work even if the user isn't listed anywhere in the NAB but is still listed in Authors fields on the document; of course, if you are using Authors fields, you'll need to ensure that the signer also has Author access... but I'm assuming you already know that anyway.

And we definitely lose the convenience of using Readers fields to cause records to simply "disappear" from anyone who shouldn't see them. We have to use the same type of logic to determine record by record whether the user has Reader access - but the logic is even more convoluted because there's no "@Reader" we can call to determine that; we have to loop through all the items in the record, check to see if each is a Readers field, and then perform the same evaluation to see if the user is either explicitly listed, or implicitly included by role or (ideal in most situations, but atrocious in this case) as a member of some cascaded group. When faced with the need to perform all this logic ourselves, it becomes easy to see why using Readers fields always slows down the entire application... often to the point that the security is not worth the degradation; if we're using the built-in security, Domino still has to perform these evaluations, even though we don't have to explicitly instruct it to do so.

So I guess the question becomes: in situations where it makes sense to implement a REST API for an application - which will certainly be an exception for most organizations - which of the following options is ideal?

A) "Settle" for an inferior alternative logic-based security scheme, which forces us to manually do a portion of the work that Domino would otherwise do automatically for us, in order to continue to allow Domino to do everything it does do well that we can still leverage.
B) Since we'd just have to implement our own security anyway, which we can do in any platform, abandon ALL of our existing investment and just migrate the application to another platform.
C) Simply choose not to bother: keep the application locked behind the API's that Domino supports out of the box, limiting its adoption to those who are able and willing to access it via those interfaces.

That list isn't quite as snarky as it probably comes across: those really are our options. There's been an abundance of discussion lately about what IBM or its customers and partners should do to ensure that Domino outlives us all, and those discussions have yielded many suggestions. But none of us (myself included) has created something on Domino that has achieved adoption on the scale of Twitter or Facebook... and then announced to the world that we did so. I know that's a rather high bar to set, but it's possible. But the only way it could happen is if either it has a web front-end that is so utterly compelling that everyone with any computing device capable of an Internet connection wants, for some reason, to navigate to it on a regular basis... or if it uses a globally standard format to surface an API that allows third-party applications and devices to turn the service the application represents into something everyone wants to use. That's certainly within our reach.

And imagine what it would be like if that happened... if I were to write an app called... I dunno, maybe "MyTwitFace". And suddenly that's the app that everyone's using and talking about and posting to from 5 different devices, and CNN is telling their viewers, "to keep up on the latest news, follow us on MyTwitFace"... and then, when a half billion people are using it, I tell the world that it's running on Domino, and that's why it doesn't crash every two hours or suddenly get slow just because it's lunchtime and everybody's hitting our servers. I can't help thinking it would be rather difficult for Microsoft to convince a CIO that they could ever pull that off with Sharepoint. And, while most businesses aren't in the business of creating that type of application, why would they ever abandon Domino in favor of Sharepoint if they're already using a platform that can scale to support one twelfth of the world's population? Emoticon

Gravatar Image2 - I'd prefer to not store passwords in cleartext in the database even though there is no username associated with it. Otherwise, you're creating a nice big password dictionary for brute force attacks. Maybe you could send the password through a one way hash function like MD5.

Gravatar Image3 - Karsten, I believe it would depend heavily on how the data is structured. We've experimented with "viewless" architectures (using the predictable UNID technique discussed earlier in this post, as well as other approaches to making views irrelevant), and were able to maintain sub-second response times to complex queries under moderate load. The application that used that technique the most heavily didn't even use scope variables to maintain a cache of query results... when data is appropriately scoped and distributed across NSF's, and indexes are kept to a minimum, the speed that we can get now is ridiculous.

Additionally, with the new page serialization settings slated to be available in 8.5.2, we'll be able to balance raw speed vs. overall scalability. While I'm going to be too busy between now and the end of August to be able to justify playing with this, I'm planning to create a free iOS (and possibly Android) app this fall that uses the authentication technique described here to connect to data stored on a Domino server sitting in my living room, and just see what happens. It won't be an enterprise app, purely a novelty one. If it goes viral, I'll up the price to $0.99 just so I can pay to move the server to an EC2 instance, then put up an actual website for it that publishes the API so that third parties can build their own apps to connect to it. In the meantime, I'm not going to mention here what the app is... I want to first try to prove the point that Domino deserves to be the back end for some of these trendy apps - sure, the enterprise is its natural home, but who says it can't play games too? Emoticon

Gravatar Image4 - Interesting thought, whether a Domino server (cluster) could handle a platform like Facebook. How does Domino scale compared to distributed databases like Bigtable or Cassandra, either with real or fake users?

Gravatar Image5 - That's a good idea, Karsten. The next section in the series discusses creation of temporary tokens and validation of API keys, which in the example I'm using, conform to the following syntax:

md5 ( md5 ( password ) + token + userId )

The only time we ever query the password is when we're determining what the temporary API should be based on the temporary token we send to the consumer, so if we're always storing the password as an MD5 of the actual value, we can skip hashing it each time we calculate the API key. It's an inexpensive calculation to perform, but if our user base were to grow into the millions, any additional efficiency would add up over time.

Similarly, if we were to implement a password change feature, where the user provides the existing password and a new password, we can hash both, and if the hash of the existing matches what we've stored, then it's okay to store the new password - alternatively, we could require that the consumer hash both prior to sending them so that the plain text password is never sent over the wire.

Gravatar Image6 - Nice series, Tim. Lot's of 'internals' that many people don't know.

Gravatar Image7 - This is quite fascinating. I especially like the Java source code in this installment. I'm looking forward to how the whole series unfolds.

I'm just wondering how many of the Domino features you're going to re-implement. The Lotus Notes/Domino security model with it's very fine granularity is one of the things that keeps me from moving to another platform. So far the authentication methods you've described could be implemented in most any environment.

I use reader and author fields in all of my designs. The fact that I can put a role in a reader field gives me fantastic control over who can view documents with very little development overhead. I have Java classes that I've developed over many years that let me automate updated groups, roles and ACLs from web interfaces as well as on a date/time basis. It would take a lot of persuasion for me to all of that up and move to a new model.

Or maybe I'm not clear on where you're going with this. It kind of reads like a nerd suspense novel. I can hardly wait for the next installment. And your writing is very clear.

Peace,

Rob:-]

Gravatar Image8 - *sigh* Ah, document digests. They're like the secret kung-fu in a bootleg Hong Kong action flick. Emoticon

Soon, we'll have the army of interns we need...

Gravatar Image9 - This is cool stuff, thanks for sharing!

I'm wondering why you're using MD2 for creating the unid and not MD5. From my own tests MD5 is a bit quicker and also produces unid-like hashes.

Warning: The function HashMaster.md2 doesn't always produce 32 char unids. For instance test with "The quick brown fox jumps over the lazy dog" and it returns the 31 char string "3d85a0d629d2c442e987525319fc471". I've added a "unid" method to the HashMaster class:

public static String unid(String unhashed) {
String hashed = encode(unhashed, "MD2");
while (hashed.length() < 32) {
hashed = "0" + hashed;
}
return hashed;
}

Post A Comment

:-D:-o:-p:-x:-(:-):-\:angry::cool::cry::emb::grin::huh::laugh::lips::rolleyes:;-)