Announcing results with/of couchdb

I have already mentioned about the scenario in the previous post. Now its time for some business, with couchdb as it is my favorite nowadays. The application is actually dead-simple. One screen asks you your ID and password, next one lists the results you had. In order to look like some real-life counterpart (not that one exists), i added some styling to the page, but still kept the size minimum. You can see the screenshots in story-telling post.

These all are composed of a single show-function in order to keep the design-doc minimal. However in order to make it like a real-life application, with all maintainability issues and such, i used a mustache template to render that screen, and that means i used mustache.js too. User passwords should actually be on some other authentication system, but since we aim for high-performance, we cannot rely on some alien auth system. So every user/result doc also contains a md5’ed password. And that means we need to be able to generate md5 hashes out of user inputs so i used md5.js too. Whole design-doc (a special doc which you define your application) is like that:

{
   "_id": "_design/app",
   "_rev": "6-c80ec7e20e6ded17bf0e048fff596665",
   "templates": {
       "result": "[see below]"
   },
   "lib": {
       "mustache": "[mustache.js]",
       "md5": "[md5.js]"
   },
   "language": "javascript",
   "shows": {
       "result": "[see result.js below]"
   }
}

Show function result.js is :

function(doc, req) {
  var Mustache = require("lib/mustache");
  var md5 = require("lib/md5").hex;
  var ctx = {
    form: true
  };
  if (req.query.password) {
    if (!doc || doc.passwd != md5(req.query.password)) {
      ctx.formErr = "Wrong ID number and/or password";
    } else {
      ctx.form = false;
      ctx.doc = doc;
    }
  }
  return Mustache.to_html(this.templates.result, ctx);
}

For couchdb outsiders, this is a show-function which runs on couchdb when a request is received on some URL. If you add a doc-id to that URL, we receive the corresponding doc (a json object) in function’s doc parameter. req parameter contains details about http request. The result of the function is sent back to the user. Simple.

As you can see, both the login and the result screen spawns from the same function. It is simply a matter of query string of the request. More specifically, if user tries to log-in we check for auth and provide the doc to the template, if not we pass form:true to template so the login form gets rendered. The relevant parts of the template goes like that:

{{#form}}
<p>Please enter your identification number and password</p>
<form onsubmit="sb()">
  <table border="0" cellpadding="3" cellspacing="0">
    <tr><th><label for="idn">ID Number</label></th>
      <td><input type="text" id="idn"/></tr>
    <tr><th><label for="pass">Password</label></th>
      <td><input type="password" name="password" id="pass"/></tr>
  </table>
  <p><input type="submit"/></p>
</form>
{{/form}}

<p class="err">{{formErr}}</p>

{{#doc}}
...Result screen...
{{/doc}}

<script type="text/javascript">
  function sb() {
    document.forms[0].setAttribute(
      "action", "/"+document.getElementById("idn").value);
  }
</script>

That last script changes the form action to include the user’s id number in request path which means the show function will be run on the doc with that id. Short version: Document ids are user ids, and in order to ‘show’ that document we request the doc with the user’s id. An example user/result doc is like that:

{
   "_id": "1000002",
   "_rev": "1-72b2b17d3c46a69464c55c80373abc01",
   "name": "John Knife",
   "passwd": "877466ffd21fe26dd1b3366330b7b560"
   // result data
}

So if we request something like “_show/result/1000002” our result show function will be run with that doc.

I am omitting the result data and template because they may change for every exam. Also it is extremely easy to render json with mustache.

Application is done, but system isn’t. I need 3 million docs in that db, so i wrote a simple script to generate them all. It took me 15 minutes to prepate and around 20 minutes to run. In just over half an hour i had 3 million records ready to be read by the application. Add that duration to the 2.5 hour mostly passed in styling, i had completed single node system in about 3 hours. Isn’t that relaxing!

Speaking of single node deployment (my pc, a quad core machine with 4G ram), here are some numbers. Using ab on login screen: i got around 500 RPS. With 100 concurrent users, 90% of the requests served under 290ms. But that is hardly a simulation for the running system. Normally, a user lands on login screen, then asks for his/her results using credentials. Using jmeter to fit this scenario: i got 390 RPS and 360ms for 90%. In either case, single machine will take hours to finish 3 million results, so we need to scale.

Since the application is totally stateless, I should be able to run a number of couches behind some kind of a load balancer. Just put the data on all of them and assign an usher to show people where to sit. And since there are no writes i shouldn’t even need replication among them. Simple enough. The hard part is to convince some sysadmins to install couchdb on some servers. I need some time. I’ll update here when i’m done.

Notes

  1. tera-strauhal reblogged this from agaoglu
  2. agaoglu posted this