Using superagent to authenticate a user-agent in node.js, plus a bonus bug!

Summary

This post describes how I use the superagent library to test access to restricted resources on my web server. This is something that I found to take a bit more effort than I expected, so I thought I’d write this up for the greater good.

Context

I am running a website in which some resources are open to the internet, while others require authentication versus our CAS server.

I have been logging into the CAS server using request. But in the interest of trying out different libraries and all that, I decided to rewrite my
method using superagent.

I need to log into the CAS server from node.js because I am writing tests that verify that the protected resources are hidden to non-authenticated users, and available to authenticated ones.

Plot development

When I first tried superagent, it didn’t have any way to accept and keep cookies. Or rather, perhaps it did but it wasn’t documented. Now it is documented in the github readme. That code is reproduced below:

var superagent = require('superagent');
var user1 = superagent.agent();
user1
  .post('http://localhost:4000/signin')
  .send({ user: 'hunter@hunterloftis.com', password: 'password' })
  .end(function(err, res) {
    // user1 will manage its own cookies
    // res.redirects contains an Array of redirects
  });

My situation is slightly more complicated. First, the CAS server expects to interact with a human on a web page, so it buries secret messages in the form fields in the generated login page. For example:

<input type="hidden" value="e1s1" name="lt">
<input type="hidden" value="submit" name="_eventId">

Even more irritating, the CAS server implementation that we are running apparently only responds to known user agents. It plays nicely with Firefox and curl, but it pukes and throws an error when
confronted with node.js libraries such as request or superagent. So one has to remember to lie about the user agent.

Rising action

The first step is to get the login page, as performed by the following code block

var ua = superagent.agent()
ua.get(casurl)
  .set('User-Agent','Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20100101 Firefox/12.0')
  .end(_login_handler(ua,function(err,res){
                         if(err) return callback(err)
                         var success_reged=/Log In Successful/i;
                         if(success_regex.test(res.text)){
                             return callback()
                         }else{
                             return callback(new Error('CAS login failed'))
                         }
                      })
      )

The login page itself is passed to the function generated by the call to _login_handler(), which is reproduced below. In this case, I am setting up a test, so I know the user and the user’s password, defined by cas_user and cas_pass.

function _login_handler(ua,callback){
    return function(err,res){
        if(err) return callback(err)
        // parse the body for the form url, with the correct jsessionid
        var form_regex = /id="fm1".*action="(.*)" method="post"/;
        var result = form_regex.exec(res.text)
        var url='https://'+cashost+result[1]

        var form={'username':cas_user
                  ,'password':cas_pass
                  ,'submit':'LOGIN'
                  }

        // scrape hidden input values
        var name_regex = /name="(.*?)"/
            var value_regex = /value="(.*?)"/
            var hidden_regex = /<input.*type="hidden".*?>/g
        while ((result = hidden_regex.exec(res.text)) !== null)
        {
            var n = name_regex.exec(result[0])
            var v = value_regex.exec(result[0])
            form[n[1]]=v[1]
        }
        // ready to post
        ua.post(url)
        .type('form')
        .set('User-Agent','Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20100101 Firefox/12.0')
        .send(form)
        .end(callback)
        return null
    }
}

I glossed over the tricky bits that had me stuck through a bunch of tweak-try-fail iterations. First, every time the agent is used to make a request, the CAS server must be sent the fake user agent header.

ua.get(casurl)
  .set('User-Agent','Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20100101 Firefox/12.0')
// ...
// ...
ua.post(url)
  .type('form')
  .set('User-Agent','Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20100101 Firefox/12.0')

Second, superagent must be told to send the hashref of data as text, not as JSON. This wrinkle I missed entirely until I started reading the superagent docs carefully. The docs clearly state that the default POST transport is JSON:

“Since JSON is undoubtably the most common, it’s the default!”

The next paragraph down in the docs explains the use of the .type('form') options:

To send the data as application/x-www-form-urlencoded simply invoke .type() with “form”, where the default is “json”. This request will POST the body “name=tj&pet=tobi”.

The second call above uses this option to post the login credentials and hidden fields to the CAS server.

With that, if the credentials are correct, the user agent will have an established session with the CAS server that can be verified by each of the web hosts using the CAS authentication system.

Conflict

But merely logging in to the CAS server and establishing a session with it is only half of the story. The main feature that CAS supports is single sign on. This is achieved in practice as follows. First, a user logs on to the CAS system somehow. Then the user agent visits a CAS-protected website. This website redirects the incoming request to the CAS server, with a query parameter called “service”. If the user agent is already logged in to the CAS server, then the server returns the user to the url defined in the “service” query parameter, along with a one-time-use ticket. The CAS-protected webservice then checks this ticket by sending a request directly to the CAS server to verify that the service ticket it just received from the incoming request is valid. If it is valid, then the user agent is considered to be an authenticated user.

Unfortunately, this is when I found a bug in SuperAgent that is not present in request, and to try to fill the open source bargain, I created a test case and opened a bug report. I also tried to fix it myself, but it wasn’t quite as simple as I thought. Merely copying the headers during the redirect doesn’t solve the issue.

Conclusion

In true postmodern fiction style, there is no real end to this story. It works to log in to CAS, but it doesn’t work to establish an authenticated session with a CAS-protected website. As very similar code works fine using Request, I am returning to using that code, and will perhaps be writing another, similar blog post in the future containing that code.

Appendix

For completeness, the chain of functions I use are reproduced below.

function _login_handler(ua,callback){
    return function(err,res){
        if(err) return callback(err)
        // parse the body for the form url, with the correct jsessionid
        var form_regex = /id="fm1".*action="(.*)" method="post"/;
        var result = form_regex.exec(res.text)
        var url='https://'+cashost+result[1]

        var form={'username':cas_user
                  ,'password':cas_pass
                  ,'submit':'LOGIN'
                  }

        // scrape hidden input values
        var name_regex = /name="(.*?)"/
            var value_regex = /value="(.*?)"/
            var hidden_regex = /<input.*type="hidden".*?>/g
        while ((result = hidden_regex.exec(res.text)) !== null)
        {
            var n = name_regex.exec(result[0])
            var v = value_regex.exec(result[0])
            form[n[1]]=v[1]
        }
        // ready to post
        ua.post(url)
        .type('form')
        .set('User-Agent','Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20100101 Firefox/12.0')
        .send(form)
        .end(callback)
        return null
    }
}


function cas_login_function(ua,callback){

             console.log('hitting '+casurl)

             ua.get(casurl)
             .set('User-Agent','Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20100101 Firefox/12.0')
             .end(_login_handler(ua
                              ,function(err,res){
                                   if(err) return callback (err)
                                   var success_regex = /Log In Successful/i;
                                   if(success_regex.test(res.text)){
                                       return callback()
                                   }else{
                                       return callback(new Error('CAS login failed'))
                                   }
                               }) )
             return null
         }

// ...

describe('get files with login',function(){
    var agent =  superagent.agent()
    before(function(done){
        cas_login_function(agent,function(err){
            if(err) throw new Error(err)
            done()
        })
    })
    // in all of the following tests, agent is logged in to CAS
    it('should get index.html from /protected_resource'
      ,function(done){
           agent
           .get(rooturl+'/protected_resource')
           .end(function(err,res){
               if(err) return done(err)
               res.ok.should.be.true;
               ...
               return done()
           })
    })
})

Advertisements

3 thoughts on “Using superagent to authenticate a user-agent in node.js, plus a bonus bug!

  1. Thanks for this. I’m still struggling though with my case, which is very similar to yours. I always get the 403 on the second request.
    The only difference is that my jsessionid is not passed on the URL, it’s in a cookie. My guess is that suerpagent doesn’t send cookies on second requests, but I’m still debugging this. Maybe you have an idea?

  2. At the moment my approach is to use request for applications that require features like authentication and cookies and so on (which is most of my application type projects), and to use superagent when I write tests (unless those tests require authentication and cookies).

    But I highly recommend filing a bug report with your case.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s