How to implement an HTTP output filter in nginx

Internally nginx HTTP server has a stack of output filters. These filters affect the reply generated by other sources, such as phase handlers or upstream. There are 2 types of output filter: header filters and body filters. Obviously, header filters apply to the header of the reply and body filters apply to the body of the reply.

Header filters

Header filters adapt the reply by manipulating the status and header lines of the reply. Nginx has a structure called ngx_http_headers_out_t that stores data to be sent out in the reply header. This structure appears as headers_out in the HTTP request structure ngx_http_request_t.

Let’s take a look at the most important internals of this structure (from version 1.17.6):

typedef struct {
    ngx_list_t                        headers;
    ngx_list_t                        trailers;

    ngx_uint_t                        status;
    ngx_str_t                         status_line;

    [...]
} ngx_http_headers_out_t;

Here we see headers — which is a list of header lines to be sent, trailers — a list of trailer lines to be sent, status — is the status code of the request and status_line — is the status line of the upcoming reply. The rest of the members of this structure are omitted because they serve only specific purpose in the codebase.

Any time before the reply header is sent you can extend the headers, the trailers or change the status of the reply. This is pretty sufficient for majority of functions that a header filter can perform.

A header filter itself is a function with the following signature:

ngx_int_t header_filter(ngx_http_request_t *r);

That is, it takes a request as the only argument and returns a status code. The only status code that this function can return by itself is NGX_ERROR indicating that the an error has occurred and the header filter thinks processing of the reply cannot go on anymore. In any other case a header filter must call the next header filter in the chain.

How is the filter chain formed? Nginx codebase has a global variable ngx_http_top_header_filter. This one always points to the top header filter in the stack. Upon startup Nginx calls initialization functions for each module. Each module can capture the pointer to the top filter in the stack and replace it by an own filter. Typically the top filter is saved into a variable called ngx_http_next_header_filter by every module that wants to install a header filter:

static ngx_http_output_header_filter_pt ngx_http_next_header_filter;

static ngx_int_t my_module_init(ngx_conf_t *cf)
{
    ngx_http_next_header_filter = ngx_http_top_header_filter;
    ngx_http_top_header_filter = my_header_filter;

    return NGX_OK;
}

This variable is declared static. This way every header filter can have an own instance of this variable. By calling ngx_http_next_header_filter a header filter passes execution to the next header filter in chain.

At the bottom of the stack there is function that takes everything fromĀ headers_out, transforms this data into a sequence of bytes and calls so-called write filter that writes everything to the socket. Thus, the reply header gets sent out.

Thus, if a header filter does not pass execution to the next filter in the chain, the HTTP reply header will never gets written. This is an error.

So, what is the typical flow of a header filter? From the above it appears very simple: step 1 — check if any modifications of the reply is needed, step 2 — modify the reply, step 3 — pass execution to the next filter:

static ngx_int_t my_header_filter(ngx_http_request_t *r)
{
   if([ check if any adaptation of the reply is needed ])
   {
      [ modify the reply by adding/changing headers lines or changing status ]      

      h = ngx_list_push(&r->headers_out.headers);
      if(h == NULL) {
          return NGX_ERROR;
      }

      h->hash = 1;
      ngx_str_set(&h->key, "X-Header");
      ngx_str_set(&h->value, "x-value");
   }

   // Call the next header filter in the chain
   return ngx_http_next_header_filter(r);
}

When a header filter decides that no adaptation of the reply is needed, the execution is passed straight to the next filter and the reply remains intact.

Getting cookie value in nginx

nginx allows you to extremely easily extract the value of a cookie. Simply use $cookie_<name> meta variable in whatever context you need.

Here is an example:

location / {
     proxy_set_header X-Session-id $cookie_sid;
     proxy_pass http://upstream;
}

In this configuration snippet we pass the request to the upstream named “upstream” and extend it with a header “X-Session-id” set to the value if the cookie named “sid”. Being a meta variable, $cookie_<name> can be used to control redirects, conditional configuration sections and upstream selection. Continue reading

Should you trust free SSL certificates?

Risk management in business is everything. In fact it’s the main job of an entrepreneur to manage risks related to the business. One of the tools to manage risk is a binding contract. By means of a binding contract it is possible to hand over some risk to another party in exchange for a fee. Now, the fee is the key element to it. Why should another party be liable for your risk in exchange for nothing? That’s abusive! That’s why only a fee makes a contract binding. But that’s not a sufficient condition. The fee must also be reasonable (so-called “reasonable consideration”). If the fee is not reasonable, the contract is abusive again.

Now, imagine some complex piece of technology, e.g. secure communication. You don’t how it works. You only need an SSL certificate that works. You go to a certification authority, you pay money and get an SSL certificate.

Continue reading

Let’s welcome the fluid time

fluid-timeIndustrial revolution made a profound impact on our perception of time.

Before the pendulum clock was invented around middle of the seventeenth century people perceived time differently. Time was not something related to a periodic rhythm, but rather solar activity. If you asked somebody “what time is it now”, you’d get an answer like: “the sun is still very high, so I still have time to harvest my crops”.

This all have changed with the invention of the steam engine and eventually the rail road. Operation of the real road required more precise time coordination. Continue reading

Nginx Essentials is available for pre-order

B04282_MockupCover_Normal.jpgNginx Essentials is available for pre-order! This is my first book ever and my first book about Nginx. The book is conceived as “programmer’s view of Nginx administration” and intends to enrich web masters’ and site reliability engineers’ knowledge with subtleties known to those who have deep understanding of Nginx core.

At the same time, Nginx Essentials is a ‘from the start guide’ allowing newcomers to easily switch to Nginx under experienced guidance.

I’ve enjoyed writing this book and it’s been a great re-solidification of my knowledge and I hope you’ll enjoy reading it as much as I do enjoy writing it!

Distributed caching using Redis

redisAnd so, 2 weeks after I’m about to complete my first project on my own. I’m working on a distributed cache module for Nginx that stores cached content in Redis. Redis is a superfast persistent key-value storage that is easy to integrate and easy to work with.

The choice fell on Redis because Continue reading

The value of qualitative programming

Imagine that complexityyou run an online service company and you do some service for people and your business becomes so big that your departments no longer fit in entire floors of a big building and your budget is so big, that you don’t give a damn of caring about people and raising their salaries, because you can hire someone else of the same level the next day. Well, in fact you don’t give a damn of people’s level at all, because they write throw-away code anyways and they do jobs that can be trained for in no time anyways.

And you don’t care about code quality, because it sort of “does not sell” and not measurable. And one day Continue reading

Bitcoin Magazine: Stop Thinking Bitcoin is Just a New Kind of Currency

I guest-posted on Bitcoin Magazine:bitcoinmagazine

http://bitcoinmagazine.com/16297/stop-thinking-bitcoin-just-new-kind-currency/

The main idea of the article is that Bitcoin is not simply a new kind of currency, but an innovation that can lead to emergence of a new information space.

Queueing systems

Waiting in Line

Queueing systems manage and process queues of jobs. The presence of a queue implies that something cannot be done immediately and one has to wait until some resource is available. When you respond to an HTTP request you usually want to do it interactively, that is, you want to respond within some reasonable amount of time. What can prevent you from doing this? Various things: external data sources, long-running tasks. Your request might consume too much memory and the system might decide to swap out something, which is time-consuming.

From web app’s point of view external data sources are out of influence: there is no way you can make external data sources respond faster other than by making external data sources respond faster. For example, if you are unsatisfied with how long a MySQL database responds to certain query, fixing this requires changing something in the database. Fixing the query would produce a new query, which is a different story.

When there is a long running task, however, and it cannot be redesigned to run interactively, Continue reading

Logging modules re-released!

Few months ago I shut down the pages of nginx socketlog module and nginx redislog module. This is due to excessive volume of support they required. Some people, however, found that these are interesting pieces of technology and I got several requests to release them.

Today I release these modules under free licence. Meet the released nginx socketlog module and nginx redislog module!