Looking for a good solution, at least for my use case, I’ve stumbled upon interesting approach to find and remove nginx cached files by regex pattern. This is proxy_pass or fastcgi cached files, regardless. Repository with source code can be found on github: https://github.com/perusio/nginx-cache-purge The script gets the arguments from regular expression to calculate the match within caching directory nginx uses. It is very important to allow script to read as well as to write on this directory. To get back to common use case with nginx as a caching system we have to define nginx vhosts. For this showcase purpose we can use nginx.conf for all two vhosts I’ll use to mimic potential multi-user environment but, what would in real world be the case is a list of vhost files for each host. Then for test purpose I’ve created a directory “/var/cache/idabic/” for nginx to use for caching and included this path in nginx.conf. Each server block within nginx.conf, or within vhost file, is a separate server entity with separate cache keys, caching rules, server names, etc. When having strict requirement to purge large number of files and only known criteria is something that can be mapped via cache key, it’s a wildcard purge support that would save the day. Usually, this is something that’s triggered by API call so then API server can initiate the purge/delete from cache based on regex solely. What I need to implement above solution is:
Installation on my test box isn’t utilizing vhost files (I’m too lazy to re-organize current installation), rather, it’s using main nginx.conf file located at “/etc/nginx/nginx.conf” for each server block needed to create multi-host environment. First, create caching space for nginx to use:
~$ mkdir /var/cache/idabic
Then define the caching space and caching zone in nginx.conf by adding following line in it under http block:
proxy_cache_path /var/cache/idabic keys_zone=idabic:10m inactive=10h;
This creates a caching “zone” as well with name “idabic” and it tells nginx to expire any cached request if it’s not being requested in more than 10 hours. It, also, sets index keys size to 10MB. To create multi-host environment create two server blocks as follows:
server { listen 80; listen https://www.linkedin.com/redir/invalid-link-page?url=%5B%3A%3A%5D%3A80; server_name local; root /usr/share/nginx/html; include /etc/nginx/default.d/*.conf; location / { resolver 8.8.8.8; proxy_pass https://www.maxcdn.com$request_uri; add_header ID $upstream_cache_status; proxy_cache_min_uses 2; proxy_cache idabic; } error_page 404 /404.html; location = /40x.html { } error_page 500 502 503 504 /50x.html; location = /50x.html { } proxy_cache_key $http_host$uri$is_args$args; }
And another one with slightly different parameters:
server { listen 80 default_server; listen https://www.linkedin.com/redir/invalid-link-page?url=%5B%3A%3A%5D%3A80 default_server; server_name localhost; root /usr/share/nginx/html; include /etc/nginx/default.d/*.conf; location / { resolver 8.8.8.8; proxy_pass https://www.maxcdn.com$request_uri; add_header ID $upstream_cache_status; proxy_cache_min_uses 2; proxy_cache idabic; } error_page 404 /404.html; location = /40x.html { } error_page 500 502 503 504 /50x.html; location = /50x.html { } proxy_cache_key $http_host$uri$is_args$args; }
You’ll notice that cache keys are custom in order to meet common requirement by combining Host, requested uri (without query string) and arguments (if any). Final nginx.conf looks like this:
user nginx; worker_processes auto; error_log /var/log/nginx/error.log; pid /run/nginx.pid; events { worker_connections 1024; } http { log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; proxy_cache_path /var/cache/idabic keys_zone=idabic:10m inactive=10h; access_log /var/log/nginx/access.log main; sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 65; types_hash_max_size 2048; include /etc/nginx/mime.types; default_type application/octet-stream; include /etc/nginx/conf.d/*.conf; server { listen 80; listen https://www.linkedin.com/redir/invalid-link-page?url=%5B%3A%3A%5D%3A80; server_name local; root /usr/share/nginx/html; include /etc/nginx/default.d/*.conf; location / { resolver 8.8.8.8; proxy_pass https://www.maxcdn.com$request_uri; add_header ID $upstream_cache_status; proxy_cache_min_uses 2; proxy_cache idabic; } error_page 404 /404.html; location = /40x.html { } error_page 500 502 503 504 /50x.html; location = /50x.html { } proxy_cache_key $http_host$uri$is_args$args; } server { listen 80 default_server; listen https://www.linkedin.com/redir/invalid-link-page?url=%5B%3A%3A%5D%3A80 default_server; server_name localhost; root /usr/share/nginx/html; include /etc/nginx/default.d/*.conf; location / { resolver 8.8.8.8; proxy_pass https://www.maxcdn.com$request_uri; add_header ID $upstream_cache_status; proxy_cache_min_uses 2; proxy_cache idabic; } error_page 404 /404.html; location = /40x.html { } error_page 500 502 503 504 /50x.html; location = /50x.html { } proxy_cache_key $http_host$uri$is_args$args; } }
This is pseudo server due to amount of time taking to create pseudo server (no OAuth required) for the sake of showcase. This time I’ll be using a python service I just wrote for this purpose. IT utilizes no authentication as it’s soly purpose is to accept the DELETE request with arguments and call the purge script accordingly:
from urlparse import urlparse from BaseHTTPServer import HTTPServer, BaseHTTPRequestHandler from SocketServer import ThreadingMixIn import threading import SocketServer import SimpleHTTPServer import HTMLParser import sys import hashlib import json import cgi import logging import time import BaseHTTPServer import urllib import subprocess class Handler(BaseHTTPRequestHandler): def do_DELETE(self): self.send_response(200) self.end_headers() message = threading.currentThread().getName() self.wfile.write(message) query = self.path.split("&") global match for name in query: if name.split("=")[0] == "match" or name.split("=")[0] == "/?match": match = name.split("=")[1] else: exit(1) self.wfile.write('Calling: /tmp/purge "' + str(match) + '" /var/cache/idabic/') subprocess.call('/tmp/purge "' + str(match) + '" /var/cache/idabic/', shell=True) self.wfile.write('OK.1') return class ThreadedHTTPServer(ThreadingMixIn, HTTPServer): """Handle requests in a separate thread.""" if __name__ == '__main__': server = ThreadedHTTPServer(('', 6969), Handler) print 'Starting server, use <Ctrl-C> to stop' server.serve_forever()
NOTE: includes belong to parts of this script I’ve removed as they are a part of GET and POST handler. Run the server and call the server on port 6969 (defined in script under ThreadedHTTPServer) as follows:
curl "http://localhost:6969/?match=(localhost\/).*(\.png)" -X DELETE
AT the beginning I’ve created to server blocks to handle two virtual hosts where each would respond to requests having either “localhost” or “local” as hosts. I’ve used same backend server intentionally so I can show use of cache keys and purging script relation. First, let’s make sure two sample files are cached for each vhost:
~$ curl -I localhost/wp-content/uploads/2015/06/splash-cp.png -H 'Host: localhost' HTTP/1.1 200 OK ID: HIT ~$ curl -I localhost/wp-content/uploads/2015/06/splash-cp.png -H 'Host: local' HTTP/1.1 200 OK ID: HIT ~$ curl -I localhost/wp-content/uploads/2015/06/home-map-1.png -H 'Host: localhost' HTTP/1.1 200 OK ID: HIT $ curl -I localhost/wp-content/uploads/2015/06/home-map-1.png -H 'Host: local' HTTP/1.1 200 OK ID: HIT
Now to execute the purge call:
curl "localhost:6969/?match=(local\/).*(\.png)" -X DELETE
And expected result is to have requests called with Host “local” removed from cache and now showing ID: MISS where the rest should stay intact:
$ curl -I localhost/wp-content/uploads/2015/06/splash-cp.png -H 'Host: localhost' HTTP/1.1 200 OK ID: HIT $ curl -I localhost/wp-content/uploads/2015/06/home-map-1.png -H 'Host: local' HTTP/1.1 200 OK ID: MISS $ curl -I localhost/wp-content/uploads/2015/06/home-map-1.png -H 'Host: localhost' HTTP/1.1 200 OK ID: HIT $ curl -I localhost/wp-content/uploads/2015/06/home-map-1.png -H 'Host: local' HTTP/1.1 200 OK ID: MISS
I hope this would come in handy to whom it may concern.
Fill out the enquiry form and we'll get back to you as soon as possible.
Fill out the enquiry form and we'll get back to you as soon as possible.