How to disable IP address logging for Apache web server and Tomcat

The General Data Protection Regulation (GDPR) prohibits excessive or unnecessary collection of personal data. IP adresses in server log files are considered personal data.

Logging of IP addresses is usually enabled by default in fresh web server installations. This article describes how to disable it for the Apache web server and the Tomcat application server, which are a common combination for JVM based web applications.

Apache Web Server

The apache web server logs the HTTP requests from web clients in log files called *access.log, which includes IP addresses. Another apache log file, which can contain IP addresses is error.log.

To disable IP address logging for Apache edit the main configuration file, usually called httpd.conf and scan it for LogFormat entries, for example:

LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" combined

The format ‘verb’ %h stands for the hostname or IP address. This is what you want to remove. You may also want to remove the “Referer” and “User-Agent” components of the log format for privacy:

LogFormat "%l %u %t \"%r\" %>s %O" combined

Restart the server for the changes to take effect.

Tomcat Application Server

For Tomcat you have to configure the so-called AccessLogValve in the server.xml configuration file located in the $CATALINA_HOME/conf directory. The configuration entry looks like this:

<Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"
       prefix="localhost_access_log." suffix=".txt"
       pattern="%h %l %u %t "%r" %s %b" />

The pattern attribute specifies the log format. Again, the relevant format verb you want to remove is %h for the hostname or IP address:

       pattern="%l %u %t "%r" %s %b" />

Restart the server for the changes to take effect.

Making CherryPy Application WSGI compatible

When choosing a micro web framework evolving it to fit your needs is key. As CherryPy is one of our choices I want to show you how to evolve it in terms of web server. Of course you can use the embedded CherryPy web server in development and for small sites. It is fast enough for many use cases and supports important features like SSL so you may come a long way just using it. There are several reasons to put your CherryPy behind a tried and trusted native web server like Apache or nginx:

  • Consistent production environment using different application servers (e.g. for Java and Python) using a powerful and uniform frontend
  • Many options and possibilites using Apache modules
  • Well known and understood environment for administrators
  • Separation of web-facing http server concerns and your web application
  • Improved performance and security

Making CherryPy a WSGI-compatible

The good news is that CherryPy application objects are already a WSGI-compliant application. So creating a like the following will enable integration with mod_wsgi of Apache:

def application(environ, start_response):
    cherrypy.tree.mount(MyApp(), script_name=None, config=None)
    return cherrypy.tree(environ, start_response)

Integrating with Apache’s mod_wsgi

It is quite easy to integrate a Python WSGI application with apache using mod_wsgi. If the module is present you just need to add some directives telling Apache where to mount the wsgi application defined by your script:

WSGIScriptAlias /my_app /path/to/
# May be required to allow your web app using libraries installed on the system
<Directory /usr/lib/python2.7/site-packages/ >
    Order deny,allow
    Allow from all
    Require all granted

After you have such a setup working properly you can consult the mod_wsgi documentation on how to improve in regards to threading, script reloading etc.

Configuring the WSGI-app

Many web applications need some form of configuration. Your application should not make assumptions on its install location or some directory structure. Generally speaking, an application should never assume that it can use relative path names for accessing the filesystem. Also access to operating system environment variables is dangerous because the application may run in different contexts. But we can specify WSGI-environment variables in the web servers’ configuration. An easy and safe way is to provide the configuration directory and other values using WSGI-environment variables that we can specify in the mod_wsgi configuration:

WSGIScriptAlias /my_app /path/to/
SetEnv configuration_dir /etc/my_shiny_web_app

We can access the wsgi-environment in python like so:

def application(environ, start_response):
    configdir = environ['configuration_dir']
    cherrypy.config.update(os.path.join(configdir, 'global.conf'))

    cherrypy.tree.mount(MyApp(), config=os.path.join(configdir, 'my_app.conf'))
    return cherrypy.tree(environ, start_response)

Note: Because your web app can be mounted to other locations than “/” on the the web server your application should not hard-code absolute links and the like. They all will be dead if your app is mounted at a different location.