Co-Written by Hari Dara and uenz
Last week I’ve had to wrangle with Python’s documentation because I needed one of the apps I’m writing to centrally remove sensitive information from the logs. After several attempts at understanding the documentation, I’ve come to appreciate the built-in logging framework. I also have to admit that it’s more powerful than Ruby’s built in logger.
There are definitely more than one approach to solve this problem, but I will discuss a couple of approaches below:
Creating a logger
To create a logger, use the class method (aka module-level function)
logging.gettLogger()
. For example:
1
2
3
import logging
logger = logging.getLogger("myapp")
This creates a logging object named “myapp”. Calling this same line from anywhere in your application will return the same logging object. You can also create hierarchies by separating names using dot notation. For example:
1
2
3
import logging
logger = logging.getLogger("myapp.my_object")
Creates a logger that is a descendant of the “myapp” logger we created above.
This is handy and allows you to just say logging.getLogger(__name__)
in your
modules and they will be organized automatically.
After you’ve created your logger, creating log entries is similar to other logging frameworks:
1
2
3
4
5
logger.debug("OHAI, I CAN HAZ DEBUG MSG?")
logger.info("I'M IN UR LOGGER. INFOING LOGS")
logger.warning("YOU HAZ RIGHT TO REMAIN SILENT")
logger.error("ERROR MSGZ")
logger.critical("OH NOEZ!")
More where that came from.
Configuring your loggers
There are different ways to centrally configure your application’s logs, but I prefer to use YAML. For example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
version: 1
disable_existing_loggers: False
formatters:
simple:
format: '%(asctime)s %(name)-30s %(levelname)-7s %(message)s'
handlers:
file:
class: logging.handlers.RotatingFileHandler
formatter: simple
filename: /var/log/myapp/myapp.log
root:
level: DEBUG
handlers:
- file
Let’s break this down:
version
tells the logging framework what configuration file schema version we’re using.disable_existing_loggers
tells the framework what to do with loggers that were created before we configured itformatters
defines the formatters that we later use in the config file. The variables in theformat
key are actually attributes of the LogRecord class. The30s
and7s
suffixes tell the formatter the minimum number of chars to allocate to the string. Handy for lining up your entries.handlers
declares and configures the handlers that we want to use with our loggers. In this example, we have a handler namedfile
which uses the built in RotatingFileHandler class.- Notice how we associated our
simple
formatter with ourfile
handler in line 11? - Finally, we define our root logger which is the ancestor of all loggers we will create from here onwards. We set it to log from levels
DEBUG
and up and also specify that it use ourfile
handler. - Alternatively, you could define other loggers here such as
myapp
and use the same method that we used withroot
to configure it. - Notice how, in our
root
logger we havehandlers
(plural)? That means we can add as many handlers as we want here.
Before we can use this YAML file though, we have to convert it into a dictionary
so that Python’s logging framework can understand it. Simple enough. Let assume we
have a configure_logging.py
file with the following contents:
1
2
3
4
5
6
7
8
import logging
import logging.config
import yaml
with open("path/to/yaml", 'r') as the_file:
config_dict = yaml.load(the_file)
logging.config.dictConfig(config_dict)
After that, calling logging.getLogger()
from anywhere in your app will give you
a logger configured as the above.
Redacting logs using a Filter
Here’s a detailed flowchart of how the logging framework handles log messages. The handler flow on the right side gives us our first hint on how to centrally filter out sensitive information from our logs.
Declaring a filter requires that we create a new section in our config file called
filters
like so:
1
2
3
4
5
6
7
filters:
myfilter:
(): filters.RedactingFilter
patterns:
- "filterthisstring"
- "thistoo"
- "metoo!"
I declared only one filter there and the class name is RedactingFilter
as indicated
by the ()
key. All other keys will be supplied to the RedactingFilter
constructor. In
the above example, an array of strings called patterns
will be passed. Let’s
see how the RedactingFilter
class might be defined. Let’s assume we have a filters.py
file with the following contents:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
import logging
class RedactingFilter(logging.Filter):
def __init__(self, patterns):
super(RedactingFilter, self).__init__()
self._patterns = patterns
def filter(self, record):
record.msg = self.redact(record.msg)
if isinstance(record.args, dict):
for k in record.args.keys():
record.args[k] = self.redact(record.args[k])
else:
record.args = tuple(self.redact(arg) for arg in record.args)
return True
def redact(self, msg):
msg = isinstance(msg, basestring) and msg or str(msg)
for pattern in self._patterns:
msg = msg.replace(pattern, "<<TOP SECRET!>>")
return msg
- Notice how the
RedactingFilter
class inherits fromlogging.Filter
- As you can see, our initializer expects a
patterns
argument - Our second method
filter
is called automatically by the framework. This corresponds with the last diamond in this flowchart. If this method returns True, the log is written. Otherwise it is rejected. In our case, it always approves all log entries. - Notice how our
filter
method expects arecord
argument. This is always aLogRecord
instance which we can manipulate according to our needs. In our case, we take the log’smsg
and replace all occurances of the patterns we declared in the config file with “«TOP SECRET!»”. You’ll notice that it’s just a small step from here to regular expressions. I’ll let you experiment with that on your own. :-)
Finally, let’s attach this filter to our handler:
1
2
3
4
5
6
7
handlers:
file:
class: logging.handlers.RotatingFileHandler
formatter: simple
filename: /var/log/myapp/myapp.log
filters:
- myfilter
Notice how filters
is an array. Therefore we can add as many filters as we need here.
Redacting logs using a Custom Formatter
In this approach, we will use a custom formatter objects that simply wrap the real formatter objects such that, messages can be intercepted right after they have been formatted but before the handler
gets a chance to write them. This has the advantage that it is very simple, but unlike the previous approach, it can only be configured programmatically (which is actually good in a way, as it will always get applied irrespective of what the user configures).
First we will create a formatter class which is just a wrapper that simply proxies all the original formatter attributes, except for the format()
method. Let’s assume we have a formatters.py file with the following contents:
1
2
3
4
5
6
7
8
9
10
11
12
13
class RedactingFormatter(object):
def __init__(self, orig_formatter, patterns):
self.orig_formatter = orig_formatter
self._patterns = patterns
def format(self, record):
msg = self.orig_formatter.format(record)
for pattern in self._patterns:
msg = msg.replace(pattern, "***")
return msg
def __getattr__(self, attr):
return getattr(self.orig_formatter, attr)
In the code right after logging
is configured (see), we need to add the below code to wrap all formatters (a word of caution: the below code assumes that all the handlers are on the root logger, but this may not always be the case):
1
2
for h in logging.root.handlers:
h.setFormatter(RedactingFormatter(h.formatter, patterns=['hide me','me also']))
Additional Reading
There’s more to the logging framework than what I’ve discussed here. So go ahead and read some more from the official docs.