Logger for D Design ARCHITECTURE ------------ The logger will consist of three distinct components: 1) A public interface to allow a user to write messages to a log. 2) A background thread to periodically poll the configuration file for changes. 3) A writer thread that handles the actual writing of log messages to some location. The writer thread is the most interesting as it will support being able to use a user-defined log writer method that conforms to a specific interface. At least one log writer will be provided by default, therefore this is not needed. The reason for this capability is not only to make it easier to add different writers in the future, but to remain flexible. Individuals may require a custom solution, such as logging to a database or a custom file format. The current plan is to store function pointers in an associative array. A mixin will be provided that accepts an ID/name for the function and a pointer to the function. The function can then be specified in the configuration file. Also, it is expected that any writer can be configured using the same configuration file. INITIALIZATION -------------- Some basic initialization needs to occur for the logger to work. This is because two threads are spawned in the background and I also want to capture the location where the logging is occurring. There are two possible techniques for each. Thread Initialization --------------------- The first method, and probably the most desirable, is to have thread initialization occur using a static constructor. This way the user does not have to ensure that it was done before using the logger. This should be very feasible as the constructor would be part of the class that does the log writing, which would be used by the methods described below under LOGGING. E.g. a simplified version of the log function could look like this: void log(enum level, string msg) { logWriter.writeToLog(level, msg); } The second method is the one currently used, which is to have the user call a constructor and the constructor starts the threads. Logger Initialization --------------------- If a constructor is explicitly called, then the location where the logging is occurring can be passed in as a parameter. However, it appears that the more desirable way to do this is using a mixin. This way, we can use __FILE__ as the default. Unfortunately, it means that something similar to this needs to appear in the code: mixin(initLogger!()) The location is important as it lets the user know where a log message originated from. However, it is a pain to have to include it in every log message, therefore being able to have it set once in a module/class ensures we get this information without being a chore. Note: If mixins could be called like normal functions, this could be even cleaner as the "log" statements could possibly be mixins. However, having a person do this every time seems very ugly: mixin(log!(INFO, "...")); The big question here is whether a mixin, or mixin template, can be used as a Variadic Template. LOGGING ------- The current plan is to have all logging statements look similar to this: log(INFO, "some message", 91, " foo"); logf(DEBUG, "%s - %d", "Current value", 57); logc((a > b), TRACE, "'a' should not be greater than 'b'"); logcf((a % 10 == 0), INFO, "%s - %d", "current count", count); However, if we go with a class-based approach, then they would look more like this: logger.log(INFO, "some message", 91, " foo"); Aliases can then be used to make them look like this: logger.INFO("some message", 91, " foo"); Writing to the log ------------------ Each of these methods are simply wrappers around the log writer function. These wrappers are responsible for checking the current log level and any user-provided boolean conditions to determine if logging should occur. If logging is to occur, then the message is passed to the log writer. Writing to the log will be done by sending messages to a writer thread. This was done primarily to support applications that use multiple threads, however it also helps ensure that logging has a minimal impact on the normal operation of a program. NOTE: Need to determine if send/receive support variadic parameters. If not, all formatting will need to be done before passing the message to the log writer. If it is possible, formatting can be performed in the log writer and help keep the logging overhead from impacting program execution. Buffered vs. Unbuffered Writes ------------------------------ When writing out the log entries, one may be tempted to try to improve performance by buffering the entries and only writing them out periodically. After examining some scenarios, this appears to be the incorrect solution. Whether one only cares about errors or a developer is trying to debug an application, ensuring the capturing the log entries is very important. No entries should be lost, therefore unbuffered writes are more desirable. Writing to stderr ----------------- Log entries can be written to stderr, however they will also be copied to a log file. The reason for this is to ensure that the output can be reviewed as needed without needing to execute the program again. NOTE: Will this work if the writer is in a separate thread? If not, may need to drop this feature. Timestamps ---------- There are two possible times to record the timestamp: when the log statement is called and when the message is written to the log. The advantage of the first is that the timestamps are guaranteed to be in the order that the messages were written, however when they are written the order may not be preserved with respect to messages from other threads. The latter will generate the timestamps before writing, thus the ordering may not be quite right with respect to the messages, but correct in respect to when they are written. Also, the creation and formatting of the timestamp is handled in the writer thread, thus less overhead with respect to the rest of the program. The current plan is to create the timestamp upon writing the message. This ensures that program execution is not impacted and to ensure that there is no confusion due to timestamps potentially being out of order. There currently is a configuration option that is required due to std.date reporting UTC time and not local time. Once this is resolved, it can be removed. Until then, to get the correct timestamp the timezone needs to be adjusted. COMPILATION ----------- A compile-time option will be provided to turn on/off logging as desired. The question is which is the preferred solution: 1) Provide a version identifier to turn logging on. 2) Provide a version identifier to turn logging off. Option 2 may be preferable because if one has taken the time to add logging, chances are it is a key part of the system. Also, it is hoped that if someone decides to turn off logging, they do so because they are very confident that it is not necessary. E.g. the developer is very confident that the software works properly. CONFIGURATION FILE ------------------ To keep things simple, a configuration file will be used to store various settings. This not only keeps the code free of configuration information, options can be changed during runtime and a background thread will reread the configuration file periodically. This was done in anticipation of long-running applications, specifically servers, where stopping and starting the application is not desirable. The exact options are currently not know, however the following are anticipated: minlevel - The minimum level to log. tzoffset - The timezone offset. This is used to ensure the logger writes the local time to the log and not UTC time. filename - The name and location of the logfile. writer - The name of the log writer to user. NOTES ----- - Most initialization will occur in a static constructor. - Background threads - Read initial values from the configuration file. - Individual files/modules can have their own log levels. - Each file/module has a unique ID. - IDs are keys to a has table (AA) containing log level values. - Synchronization may cause performance issues. - Need to figure out. - Set up the logger to know the current file that it is in. - Mixin? E.g. every file starts with something like: mixin(initLogger!()); - "location" defaults to __FILE__, but user can override. - If embedding classes, this may be very useful. - Setting are contained within a config file. - Keeps code cleaner. - Settings can be changed without stopping the program. - Version option available to remove all logging during compilation. - Logging to stderr is also logged to a log file. - If for some reason the amount of data is large, the log file is also available to peruse. - Logger can accept formatted or unformatted input. - Logger decide to log or not log via user-supplied boolean statements. - Timestamp - When written or when the application calls a log function? - If not when written, timestamps may appear out of order. (Not confirmed) - Still have the timezone offset issue... LEVELS ------ fatal - Used for when an error has occured where the only solution is to terminate the application. error - Used for when an error occured, but it is recoverable. warning - Used for when something has occured during program execution that still results in a valid execution of the program, but may have unexpected side effects or can indicate that a problem is going to occur. E.g. compiler warnings or noting that the number of allowed connections is close to the limit. info - Used for recording key aspects of the normal execution of the program. This can be useful for servers to know what aspects of the server is being used the most. debug - Used for capturing the minimal amount of information for debugging a program. Some examples are entering and leaving methods, capturing the original and transformed copies of data, and recording when key steps in an algorithm are performed. trace - Used for when normal debug level logging is not enough. This should be used for very fine grain logging, such as recording the state of every variable or each step of an algorithm. Per "Module" Levels ------------------- Glog has a feature where specific modules can have different log levels associated with them. For example, the default log level may be set to "warning", but a module responsible for parsing a complex file may be set to "debug" as there may still be some issues that occur during testing or production use. Currently, the only way to properly implement this is to have only the log writer thread determine if the message is to be written to the log. This means that every log message, regardless of level, is passed to the log writer thread. The concern here is that if there is a significant amount of logging in the system, the threads mailbox can be filled very quickly. If a logger is instantiated in a new location at runtime, the log writer code would have to handle this accordingly. If the location is currently not known, it's level would be set to the minimum level. Otherwise, it's level would be set to whatever is specified for that location in the config file.