You can configure the Common Log Format (CLF) Apache parser for both FileLog and WinLog collectors.

Common Log Format (Apache) Parser

The default CLF parser defines the following order and names of fields.

host ident authuser datetime request statuscode bytes

Parser name: clf

The CLF parser-specific option is format.

format Option

The format option specifies the format with which Apache logs are generated. The option is not mandatory.

If no format is specified, the following default common log format is used.

%h %l %u %t \"%r\" %s %b

The CLF parser format string does not accept regex expressions. For example, specify a space instead of the expression \s+.

To parse other log formats, specify that format in the agent's configuration. Parsed fields appear on the server side with the following names.

Note: In the cases in which a variable is required, if {VARNAME} is not provided in the configuration, the fields are ignored.
Fields Value
'%a': "remote_ip"
'%A': "local_ip"
'%B', '%b': "response_size"
'%C': Depends on the name of variable specified in the format
'%c': Depends on the name of variable specified in the format
'%D': "request_time_mcs"
'%E': "error_status"
'%e': Depends on the name of variable specified in the format
'%F', '%f': "file_name"
'%h': "remote_host"
'%H': "request_protocol"
'%i': Depends on the name of variable specified in the format
'%k': "keepalive_request_count"
'%l': "remote_log_name"
'%L' "request_log_id"
'%M': "log_message" (parser stops parsing the input log after reaching this specifier)
'%m': "request_method"
'%n': depends on the name of variable specified in the format
'%o': depends on the name of variable specified in the format
'%p': "server_port"

Additional formats can be used with this specifier: %{format}p. Supported formats are "canonical", "local", or "remote". When the "canonical" format is used, the field name remains as "server_port". When the "local" format is used, the field name will be "local_server_port", and when the "remote" format is used, the field name will be "remote_server_port".

'%P': "process_id"

Additional formats can be used with this specifier: %{format}P. Supported formats are "pid", "tid", and "hextid". If "pid" is used as a format, the field name will be "process_id". While "tid" and "hextid" formats generate fields with the name "thread_id"

'%q': "query_string"
'%r': "request"
'%R': "response_handler"
'%s': "status_code", which generates the final status of the request.
'%t':

"timestamp", which works as event timestamp on ingestion, and engages the timestamp parser. To override timestamp auto detection, date and time format can be specified in curly braces: %{%Y-%m-%d %H:%M:%S}t, see Timestamp Parser for more details.

The timestamp format for the CLF parser can start with "begin:" or "end:" prefixes. If the format starts with begin: (default), the time is taken at the beginning of the request processing. If it starts with end:, it is the time when the log entry gets written, close to the end of the request processing. For example, such formats such as the following are supported by CLF parser: %h %l %u [%{begin:%d/%b/%Y %T}t.%{msec_frac}t] \"%r\" %>s %b

The following format tokens are also supported for the CLF parer's timestamp format specifier:
sec
number of seconds since the Epoch. This is equivalent to Timestamp parser's %s specifier.
msec
number of milliseconds since the Epoch
usec
number of microseconds since the Epoch
msec_frac
millisecond fraction (is equivalent to Timestamp parser's %f specifier)
musec
microsecond fraction (is equivalent to Timestamp parser's %f specifier)
To parse logs where timestamp is represented with format tokens, the following formats can be used in the configuration:
format=%h %l %u %{sec}t \"%r\" %s %b
format=%h %l %u %{msec}t \"%r\" %s %b
format=%h %l %u %{usec}t \"%r\" %s %b

These tokens cannot be combined with each other or Timestamp parser formatting in the same format string. You can use multiple %{format}t tokens instead. For example, to use Timestamp which includes milliseconds, except of using Timestamp parser's %f specifier, the following combined timestamp can be used: %{%d/%b/%Y %T}t.%{msec_frac}t .

'%T': "request_time_sec"
'%u': "remote_auth_user"
'%U': "requested_url"
'%v': "server_name"
'%V': "self_referential_server_name"
'%X': "connection_status", which depends on the name of variable specified in the format
'%x': Depends on the name of variable specified in the format
'%I': "received_bytes"
'%O': "sent_bytes"
'%S': "transferred_size"

For example, to parse logs collected from either winlog or filelog sources with the CLF parser, specify the following configuration:

[filelog|clflogs]
directory=D:\Logs
include=*.txt
parser=myclf

[parser|myclf]
debug=yes ;Note: use this option only while debugging and set it to ‘no’ when used in production.
base_parser=clf
format=%h %l %u %b %t \"%r\" %s

Using this configuration, logs that are collected from the clflogs source, for example from the directory=D:\Logs directory, are parsed by myclf. The myclf parser only parses those logs that were generated with the format described in the configuration.

The default value for debug is debug=no for parsers.

Parsing Logs that were Generated Using CLF

To parse logs that were generated using CLF, you must define the corresponding format in the configuration. For example,

format=%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User_Agent}i\"

Fields that are not empty that use the specifiers %{Referer}i and %{User_Agent}i appear on the VMware Aria Operations for Logs server with the names referer and user_agent respectively.

Integrating the Timestamp Parser with the CLF Parser

You can parse Apache logs with a custom time format.

Access logs that have a custom time format as follows.

format = %h %l %u %{%a, %d %b %Y %H:%M:%S}t \"%r\" %>s %b

If a custom time is not specified, the CLF parser attempts to deduce the time format automatically by running the automatic timestamp parser, otherwise the custom time format is used.

The supported custom time formats that are supported for error logs are:

Custom Time Format Description Configuration Format
%{u}t Current time including micro-seconds format=[%{u}t] [%l] [pid %P] [client %a] %M
%{cu}t Current time in compact ISO 8601 format, including micro-seconds format=[%{cu}t] [%l] [pid %P] [client %a] %M

For a full list of supported timestamp specifiers, see timestamp parser.

Apache Default Access Logs Configuration for Windows

This example shows how you can format Apache v2.4 access log configurations for Windows.

;ACCESS LOG
;127.0.0.1 - - [13/May/2015:14:44:05 +0400] "GET /xampp/navi.php HTTP/1.1" 200 4023 "http://localhost/xampp/" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:37.0) Gecko/20100101 Firefox/37.0"
;format=%h %l %u %{%d/%b/%Y:%H:%M:%S %z}t \"%r\" %>s %b \"%{Referer}i\" \"%{User_agent}i\"
 
; Section to collect Apache ACCESS logs
[filelog|clflogs-access]
    directory=C:\xampp\apache\logs
    include=acc*
    parser=clfparser_apache_access
    enabled=yes

;Parser to parse Apache ACCESS logs
[parser|clfparser_apache_access]
    debug=yes
    base_parser=clf
    format=%h %l %u %{%d/%b/%Y:%H:%M:%S %z}t \"%r\" %>s %b \"%{Referer}i\" \"%{User_agent}i\"
Define the access log format:
  1. Configure Apache for the access log format (httpd.conf):
     LogFormat "%h %l %u %{%d-%b-%Y:%H:%M:%S}t \"%r\" %a %A %e %k %l %L %m %n %T %v %V %>s %b \"%{Referer}i\" \"%{User_Agent}i\"" combined
    
  2. Define the CLF parser configuration:
;ACCESS LOG
;127.0.0.1 unknown - 21-May-2015:13:59:35 "GET /xampp/navi.php HTTP/1.1" 127.0.0.1 127.0.0.1 - 0 unknown - GET - 1 localhost localhost 200 4023 "http://localhost/xampp/" "-"
[filelog|clflogs-access]
    directory=C:\xampp\apache\logs
    include=acc*;_myAcc*
    parser=clfparser_apache_access
    enabled=yes
; Parser to parse Apache ACCESS logs
[parser|clfparser_apache_access]
   debug=yes
   base_parser=clf
   format=%h %l %u %{%d-%b-%Y:%H:%M:%S}t \"%r\" %a %A %e %k %l %L %m %n %T %v %V %>s %b \"%{Referer}i\" \"%{User_Agent}i\"
The CLF parser returns the following:
remote_host=127.0.0.1
timestamp=2015-05-13T10:44:05
request=GET /xampp/navi.php HTTP/1.1
status_code=200
response_size=4023
referer=http://localhost/xampp/
user_agent=Mozilla/5.0 (Windows NT 6.1; WOW64; rv:37.0) Gecko/20100101 Firefox/37.0

This example shows how you can format Apache v2.4 error log configurations for Windows.

;ERROR LOG
;[Wed May 13 14:37:17.042371 2015] [mpm_winnt:notice] [pid 4488:tid 272] AH00354: Child: Starting 150 worker threads.
;[Wed May 13 14:37:27.042371 2015] [mpm_winnt:notice] [pid 5288] AH00418: Parent: Created child process 3480
;format=[%{%a %b %d %H:%M:%S%f %Y}t] [%m:%{severity}i] [pid %P:tid %{thread_id}i] %E: %M
;format=[%{%a %b %d %H:%M:%S%f %Y}t] [%m:%{severity}i] [pid %P] %E: %M 
 
; Section to collect Apache ERROR logs
[filelog|clflogs-error]
    directory=C:\xampp\apache\logs
    include=err*
    parser=clfparser_apache_error
    enabled=yes
 
;Parser to parse Apache ERROR logs
[parser|clfparser_apache_error]
    debug=yes
    base_parser=clf
    format=[%{%a %b %d %H:%M:%S%f %Y}t] [%m:%{severity}i] [pid %P:tid %{thread_id}i] %E: %M
    next_parser=clfparser_apache_error2
 
;Parser to parse Apache ERROR logs
[parser|clfparser_apache_error2]
    debug=yes
    base_parser=clf
    format=[%{%a %b %d %H:%M:%S%f %Y}t] [%m:%{severity}i] [pid %P] %E: %M
Note: The provided names correspond to the combined log format. Apache error logs are also described using the above formatting keys, not the Apache error log format.
Define the error log format:
  1. Configure Apache for the error log format (httpd.conf):
     LogFormat "%h %l %u %{%d-%b-%Y:%H:%M:%S}t \"%r\" %a %A %e %k %l %L %m %n %T %v %V %>s %b \"%{Referer}i\" \"%{User_Agent}i\"" combined
    
  2. Define the CLF parser configuration:
;Parser to parse Apache ERROR logs
[parser|clfparser_apache_error]
   debug=yes
   base_parser=clf
   format=[%{%a %b %d %H:%M:%S%f %Y}t] [%m:%{severity}i] [pid %P] %E: %M
   next_parser=clfparser_apache_error2

;Parser to parse Apache ERROR logs
[parser|clfparser_apache_error2]
   debug=yes
   base_parser=clf
   format=[%{%a %b %d %H:%M:%S%f %Y}t] [%m:%{severity}i] [pid %P:tid %{thread_id}i] %E: %M

Log entry:

[Wed May 13 14:37:17.042371 2015] [mpm_winnt:notice] [pid 4488:tid 272] AH00354: Child: Starting 150 worker threads.
The CLF parser returns the following fields for the log entry (If using a parser in a +0400 timezone):
timestamp=2015-05-13T10:37:17.042371
request_method=mpm_winnt
severity=notice
process_id=4488
thread_id=272
error_status=AH00354
log_message=Child: Starting 150 worker threads.

Log entry:

[Wed May 13 14:37:27.042371 2015] [mpm_winnt:notice] [pid 5288] AH00418: Parent: Created child process 3480
The CLF parser returns the following fields for the log entry (If using a parser in a +0400 timezone):
timestamp=2015-05-13T10:37:27.042371
request_method=mpm_winnt
severity=notice
process_id=5288
error_status=AH00418
log_message=Parent: Created child process 3480