Mostrando entradas con la etiqueta debug. Mostrar todas las entradas
Mostrando entradas con la etiqueta debug. Mostrar todas las entradas

miércoles, 12 de septiembre de 2018

Level up logs and ELK - VRR Curator configuration

Articles index:

  1. Introduction (Everyone)
  2. JSON as logs format (Everyone)
  3. Logging best practices with Logback (Targetting Java DEVs)
  4. Logging cutting-edge practices (Targetting Java DEVs) 
  5. Contract first log generator (Targetting Java DEVs)
  6. ElasticSearch VRR Estimation Strategy (Targetting OPS)
  7. VRR Java + Logback configuration (Targetting OPS)
  8. VRR FileBeat configuration (Targetting OPS)
  9. VRR Logstash configuration and Index templates (Targetting OPS)
  10. VRR Curator configuration (Targetting OPS)
  11. Logstash Grok, JSON Filter and JSON Input performance comparison (Targetting OPS)

VRR Curator configuration

 

Curator configuration

Last piece of the puzzle, Curator will take care of deleting information from ElasticSearch once it has expired according to our policies.

There are two ways of deleting information from ElasticSearch using its API
  1. By deleting documents through a query that uses timestamp -> SLOW, deleting a document requires modifying indices and caches.
  2. By deleting indices as a whole -> FAST. But you cannot choose what to delete inside an index, it all goes down.
The best way of deleting indices but only old information is to create an index per day (or other granularity) and let curator understand the day format from the index name.
Then curator can delete the oldest indices and keep the newest at the same time.

Finally, Curator is not a daemon, it needs to be executed, normally once a day is enough (really as often as your smaller granularity).

Configuration example:

File available here

---
actions:
  1: <- Sets an order of execution
    action: delete_indices <- action
    description: >-
      Delete indices older than 3 days (based on index name), for vrr-*-crit-
      prefixed indices. Application/Service name is irrelevant, but this only 
      applies to "crit" indices. Ignore the error if the filter does not result in an
      actionable list of indices (ignore_empty_list) and exit cleanly.
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False <- otherwise it wouldn't execute
    filters:
    - filtertype: pattern
      kind: regex
      value: vrr-.*-crit- <- Applies to all VRR critical indices, no matter the application. 
this is only valid if you get an agreement on retention being the same for all application for
the same importance. 
      exclude:
    - filtertype: age <- Second filter, we are filtering by age
      source: name
      direction: older
      timestring: '%Y-%m-%d' <- date format in the index (set in logstash!)
      unit: days
      unit_count: 371 <- delete if older than 371 days ("days" defined in unit, just above)
      exclude:
  2:
    action: delete_indices
    description: >-
      Delete indices older than 2 days (based on index name), for vrr-*-imp-
      prefixed indices. Application/Service name is irrelevant, but this only 
      applies to "imp" indices. Ignore the error if the filter does not result in an
      actionable list of indices (ignore_empty_list) and exit cleanly.
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: regex
      value: vrr-.*-imp- <- Applies to all VRR important indices
      exclude:
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y-%m-%d'
      unit: days
      unit_count: 90 <- delete if older than 90 days
      exclude:
  3:
    action: delete_indices
    description: >-
      Delete indices older than 1 days (based on index name), for vrr-*-low-
      prefixed indices. Application/Service name is irrelevant, but this only 
      applies to "low" indices. Ignore the error if the filter does not result in an
      actionable list of indices (ignore_empty_list) and exit cleanly.
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: regex
      value: vrr-.*-low- <- Applies to all VRR low indices
      exclude:
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y-%m-%d'
      unit: days
      unit_count: 14 <- delete if older than 14 days
      exclude:


Next: 11 - Logstash Grok, JSON Filter and JSON Input performance comparison

Level up logs and ELK - VRR FileBeat configuration

Articles index:

  1. Introduction (Everyone)
  2. JSON as logs format (Everyone)
  3. Logging best practices with Logback (Targetting Java DEVs)
  4. Logging cutting-edge practices (Targetting Java DEVs) 
  5. Contract first log generator (Targetting Java DEVs)
  6. ElasticSearch VRR Estimation Strategy (Targetting OPS)
  7. VRR Java + Logback configuration (Targetting OPS)
  8. VRR FileBeat configuration (Targetting OPS)
  9. VRR Logstash configuration and Index templates (Targetting OPS)
  10. VRR Curator configuration (Targetting OPS)
  11. Logstash Grok, JSON Filter and JSON Input performance comparison (Targetting OPS)

VRR FileBeat configuration



Filebeat doesn't need much configuration for JSON log files, just our typical agreement between parties:
  • DEVs agree to 
    • use JSON for logs, 
    • VRR as log retention strategy, 
    • "imp" JSON field for VRR "importance" fields with values LOW, IMP, CRIT
    • no "imp" field means LOW importance
  • OPS agree to
    • take this file and use retention and replication depending on those fields
    • add "service" in filebeat for application name
    • add "environment" in filebeat where applicable
    • add "logschema":"vrr" to distinguish a common approach for logs.

As contract is mostly the same for all applications, Filebeat configuration is very reusable, one entry per application and box.

This file, in a working example, can be found here.

- type: log
  enabled: true <- important ;)
  paths:
    - /path/to/logFile.json 
  encoding: utf-8
  fields:
    logschema: vrr <- this value will be reused in Logstash configuration
    service: leveluplogging <- application / service name
    environment: production <- optional, very.


Next: 9 - VRR Logstash configuration and Index templates

Level up logs and ELK - VRR Logstash configuration

Articles index:

  1. Introduction (Everyone)
  2. JSON as logs format (Everyone)
  3. Logging best practices with Logback (Targetting Java DEVs)
  4. Logging cutting-edge practices (Targetting Java DEVs) 
  5. Contract first log generator (Targetting Java DEVs)
  6. ElasticSearch VRR Estimation Strategy (Targetting OPS)
  7. VRR Java + Logback configuration (Targetting OPS)
  8. VRR FileBeat configuration (Targetting OPS)
  9. VRR Logstash configuration and Index templates (Targetting OPS)
  10. VRR Curator configuration (Targetting OPS)
  11. Logstash Grok, JSON Filter and JSON Input performance comparison (Targetting OPS)

VRR Logstash configuration and Index templates

Logstash configuration from example can be found here.

input {
  beats {
    port => 5044
    codec => json
  }
}

output {
     if [fields][logschema] == "vrr" { //for ALL VRR applications
        if [importance] == "CRIT" { //for ALL CRITICAL LINES
            elasticsearch { //send to SERVICE-LOGSCHEMA-IMP-DATE index (vrr-loggingup-crit-2018-09-10) with template template.max
                hosts => "localhost:9200"
                index => "vrr-%{[fields][service]}-crit-%{+YYYY-MM-dd}" //ONE INDEX PER APPLICATION AND DAY AND IMPORTANCE
                template => "/path/to/templates/template-max.json" //USING THIS TEMPLATE, NEXT CHAPTER!
                template_overwrite => true
                template_name => "vrr-max"
            }
        } else if [importance] == "IMP" { //for ALL CRITICAL LINES
            elasticsearch { //send to SERVICE-LOGSCHEMA-IMP-DATE index (vrr-loggingup-imp-2018-09-10) with template template.mid
                hosts => "localhost:9200"
                index => "vrr-%{[fields][service]}-imp-%{+YYYY-MM-dd}" //ONE INDEX PER APPLICATION AND DAY AND IMPORTANCE
                template => "/path/to/templates/template-mid.json" //USING THIS TEMPLATE, NEXT CHAPTER!
                template_overwrite => true
                template_name => "vrr-mid"
            }
        } else { //FOR BOTH "LOW" AND NO-EXPLICIT TAGGING
            elasticsearch { //send to SERVICE-LOGSCHEMA-IMP-DATE index (vrr-loggingup-low-2018-09-10) with template template.min
                hosts => "localhost:9200"
                index => "vrr-%{[fields][service]}-low-%{+YYYY-MM-dd}" //ONE INDEX PER APPLICATION AND DAY AND IMPORTANCE
                template => "/path/to/templates/template-min.json" //USING THIS TEMPLATE, NEXT CHAPTER!
                template_overwrite => true
                template_name => "vrr-min"
            }
        }
        
    } else { //OTHER NON-VRR APPLICATIONS
        elasticsearch {
            hosts => "localhost:9200"
            index => "logstash-classic-%{[fields][service]}-%{+YYYY-MM-dd-HH}" //STILL ONE SEPARATE INDEX PER APPLICATION AND DAY
        }
    }
}

Template template-max.json (here)

{
  "index_patterns": ["vrr-*-crit-*"], //FOR ALL INDICES THAT MATCH THIS EXPRESSION 
  "order" : 1, <- Overrides order 0 settings (default values like number of shards or mappings)
  "settings": {
    "number_of_replicas": 2 //WE WANT 2 EXTRA COPIES + MASTER
  }
}

Template template-mid.json (here)

{
  "index_patterns": ["vrr-*-imp-*"], //FOR ALL INDICES THAT MATCH THIS EXPRESSION 
  "order" : 1, <- Overrides order 0 settings (default values like number of shards or mappings)
  "settings": {
    "number_of_replicas": 1 //WE WANT AN EXTRA COPY + MASTER
  }
}

Template template-min.json (here)

{
  "index_patterns": ["vrr-*-low-*"], //FOR ALL INDICES THAT MATCH THIS EXPRESSION 
  "order" : 1, <- Overrides order 0 settings (default values like number of shards or mappings)
  "settings": {
    "number_of_replicas": 0 //WE DON'T WANT EXTRA COPIES, JUST MASTER
  }
}


Next: 10 - VRR Curator configuration


Level up logs and ELK - VRR Java + Logback configuration

Articles index:

  1. Introduction (Everyone)
  2. JSON as logs format (Everyone)
  3. Logging best practices with Logback (Targetting Java DEVs)
  4. Logging cutting-edge practices (Targetting Java DEVs) 
  5. Contract first log generator (Targetting Java DEVs)
  6. ElasticSearch VRR Estimation Strategy (Targetting OPS)
  7. VRR Java + Logback configuration (Targetting OPS)
  8. VRR FileBeat configuration (Targetting OPS)
  9. VRR Logstash configuration and Index templates (Targetting OPS)
  10. VRR Curator configuration (Targetting OPS)
  11. Logstash Grok, JSON Filter and JSON Input performance comparison (Targetting OPS)

VRR Java + Logback configuration



Applying VRR to Java Logback application.


Example application defining importance per line using Structured Arguments

Product Owner, OPS and Developers have agreed to use tag/flag/mark "importance" with possible values "LOW" for lowest importance, "IMP" for mid-importance and "CRIT" for critical importance.

Example code without comments available here.


public class VRR {

    private static final String IMPORTANCE = "importance";
    private static final StructuredArgument LOW = kv(IMPORTANCE, "LOW"); //CREATING OBJECTS
    private static final StructuredArgument IMP = kv(IMPORTANCE, "IMP"); //TO REUSE AND
    private static final StructuredArgument CRIT = kv(IMPORTANCE, "CRIT"); //AVOID REWRITING

    private static final Logger logger = LoggerFactory.getLogger(VRR.class);

    public static void main(String[] args) {
        MDC.put("rid", UUID.randomUUID().toString()); //SAME MDC USAGE AVAILABLE
        try {
            long startTime = currentTimeMillis();
            someFunction();
            logger.info("important message, useful to so some metrics {} {}",
                    kv("elapsedmillis", currentTimeMillis() - startTime),
                    IMP); //IMPORTANT MESSAGE
        } catch (Exception e) {
            logger.error("This is a low importance message as it won't have value after few weeks", 
                          e); //THIS IS A LOW IMPORTANCE MESSAGE AS IT'S NOT TAGGED
        }
    }

    static void someFunction() throws Exception {
        logger.info("low importance message, helps to trace errors, begin someFunction {} {} {}",
                kv("user","anavarro"),
                kv("action","file-create"),
                LOW); //LOW IMPORTANCE TAGGED MESSAGE, SLIGHTLY REDUNDANT, SAME THAN UNTAGGED

        Thread.sleep(500L); //some work

        logger.info("critical message, audit trail for user action {} {} {}",
                kv("user","anavarro"),
                kv("action","file-create"),
                CRIT); //CRITICAL MESSAGE
    }
}

Previously mentioned logback.xml configuration

<configuration>
    <appender class="ch.qos.logback.core.rolling.RollingFileAppender" name="stash">
        <file>logFile.json</file>
        <rollingpolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
            <filenamepattern>file.log.%d{yyyy-MM-dd}</filenamepattern>
            <maxhistory>30</maxhistory>
        </rollingpolicy>
        <encoder class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
            <providers>
                <timestamp>
                <threadname>
                <mdc>
                <loggername>
                <message>
                <loglevel>
                <arguments>
                <stacktrace>
                <stackhash>
            </stackhash></stacktrace></arguments></loglevel></message></loggername></mdc></threadname></timestamp></providers>
        </encoder>
    </appender>

    <appender class="ch.qos.logback.core.ConsoleAppender" name="STDOUT">
        <encoder>
            <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern>
        </encoder>
    </appender>

    <root level="all">
        <appender-ref ref="stash">
        <appender-ref ref="STDOUT">
    </appender-ref></appender-ref></root>
</configuration>

Code and configuration together, we get a result as JSON files as follow:


{"@timestamp":"2018-09-11T00:05:11.746+02:00","thread_name":"main",
 "rid":"7fac3070-0d7e-40a6-a3e8-246ec95e86e7","logger_name":"VRR", 
"message":"low importance message, helps to trace errors, begin someFunction 
user=anavarro action=file-create importance=LOW","level":"INFO","user":"anavarro", 
"action":"file-create","importance":"LOW"}

{"@timestamp":"2018-09-11T00:05:12.271+02:00","thread_name":"main",
 "rid":"7fac3070-0d7e-40a6-a3e8-246ec95e86e7","logger_name":"VRR",
 "message":"critical message, audit trail for user action 
user=anavarro action=file-create importance=CRIT","level":"INFO","user":"anavarro", 
"action":"file-create","importance":"CRIT"}

{"@timestamp":"2018-09-11T00:05:12.274+02:00","thread_name":"main", 
"rid":"7fac3070-0d7e-40a6-a3e8-246ec95e86e7","logger_name":"VRR",
 "message":"important message, useful to so some metrics elapsedmillis=528 importance=IMP", 
"level":"INFO","elapsedmillis":528,"importance":"IMP"}
 

Next: 8 - VRR FileBeat configuration