yahi

Versatile parallel log parser
Download

yahi Ranking & Summary

Advertisement

  • Rating:
  • License:
  • Python License
  • Price:
  • FREE
  • Publisher Name:
  • Julien Tayon
  • Publisher web site:
  • https://github.com/jul/

yahi Tags


yahi Description

yahi is a versatile log parser providing default extractors for apache/lighttpd.Command line usageExample of data parsed with yahi: http://wwwstat.julbox.fr/Simplest usage is:speed_shoot -g /usr/local/data/geoIP /var/www/apache/access*logit will return a json in the form:{ "by_date": { "2012-5-3": 11 }, "total_line": 11, "ip_by_url": { "/favicon.ico": { "192.168.0.254": 2, "192.168.0.35": 2 }, "/": { "74.125.18.162": 1, "192.168.0.254": 1, "192.168.0.35": 5 } }, "by_status": { "200": 7, "404": 4 }, "by_dist": { "unknown": 11 }, "bytes_by_ip": { "74.125.18.162": 151, "192.168.0.254": 489, "192.168.0.35": 1093 }, "by_url": { "/favicon.ico": 4, "/": 7 }, "by_os": { "unknown": 11 }, "week_browser": { "3": { "unknown": 11 } }, "by_referer": { "-": 11 }, "by_browser": { "unknown": 11 }, "by_ip": { "74.125.18.162": 1, "192.168.0.254": 3, "192.168.0.35": 7 }, "by_agent": { "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:12.0) Gecko/20100101 Firefox/12.0,gzip(gfe) (via translate.google.com)": 1, "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:12.0) Gecko/20100101 Firefox/12.0": 10 }, "by_hour": { "9": 3, "10": 4, "11": 1, "12": 3 }, "by_country": { "": 10, "US": 1 }}If you use:speed_shoot -f csv -g /usr/local/data/geoIP /var/www/apache/access*logYour result is:by_date,2012-5-3,11total_line,11ip_by_url,/favicon.ico,192.168.0.254,2ip_by_url,/favicon.ico,192.168.0.35,2ip_by_url,/,74.125.18.162,1ip_by_url,/,192.168.0.254,1ip_by_url,/,192.168.0.35,5by_status,200,7by_status,404,4by_dist,unknown,11bytes_by_ip,74.125.18.162,151bytes_by_ip,192.168.0.254,489bytes_by_ip,192.168.0.35,1093by_url,/favicon.ico,4by_url,/,7by_os,unknown,11week_browser,3,unknown,11by_referer,-,11by_browser,unknown,11by_ip,74.125.18.162,1by_ip,192.168.0.254,3by_ip,192.168.0.35,7by_agent,"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:12.0) Gecko/20100101 Firefox/12.0,gzip(gfe) (via translate.google.com)",1by_agent,Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:12.0) Gecko/20100101 Firefox/12.0,10by_hour,9,3by_hour,10,4by_hour,11,1by_hour,12,3by_country,,10by_country,US,1Well I guess, it does not work because you first need to fetch geoIP data file:wget -O- "http://www.maxmind.com/download/geoip/database/GeoLiteCountry/GeoIP.dat.gz" | zcat > /usr/local/data/GeoIP.datOf course, this is the geoLite database, I don't include the data in the package since geoIP must be updated often to stay accurate.Default path for geoIP is data/GeoIP.datUse as a scriptspeed shoot is in fact a template of how to use yahi as a module:#!/usr/bin/env pythonfrom archery.bow import Hankyu as _dictfrom yahi import notch, shootfrom datetime import datetimecontext=notch()date_formater= lambda dt :"%s-%s-%s" % ( dt.year, dt.month, dt.day)context.output( shoot( context, lambda data : _dict({ 'by_country': _dict({data: 1}), 'by_date': _dict({date_formater(data): 1 }), 'by_hour': _dict({data.hour: 1 }), 'by_os': _dict({data: 1 }), 'by_dist': _dict({data: 1 }), 'by_browser': _dict({data: 1 }), 'by_ip': _dict({data: 1 }), 'by_status': _dict({data: 1 }), 'by_url': _dict({data: 1}), 'by_agent': _dict({data: 1}), 'by_referer': _dict({data: 1}), 'ip_by_url': _dict({data: _dict( {data: 1 })}), 'bytes_by_ip': _dict({data: int(data)}), 'week_browser' : _dict({data.weekday(): _dict({data :1 })}), 'total_line' : 1, }), ),)Installationeasy as:pip install yahior:easy_install yahiRecommanded usage- for basic log aggregation, I do recommand using command line;- for one shot metrics I recommend an interactive console (bpython or ipython);- for specific metrics or elaborate filters I recommand using the API.Product's homepage


yahi Related Software