воскресенье, 6 апреля 2014 г.

Sysdig - new Linux tracing tool for sysadmins

Last week Draios company made bold move - they made their Linux tracing tool Sysdig open-source.
What is Sysdig? As it says on own website - "strace + tcpdump + lsof + awesome sauce".
And I think that tool is really quite awesome.
Installation for daredevils is quite simple -
curl -s https://s3.amazonaws.com/download.draios.com/stable/install-sysdig | sudo bash
But for more responsible sysadmins there's good manual - set up sysdig repo and install it (you'll need linux headers and DKMS for automatic kernel module build and installation).
You can start learning sysdig using this simple examples:
  • See the top processes in terms of network bandwidth usage:
  • sysdig -c topprocs_net
  • See the top local server ports (in terms of total bytes)
  • sysdig -c fdbytes_by fd.sport
  • See the top client IPs (in terms of established connections)
  • sysdig -c fdcount_by fd.cip "evt.type=accept"
  • Топ процессов по использованию диска
  • sysdig -c topprocs_file
  • Print the top files that apache has been reading from or writing to
  • sysdig -c topfiles_bytes proc.name=httpd
  • List the processes that are using a high number of files
  • sysdig -c fdcount_by proc.name "fd.type=file"
  • See the files where most time has been spent
  • sysdig -c topfiles_time
  • See queries made via apache to an external MySQL server happening in real time
  • sysdig -A -c echo_fds fd.sip=192.168.30.5 and proc.name=apache2 and evt.buffer contains SELECT
You can also record all events server-wide (or process-wide, or using other sysdig filter):
sysdig -w out.scap proc.name=httpd
and analyze that later, using even MAC or Windows workstation.
Also there is a framework for Lua - Chisel - you can write a simple script and execute them immediately at sysdig run.
However, there's an open question still - how much additional load sysdig brings to server.
Let's make a simple test. I have an small virtual machine on Ubuntu 12.04, 1 GB of RAM, with Percona Mysql 5.6 installed.
  1. Install sysbench:
  2. sudo apt-get install sysbench
  3. Create empty database 'sbtest' and fill it with test data:
  4. sysbench --test=oltp --mysql-table-engine=innodb --oltp-table-size=10000 --mysql-user=root --mysql-password=root --db-driver=mysql --test=oltp prepare
    
  5. Run sysbench
  6. root@ubuntu:~# sysbench --num-threads=8 --max-requests=5000 --oltp-table-size=10000 --mysql-user=root --mysql-password=root --db-driver=mysql --test=oltp run
    
  7. Results
  8. sysbench 0.4.12:  multi-threaded system evaluation benchmark
    
    Running the test with following options:
    Number of threads: 8
    
    Doing OLTP test.
    Running mixed OLTP test
    Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
    Using "BEGIN" for starting transactions
    Using auto_inc on the id column
    Maximum number of requests for OLTP test is limited to 5000
    Threads started!
    Done.
    
    OLTP test statistics:
        queries performed:
            read:                            70014
            write:                           25003
            other:                           10001
            total:                           105018
        transactions:                        5000   (123.58 per sec.)
        deadlocks:                           1      (0.02 per sec.)
        read/write requests:                 95017  (2348.49 per sec.)
        other operations:                    10001  (247.19 per sec.)
    
    Test execution summary:
        total time:                          40.4587s
        total number of events:              5000
        total time taken by event execution: 323.5886
        per-request statistics:
             min:                                  5.83ms
             avg:                                 64.72ms
             max:                               8020.75ms
             approx.  95 percentile:             168.71ms
    
    Threads fairness:
        events (avg/stddev):           625.0000/24.51
        execution time (avg/stddev):   40.4486/0.01
    
Delete sbtest database, reboot virtual machine, repeat p.1 and 2
  1. Run sysdig in separate terminal:
  2. root@ubuntu:~# sysdig -w /root/mysqld.scap proc.name=mysqld
  3. Re-run test
  4. root@ubuntu:~# sysbench --num-threads=8 --max-requests=5000 --oltp-table-size=10000 --mysql-user=root --mysql-password=root --db-driver=mysql --test=oltp run
    
  5. Results
  6. sysbench 0.4.12:  multi-threaded system evaluation benchmark
    
    Running the test with following options:
    Number of threads: 8
    
    Doing OLTP test.
    Running mixed OLTP test
    Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
    Using "BEGIN" for starting transactions
    Using auto_inc on the id column
    Maximum number of requests for OLTP test is limited to 5000
    Threads started!
    Done.
    
    OLTP test statistics:
        queries performed:
            read:                            70014
            write:                           25002
            other:                           10001
            total:                           105017
        transactions:                        5000   (71.62 per sec.)
        deadlocks:                           1      (0.01 per sec.)
        read/write requests:                 95016  (1360.97 per sec.)
        other operations:                    10001  (143.25 per sec.)
    
    Test execution summary:
        total time:                          69.8150s
        total number of events:              5000
        total time taken by event execution: 558.1830
        per-request statistics:
             min:                                  9.35ms
             avg:                                111.64ms
             max:                               1590.65ms
             approx.  95 percentile:             304.89ms
    
    Threads fairness:
        events (avg/stddev):           625.0000/39.17
        execution time (avg/stddev):   69.7729/0.02
    
    
So, average query time has almost doubled - 111 ms instead of 65 ms. Not very impressive. Truthfully speaking, test was quite artificial and not very methodologically correct though....

воскресенье, 9 февраля 2014 г.

FreeBSDJournal Rant

OK, in the year of the Lord 2013 some people was going to publish a magazine dedicated to FreeBSD - FreeBSD Journal. Idea was to put it to iPad and Android tables - it's XXI century, right? OK, that's good initiative.
And in the year 2014, the magazine finally came out. And what do we see ? In the Apple Appstore and Amazon Appstore (NOT in Google Play, although they say that the version for Play on the way) we must buy (free) app (!).
Let's download, run it and then we are see -

WHAT???
What login? What's the password? Where I need to register? Why I need to register?
Why everything is repeated twice? Isn't 7 USD very much there for one issue of electronic format even not in PDF/ePUB?

And all this despite the fact that on Amazon, Appstore and Google Play have an opportunity to sell the magazines in "one-click" - without additional registration, without entering your payment information...
Yes, it costs some money, but even if it for the same 30% of the price (I don't know, maybe rates are different for magazines, of course) - but really, is FreeBSD OS has become so "niche thing" that reducing prices by 30% does not pay for an increase in audience AT TIMES? :(

воскресенье, 19 января 2014 г.

Big Data is easy

Big Data is very easy. :)
1. Read http://adambard.com/blog/top-github-languages-for-2013-so-far/
2. Check http://www.githubarchive.org/
3. Check https://developers.google.com/bigquery/
4. Read https://github.com/igrigorik/githubarchive.org/tree/master/bigquery
5. Change Adam's query to
SELECT repository_language, count(repository_language) AS repos_by_lang
FROM [githubarchive:github.timeline]
WHERE repository_fork == "false"
AND type == "CreateEvent"
AND PARSE_UTC_USEC(repository_created_at) >= PARSE_UTC_USEC('2013-01-01 00:00:00')
AND PARSE_UTC_USEC(repository_created_at) < PARSE_UTC_USEC('2014-01-01 00:00:00')
GROUP BY repository_language
ORDER BY repos_by_lang DESC
LIMIT 100
6. Run it on BigQuery - Query complete (2.3s elapsed, 6.80 GB processed)
7. PROFIT:
Besides of Gihub errors on language detection (like most of projects including jQuery is detecting like Javascript) it will be interesting to find top language by number/size of commits, but I think its not very easy to do - Github doesn't report commits, only pushes... Ok, let's do it for pushes:

SELECT repository_language, count(repository_language) as pushes
FROM [githubarchive:github.timeline]
WHERE type="PushEvent"
AND PARSE_UTC_USEC(created_at) >= PARSE_UTC_USEC('2013-01-01 00:00:00')
AND PARSE_UTC_USEC(created_at) < PARSE_UTC_USEC('2014-01-01 00:00:00')
GROUP BY repository_language
ORDER BY pushes DESC
LIMIT 100


Results are almost the same -
Anyway, now you know that Big Data is not scary at all :)



воскресенье, 1 сентября 2013 г.

Perl 5 Internals

Long time post nothing... Shame on me.
Nothing more than couple of intresting links for weekend reading.
Nice post from Rob Hoelz about Perl 5 internals, worth reading (understanding of perlguts or illguts required)
And something totally different - Cloudera have nice post about Hadoop hardare planning

воскресенье, 31 марта 2013 г.

суббота, 30 марта 2013 г.

Scala news :)


Russian post is here.
There's good article in Linkedin engineering blog - "Play Framework: async I/O without the thread pool and callback hell". As you could guess it's about programming on Scala using Play Framework. :) - starting from event-driven vs threaded programming model, but after that it says how Scala-specific things could help avoid of event programming flaws like callback hell etc.
What if you don't know Scala language? You can master it using fun game Scalatron - you'll need to write bots which will compete in a virtual arena for energy and survival.
But if you're prefer more traditional learning - Martin Odersky himself bring second iteration of Functional Programming Principles in Scala course on coursera.org from March 25, you still have time to sign in. It's quite intresting and not very hard - it was possible even for me to finish it (although with not very great results :)
IMHO Scala is quite intresting language. I think it's somewhat similar to Perl, but not about syntax, it's about... I don't know.. impression? It's also quite brief (especially comparable to Java) and powerful.

суббота, 16 марта 2013 г.

Velocity 2011: Theo Schlossnagle, "Career Development"

Very honest talk from Theo, like it. Must see for Ops people. :)