I came across a situation where I needed to generate a bz2 compressed archive of a bunch of files extracted from GridFS. This process is going to occur regularly, so I had to take into consideration performance hits against the server. I felt it would be best if I could take the data as it is extracted from GridFS and write it directly to the compressed archive, instead of writing each file to disk first, and then adding it to the archive.

Python has a library for generating tarballs called tarfile. This was actually very useful since you can write files directly to a bz2 compressed archive. The issue I ran into is that in order for the data coming out of GridFS to be written to the archive, it had to be written as if it were a file (with file attributes). Using any sort of file IO would force me to write to disk, and that’s not what I wanted to do.

Luckily there is StringIO. Using this in combination with tarfile’s TarInfo object, this became a very easy task to accomplish:

import tarfile
import time
from StringIO import StringIO

tar = tarfile.open("sometarfile.tar.bz2", "w:bz2")
for file in gridfs_files:
    info = tarfile.TarInfo(name="%s" % file.name)
    info.mtime = time.time()
    info.size = len(file.data)
    tar.addfile(info, StringIO(file.data))
tar.close()

The basic idea is you use TarInfo to specify the filename, size, modified time (this is important otherwise tar will complain when the date is older than epoch), etc. You use StringIO to turn your data into an object tarfile will accept, and you use the two to add the file to the archive. This works really well, except for one issue that I am still working on. If you bunzip2 the compressed archive, and then attempt to do a tar -t on it, it hangs and does nothing. It’s possible that gnu tar has a problem, or that the way tarfile is creating the file isn’t correct, but it does decompress and explode properly which is the important part!

It’s that time of year again! Tonight marks the start of CarolinaCon 7. If you’ve never been, it is a great little tech conference put on by nc2600. It’s grown over the years and is now up to around 200 attendees. I’ve been very lucky to be around for the last 3 (including the one this weekend) and have also been extremely lucky to have the honor of giving a talk at each one.

This year I will be giving a talk on Malware Identification and Classification. Specifically, I will be showing how to do this using Yara and Python. Since malware has become a major problem and is exploding in growth, I thought it would be a great topic to talk about. If you’re in the Raleigh area and want to attend, the conference is extremely cheap to get into, and it gives you access to an entire weekend full of over 15 talks, trivia, lock picking, capture the flag, and more! If you don’t get a chance to attend, but are still interested in my talk, I’ll be posting the slides and demo content after I’m done presenting. My talk is at 2pm tomorrow. Also on Sunday Gerry Brunelle will be giving a talk on Malware Analysis, which fits in beautifully with my talk. Between those two talks, you should have a great intro into the world of Malware.

Hope to see you there!

I’m heading up to DC tomorrow for ShmooCon. I’ll be stopping by “the mothership” potentially and then heading out for a dinner meeting with management if they aren’t snowed in. After that, it’s all con all the time until Sunday evening. I’m looking forward to a lot of the great times we shall have, and the great stories to share. Hopefully this year we won’t get snowed in by 50+” of snow (even though looking back in it, it was pretty awesome).

Once I get back and I can detox from the con, I’ll post a summary of the shenanigans that went down. If you’re going, hope to see you there!

Tabs and spaces are code killers, visually and syntactically. Dealing with Python, you might see the error in the title often when sharing code with others. Focusing on VIM, people love to setup their rcfile so they get the most out of what VIM has to offer, but also makes their code easy to write and easy to read. Unfortunately, many times it’s just easy to read for them and no one else.

Python is very picky about spacing and indentation since it’s used to determine the flow of your script or program. Miss a tab or series of spaces somewhere and all the sudden you are executing code that’s not inside the if statement you just made, or you’ll get errors like above because your code just flat out fails.

When dealing with Git, and having several people committing code to a similar script or project, you are bound to run into situations where tabstops and shiftwidths are in conflict. It may look like all of the code lines up visually, but where one person uses spaces, another may have used actual tabs. Then you notice they made over 100 changes to the code you’re working on, and there’s tabs and spaces mixed all over the place. You have the following in your .vimrc:

set expandtab
set tabstop=4
set shiftwidth=4

It would be great to fix all of the tabs and spaces to be the same. Now what? VIM to the rescue! Bare with me, this is a bit complicated. Once you open the file in VIM, execute the following:

:%retab

You’re done, by the way. This uses your settings, and converts all of the tabs to the settings in your .vimrc. Now your entire file is standardized as far as your spacing goes and you can say goodbye to your unexpected indent errors. If you find yourself having to do this often, you can bind a key to automagically make the changes for you by adding this into your .vimrc (using F2 in this example):

map <F2> :retab <CR> :wq! <CR>

This will convert all of the tabs to spaces and save the file for you. Enjoy!

Ever since I started gaming when I was a kid, everyone used WASD for their primary movement controls. Back when I was a kid it made sense too. The keys are in the same layout as the arrow keys, but they allow you to have access to other keys around it (with the added bonus of not having to reach across the keyboard with your left hand). But by the time I got to college, WASD didn’t work for me anymore:

  • My hands grew and my fingers felt very cramped.
  • The W key seemed way too shifted to the left to be comfortable, resulting in my index and ring finger fighting for key space.
  • Pushing my hand so far left on the keyboard resulted in me having access to fewer keys easily.

I spent time thinking of a better keybinding layout. Moving my hand to the middle of the keyboard started to get uncomfortable, and my wrist started to curve in a way that made using an “arrow keys” layout impossible. Here’s what I wanted to accomplish:

  • More comfortable, natural keybinding layout for my hand.
  • A layout that also gave me access to more keys quickly.
  • The new keys that are available need to be used in a way that makes sense.
  • The layout needs to work across multiple genres of games so I am always using the same comfortable binding scheme.

While I was thinking of a new layout, I noticed my hand resting on certain keys of the keyboard, and it hit me. My new layout was EASF. The interesting thing is by putting my hand on these keys, a couple things happen naturally:

  • My pinky finger rests nicely on the shift key (used commonly for walk/run/use) and CTRL becomes easier to hit (which I use for Ventrilo).
  • My thumb rests nicely on the space bar (used commonly for jumping).
  • Instead of just having Q available to my index/ring finger, I now had W as well (I tend to use W for things like Flashlight, which are normally bound to F).
  • R for reload, and G for grenade are much easier to use, and exposes T and 5 as a new easy-access keys.
  • Instead of using C for duck/crouch, I can now use D, which is much closer to the rest of my movement keys.
  • Z, X, C, and V are easier to hit now that my hand isn’t cramped up and can be used for misc stuff (although I use V as push-to-talk in-game voice chat primarily).
  • TAB is still just as easy to hit except now I don’t have to raise my pinky off of shift to get to it.

I tried using RSDG (shifting one key to the right). It’s also pretty comfortable, but I find that hitting TAB becomes annoying, SHIFT is ok but sometimes my pinky hits Z, CTRL is too hard to quickly use, H doesn’t make sense for grenade and T doesn’t make sense for reload, and Q becomes difficult to hit making it useless.

Of course, you need to use what is right for you, but from a strategic standpoint, I found that moving away from WASD made using the keyboard so much easier and more comfortable. Maybe EASF isn’t for you, but I highly recommend experimenting and finding the best keybinding layout for your hand. I’ve been using this layout for 9 years and it hasn’t let me down yet. It’s also really fun to watch people try and play a game on my machine!