Wednesday, February 16, 2011
Closing the Loop -- Re-Engineering Android Applications
This technique is very common in the Android Market right now, with people modifying apps for good and bad reasons -- at some point, Google is going to have to do some level of verifing "good" applications and "responsible" developers, because the current market is packed with apps that demonstrate varying levels of naughtiness.
Monday, January 31, 2011
mp3collect.go -- reorganizing mp3 files by hashes of their mpeg-1 content
A friend asked me a couple weeks ago for a sample of what a "real" Go program looks like. I have been using Go quite a bit for fuzzers and analysis packages at IOActive for the last few months, but I obviously can't share those with anyone else. On the flight back from Shmoocon, I decided to write a Go program to solve a problem that has been slowly building up in my ~/music directory.
It's a real trainwreck; between cycles of using iTunes and copying my music between devices I now have this mass of duplicated songs that have tweaked ID3 tags so I cannot simply de-duplicate them using hashes. The solution is to calculate a hash for the actual MPEG frames frames in each file and ignore all the helpful metadata.
The program does just that by constructing hard links between a file and the hash of its media contents; duplicates are reported and left intact. The plan is to go back through those files and normalize their ID3 metadata using a program that doesn't try to "organize" my music -- Quod Libet. (Or the Android music player, which is too dumb to attempt any of this.) There is, of course, room for improvement -- it does not handle FLAC, OGG or M4A files, which do occur in my library due to certain stores using non-MP3 formats. (Trent Reznor, Rhythmbox and iTunes, respectively.) It also should have a way of properly handling cross-filesystem collections by copying the file instead of hardlinking.
mp3collect.go
Tuesday, December 21, 2010
Introducing Fuzzex, Generating Random Data From Regexes
Fuzzex produces sequences of random bytes using a generation language that is similar to that commonly used by regular expressions for parsing data. This similarity enables testers who are familiar with regular expressions to produce test data that can satisfy an application's superficial input validation and parsing without getting bogged down in specialized frameworks such as Sulley or Peach.
In situations where the regular expressions used for parsing and validation are available, Fuzzex enables using these expressions directly to develop tests that demonstrate potential weaknesses and exercise internal surfaces.
Example, a Very Permissive Email Address Regex:
>>> fuzzex.generate( '[^@]+@([^.]+)([.][^.]+)+' )
'\x07m\x10@\x0cI\x12%.\x1a.f.:'
Thursday, November 18, 2010
Spot the Crypto Bug
iv := read_cprng( 16 )
enc := aes_enc( key )
ciphertext := cbc_enc( iv, enc, iv + plaintext )
Where cbc_enc is a function that accepts an initialization vector, a block encryption function, and a buffer containing the plaintext to encrypt, and applies that function using the Cipher Block Chaining mode and the initialization vector.
Can you spot why, regardless of variance in the IV, given a constant plaintext and key, why the ciphertext never varies?
Sunday, November 14, 2010
Lexical Analysis of C using Python and Ply
I use a hybrid strategy, involving a simple webapp that does syntax highlighting and grep with a few simple features that lets me combine common browsing habits (history, document tabs and linking) with a minimal expectations environment. It isn't beautiful, or featureful, but it doesn't interrupt my flow.
Of course, there's always room for improvement, like a cross-reference of identifiers, and the source files that mention them. This requires simple lexical analysis which is where a smart C programmer goes to Flex. So, where does a Python programmer go? My best guess is Ply -- a Python Lexical Analyzer that merges Lex semantics with Python metaprogramming.
So, in WEPMA fashion, here is the interesting bit, a lexical analyzer that produces identifiers, line numbers, and tokens indicating the start and end of lexical scopes. It is barely smart enough to filter out comments and strings, and tolerant of unanticipated syntactic elements because, obviously, I couldn't be bothered to implement a full C lexer.
Enjoy, and no, you can't have my review tool. :)
Saturday, October 30, 2010
JavaScript, Closures, and Wasteful API's
Yahoo has written roughly 30 lines to encapsulate and abstract the simple functionality of passing a thunk to either setInterval or setTimeout. An example, stripped from Todd Kloots' YUI 3 demo:
var args = [ 1,2 ]
Y.later( 50, gizmo, gizmo.foo, args )
Could be more simply expressed as:
setTimeout( 50, function( ){ gizmo.foo( 1, 2 ) } )And, hey look, no CDN callout required. No need for a code reviewer to reach out for YUI's documentation to find out the special semantics of YUI, and it explains exactly what it means. And, bonus, fewer keystrokes.
Libraries like jQuery and YUI have valuable capabilities, such as concealing all of the W3C's pointless DOM verbosity behind more modern XPath-like selectors. But when these frameworks feel the need to abstract away closures, all I really see is a developer who has lost touch with the clear simplicity of JavaScript.. And start wondering if they get paid by the API function.
Monday, October 18, 2010
Long Polling with Node.JS and Express
JavaScript is regarded by Lisp hackers as Lisp without parenthesis, shackled by the problem domain of browser scripting. It's a great, powerful language for people who think in closures, but until the recent introduction of libraries like jQuery, it's also shackled to really cruddy libraries. When Google released V8 under the BSD license, I think many of us immediately ran to check out the source, write a partial general purpose environment, then wandered off to do better things. Like bugfixes for MOSREF. *cough*
Ryan Dahl, unlike the rest of us, stuck with it, and fused V8 with the similarly fascinating libev to produce a JavaScript environment for I/O-centric problems that don't live solely within the browser. The resulting Node.JS strikes an interesting balance between minimalism, functionality, and performance thanks to its reliance on existing projects with great characteristics.
When I encounter a new language or framework, I fall back on a set of problems dear to my heart -- writing a MUD server. With web frameworks, lately, this has been simplified down into "can I write a long-polling message wall with it?" Simple problem, tends to break most simplistic web frameworks simply because requests are often deferred, waiting for an update.
Here it is in Node.JS, using Express, about 50 lines of overcommented code. I'm sure it could be written faster, but probably not as concisely.
Next up, making a kobold walk around a message board.. ;)
Thursday, August 12, 2010
More Fun With Nessus Reports
nscross.py
(I reserve the right to be somewhat embarrassed if the Nessus experts come out of the woodwork with an option to do this, too, from the Nessus GUI..)
Wednesday, August 11, 2010
Nessus False Positives Getting Underfoot?
nsfix.py
This may work on OpenVAS reports, let me know if it causes a problem. As always, improvements are welcome.
Updated: pauldotcom from Twitter makes an excellent point that this can be achieved using the "Report Filters" interface. I blame my fear of flash guis for not finding this.
Monday, July 19, 2010
Cross-Platform Raw Character Input in Python
getch.py