OS X Anti-Forensics Techniques 3 - Expanding the Attack Space
Read previous part: OS X Anti-Forensics Techniques 2 - Assaulting OS X
The final part of the Grugq’s HIRBSecConf presentation includes an outline and description of more attack vectors, and a Questions and Answers section.
When Unicode Attacks
So, let’s look at something else. We’re going to look at how we can use Unicode; there’re Unicode bugs that we can exploit, and we can actually do data storage inside Unicode strings. Everyone’s very excited by this, I can tell.
What we’re going to be looking at is zero width Unicode glyphs. Basically, a glyph is something that gets displayed to the screen or advances the cursor on the screen. UTF-16 has non-glyph characters; they basically occupy space within the UTF-16 stream, but they don’t display any glyphs when they are printed to the screen, and they don’t actually cause the cursor to move. So they’re essentially invisible. And we only need two of those. The first one we have is the ‘null character’, which is basically all zeros. The other one is the ‘zero width space’, it’s the space that occupies zero width.
Now we can do data storage by converting our bytes into chunks of Unicode characters. So we actually now have a way of converting a stream of bytes into a stream of zero width Unicode characters. How awesome is that? Very awesome! The cool thing about this is that mostly what happens is UTF-16 gets converted to UTF-8 on the way back, so it will be read up as UTF-16 off the disk and it will get converted to UTF-8 to be displayed on a screen. And the awesome thing is the way that the null character over here gets converted into UTF-8 is it becomes a null terminator, which means your null character actually terminates your UTF-8 string, which means all of the rest of the data is now invisible, completely. It vanishes in the conversion process itself. So you can actually hide data in a file system by converting your data to invisible UTF-16 characters, appending them to filenames, so you’ll have a filename that’s, like, 2 characters plus 200 characters of UTF-16 data.
When the forensics investigator comes by trying to look at where all the stuff is, he will read up the filename, convert it to UTF-8 which strips out all of your data, and then start looking inside the contents of that file for your data, when his tool has already basically erased the data for you. So it’s data that will get printed to the screen; it’s incredibly ironic. The data he is looking for is displayed directly in front of him, but it’s invisibly displayed.
So, where can we use UTF-16? There’re some cool things, like the iPod iTunes database actually uses UTF-16 for everything. That means the artist name, the album name, the genre name, the path of the file – basically every single piece of metadata associated with a track inside an iTunes database is stored in UTF-16 characters. And it’s duplicated for every single entry, so if you’ve got 13 tracks from the same album, each of them will have the album name individually, embedded within that entry. It’s also used for filenames, for example in HFS+, NTFS and in FAT32, in what’s called the ‘long filename format’, so you can actually do attacks with UTF-16 characters against FAT32 on your phone. Pretty cool stuff. And obviously, you can do it inside documents, so you can have your document encoded into the UTF-16 character set and you can do your data storage in there.
I call this the Z triple U attack, ZWU (see right-hand image). We’re going to look only at how we can attack an iPod database. Basically, each entry in an iPod database has a maximum of 255 Unicode UTF-16 characters. It takes us 8 characters to encode a single byte, which means that at a maximum we can encode about 31 bytes, only looking at the artist, the album, the track name and the path. Theoretically, we get about 100k of storage per 1000 tracks in an iPod, which isn’t too bad: if you’ve got a large iPod with a large number of tracks, you can actually have a couple of Megs of storage base in there that’s invisibly embedded.
There’re some downsides with iPods. I’ve been working a lot on iPod attacks; there’re also attacks against the structured storage format that they use, the ‘mhbd’ database; you can do things in there. This code (left-hand image) actually does work. Basically, it only has two methods: I’ve got ‘tounicode’ and ‘fromunicode’. As you can see here, if we call ‘tounicode’, what we get back is an 88-character length string which, when we print out, is zero width. The problem is, if you print it directly without encoding it like this, all of the zero width space characters get converted into normal space characters, and then it shows up on the screen. Honestly, it does work; it’s one of my few demos that does work, please believe me. Okay, so the cool thing is you can basically attack pretty much anything, including even Unicode. Unicode is itself simply a structured data storage format, so you can FIST the Unicode.
Application File Formats
So, next stuff we’re going to look at is how we can move beyond structured format and look at other attacks that we can do. This one (see right-hand image) is kind of funny because it simply takes a slightly different way of thinking. We’re going to look at browser cookies. First of all, what is a cookie? Very simple – a cookie is a URL that’s been associated with some sort of key, a variable name, and a value. And chances are that value is a base64 encoded chunk of encrypted binary data that’s been sent to you by the website. Well, what do we want to store? We want to store a base64 chunk of encrypted binary data. So, the idea is that you can actually use your cookies file for hidden data storage, because you can create fake cookies that contain nothing but your base64 encrypted data.
Looking just quickly at the Mozilla’s cookie file, it’s SQLite database; you can access it using only the sqlite3 tool. The scheme is very simple, looks like this (see left-hand image). Basically, the only ones that you need to worry about are the name, the value, the host sometimes, and the id which should be incremented anyway.
So, I’ll now show you another awesome demo on how you can do data storage. This is guaranteed to be ignored by just about any forensics investigator. You create your data storage, you put it under www.lolitapictures.com – and you’re away (image to the right).
One of the other things that you can do is you can take the Mozilla cookies file and you can create your own table, and then you can do your data storage directly in that table. So you can create table “my hidden secret data” with a text value and then put your stuff in there. When a forensics tool comes along, what they’re going to do is they’re going to open it up, they’re going to go to the Moz cookies table and dump all of the data out of the Moz cookies table, and then close it. They’re not going to look at all of the tables that have been defined for that particular database, because the people who implement these tools are stupid. So, we can do really simple things like that and we’re going to get away with it. But let’s do something cooler.
So, what we’re going to look at is SQLite file format. SQLite, once again – structured data storage. All SQLite does is it takes a file, it divides that file up into pages; a page contains a header on the very first page; then there’re all of these free page which are not allocated, they’re stored as a binary tree. Basically, all of the free pages are stored in a tree, and you can store your data inside those free pages, which we will do.
There’re more exciting things that you can do inside the page contents, inside what’s called a cell. That’s what is used to create the binary trees for the tables. Unfortunately, that code is very-very complicated and ugly, so it’s not working properly, unlike all my other much-clean code that is also not working properly. Anyway, SQLite is really ugly inside.
Actually, if you have a table in SQLite and it grows very big, SQLite will allocate additional blocks, it will expand the size of that file. If you then drop that table or shrink the size, SQLite doesn’t release those blocks back to the system. It keeps the same large size. What that allows you to do is take SQLite database, make it very big, drop the table, and then use all of that free space for data storage. The one problem with that is if auto-vacuum is enabled it can cause your internal databases to be adjusted, and that means when the B*trees get re-balanced, all of your stuff gets wiped out. So, don’t do it on live databases – that’s bad.
Alright, now what we’re going to do is run Python and try and remember the command line options that it takes. Okay, Projects, Anti-Forensics, SQLite – yes. So, what this is going to do is erase my ‘etc/passwd’ file, overwrite a portion of my disk, and…crash horribly. Okay, in theory now, what we have is a ‘cookies.sqlite’ file which has my ‘etc/passwd’ stored inside. And we can probably do strings.
Okay, what happened is rather than modifying the existing database I created a new database called ‘test.sqlite’ which you can see here, and it’s just being created now, honestly. If we run SQLeez with just this as the input, it will print out the contents.
Basically, what that attack is doing is it is injecting data into the free space within the SQLite file. Anyone doing forensic analysis on SQLite is going to have a significant disadvantage in this case because there are no forensics tools for SQLite, which basically brings us to the conclusion that the future of anti-forensics is out of the file system, into the files. File systems are known; there’re only a few different implementations that you can go after, and it’s possible for people to figure out the bugs and fix them.
File formats, on the other hand, are pretty much a mystery. You can take an existing close to us application on Windows, reverse-engineer the file format that it uses for storage, do your attack against that – and it will never ever be found. Basically, attacking file systems is old hack these days; it’s not the place to go after anymore. You want to move into the application space. Once you’re at the application space, it’s a lot easier to do data storage.
So, my recommendation is to go for the stuff that I’m working on right now, such as to use the iPod database as a data storage mechanism. One of the reasons that’s a really cool way of doing it is that it’s very-very easy to erase, because when you plug your iPod into an iTunes computer, it will rewrite the database from scratch and erase all of the work that you’ve spent three nights working on and didn’t know that it automatically erases it as soon as it gets plugged in, which is a great feature for destruction because you can sit there, you see the cops coming, you plug your iPod in – bang, data gone! Goddamn!
Essentially, the idea that I’m suggesting to everyone is from these days on you should really focus on data contraception attacks: how you can stay in RAM longer; the less things that you can do to actually interact with a computer, the better; and also, move into the application space, go after file formats – that’s much more exciting than going after file systems.
Q & A
Question: How many real-world attacks have you triggered?
Answer: Actually, I didn’t trigger any. I didn’t do it on my main disk because I’m not that dumb. I learned this one when I used to do all my stuff on Linux. I only had to rewrite my source code twice from scratch. The thing is, file system code is actually very poorly written, because the assumption is always made that the only person who is writing to the file system is the kernel. And since you’re writing the code for the kernel, you can assume that anything that you read back is something that you wrote. So they don’t do a lot of error checking. I think they got hammered heavily with this about a year or two ago. But basically, the HFS+ code isn’t that bad, it’s kind of elegant. Most of the attacks that I’ve been doing – I don’t screw up pointers that badly – I write stuff out, something flushes over, and my data vanishes in between. And it’s very frustrating.
Question: What main problems do you encounter when doing anti-forensics?
Answer: There’re significant bootstrapping problems. You need to have a tool that allows you to access your hidden data store. But that tool cannot be hidden itself otherwise you can’t access your hidden data store. And if that tool isn’t hidden, then it’s basically like leaving the keys out next to the safe: no one can get in unless they know to look underneath the flower pot next to the front door – that sort of thing. So, there are problems with that. Most of the anti-forensics, as I understand it, is useful primarily as a post-break-in, so you break into a box and then you need to clean up to make sure that there’s less evidence that can be used against you, or used to track you down. That interests me a lot more than hiding pornography, for example.
The other thing is that most of the anti-forensics stuff here is not worth it, because you don’t actually need to do any of this to hide your stuff from a forensics investigator. The best stuff is: #1 – don’t get caught. If you do get caught, wait until you try and get a good lawyer. Generally speaking, all of the stuff, like the child pornography – they always go for the low-hanging fruit, they go for the easiest guys to find. Hackers that go for the low-hanging fruit – they go for the young kids who don’t know how to clean up after themselves, that haven’t developed all of the good habits of “wipe the logs, make sure no rootkit is running,” all that stuff. The people that they capture tend to be the low-hanging fruit because it’s easier for them to do that.
There’re various sorts of things that make anti-forensics as a specific region of study not very useful; like putting that dot in front of the filename on Unix – that still works, that is a very effective attack. And it’s going to continue to work for a very-very long time, because from an anti-forensics perspective your first objective is to not be detected. So, your adversary is not necessarily the forensics investigator; it’s the system administrator who has day-to-day interaction with that box. If that system administrator isn’t suspicious, then there’s not going to be an investigation. And if there’s no investigation, there’s no forensics officer to come and look at the file system. So, don’t piss of the sysadmin – and you don’t have to worry about hiding all of your stuff properly.