RLE (Run Length Encoding) compression

-=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- (c) WidthPadding Industries 1987 0\|334\|0 -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=- -=+=-
SoCoder -> Article Home -> Advanced Techniques

cthug	Created : 08 August 2009 Edited : 09 August 2009 System : Cross Platform Language : Monkey RLE (Run Length Encoding) compression A simple way of compressing data
I have been writing a simple archiving library to protect my game data (sound, image etc), using a custom format based on TAR archives. Where each file with is compressed then the whole archive is encrypted, compressed using RLE, Run Length Encoding a very simple form of lossless compression. I will now explain the code to Compress and Deflate using my RLE. The code uses SDL types for specific integer sizes like Uint32 so include SDL header. // include #include <stdlib.h> // stdlib for malloc #include <SDL/SDL.h> --v Now the compression code // variables Uint8 RSRCBuffer; Uint8 RSRC_Compress(const Uint8 in, Uint32 len, Uint32 outLen) { // local variables Uint32 n, j, i; Uint8 buf, last, c; // allocate memory buf=(Uint8)malloc(len2sizeof(Uint8)); // compress using RLE compression j=0;n=0;i=0;last=0; do { c=in[n]; if (c == last && j < 254) j++; else { if (j != 0) { buf[i]=j; i++; buf[i]=last; i++; j=0; } last=c; j++; } n++; } while(n<(len+2)); // transfer data to RSRCBuffer free(RSRCBuffer); if (i>=len) { RSRCBuffer=(Uint8)malloc(lensizeof(Uint8)+sizeof(Uint8)); RSRCBuffer[0]=0; for (n=1;n<(len+1);n++) { RSRCBuffer[n]=in[n-1]; } // return data outLen=(len+sizeof(Uint8)); return RSRCBuffer; } else { RSRCBuffer=(Uint8)malloc(isizeof(Uint8)); for (n=0;n<i;n++) { RSRCBuffer[n]=buf[n]; } // return data outLen=i; return RSRCBuffer; } } --v The reason for the variable RSRCBuffer is for memory conservation. In the archive loading function in the lib, the file loading, compression and encryption functions all use this buffer. The only down side of this is that you can only have one archive open at any one time, but this works fine for me . What this code does is count how many bytes are the same in a row and place the number, then the character in the compressed buffer. Like QQQFFFFZZZ would produce in hexadecimal: 03 51 04 46 03 5A. That compresses from 10 bytes to 6 bytes. This is good for images and other repetitive sources, but not so good for text, this can actually double the filesize if there is no patterns. Because the count is stored as only 1 byte, it has a max of 255, so every 255, it stores that and starts again. So if you had 300 bytes of 'Q' it would be FF 51 2D 51. This also limits the file compression ratio, it can be anywhere between 1 : 128 or 2 : 1, depending repetitiveness of the source. \|edit\| Now if the compressed version is larger than the original, the original will be returned with a zero at the front. Having a at the beginning is impossible unless specifically placed there so, the defalte function can pick it up easily, just return the value of the file. \|edit\| And the decompression code: Uint8 RSRC_Deflate(const Uint8 in, Uint32 len, Uint32 outLen) { // local variables Uint32 n, i=0, j, z; Uint8 buf; // check if file was indeed compressed if (in[0] == 0) { free(RSRCBuffer); RSRCBuffer=(Uint8)malloc(lensizeof(Uint8)-sizeof(Uint8)); for (n=1;n<len;n++) { RSRCBuffer[n-1]=in[n]; } // return data outLen=len-1; return RSRCBuffer; } // get length for (n=0;n<len;n+=2) { i+=in[n]; } // allocate memory buf=(Uint8)malloc(isizeof(Uint8)); n=0;z=0; for (n=0;n<len;n+=2) { for (j=0;j<in[n];j++) { buf[z]=in[n+1]; z++; } } // move data into RSRCBuffer free(RSRCBuffer); RSRCBuffer=(Uint8)malloc(isizeof(Uint8)); for (n=0;n<i;n++) { RSRCBuffer[n]=buf[n]; } // return data outLen=i; return RSRCBuffer; } --v This reverses what RSRC_Compress does, it puts in[n+1] into the buffer in[n] number of times. \|edit\| Changed source to use unsigned bytes, I just did a test with 900.1KB BMP, compressed to 31KB \|edit\|

Comments
Sunday, 09 August 2009, 09:13
Mog	I hated RLE so much, it was used in my Infantry project to store media files, and we had such a fun time decoding it. So by explaining the horridness of it, Good tutorial!
Sunday, 09 August 2009, 09:44
Afr0	Why not use zlib and encrypt a zlib compressed buffer instead?
Monday, 10 August 2009, 03:54
cthug	RLE is faster than zlib, but the compression ratio of zlib is far higher in comparison to time taken. I kinda wish I did just use zlib, I thought it would take a half hour and it took an hour longer, and there still a slight bug that I found last night, with signed/unsigned values (loss of 4 bytes in the middle of the code but only with BMP files?). BUT I think that is with my file loading code.

RLE (Run Length Encoding) compression

Comments