SBM MODSCRIPT, PART 16 - BASE64DECODE
I recently discovered that I would need a way to do base64 decoding for a ModScript I was writing. This can be tricky, as the output could be a binary value with embedded zeros. You could certainly do this with the output as a Vector with each entry a uint8_t (unsigned byte). However, in my use case, I knew that the data was text and could be represented as a string. As such, I wrote the following:
add_global_const("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/", "CONST_BASE64TABLE"); def Base64DecodeAsText( string input ) { // assumes output is valid text (not binary) var sOut = ""; var buf = [uint8_t(),uint8_t(), uint8_t(), uint8_t()]; var encoded = int( input.size() ); var count = 3 * ( encoded / 4 ); var i = 0; var j = 0; while ( sOut.size() < count ) { // Get the next group of four characters // 'xx==' decodes to 8 bits // 'xxx=' decodes to 16 bits // 'xxxx' decodes to 24 bits for_each( buf, fun( entry ){ entry = 0; } ); // zero out buffer var stop = min( encoded - i + 1, 4 ); for ( j = 0; j < stop; ++j ) { if ( input == '=' ) { // '=' indicates less than 24 bits buf = 0; --j; break; } // find the index_of inside CONST_BASE64TABLE for our value buf = fun( s, c ) { for ( var i = 0; i < s.size(); ++i ) { if ( s == c ) { return i; } } return string_npos; }( CONST_BASE64TABLE, input ); ++i; } // Assign value to output buffer sOut += char(buf[0] << 2 | buf[1] >> 4); if ( sOut.size() == count || j == 1 ) { break; } sOut += char(buf[1] << 4 | buf[2] >> 2); if ( sOut.size() == count || j == 2 ) { break; } sOut += char(buf[2] << 6 | buf[3]); } return sOut; }
The function above iterates the input string contents and uses base64 to create an decoded output string. Notice that the "buf" variable is a Vector of 4 unsigned, 8 bit integers. As we are going to use bit shifting in order to decode the data, it is important to use unsigned byte data to ensure the expected bit-shift result. We find the index of the character in CONST_BASE64TABLE to find the data-representation we are looking for, then use bit shifting to convert the buf value to text. The result is the original text after processing the base64 algorithm. A possible use case for this might be in decoding HTTP headers from a REST call.