GIDForums  

Go Back   GIDForums > Computer Programming Forums > C Programming Language
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

 
 
Thread Tools Search this Thread Rate Thread
  #1  
Old 01-Sep-2004, 05:17
bigbangman bigbangman is offline
New Member
 
Join Date: Sep 2004
Posts: 3
bigbangman is on a distinguished road

Encryption implementation issue


Hi there,

I am not very experienced with C/C++ but I have a project that requires some C++ code.

What I want to do is get a basic encryption module working. I don't want to rely on huge libraries such as crypt++ or mcrypt or anything like that, I just want a simple encryption algorithm in a self-contained block of code.

The encryption algorithm I have selected is called the Tiny Encryption Algorithm or "TEA". It is a very simple block cipher that accepts a 128-bit key.

All I want to do is to create simple C++ wrapper functions to encrypt and decrypt data.

The example attached *looks* like it is working properly. It encrypts a block of text and then decrypts it successfully. As you'll see, the strings are casted to unsigned longs so the encryption algorithm can do its thing.

However, what I need to be able to do is access the encrypted text as either a char or string so I can return it to my application. When I cast it to a char, I get a string which is much less than the length of the plain text string. This is wrong, they should have identical lengths.

What I think is happening is that when I cast back to char from the unsigned long, any bytes that would cast back to "non-printable" characters are being lost. Is that likely? Do I have to perform some sort of encoding, like Base64 on the byte stream so it can be successfully output as a string?

Please bear in mind that in this case,

- I cannot use platform-specific libraries
- The output must be of type string or char, I can't use the standard file libraries.

CPP / C++ / C Code:
/*

Tea.cp

A simple C++ implementation of the Tiny Encryption Algorithm (new variant), described here:

[url]http://www.simonshepherd.supanet.com/tea.htm[/url]

The encryption routines have the general form

void encipher(const unsigned long *const v,unsigned long *const w,
   const unsigned long * const k);

void decipher(const unsigned long *const v,unsigned long *const w,
   const unsigned long * const k);

TEA takes 64 bits of data in v[0] and v[1], and 128 bits of key in k[0] - k[3].

The result is returned in w[0] and w[1]. 

*/

#include <cstring>
#include <cstdlib>
#include <iostream>
#include <string>

using namespace std;


//function prototypes

void encipher(const unsigned long *const v,unsigned long *const w, const unsigned long * const k);
void decipher(const unsigned long *const v,unsigned long *const w, const unsigned long * const k);

int main()
	{
	
	// this is the encryption key
	char *key= "1234567890123456";
	
	// this is an unsigned long version of the encryption key which is passed to the encryption algorithm
	unsigned long * l_key=(unsigned long*)key;
	
	// this is the plainText that we want to encrypt
	char *input="Should you ask me,\nwhence these stories?\n"
"Whence these legends and traditions,\nWith the odors of the forest\n"
"With the dew and damp of meadows,\nqwertyuiop1\n"
"Should you ask me,\nwhence these stories?\nWhence these legends and"
"traditions,\nWith the odors of the forest\nWith the dew and damp of"
"meadows,\nqwertyuiop1\noctopus";
	
	// the length of the plainText
	int len=strlen(input);	
	
	// this is an unsigned long version of the text which is passed 
     // to the encryption algorithm
	unsigned long * l_input=(unsigned long*)input;


	//this will be the encrypted text
	char *encryptedText=new char[len];
	// this is the unsigned long version of the encrypted text
    unsigned long * l_encryptedText=(unsigned long*)encryptedText;
    
    
      
    //this will be the decrypted text
	char *decryptedText=new char[len];
	// this is the unsigned long version of the encrypted text
    unsigned long * l_decryptedText=(unsigned long*)decryptedText;
        
    
    // calculate the number of longs in the input lext
    int numberOfPairs = len/8;
	
	cout << "Looping through " << numberOfPairs << " longs" << endl;
	
	// first, encrypt the data 
	
	//loop through the longs, using the encryption algorithm to fill
     //l_encryptedText with an encrypted copy of l_input
	int i;		
    for (i=0; i < numberOfPairs; ++i)
    	{
		encipher( l_input+2*i, l_encryptedText+2*i, l_key);
		}
	
	// pad any extra chars unencrypted
	if(len%8)
		{
		memcpy(l_encryptedText+2*i,l_input+2*i,len%8);
		}
	
	
	//now, decrypt it 
	
	//loop through the longs, using the encryption algorithm to fill
     // l_decryptedText with a decrypted copy of l_encryptedText
    for (i=0; i < numberOfPairs; ++i)
    	{
		decipher( l_encryptedText+2*i, l_decryptedText+2*i, l_key);
		}
	
	// pad any extra chars unencrypted
	if(len%8)
		{
		memcpy(l_decryptedText+2*i,l_encryptedText+2*i,len%8);
		}
	

	cout << "Length of input: " << len << endl;
	
	cout << "Length of encrypted output (should match length of input): " << strlen(encryptedText) << endl;

	cout << endl << endl << encryptedText;
		
	cout << "Length of decrypted output (should match length of input): " << strlen(decryptedText) << endl;
	cout << endl << endl << decryptedText;
	
	
	delete encryptedText;
	delete decryptedText;
	return 0;
	}

/*

These are the functions used in the reference implementation of TEA

*/
	
void encipher(const unsigned long *const v,unsigned long *const w, const unsigned long * const k)
	{
	register unsigned long y=v[0],z=v[1],sum=0,delta=0x9E3779B9,n=32;
	while(n-->0)
		{
		y += (z << 4 ^ z >> 5) + z ^ sum + k[sum&3];
		sum += delta;
		z += (y << 4 ^ y >> 5) + y ^ sum + k[sum>>11 & 3];
		}
	w[0]=y; w[1]=z;
	}

void decipher(const unsigned long *const v,unsigned long *const w, const unsigned long * const k)
	{
	register unsigned long y=v[0], z=v[1], sum=0xC6EF3720, delta=0x9E3779B9, n=32;

	/* sum = delta<<5, in general sum = delta * n */
	while(n-->0)
		{
		z -= (y << 4 ^ y >> 5) + y ^ sum + k[sum>>11 & 3];
		sum -= delta;
		y -= (z << 4 ^ z >> 5) + z ^ sum + k[sum&3];
		}
	w[0]=y; w[1]=z;
	}
	
  #2  
Old 01-Sep-2004, 09:35
davekw7x davekw7x is offline
Outstanding Member
 
Join Date: Feb 2004
Location: Left Coast, USA
Posts: 5,218
davekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to behold
Quote:
Originally Posted by bigbangman
What I think is happening is that when I cast back to char from the unsigned long, any bytes that would cast back to "non-printable" characters are being lost. Is that likely? Do I have to perform some sort of encoding, like Base64 on the byte stream so it can be successfully output as a string?
Not only is it not likely, it is not possible. Using casts does not change any data. A cast tells the compiler to treat a variable as a type that is different from the type that it was declared.

Now, the function strlen looks at the sequence of chars, starting with the address that you give it, and counts chars until it finds a char with value of 0.



If the encrypted data has any single byte with value 00, strlen thinks that's the end of the string. Say, for example an unsigned long has value 0x12003456. Strlen would stop at the 00, with a value equal to however many bytes it has seen up to that point.

In other words, this use of strlen() is not appropriate. (It might give the "right answer", but you shouldn't count on it.)

As a matter of fact, I compiled and executed your code and got 311 for the input text and the encrypted text. (Borland bcc32, Microsoft Visual C++, gnu g++ on my Windows XP box and g++ on my Linux box).

Dave
  #3  
Old 02-Sep-2004, 00:04
bigbangman bigbangman is offline
New Member
 
Join Date: Sep 2004
Posts: 3
bigbangman is on a distinguished road
Thanks, Dan, that makes perfect sense. I completely overlooked the null byte issue. I checked and the encryption code is definitely writing null bytes so this explains it.

It's interesting that code compiled and ran successfully on your systems, I will try a different compiler to check this. I have been using Codewarrior 8 on Mac OS X (which is needed for this particular project).
  #4  
Old 02-Sep-2004, 10:03
davekw7x davekw7x is offline
Outstanding Member
 
Join Date: Feb 2004
Location: Left Coast, USA
Posts: 5,218
davekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to behold
Quote:
Originally Posted by bigbangman
Thanks, Dan, that makes perfect sense. I completely overlooked the null byte issue. I checked and the encryption code is definitely writing null bytes so this explains it.

It's interesting that code compiled and ran successfully on your systems, I will try a different compiler to check this. I have been using Codewarrior 8 on Mac OS X (which is needed for this particular project).

(It's Dave)

I just now did a copy and paste from your original post, and I have run your original code on two different PCs. The Windows compilers that I used are Borland bcc32, Microsoft Visual C++, and gnu g++. On my Linux box I uset gnu g++. The bytes written from your encryptedText array are identical for all. (304 non-zero bytes from encrypt(), followed by "octopus", in the last seven).

I am looking for system-dependent behavior, and I realize that I don't know anything about Mac systems.

It occurs to me that when you cast (char *) to (unsigned long *) it doesn't change the bytes that are in memory, but the values used in encrypt() will be different for little-endian systems and big-endian systems.

What is the size of your unsigned long types? (Mine is 4, which is consistent with what I expect the behavior of your program to accommodate.) Is your system big-endian or little-endian (mine are little-endian, as are all systems based on Intel/AMD Pentium/Athlon, etc., 32-bit systems).

Don't know? Try the following:

CPP / C++ / C Code:
#include <stdio.h>

int main()
{

  unsigned char *chpoint;
  unsigned long test;
  int i;

  printf("\n\n  sizeof(unsigned long) = %d\n\n", sizeof(unsigned long));
  printf("Conclusions below are based on systems with sizeof(unsigned long) == 4\n\n");
  test = 0x12345678;
  chpoint = (unsigned char *)&test;

  printf(" The bytes are:          ");
  for (i = 0; i < sizeof(unsigned long); i++) {
    printf("0x%02x ", chpoint[i]);
  }
  printf("\n\n\n");
  printf(" If the first byte is 0x12, this system is big-endian\n");
  printf(" If the first byte is 0x78, this system is little-endian\n");


  return 0;
}


Note that this is a valid C program and a valid C++ program, so compile it as either. The results on a given system will be the same.

Let me know what you discover.

Dave
  #5  
Old 02-Sep-2004, 10:30
bigbangman bigbangman is offline
New Member
 
Join Date: Sep 2004
Posts: 3
bigbangman is on a distinguished road
Sorry Dave, I don't know where I pulled "Dan" from.

Here's the output of your code:
Code:
sizeof(unsigned long) = 4 Conclusions below are based on systems with sizeof(unsigned long) == 4 The bytes are: 0x12 0x34 0x56 0x78 If the first byte is 0x12, this system is big-endian If the first byte is 0x78, this system is little-endian

As I suspected, the Mac is big-endian.

Quote:
It occurs to me that when you cast (char *) to (unsigned long *) it doesn't change the bytes that are in memory, but the values used in encrypt() will be different for little-endian systems and big-endian systems.

Ah, that's interesting. I will have to investigate this.
  #6  
Old 02-Sep-2004, 10:33
LuciWiz's Avatar
LuciWiz LuciWiz is offline
Moderator
 
Join Date: Jul 2004
Location: Cluj-Napoca (Romania)
Posts: 1,032
LuciWiz is a jewel in the roughLuciWiz is a jewel in the roughLuciWiz is a jewel in the roughLuciWiz is a jewel in the rough
Quote:
Originally Posted by davekw7x
I am looking for system-dependent behavior, and I realize that I don't know anything about Mac systems.
Dave

Motorola processors (like those used in Mac and, I think, IBM 390) use "Big Endian" byte order.

Regards,
Luci
__________________
Please read these Guidelines before posting on the forum

"A person who never made a mistake never tried anything new."
Einstein
  #7  
Old 02-Sep-2004, 12:21
davekw7x davekw7x is offline
Outstanding Member
 
Join Date: Feb 2004
Location: Left Coast, USA
Posts: 5,218
davekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to behold
Quote:
Originally Posted by bigbangman
Sorry Dave, I don't know where I pulled "Dan" from.

Here's the output of your code:
Code:
sizeof(unsigned long) = 4 Conclusions below are based on systems with sizeof(unsigned long) == 4 The bytes are: 0x12 0x34 0x56 0x78 If the first byte is 0x12, this system is big-endian If the first byte is 0x78, this system is little-endian


As I suspected, the Mac is big-endian.



Ah, that's interesting. I will have to investigate this.

When I took your input string and flipped the bytes (so that the array of unsigned long is presented to the encipher() function like they would be in a big-endian system), and I flipped the bytes in your key, I got some zero bytes in the encrypted text.

Other inputs and other keys might, or might not, give 00 bytes in little-endian or big-endian systems, so it's never OK to use strlen() to discover the length of the encrypted data.

Now, since you can't use strlen() to determine the length of the encrypted text, you the decipherer must know the length somehow. Note that, since your encrypted text could have a number (between 0 and 7) of unencrypted bytes at the end, you can't simply use the file size of the encrypted text.
So you must tell the length to the decipherer (either by separate communication, or, maybe putting the length somewhere in the file). Of course the decipherer must also know the key.

It does point out an interesting thing: if you encode on a big-endian system and decode on a little-endian system, one or the other must flip the bytes when using a method that casts (char *) as (unsigned long *). The decrypter must also must apply the decrypt algorithm consistent with the endianness of the encrypton. Also, note that not all systems have 32-bit unsigned longs. So that convention would also have to be included in the specification.

Are we having fun yet? I've learned a few things today, so I have to say it's a Good Day.

Regards,

Dave
Last edited by davekw7x : 02-Sep-2004 at 13:07.
 
 

Recent GIDBlogProgramming ebook direct download available by crystalattice

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PHP/MySQL coding issue cmarti MySQL / PHP Forum 3 26-Jul-2004 09:01
Help with binary files (encryption?) pablowablo C++ Forum 6 28-Apr-2004 23:47
setiosflags issue dcj1978 C++ Forum 1 08-Aug-2003 06:16
Php bbcode issue Caged MySQL / PHP Forum 3 06-Aug-2003 19:55
Loading issue jrobbio Websites Reviewed Forum 4 15-Jan-2003 06:36

Network Sites: GIDNetwork · GIDWebHosts · GIDSearch · Learning Journal by J de Silva, The

All times are GMT -6. The time now is 21:17.


vBulletin, Copyright © 2000 - 2009, Jelsoft Enterprises Ltd.