GIDForums  

Go Back   GIDForums > Computer Programming Forums > C Programming Language
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

 
 
Thread Tools Search this Thread Rate Thread
  #11  
Old 25-Aug-2007, 15:11
davekw7x davekw7x is offline
Outstanding Member
 
Join Date: Feb 2004
Location: Left Coast, USA
Posts: 5,218
davekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to behold

Re: MACRO to detect big / little endian


Quote:
Originally Posted by ahbi82
I would look for the 2 functions that u recommanded form the mannual.

So, if you get the bytes of the header in the right order (independent of the endianness of the machine, then you can extract the information from the individual fields.

I would just read them a byte at a time in an array of chars and then examine the bytes:

CPP / C++ / C Code:
#include <stdio.h>
int main()
{
    unsigned char header[4];
    char *inname = "funny.mp3";
    FILE *infile;
    int i;
    infile = fopen(inname, "rb");
    if (!infile) {
        printf("There was a problem opening %s for reading.\n", inname);
        return 0;
    }
    printf("Opened %s for reading.\n", inname);
    i = fread(header, 1, 4, infile); /* read four one-byte items */
    if (i != 4) {
        printf("Error reading the header. Number of bytes read = %d\n", i);
        return 0;
    }
    printf("Here are the bytes of the header:\n");
    for (i = 0; i < 4; i++) {
        printf("  0x%02x ", header[i]);
    }
    printf("\n");
    return 0;
}
Output for a file that I just happened to have lying around:
Code:
Opened funny.mp3 for reading. Here are the bytes of the header: 0xff 0xe3 0x20 0xc4

(I'll let you figure out the sample rates, etc.)

Now if you want to read the four bytes into a 32-bit integer, you will have to be concerned about endianness, and you could do something like the following:
CPP / C++ / C Code:
#include <stdio.h>
#include <stdint.h>
#include <arpa/inet.h>

int main()
{
    uint32_t four_bytes; /* in machine-endian order         */
    uint32_t header;     /* in big-endian order             */
    char *inname = "funny.mp3";
    FILE *infile;
    int i;
    infile = fopen(inname, "rb");
    if (!infile) {
        printf("There was a problem opening %s for reading.\n", inname);
        return 0;
    }
    printf("Opened %s for reading.\n", inname);
    i = fread(&four_bytes, 4, 1, infile); /* read one four-byte quantity */
    if (i != 1) {
        printf("Error reading the header. Number of bytes read = %d\n", i);
        return 0;
    }
    header = ntohl(four_bytes);
    printf("Here is the header: ");
    printf("0x%08x\n", header);

    return 0;
}

Output
Code:
Opened funny.mp3 for reading. Here is the header: 0xffe320c4

Notice the similarity between network operations and file operations: you are reading and writing with a stream of bytes. Bytes-in are the same order as bytes-out. If you store them a byte at a time, the code will not depend on the endianness of the host.

If some of the bytes have to be put together as 16-bit or 32-bit integers for some reason, you can read and write multiple bytes at a time and then use some kind of function to swap bytes if the machine architecture is differently-ended than the byte stream.

Anyhow the bottom line is: Deal with bytes and there is never a need to take endianness into account. Deal with multi-byte quantities, and you may be able to use library functions (albeit not standard library functions) to help you write endian-agnostic code.

We could have saved a little board bandwidth if I hadn't made the wrong guess as to what you really wanted to do. (But that's OK, I hope. I still would like to know if anyone comes up with better "macro" stuff than my negative comments.)


Regards,

Dave
  #12  
Old 25-Aug-2007, 16:09
davis
 
Posts: n/a

Re: MACRO to detect big / little endian


I don't know of a good macro way of testing big/little endianness, but here is an easy to use C way:

CPP / C++ / C Code:
#include <stdio.h>

int is_big_endian()
{
    int rc = 0;
    short word = 0x4321;
    if( (*(char*)& word) != 0x21 )
    {
	rc = 1;
    }
    return rc;
}

int main()
{
    if( is_big_endian() )
    {
	printf( "Big Endian Machine\n" );
    }
    else
    {
	printf( "Little Endian Machine\n" );
    }

    return 0;
}

Output:

Code:
./endianness Big Endian Machine uname -a SunOS shell 5.8 Generic_117350-11 sun4u sparc SUNW,Ultra-2 ./endianness Little Endian Machine uname -a Linux localhost.localdomain 2.6.9-5.EL #1 Wed Jan 5 19:22:18 EST 2005 i686 i686 i386 GNU/Linux


:davis:
  #13  
Old 25-Aug-2007, 16:40
davekw7x davekw7x is offline
Outstanding Member
 
Join Date: Feb 2004
Location: Left Coast, USA
Posts: 5,218
davekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to behold

Re: MACRO to detect big / little endian


Quote:
Originally Posted by davis
I don't know of a good macro way of testing big/little endianness, but here is an easy to use C way:
This and the two methods previously shown are all variations on a theme: Put some multi-byte integer data object into memory and look at the byte at the lowest address. Then return a value that depends on endianness. See footnote.

The point is that in order to know about endianness, you have to be able to examine something that you put into memory (or somehow know the contents of some memory locations, regardless of how they got there) and that you know has different values for one strategically placed byte than for other bytes.

So what's the problem with making a macro out of it?

You could make the macro that consists of a block, inside of which you declare an integer data type variable and store something. But I don't see how to make the result of invoking such a macro be the value of an expression that you could use in an expression like if (is_big_endian()){...}.

I don't know of a particularly important (practical) reason to knock ourselves out trying to make it a macro instead of a function, but I don't think it hurts to think about such things from time to time.

I mean, as much as I have expressed my disdain for macros in general, sometimes you are faced with OPC (Other People's Code) that is rife with insipid, useless, confusing macros and you have to try to make heads-or-tails out of it. So thinking about such things ahead of time could give you a little edge.

Sometimes you may even see some code where intricate, clever, extremely time-efficient or memory-efficient, if somewhat obfuscatory, macros do a fantastic job, but it seems to me that most (all??) of them that I can remember lately could have been done as effectively and more safely with in-line functions.


Regards,

Dave

Footnote: As far as being "easy to use," they are all "easy to use": if (is_big_endian()){...}, right?
  #14  
Old 26-Aug-2007, 04:37
aijazbaig1's Avatar
aijazbaig1 aijazbaig1 is offline
Member
 
Join Date: May 2006
Location: India
Posts: 156
aijazbaig1 has a spectacular aura aboutaijazbaig1 has a spectacular aura about
Exclamation

Re: MACRO to detect big / little endian


Hello everyone.
Dave said:
I have some questions from a rather previous post in this thread. Pardon me for taking the flow a bit backwards, but I hope you guys understand .
Quote:
Literal strings have a zero byte at the end (defined as a byte whose bits are all zero), and sometimes we speak of a null-terminated "C-style" string. But that's just English, not C. Lower-case "null" is not defined as an identifier in C or anywhere in the standard C library. Upper-case "NULL" is #defined as a pointer in <stdio.h> (or some file that gets included when you include <stdio.h>).
So what you mean is that those strings, array of char i.e. do not append a NULL at the end of the sequence. Instead it is a byte which contains all zeros., which I assume it means all bits are 0s. Isn't it?

Quote:
People sometimes think that NULL is a synonym for zero. Actually, NULL is defined as a pointer whose value corresponds to an integer with value zero cast as a pointer. I think that if you want to know whether an integer data type is zero, it would be better to compare it with zero rather than NULL. However, in this case comparing it with NULL wouldn't cause it not to work.
So could you elaborate a bit more on NULL is defined as a pointer whose value corresponds to an integer with value zero cast as a pointer. What do u mean by 'with value zero cast as a pointer'?
Then again, in my original program I had compared 'cptr' directly with NULL. And you corrected me to have '*cptr' compared with '0' instead. Well...does dereferencing 'NULL' return a value '0'?
Quote:
This would return one if the most significant byte of test_var is stored first, and it would return zero if the least significant byte of test_var is stored first. So now, it is possible to return a value that depends on the endianness of the machine.
Hmm..so when I assign the address of test_var to cptr, where is cptr pointing to? Is it at the base of the test_var, i.e. the lowest memory byte out of the 4 bytes in case of test_var?
If thats the case, then *cptr would give the 'item' stored at the lowest memory location for the variable test_var and if thats zero then it means little endian architecture and if it isn't then it means it is big endian architecture.

Quote:
I thought that the reason you needed a function or macro to determine system endianness was because getting 32-bit ints (or 16-bit ints) into and out of memory would require byte-swapping for little-endian machines but not for big-endian ones (or some such thing)
Why is that little endian architecture needs byte-swapping?Hmm..this is kinda really new to me so please bear with me for asking such a (may be for them veterans ) primitive question...but I am eager to learn and to contribute

waiting for ur answers
__________________
Hope to hear from you guys!

--------------------------------------------------

Best Regards,
Aijaz Baig.
  #15  
Old 26-Aug-2007, 11:33
davekw7x davekw7x is offline
Outstanding Member
 
Join Date: Feb 2004
Location: Left Coast, USA
Posts: 5,218
davekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to behold

Re: MACRO to detect big / little endian


Quote:
Originally Posted by aijazbaig1
Hello everyone.
Quote:
Originally Posted by davekw7x
Literal strings have a zero byte at the end (defined as a byte whose bits are all zero)

So what you mean is that those strings, array of char i.e. do not append a NULL at the end of the sequence. Instead it is a byte which contains all zeros., which I assume it means all bits are 0s. Isn't it?


A literal string is not an array of char. The official terminology for what I called a "literal string" is actually a "character string literal". I hope that was not a point of confusion. (Note that there is also a thing called a "wide string literal" with wide chars. I am not addressing that here, and I will not address it.) When I refer to "literal string" or "character string" I am referring to the following

From the C standard (ISO/IEC 98/99, Section 6.4.5, Paragraph 2)
Quote:
A character string literal is a sequence of zero or more multibyte characters enclosed in double-quotes, as in "xyz"

The string literal resides somewhere in program memory, and consists of the sequence of chars that you put inside the quote marks and is followed by a byte whose bits are all zero.

Quote:
Originally Posted by aijazbaig1
So could you elaborate a bit more on NULL is defined as a pointer whose value corresponds to an integer with value zero cast as a pointer. What do u mean by 'with value zero cast as a pointer'?
OK. You asked for it (you may soon be asking your self, "Why?")
The term "null character" (lower case "null") is defined in the C standard in the following sentence (ISO/IEC 9899, Section 5.2.1, Paragraph 2):
Quote:
A byte with all bits set to 0, called the null character, shall exist in the basic execution character set; it is used to terminate a character string.

Now, one example of a character string would be stuff inside the quote marks of the string literal in the following example:
CPP / C++ / C Code:
char *cptr;
cptr = "Hello";
The first statement causes a memory location to be allocated for a variable named cptr. The indentifier cptr can be used any place in a C program where it is legal to use a variable whose type is "pointer to char". (Note that there are a lot of illegal things that you can do with pointers, and the language doesn't protect you from doing lots if illegal things with pointers. It's your job (you, the programmer) to make sure nothing bad can happen when you use a pointer) At this point, since the value of cptr is not initialized, if you were to dereference cptr, the result would be undefined behavior.

The second statement does two things:
1. It causes the following six chars to be stored somewhere in program data space (it should be read-only, but not all compilers enforce the read-only part):
'H', 'e', 'l', 'l', 'o', '\0'
(That last one is a byte whose bits are all zero. In other words: it is a null byte.)

2. Then it sets the value of cptr equal to the address of the 'H' character in the string literal. Now when you dereference cptr, you are referring to the memory location where the 'H' was stored.

Quote:
Originally Posted by aijazbaig1
Then again, in my original program I had compared 'cptr' directly with NULL. And you corrected me to have '*cptr' compared with '0' instead. Well...does dereferencing 'NULL' return a value '0'?

Dereferencing NULL is always (that's always) an error. It may or may not cause a run-time error, but it "usually" does, I think.

NULL is a macro, which is one of the things that is defined whenever you include <stdio.h> in your program. From ISO/IEC 98/99 Section 7.17:
"The macros are
NULL
which expands to an implementation-defined null pointer constant"

Now, from IEC/ISO 98/99, Section 6.3.2.3, Paragraph 2:
"An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant. If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function."

In other words NULL is defined by #define NULL ((void *)0)

For Linux and all versions of Windows for which I have (or have had) compilers, a pointer with value NULL turns out to have all zeros as its value. This is not necessarily true, but it "just happens to be" so for my implementations.
But that is implementation-defined. Some systems might have something else. So, comparing an integer with NULL may actually give different results that comparing an integer with an integer zero.

On the other hand, comparing a pointer with 0 (even without the cast), is OK, since that is defined to be an null pointer constant. Is it just me, or does anyone else think that this is a little ambiguous? Go figure. (And see the Footnote.)

However, back to your question:
It is not illegal or in any way improper to compare a pointer with NULL. It simply is not useful in this case, since we already know that cptr is not equal to NULL.

Why do I say that about the value of cptr?

Well:

1. cptr was declared as a pointer to char in that function.
2. The value of cptr was set equal to the address of another variable.
3. No address will have a value of NULL
4. Nothing has changed the value of cptr since the assignment statement.
5. Therefore, the value of cptr in that statement can not be NULL
QED

What you obviously needed to do was to dereference cptr, since your routine needed to determine if the char that was being pointed to by cptr was zero.

Summary:
Consider the expression being tested by the following
CPP / C++ / C Code:
/* first example */
char *cptr;
.
.
.
if (cptr == NULL)
The result is true if the pointer cptr is a null pointer.

Consider the following:
CPP / C++ / C Code:
/* second example */
char *cptr;
.
.
.
if (*cptr == 0)
The result is true if the char pointed to by cptr is equal to zero.

Now, sometimes you see something like the following in programs, where people really mean the previous expression
CPP / C++ / C Code:
/* third example */
char *cptr;
.
.
.
if (*cptr == NULL)
This is actually not legal in C. From ISO/IEC 98/99, Section 6.5.9, Paragraph 2, talking about expressions with two operands connected by "==" or "!="
"One of the following shall hold:
— both operands have arithmetic type;
— both operands are pointers to qualified or unqualified versions of compatible types;
— one operand is a pointer to an object or incomplete type and the other is a pointer to a
qualified or unqualified version of void; or
— one operand is a pointer and the other is a null pointer constant."


Now for implementations that I have, the expression turns out to be the equivalent of the second example when I run the following example code.

So, a relational expression involving an integer and a pointer are not allowed. My Borland compilers don't complain about the expression of example 3, but GNU gcc does. It gives a warning about comparison between pointer and integer. Microsoft compilers also give warnings to the effect that 'int' differs in levels of indirection from 'void *'

The fact that NULL is a macro actually prevents the compiler from knowing that it is illegal, and in all cases an executable is created (and example 3 "just happens" to work the same way as example number 2 for my compilers).
CPP / C++ / C Code:
#include <stdio.h>
#include <string.h>
int main()
{
    int i;
    char *cptr;
    cptr = "Hello";
    if (cptr == NULL) {
        printf("(cptr == NULL) is true\n");
    }
    else {
        printf("(cptr == NULL) is not true\n");
    }
    for (i = 0; i <= strlen(cptr); i++) {
        printf("cptr[%d] = 0x%02x\n", i, cpt[i]);
    }
    printf("\n");
    if (cptr[5] == 0) {
        printf("(cptr[5] == 0) is true\n");
    }
    else {
        printf("(cptr[5] == 0) is not true\n");
    }
    printf("\n");

    if (cptr[5] == NULL) {
        printf("(cptr[5] == NULL) is true\n");
    }
    else {
        printf("cptr[5] == NULL) is not true\n");
    }
    return 0;
}

Output:
Code:
F:\home\dave\cprogs\forum (cptr == NULL) is not true cptr[0] = 0x48 cptr[1] = 0x65 cptr[2] = 0x6c cptr[3] = 0x6c cptr[4] = 0x6f cptr[5] = 0x00 (cptr[5] == NULL) is true
So, it "works" for my compilers if I have illegal C code comparing an int with a pointer. It probably works for yours. (I seem to recall that for VAX VMS systems the pointer NULL actually was not all zeros, but don't quote me on that.)
Quote:
Originally Posted by aijazbaig1
Hmm..so when I assign the address of test_var to cptr, where is cptr pointing to?
Your code is:
CPP / C++ / C Code:
    char *cptr = (char*)&test_var;
The value of cptr initialized to be the address of the int test_var. (That is, the address of the first byte of the integer as it is stored in memory.) If the decimal value of test_var is 1, and an int is a 32-bit quantity then:
1. If that implementation stores integers as big-endian , the four bytes are stored in the following order: 0x00, 0x00, 0x00, 0x01. Therefore *cptr is equal to zero.

2. If that implementation stores integers as little-endian, the four bytes are stored consecutively in the following order: 0x01, 0x00, 0x00, 0x00. Therefore *cptr is equal to 1.
Quote:
Originally Posted by aijazbaig1
Why is that little endian architecture needs byte-swapping?
For "normal" programs, data goes into memory and is retrieved by assignment statements and things like printf("%d"...) and stuff like that. The order of bytes actually stored in memory is all taken care of by hardware/software so that you never have to worry about it. I'll repeat that: You don't have to worry about endianness if you always store integer data types as integers of that type. (The problem only occurs when you store things a byte at a time, and then retrieve multiple bytes with a single integer operation. Or vice-versa.)

That's why, in introductory computer science courses, when the professor talks about such things as big-endian and little-endian, the all-knowing C.S. sophomores roll their eyes and say, "Well, that's just too weird for words. I mean, after all, I taught myself BASIC programming on my grandpa's Apple II when I was, like, eight years old, and since I have never heard of such a thing, it is obviously not important. Or, actually, the professor is some kind of nut case who obviously doesn't know what he/she is talking about."

I hate to repeat myself (again), but byte-swapping is only important when you have stored things a byte a time and you want to retrieve four consecutive bytes as a 32-bit int. Or vice versa. Suppose I have a binary file whose first four bytes are 0x12, 0x34, ix56, 0x78. (Or I receive that byte stream from a communications channel.)

I store the four bytes into memory in the order in which they were received into a character array. I know that those four bytes actually make up a 32-bit integer and I want to know the value of the integer. Is it 0x12345678, or is it 0x78563412?

I have to know whether the bytes were written to the file in big-endian order or little-endian order. (Or, at least, I have to know whether the bytes in the file have the same endianness of the machine that is converting them to an integer.)

File standards (like the mp3 format definition) specify what kind of endianness is used to write multi-byte quantities. Essentially all network standards specify that multi-byte quantities are sent in big-endian order.

When you write a program to read (or write) an mp3 file, or when you write a network (sockets) program to convert a 32-bit integer to the four bytes if an IPv4 address for transmission in an IP packet or when you convert the 16-bit port address to the two bytes that go into the struct whose address you feed to the connect() function (and will eventually be send in the packet), you must take endianness into consideration.

Traditional network functions like the ones I enumerated will help you write source code that will work equally well even if it is compiled on another machine whose endianness is different from yours.

Not doing network programming? Not doing mp3 (or .wav or .bmp or .ttf or...) programming? Don't worry about implementing special code that deals with indianness.

Writing multi-byte integers to binary files? If you know (absolutely know) that no one will ever want to read that file on a different-endian machine, then it might be OK to ignore endianness issues. Just make sure that you know that.

Text files? Endianness is never an issue, since everything is written a byte at a time. (That's why I always---well usually always---prefer text files over binary for generic programs.)


Regards,

Dave

Footnote:
The formulator(s) of the C++ standard corrected the potential difficulties associated with the C definition for NULL as follows. Instead of quoting from the several sections of the C++ standard, I will go to good old Wikipedia: http://en.wikipedia.org/wiki/C++0x
"In C, NULL is a preprocessor macro defined to be ((void*)0) or 0. In C++, implicit conversions from void* to other pointer types is not allowed, so something as simple as char* c = NULL would fail to compile under the former definition. To fix this, C++ ensures that NULL expands to 0, which as a special case is allowed to be converted to any pointer type. This interacts poorly with the overloading mechanism. For instance, suppose a program has declarations void foo( char* ); void foo( int ); and then calls foo(NULL); this will call the foo(int) version, which is almost certainly not what the programmer intends."

So, even in C++, where the ambiguity of the definition of NULL has been removed, it is still possible for a careless or unknowing programmer to write bad code. Yes, it's true. I know that it's hard to believe, but you can actually write bad code in C++

(I wanted to put an exclamation point at the end of the previous line, but then someone would have asked what "C++ factorial" means.)
---davekw7x
 
 

Recent GIDBlogProgramming ebook direct download available by crystalattice

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Help analyzing a linked list program aijazbaig1 C Programming Language 22 01-Aug-2007 11:19
how to detect mouse click in child window created by CHTMLEDITVIEW terr MS Visual C++ / MFC Forum 0 08-Sep-2006 14:12
difference between inline function and macro function gemini_v440 C++ Forum 2 25-Mar-2006 21:36
SWAP macro alcoholic C Programming Language 4 15-Jan-2006 18:39
Changing big/little endian Dream86 C++ Forum 3 24-Jul-2005 21:13

Network Sites: GIDNetwork · GIDWebHosts · GIDSearch · Learning Journal by J de Silva, The

All times are GMT -6. The time now is 17:30.


vBulletin, Copyright © 2000 - 2009, Jelsoft Enterprises Ltd.