GIDForums  

Go Back   GIDForums > Computer Programming Forums > C Programming Language
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

 
 
Thread Tools Search this Thread Rate Thread
  #1  
Old 08-Apr-2009, 04:56
bobby2009 bobby2009 is offline
New Member
 
Join Date: Apr 2009
Posts: 4
bobby2009 is on a distinguished road

Working with binary files


hi,

i am working on some binary files. basically, i have to process every element of a binary file, then output the results to another binary file. there are some things which i cant understand:

1. if say my binary file is 1000 bytes in size, and i know that the file contains 1000 data points, can i assume that each data point is 1 byte in size? does that also mean i can read each element as a char, since sizeof(char) = 1?

2. if in the case that i read each element as char, can i directly do mathematical operations on it? or do i need to convert each char to int first, in which case the output will no longer be 1 byte per element, right?
  #2  
Old 08-Apr-2009, 09:22
davekw7x davekw7x is offline
Outstanding Member
 
Join Date: Feb 2004
Location: Left Coast, USA
Posts: 5,218
davekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to behold

Re: working with binary files


Quote:
Originally Posted by bobby2009

... my binary file is 1000 bytes in size, and i know that the file contains 1000 data points, can i assume that each data point is 1 byte ...
What else could it be? 1000 data points, 1000 bytes so number of bytes per data point has to be one, right?
Quote:
Originally Posted by bobby2009
does that also mean i can read each element as a char, since sizeof(char) = 1?
Yes.
Quote:
Originally Posted by bobby2009
...do i need to convert each char to int first...
Not necessarily. See footnote. A char is an integer data type.

All operators that work on ints will work on chars. There may be issues of overflow if you are doing arithmetic on the byte values(or if you are doing equivalent stuff like shifting), and integer overflows are always ignored by C and C++ programs, but the operations themselves are well-defined.

There might be a reason to explicitly convert the data values to ints (or floats or doubles) if you are going to do something like "multiply each data value by 4 and divide by 3," in which case working on a larger data item might reduce the chance of overflow.

Similarly, if you are doing filtering with something like "add 10 data values together and divide by 10," you would (probably) want to convert data values to ints (or floats or doubles) before the operations.

Stuff like that.

Regards,

Dave

Footnote:

The generally correct answer to questions like "do I need..." is (almost) always: "It depends."

It depends on what the heck you are really going to do with the data and what you expect of the program's output. Since you didn't give us a program specification, I can only answer in generalities. I give a couple of examples where the answer is, "Probably yes."

I am sometimes known as a kind of a specialist in vague generalities. I tried to be more vague in my "generally correct" answer, but that's the best I could come up with on the spur of the moment.

Bottom line:

"All generalities are false. Vague generalities are even more false."
---davekw7x
  #3  
Old 08-Apr-2009, 20:22
bobby2009 bobby2009 is offline
New Member
 
Join Date: Apr 2009
Posts: 4
bobby2009 is on a distinguished road

Re: Working with binary files


i am working with binary files that open with specific image viewers, so each data point is one pixel.

i want to read every data point (as char), do mathematical manipulations, which include:

1. comparing every point to some pre-fixed numerical value -- if the point is smaller than this value, leave it; if the point is bigger than this value, i want to change that point to 0 (yes, i am doing some sort of masking)

2. simple manipulations such as addition and multiplication, etc

and finally output the processed data points (hopefully as char too, to match the output file size to the original file size) into a binary file that i will need to open with my image viewer again. for the masking part, i have tried

CPP / C++ / C Code:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>

int main()
{
FILE *fp3;
FILE *fp3_o;

unsigned long int i, size = 1249820;

char *b3_dn;
b3_dn = malloc(size * sizeof(char));

fp3 = fopen("xxx\\b3", "rb");
fp3_o = fopen("xxx\\b3_o", "wb");

fread(b3_dn, sizeof(char), size, fp3);

for (i=0; i<size; i++){
 if(b3_dn[i] >= 50) b3_dn[i] = 0;
}

fwrite(b3_dn, sizeof(char), size, fp3_o);

free(b3_dn);
fclose(fp3);
fclose(fp3_o);

return 0;
}

but this code simply left the majority of the data points untouched while randomly changing some to 0. any hints?
Last edited by admin : 09-Apr-2009 at 04:40. Reason: Please insert your example C/C++ codes between [CPP] and [/CPP] tags
  #4  
Old 08-Apr-2009, 23:38
davekw7x davekw7x is offline
Outstanding Member
 
Join Date: Feb 2004
Location: Left Coast, USA
Posts: 5,218
davekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to behold

Re: Working with binary files


Quote:
Originally Posted by bobby2009
...
i want to read every data point (as char)
SIgned chars or unsigned chars? It makes a difference when comparing numerical values, you know.
Quote:
Originally Posted by bobby2009
simple manipulations such as addition and multiplication, etc
So, maybe you should convert the chars to ints or floats or doubles.
Quote:
Originally Posted by bobby2009
for the masking part, i have tried
.
.
.

but this code simply left the majority of the data points untouched while randomly changing some to 0. any hints?

Randomly? Really?

How about this:

1. create a file with known values in all of the places.
2. run the file through your program
3. look at what happens.

Here's what I might do:

1. Write character values 0x00 through 0xff to the file, Maybe with something like
CPP / C++ / C Code:
/* Write binary file */
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
    FILE *outfile;
    int i;
    if (argc < 2) {
        printf("Give the name of the file to be written.\n");
        exit(EXIT_FAILURE);
    }
    outfile = fopen(argv[1], "wb");
    if (!outfile) {
        printf("Can't open file %s for writing.\n", argv[1]);
        exit(EXIT_FAILURE);
    }
    printf("Opened file %s for writing.\n", argv[1]);
    for (i = 0; i < 256; i++) {
        if (fputc(i, outfile) == EOF) {
            printf("There was a problem writing byte number %d\n", i);
            exit(EXIT_FAILURE);
        }
    }
    printf("Total number of bytes written = %d\n", i);
    fclose(outfile);
    return 0;
}

2. Make a program to read and display all of the characters from the file, one at a time. I mean you could put them into an array, but why bother?

CPP / C++ / C Code:
/*
 * Read and display bytes of a binary file
 * The bytes are assumed to be signed chars
 */

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
    int input_value;
    char inchar;
    FILE *infile;
    int charcount = 0;
    if (argc < 2) {
        printf("Give the name of the file to be read\n");
        exit(EXIT_FAILURE);
    }
    infile = fopen(argv[1], "rb");
    if (!infile) {
        printf("Can't open file %s for reading.\n", argv[1]);
        exit(EXIT_FAILURE);
    }
    printf("Opened file %s for reading.\n", argv[1]);
    while ((input_value = fgetc(infile)) != EOF) {
        ++charcount;
        inchar = input_value;
        printf(" %4d", inchar);
        if (charcount % 10 == 0) {
            printf("\n");
        }
    }
    printf("\nTotal number of bytes read = %d\n", charcount);
    fclose(infile);
    return 0;
}

Note that my compiler (and most that you are likely to run across these days) uses signed chars by default.

Here's the output of this program when it reads the file written by the first example:
Code:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 -128 -127 -126 -125 -124 -123 -122 -121 -120 -119 -118 -117 -116 -115 -114 -113 -112 -111 -110 -109 -108 -107 -106 -105 -104 -103 -102 -101 -100 -99 -98 -97 -96 -95 -94 -93 -92 -91 -90 -89 -88 -87 -86 -85 -84 -83 -82 -81 -80 -79 -78 -77 -76 -75 -74 -73 -72 -71 -70 -69 -68 -67 -66 -65 -64 -63 -62 -61 -60 -59 -58 -57 -56 -55 -54 -53 -52 -51 -50 -49 -48 -47 -46 -45 -44 -43 -42 -41 -40 -39 -38 -37 -36 -35 -34 -33 -32 -31 -30 -29 -28 -27 -26 -25 -24 -23 -22 -21 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 Total number of bytes read = 256

Now run this file through a program that converts byte values greater or equal to than some threshold to zero and writes the results to another file. You could modify your program to read and write 256 bytes. I would probably not hard-code a fixed file size in the program, but for testing purposes, you can just go with what you have. See Footnote.

If I set the threshold to 80, for example, the result would look like
Code:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -128 -127 -126 -125 -124 -123 -122 -121 -120 -119 -118 -117 -116 -115 -114 -113 -112 -111 -110 -109 -108 -107 -106 -105 -104 -103 -102 -101 -100 -99 -98 -97 -96 -95 -94 -93 -92 -91 -90 -89 -88 -87 -86 -85 -84 -83 -82 -81 -80 -79 -78 -77 -76 -75 -74 -73 -72 -71 -70 -69 -68 -67 -66 -65 -64 -63 -62 -61 -60 -59 -58 -57 -56 -55 -54 -53 -52 -51 -50 -49 -48 -47 -46 -45 -44 -43 -42 -41 -40 -39 -38 -37 -36 -35 -34 -33 -32 -31 -30 -29 -28 -27 -26 -25 -24 -23 -22 -21 -20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 Total number of bytes read = 256

Now, if the bytes in your file are supposed to be unsigned chars (which I suspect is the case for image files), then declare inchar to be an unsigned char in the program that reads and displays the file. (If you are going to use an array and compare values in the array to an arithmetic value, declare the array to be unsigned chars.)
Then the file written by the first example would be displayed by the modified reading program as
Code:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 Total number of bytes read = 256

And the file after being run through the modified "threshold" program would be displayed as
Code:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Total number of bytes read = 256

Bottom line: After testing my threshold program for all possible bytes that could occur in the file by using a known file, I might run the program on the (larger) actual image file. If the bytes really were changed "randomly," I might look for some kind of undefined behavior (reading/writing beyond the limits of assigned memory. Stuff like that.)


Regards,

Dave

Footnote: I would never write a program that assumed files are opened and assumes that files are read correctly. I really mean this: Never. Period. I showed how I test for open files.

If I were going to read into an array using fread(), I might do something like
CPP / C++ / C Code:
.
.
.
    int num_read;
.
.
.
    b3_dn = malloc(size); /* sizeof(char) is always 1 */
    if (!b3_dn) {
        printf("Can't allocate %d chars for the buffer\n");
        exit(EXIT_FAILURE);
    }
    number_read = fread(b3_dn, 1, size, infile);
    printf("Number of bytes read = %d\n", number_read);

    for (i = 0; i < number_read; i++) {
        if (b3_dn[i] >= 50)
            b3_dn[i] = 0;
    }

I would also check the number of bytes written by fwrite and report this also.
  #5  
Old 09-Apr-2009, 00:07
bobby2009 bobby2009 is offline
New Member
 
Join Date: Apr 2009
Posts: 4
bobby2009 is on a distinguished road

Re: Working with binary files


my data points are in unsigned char - they run from 0-255.

i will try out your suggestions and get back. thanks
  #6  
Old 11-Apr-2009, 17:50
Mexican Bob's Avatar
Mexican Bob Mexican Bob is offline
Member
 
Join Date: Mar 2008
Location: Chicxulub, Yucatán
Posts: 226
Mexican Bob is a jewel in the roughMexican Bob is a jewel in the roughMexican Bob is a jewel in the rough

Re: working with binary files


Quote:
Originally Posted by davekw7x
What else could it be? 1000 data points, 1000 bytes so number of bytes per data point has to be one, right?

...unless the data is compressed


MxB
  #7  
Old 11-Apr-2009, 19:01
davekw7x davekw7x is offline
Outstanding Member
 
Join Date: Feb 2004
Location: Left Coast, USA
Posts: 5,218
davekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to behold

Re: working with binary files


Quote:
Originally Posted by Mexican Bob
...unless the data is compressed...
Huh?
I was responding to the Original Post:
Quote:
Originally Posted by bobby2009
...say my binary file is 1000 bytes in size, and i know that the file contains 1000 data points...
According to my arithmetic:

1000 bytes divided by 1000 data points = 1 byte per data point.

1000 data points divided by 1000 bytes = 1 data point per byte.

Or am I missing something?

Regards,

Dave
  #8  
Old 11-Apr-2009, 19:06
Mexican Bob's Avatar
Mexican Bob Mexican Bob is offline
Member
 
Join Date: Mar 2008
Location: Chicxulub, Yucatán
Posts: 226
Mexican Bob is a jewel in the roughMexican Bob is a jewel in the roughMexican Bob is a jewel in the rough

Re: working with binary files


Quote:
Originally Posted by davekw7x
Huh?
I was responding to the Original Post:

According to my arithmetic:

1000 bytes divided by 1000 data points = 1 byte per data point.

1000 data points divided by 1000 bytes = 1 data point per byte.

Or am I missing something?

Regards,

Dave

Lighten up Dave...it was just a joke. (note big smilie) Obviously, "it could be" something else if the data was compressed, but (also obviously) highly unlikely considering the provided info.


MxB
  #9  
Old 12-Apr-2009, 10:13
davekw7x davekw7x is offline
Outstanding Member
 
Join Date: Feb 2004
Location: Left Coast, USA
Posts: 5,218
davekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to behold

Re: working with binary files


Quote:
Originally Posted by Mexican Bob
...note big smilie...
Well
Quote:
Originally Posted by davekw7x
am I missing something?

I tend to overlook emoticons as well as the execreble and ubiquitous presence of "lol" in e-mail and other messages these days. My poor little peanut brain just filters them out unless someone draws particular attention. (Thank you, at least, for not putting in a dancing banana!)

Oh, well...

Regards,

Dave
  #10  
Old 13-Apr-2009, 01:53
bobby2009 bobby2009 is offline
New Member
 
Join Date: Apr 2009
Posts: 4
bobby2009 is on a distinguished road

Re: Working with binary files


relax guys, turns out its a simple mistake that was bugging me - i have declared a signed char array to read the data, so that means i am only allowing values from -127 to 128 to get into my array, whereas my data runs from 0-255, which means i have to use unsigned char instead. thx for pitching in
 
 

Recent GIDBlogProblems with the Navy (Chiefs) by crystalattice

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Named virtual host not working Johnnyrotton Apache Web Server Forum 4 04-Sep-2007 21:32
Bloodshed Dev C++ Project Options JdS C++ Forum 6 11-Nov-2005 18:23
Reading and writing binary files in certain format Dream86 C++ Forum 10 06-Aug-2004 11:38
Can't view pages from another machine on the Intranet aevans Apache Web Server Forum 9 14-May-2004 03:26
Help with binary files (encryption?) pablowablo C++ Forum 6 28-Apr-2004 23:47

Network Sites: GIDNetwork · GIDWebHosts · GIDSearch · Learning Journal by J de Silva, The

All times are GMT -6. The time now is 20:26.


vBulletin, Copyright © 2000 - 2009, Jelsoft Enterprises Ltd.