GIDForums  

Go Back   GIDForums > Computer Programming Forums > C++ Forum
User Name
Password
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

 
 
Thread Tools Search this Thread Rate Thread
  #1  
Old 01-Jun-2008, 09:39
ngjackie ngjackie is offline
New Member
 
Join Date: Nov 2007
Posts: 20
ngjackie is on a distinguished road

Hex dump to string


Quote:
0000: FF D8 FF E0 00 10 4A 46 49 46 00 01 02 01 00 48 ......JFIF.....H
0010: 00 48 00 00 FF ED 0A 96 50 68 6F 74 6F 73 68 6F .H......Photosho
0020: 70 20 33 2E 30 00 38 42 49 4D 04 04 07 43 61 70 p 3.0.8BIM...Cap
This is an extract from Wikipedia showing the corresponding ASCII text translation to the hex dump above. Why is some codes translated to dot(.)? What should FF, D8, E0, and so on be translated to? What do they mean? Are they possibly Unicode characters? Are Unicode characters able to be converted to hex dump and vice versa? Or we don't have any corresponding translation for them? Do we have a list or table showing hex dump-to-string translation? I want to understand hex dump better. Thanks in advance.
  #2  
Old 01-Jun-2008, 11:28
davekw7x davekw7x is offline
Outstanding Member
 
Join Date: Feb 2004
Location: Left Coast, USA
Posts: 5,218
davekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to beholddavekw7x is a splendid one to behold

Re: Hex dump to string


Quote:
Originally Posted by ngjackie
This is an extract from Wikipedia showing the corresponding ASCII text translation to the hex dump above. Why is some codes translated to dot(.)?
If you read the wikipedia entry http://en.wikipedia.org/wiki/ASCII#A...ble_characters, you would see that ASCii printable characters have values from 0x20 up to and including 0x7e. Anything outside that range is shown as a '.' on the right-hand side of each line of the hex dump.
Quote:
Originally Posted by ngjackie
What should FF, D8, E0, and so on be translated to?
They are not ASCII printable chars, so they are shown as dots.
Quote:
Originally Posted by ngjackie
What do they mean?
By themselves they don't "mean" anything. They are characters in a file. The "meaning" of file contents depends on the file.
Quote:
Originally Posted by ngjackie
Are they possibly Unicode characters?
Well, they could be anything. That hex dump is apparently from a JPEG Interface File Format file, which you can read about, starting here: http://www.w3.org/Graphics/JPEG/. If, indeed, it is a JPEG file, it's a binary file. None of the chars is Unicode for anything.

The "meaning" of the bytes in a JPEG file can be seen by reading the JPEG specification: www.w3.org/Graphics/JPEG/jfif3.pdf
Quote:
Originally Posted by ngjackie
Are Unicode characters able to be converted to hex dump and vice versa?
You can write a program to convert anything to anything. A hex dump program like the one in your example doesn't print any chars other than ascii printable chars. Note that, generally speaking, trying to interpret byte sequences inside a binary file (such as a JPEG file) as Unicode chars might not be very enlightening.


Regards,

Dave
  #3  
Old 01-Jun-2008, 12:03
ocicat ocicat is offline
Regular Member
 
Join Date: May 2008
Posts: 580
ocicat is a jewel in the roughocicat is a jewel in the rough

Re: Hex dump to string


Quote:
Originally Posted by ngjackie
Why is some codes translated to dot(.)?
Every two-digit hexadecimal sequence does not represent a printable ASCII character. Because of this, most programs which attempt to translate hexadecimal dumps will use a period as a placeholder for a character which is non-printable.
Quote:
What do they mean?
You have not provided information as to what the hexdump above represents, so it is impossible to give a definitive answer.
Quote:
Are they possibly Unicode characters?
Perhaps.
Quote:
Do we have a list or table showing hex dump-to-string translation
Yes, most tables showing ASCII conversions will display this information.

Perhaps the following two examples will help provide perspective.
  • Dumping the contents of the first sector of a Windows XP installation yields:
    Code:
    0000 FA 33 C0 8E D0 BC 00 7C 8B F4 50 07 50 1F FB FC |.3.....|..P.P...| 0010 BF 00 06 B9 00 01 F2 A5 EA 1D 06 00 00 BE BE 07 |................| 0020 B3 04 80 3C 80 74 0E 80 3C 00 75 1C 83 C6 10 FE |...<.t..<.u.....| 0030 CB 75 EF CD 18 8B 14 8B 4C 02 8B EE 83 C6 10 FE |.u......L.......| 0040 CB 74 1A 80 3C 00 74 F4 BE 8B 06 AC 3C 00 74 0B |.t..<.t.....<.t.| 0050 56 BB 07 00 B4 0E CD 10 5E EB F0 EB FE BF 05 00 |V.......^.......| 0060 BB 00 7C B8 01 02 57 CD 13 5F 73 0C 33 C0 CD 13 |..|...W.._s.3...| 0070 4F 75 ED BE A3 06 EB D3 BE C2 06 BF FE 7D 81 3D |Ou...........}.=| 0080 55 AA 75 C7 8B F5 EA 00 7C 00 00 49 6E 76 61 6C |U.u.....|..Inval| 0090 69 64 20 70 61 72 74 69 74 69 6F 6E 20 74 61 62 |id partition tab| 00A0 6C 65 00 45 72 72 6F 72 20 6C 6F 61 64 69 6E 67 |le.Error loading| 00B0 20 6F 70 65 72 61 74 69 6E 67 20 73 79 73 74 65 | operating syste| 00C0 6D 00 4D 69 73 73 69 6E 67 20 6F 70 65 72 61 74 |m.Missing operat| 00D0 69 6E 67 20 73 79 73 74 65 6D 00 00 00 00 00 00 |ing system......| 00E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00F0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0160 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 0190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 01A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 01B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80 01 |................| 01C0 01 00 0B 7F BF FD 3F 00 00 00 C1 40 5E 00 00 00 |......?....@^...| 01D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 01E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 01F0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 AA |..............U.|
    Obviously some error strings are present, but most of the rest of what is displayed is the numerical values of executable instructions. Without getting too deeply into this example, I will also point out that the last 65 bytes actually represented the table of four partitions comprising any Intel-based system plus the magic number needed by the BIOS.

    The fundamental point here is that this is what is going on at the assembly language level. Assembly language mixes data & instructions together, & it can be difficult distinguishing the two simply from a hexadecimal dump.
  • In comparison, look at the hexadecimal dump of the first sector of a PDF file:
    Code:
    00000000 25 50 44 46 2d 31 2e 32 0d 25 e2 e3 cf d3 0d 0a |%PDF-1.2.%????..| 00000010 38 31 39 31 20 30 20 6f 62 6a 0d 3c 3c 20 0d 2f |8191 0 obj.<< ./| 00000020 4c 69 6e 65 61 72 69 7a 65 64 20 31 20 0d 2f 4f |Linearized 1 ./O| 00000030 20 38 31 39 36 20 0d 2f 48 20 5b 20 31 32 36 33 | 8196 ./H [ 1263| 00000040 36 20 35 32 38 30 20 5d 20 0d 2f 4c 20 33 36 33 |6 5280 ] ./L 363| 00000050 39 31 37 32 20 0d 2f 45 20 39 30 37 35 33 20 0d |9172 ./E 90753 .| 00000060 2f 4e 20 33 36 30 20 0d 2f 54 20 33 34 37 35 32 |/N 360 ./T 34752| 00000070 33 32 20 0d 3e 3e 20 0d 65 6e 64 6f 62 6a 0d 20 |32 .>> .endobj. | 00000080 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | | 00000090 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | | 000000a0 20 20 20 20 20 20 20 20 20 20 20 20 20 78 72 65 | xre| 000000b0 66 0d 38 31 39 31 20 35 39 38 20 0d 30 30 30 30 |f.8191 598 .0000| 000000c0 30 30 30 30 31 36 20 30 30 30 30 30 20 6e 0d 0a |000016 00000 n..| 000000d0 30 30 30 30 30 31 32 33 31 36 20 30 30 30 30 30 |0000012316 00000| 000000e0 20 6e 0d 0a 30 30 30 30 30 31 32 35 30 31 20 30 | n..0000012501 0| 000000f0 30 30 30 30 20 6e 0d 0a 30 30 30 30 30 31 32 35 |0000 n..00000125| 00000100 33 34 20 30 30 30 30 30 20 6e 0d 0a 30 30 30 30 |34 00000 n..0000| 00000110 30 31 32 35 39 33 20 30 30 30 30 30 20 6e 0d 0a |012593 00000 n..| 00000120 30 30 30 30 30 31 37 39 31 36 20 30 30 30 30 30 |0000017916 00000| 00000130 20 6e 0d 0a 30 30 30 30 30 31 38 31 30 36 20 30 | n..0000018106 0| 00000140 30 30 30 30 20 6e 0d 0a 30 30 30 30 30 31 38 31 |0000 n..00000181| 00000150 37 35 20 30 30 30 30 30 20 6e 0d 0a 30 30 30 30 |75 00000 n..0000| 00000160 30 31 38 33 35 33 20 30 30 30 30 30 20 6e 0d 0a |018353 00000 n..| 00000170 30 30 30 30 30 31 38 35 32 39 20 30 30 30 30 30 |0000018529 00000| 00000180 20 6e 0d 0a 30 30 30 30 30 31 38 37 33 31 20 30 | n..0000018731 0| 00000190 30 30 30 30 20 6e 0d 0a 30 30 30 30 30 31 38 38 |0000 n..00000188| 000001a0 34 35 20 30 30 30 30 30 20 6e 0d 0a 30 30 30 30 |45 00000 n..0000| 000001b0 30 31 39 30 32 36 20 30 30 30 30 30 20 6e 0d 0a |019026 00000 n..| 000001c0 30 30 30 30 30 31 39 32 32 32 20 30 30 30 30 30 |0000019222 00000| 000001d0 20 6e 0d 0a 30 30 30 30 30 31 39 33 36 36 20 30 | n..0000019366 0| 000001e0 30 30 30 30 20 6e 0d 0a 30 30 30 30 30 31 39 35 |0000 n..00000195| 000001f0 36 37 20 30 30 30 30 30 20 6e 0d 0a 30 30 30 30 |67 00000 n..0000|
    Obviously, there are some embedded strings, but the rest is sequences used in defining the PDF file format. Without knowing that this is a portion of a data file, it would be difficult to discern whether this represented a mixture of instructions & data, or in this case, all data.
What you should take away from these examples is that context is everything. Since everything ultimately is represented as numerical values, knowing what is data & what are instructions depends upon how the stream of hexadecimal values is being used. A raw hexadecimal dump doesn't provide this perspective.
  #4  
Old 03-Jun-2008, 03:38
ngjackie ngjackie is offline
New Member
 
Join Date: Nov 2007
Posts: 20
ngjackie is on a distinguished road

Re: Hex dump to string


Thanks to davekw7x and ocicat for your explanation. I thought those non-printable characters are something else than alphabet, numbers, and symbols, such as an instruction, a character of other languages, or encrypted data. By the way, would they turn out to be one of the mentioned above (characters of other languages or encrypted data)?

Quote:
"You have not provided information as to what the hex dump above represents, so it is impossible to give a definitive answer," said ocicat.

I'm not very clear about this part you said. Do you mean that if we know the file format of a particular hex dump, then we are able to work out the character or symbol it represents?

Here's another question. In a hex dump, do the printed ASCII characters might not mean their literal meaning? In other word, the printed ASCII characters might not mean the same as they are. For example, a number or character such as 0 might mean something else. It might mean true or false (Boolean). Would it?

Thanks again.
  #5  
Old 03-Jun-2008, 04:28
ocicat ocicat is offline
Regular Member
 
Join Date: May 2008
Posts: 580
ocicat is a jewel in the roughocicat is a jewel in the rough

Re: Hex dump to string


Quote:
Originally Posted by ngjackie
By the way, would they turn out to be one of the mentioned above (characters of other languages or encrypted data)?
Yes, these are possibilities.
Quote:
I'm not very clear about this part you said. Do you mean that if we know the file format of a particular hex dump, then we are able to work out the character or symbol it represents?
If you know what the file is, or if you know the file format, yes, you may be able to reverse engineer back to the original contents if that is your goal, but this is dependent upon a lot issues falling into place. Your original example was approximately 50 bytes. If the heading had not revealed the obvious string "Photoshop", then it would be very hard to impossible to figure out what was the original meaning or source.
Quote:
Here's another question. In a hex dump, do the printed ASCII characters might not mean their literal meaning?
Correct. For example, 'A' by itself could mean anything, but a string of characters which easily translates into a spoken/written language is most likely text. Again, look at the examples I provided above.
Quote:
For example, a number or character such as 0 might mean something else. It might mean true or false (Boolean). Would it?
Remember that '0' in ASCII is 0x30. '0' might represent a Boolean value, but most likely it will be part of something else. Isolating any single character in a hexdump without looking at what is around it, the origin of the file, the hardware it was found on, or the file format makes it very difficult to say definitively what it is.

Welcome to file system & disk forensics. It isn't an obvious set of technologies or skills. Again, context is everything.
  #6  
Old 04-Jun-2008, 05:21
ngjackie ngjackie is offline
New Member
 
Join Date: Nov 2007
Posts: 20
ngjackie is on a distinguished road

Re: Hex dump to string


Thank you very much. I got it now.
 
 

Recent GIDBlogProblems with the Navy (Chiefs) by crystalattice

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Multiple questions for C++ project devster420 C++ Forum 1 20-Apr-2007 22:26
Message Class TransformedBG C++ Forum 5 29-Nov-2006 22:28
Need help with strings sasukekun C++ Forum 4 24-Apr-2006 11:51
variables return to previous value after i try to set them nasaiya MS Visual C++ / MFC Forum 2 14-Jun-2005 01:43
Help wit my source code compiler errors Krandygrl00 C++ Forum 1 06-Jun-2005 09:14

Network Sites: GIDNetwork · GIDWebHosts · GIDSearch · Learning Journal by J de Silva, The

All times are GMT -6. The time now is 18:50.


vBulletin, Copyright © 2000 - 2009, Jelsoft Enterprises Ltd.