
Now, when searching for text inside a binary file, some editors allow to specify if you search for 1 or 2 byte encoded strings. Therefore the letters between the dots :) Thanks to downwards compatiblity with single-byte ASCII mapping, the basic alphanumeric characters are still readable in their 1-byte representation. The dots between the letters in Unicode are not to indicate an abbreviation, or some ASCII-Art: The dots represent non-printable characters.

Single- and multi-byte character encoding of "hello" next to each other (separated by a space character "0x20")

Here's a bogus example how these 2 encodings would look in a hex editor: If it's a number, you already know what to do, but if it's text you may find 2 different ways in which its displayed in the text column:
#010 EDITOR CHANGE NAME HOW TO#
In order to read embedded metadata, you need to know at which byte offsets to find which data field and how to interpret it. If you look at a lossy compressed audio file like MP3 or MP4, it's quite likely to contain descriptive metadata embedded. Handling text in binary files Searching TextĪs already mentioned, you'll encounter text phrases inside binary files. There is also a PDF version of this document available for download, as well as its Markdown source code. This article is mainly aimed at archivists with digital preservation needs, and maybe a bit for data forensic beginners. This part is about handling the data in the file itself, such as searching or editing text or reading and editing binary data with the use of a file format specification paper.

In the previous part, we've covered the basics of "what is hex?" and how to use a hex editor for binary files.
