Categories
Linux

Greek letter encoding inside zip files on linux

Greek character encoding in filenames inside zip files, that use only latin encoding for the filenames that are created on Windows, cannot handled correctly in Linux. This article presents a solution to the problem.

If you uncompress such file, you will end up with garbage ascii filenames. One solution of course, is to use a product like winzip for Windows through wine. Here we will show a method using the convmv tool.

I am using Archlinux, where convmv is available inside extra repository.

Let’s say we have a file.zip containing filenames with greek letters. First we make a directory putting the file.zip inside it. We uncompress the zip file:

unzip -^ file.zip

The parameter -^ allows non-printable characters in extracted zip file names

Now we convert the filenames from cp1252 to cp850 encoding:

convmv --notest -r -f cp1252 -t cp850 *

And then, from cp737 to utf8 encoding:

convmv --notest -r -f cp737 -t utf8 *

The greek filenames are now unicode encoded and are shown fine! The above procedure can be written as a bash script:

#!/bin/sh 
#Extract files from a ZIP with windows-encoded greek filenames
#then try to convert all filenames to UTF8 #
unzip -^ $*
convmv --notest -r -f cp1252 -t cp850 * convmv --notest -r -f cp737 -t utf8 *

Notice that the solution was found at forum.ubuntu-gr.org

Leave a Reply

Your email address will not be published. Required fields are marked *