Difference between revisions of "Unzip"

From CBLFS
Jump to navigationJump to search
(64Bit)
 
(13 intermediate revisions by 3 users not shown)
Line 2: Line 2:
 
|-valign="top"
 
|-valign="top"
 
!Download Source:
 
!Download Source:
| http://www.mirrorservice.org/sites/ftp.info-zip.org/pub/infozip/src/unzip{{Unzip-Version}}.tar.gz
+
| ftp://ftp.info-zip.org/pub/infozip/src/unzip{{Unzip-Version}}.tgz
|-valign="top"
 
!Download Source:
 
| ftp://ftp.info-zip.org/pub/infozip/src/unzip{{Unzip-Version}}.tar.gz
 
 
|}
 
|}
  
 
----
 
----
  
== Introduction to Unzip ==
+
{{Package-Introduction|UnZip is an extraction utility for archives compressed in .zip format.|http://www.info-zip.org/UnZip.html}}
 
 
The UnZip package contains ZIP extraction utilities. These are useful for extracting files from ZIP archives. ZIP archives are created with PKZIP or Info-ZIP utilities, primarily in a DOS environment.
 
  
 
== Dependencies ==
 
== Dependencies ==
Line 18: Line 13:
 
== Unzip Locale Related Issues ==
 
== Unzip Locale Related Issues ==
  
'''Note:'''
+
{{Note|Use of UnZip in the JDK, Mozilla, DocBook or any other CBLFS package installation is not a problem, as CBLFS instructions never use UnZip to extract a file with non-ASCII characters in the file's name.}}
Use of UnZip in the JDK, Mozilla, DocBook or any other CBLFS package
 
installation is not a problem, as CBLFS instructions never use UnZip
 
to extract a file with non-ASCII characters in the file's name.  
 
  
 
The UnZip package assumes that filenames stored in the ZIP archives created on non-Unix systems are encoded in CP850, and that they should be converted to ISO-8859-1 when writing files onto the filesystem. Such assumptions are not always valid. In fact, inside the ZIP archive, filenames are encoded in the DOS codepage that is in use in the relevant country, and the filenames on disk should be in the locale encoding. In MS Windows, the OemToChar() C function (from User32.DLL) does the correct conversion (which is indeed the conversion from CP850 to a superset of ISO-8859-1 if MS Windows is set up to use the US English language), but there is no equivalent in Linux.  
 
The UnZip package assumes that filenames stored in the ZIP archives created on non-Unix systems are encoded in CP850, and that they should be converted to ISO-8859-1 when writing files onto the filesystem. Such assumptions are not always valid. In fact, inside the ZIP archive, filenames are encoded in the DOS codepage that is in use in the relevant country, and the filenames on disk should be in the locale encoding. In MS Windows, the OemToChar() C function (from User32.DLL) does the correct conversion (which is indeed the conversion from CP850 to a superset of ISO-8859-1 if MS Windows is set up to use the US English language), but there is no equivalent in Linux.  
Line 46: Line 38:
 
== Non-Multilib ==
 
== Non-Multilib ==
  
Compile the package for Intel processor systems:
+
Compile the package:
 
 
make -f unix/Makefile LOCAL_UNZIP=-D_FILE_OFFSET_BITS=64 linux
 
 
 
Compile the package for '''non'''-Intel processor systems:
 
  
  make -f unix/Makefile LOCAL_UNZIP=-D_FILE_OFFSET_BITS=64 linux_noasm
+
  make -f unix/Makefile generic CC="gcc -DUSE_BZIP2 -lbz2"
  
 
Install the package
 
Install the package
  
  make prefix=/usr install
+
  make -f unix/Makefile prefix=/usr install
 
 
=== Command Explanations ===
 
 
 
''linux'': This target in the Makefile makes assumptions that are useful for a Linux system when compiling the executables. To obtain alternatives to this target, use '''make list'''.
 
 
 
''LOCAL_UNZIP=...'': This sets the compilation flags to allow UnZip to handle files up to 4 GB.
 
  
 
== Multilib ==
 
== Multilib ==
'''''This package does not provide any libraries so only one installation is needed.'''''
 
  
 
=== 32Bit ===
 
=== 32Bit ===
Line 71: Line 52:
 
Compile the package:
 
Compile the package:
  
  make -f unix/Makefile LOCAL_UNZIP=-D_FILE_OFFSET_BITS=64 \
+
  make -f unix/Makefile generic CC="gcc ${BUILD32} -DUSE_BZIP2 -lbz2"
    CC="gcc ${BUILD32}" LD='$(CC)' CF="-O -Wall -I. -DUSE_UNSHRINK" \
 
    unzips
 
  
 
Install the package
 
Install the package
Line 83: Line 62:
 
Compile the package:
 
Compile the package:
  
  make -f unix/Makefile LOCAL_UNZIP=-D_FILE_OFFSET_BITS=64 \
+
  make -f unix/Makefile generic CC="gcc ${BUILDN32} -DUSE_BZIP2 -lbz2"
    CC="gcc ${BUILDN32}" LD='$(CC)' CF="-O -Wall -I. -DUSE_UNSHRINK" \
 
    unzips
 
  
 
Install the package
 
Install the package
Line 95: Line 72:
 
Compile the package:
 
Compile the package:
  
  make -f unix/Makefile LOCAL_UNZIP=-D_FILE_OFFSET_BITS=64 \
+
  make -f unix/Makefile generic CC="gcc ${BUILD64} -DUSE_BZIP2 -lbz2"
    CC="gcc ${BUILD64}" LD='$(CC)' CF="-O -Wall -I. -DUSE_UNSHRINK" \
 
    unzips
 
  
 
Install the package
 
Install the package
Line 107: Line 82:
 
{| style="text-align: left;"
 
{| style="text-align: left;"
 
|-valign="top"
 
|-valign="top"
! Installed Programs:
+
!Installed Programs:
| funzip, unzip, unzipfsx, zipgrep, and zipinfo
+
|funzip, unzip, unzipfsx, zipgrep, and zipinfo
 
|-valign="top"
 
|-valign="top"
! Installed Libraries:
+
!Installed Libraries:
| None
+
|libunzip.so
 
|-valign="top"
 
|-valign="top"
! Installed Directories:
+
!Installed Directories:
| None
+
|None
 
|}
 
|}
  
Line 121: Line 96:
 
{| style="text-align: left;"
 
{| style="text-align: left;"
 
|-valign="top"
 
|-valign="top"
! funzip
+
!funzip
| allows the output of '''unzip''' commands to be redirected.  
+
|allows the output of '''unzip''' commands to be redirected.  
 
|-valign="top"
 
|-valign="top"
! unzip
+
!unzip
| lists, tests or extracts files from a ZIP archive.  
+
|lists, tests or extracts files from a ZIP archive.  
 
|-valign="top"
 
|-valign="top"
! unzipfsx
+
!unzipfsx
| is a self-extracting stub that can be prepended to a ZIP archive. Files in this format allow the recipient to decompress the archive without installing UnZip.  
+
|is a self-extracting stub that can be prepended to a ZIP archive. Files in this format allow the recipient to decompress the archive without installing UnZip.  
 
|-valign="top"
 
|-valign="top"
! zipgrep
+
!zipgrep
| searches files in a ZIP archive for lines matching a pattern.  
+
|searches files in a ZIP archive for lines matching a pattern.  
 
|-valign="top"
 
|-valign="top"
! zipinfo
+
!zipinfo
| produces technical information about the files in a ZIP archive, including file access permissions, encryption status, type of compression, etc.  
+
|produces technical information about the files in a ZIP archive, including file access permissions, encryption status, type of compression, etc.  
 
|-valign="top"
 
|-valign="top"
! libunzip.so
+
!libunzip.so
| contains the API functions required by the UnZip programs.
+
|contains the API functions required by the UnZip programs.
 
|}
 
|}
 +
 +
[[Category:General Utilities]]

Latest revision as of 21:34, 19 August 2009

Download Source: ftp://ftp.info-zip.org/pub/infozip/src/unzip60.tgz

Introduction to Unzip

UnZip is an extraction utility for archives compressed in .zip format.

Project Homepage: http://www.info-zip.org/UnZip.html

Dependencies

Unzip Locale Related Issues

Caution.png

Note

Use of UnZip in the JDK, Mozilla, DocBook or any other CBLFS package installation is not a problem, as CBLFS instructions never use UnZip to extract a file with non-ASCII characters in the file's name.

The UnZip package assumes that filenames stored in the ZIP archives created on non-Unix systems are encoded in CP850, and that they should be converted to ISO-8859-1 when writing files onto the filesystem. Such assumptions are not always valid. In fact, inside the ZIP archive, filenames are encoded in the DOS codepage that is in use in the relevant country, and the filenames on disk should be in the locale encoding. In MS Windows, the OemToChar() C function (from User32.DLL) does the correct conversion (which is indeed the conversion from CP850 to a superset of ISO-8859-1 if MS Windows is set up to use the US English language), but there is no equivalent in Linux.

When using unzip to unpack a ZIP archive containing non-ASCII filenames, the filenames are damaged because unzip uses improper conversion when any of its encoding assumptions are incorrect. For example, in the ru_RU.KOI8-R locale, conversion of filenames from CP866 to KOI8-R is required, but conversion from CP850 to ISO-8859-1 is done, which produces filenames consisting of undecipherable characters instead of words (the closest equivalent understandable example for English-only users is rot13). There are several ways around this limitation:

1) For unpacking ZIP archives with filenames containing non-ASCII characters, use WinZip while- running the Wine Windows emulator.

2) After running unzip, fix the damage made to the filenames using the convmv tool. The following is an example for the ru_RU.KOI8-R locale:

Step 1. Undo the conversion done by unzip:

convmv -f iso-8859-1 -t cp850 -r --nosmart --notest \
   </path/to/unzipped/files>

Step 2. Do the correct conversion instead:

convmv -f cp866 -t koi8-r -r --nosmart --notest \
   </path/to/unzipped/files>

3) Apply this patch to unzip: https://bugzilla.altlinux.ru/attachment.cgi?id=532

It allows to specify the assumed filename encoding in the ZIP archive using the -O charset_name option and the on-disk filename encoding using the -I charset_name option. Defaults: the on-disk filename encoding is the locale encoding, the encoding inside the ZIP archive is guessed according to the builtin table based on the locale encoding. For US English users, this still means that unzip converts from CP850 to ISO-8859-1 by default.

Caveat: this method works only with 8-bit locale encodings, not with UTF-8. Attempting to use a patched unzip in UTF-8 locales may result in a segmentation fault and is probably a security risk.

Non-Multilib

Compile the package:

make -f unix/Makefile generic CC="gcc -DUSE_BZIP2 -lbz2"

Install the package

make -f unix/Makefile prefix=/usr install

Multilib

32Bit

Compile the package:

make -f unix/Makefile generic CC="gcc ${BUILD32} -DUSE_BZIP2 -lbz2"

Install the package

make -f unix/Makefile prefix=/usr install

N32

Compile the package:

make -f unix/Makefile generic CC="gcc ${BUILDN32} -DUSE_BZIP2 -lbz2"

Install the package

make -f unix/Makefile prefix=/usr install

64Bit

Compile the package:

make -f unix/Makefile generic CC="gcc ${BUILD64} -DUSE_BZIP2 -lbz2"

Install the package

make -f unix/Makefile prefix=/usr install

Contents

Installed Programs: funzip, unzip, unzipfsx, zipgrep, and zipinfo
Installed Libraries: libunzip.so
Installed Directories: None

Short Descriptions

funzip allows the output of unzip commands to be redirected.
unzip lists, tests or extracts files from a ZIP archive.
unzipfsx is a self-extracting stub that can be prepended to a ZIP archive. Files in this format allow the recipient to decompress the archive without installing UnZip.
zipgrep searches files in a ZIP archive for lines matching a pattern.
zipinfo produces technical information about the files in a ZIP archive, including file access permissions, encryption status, type of compression, etc.
libunzip.so contains the API functions required by the UnZip programs.
Retrieved from "?title=Unzip&oldid=18901"