[clue] Unzipping zip archives with duplicates?
Sean LeBlanc
seanleblanc at comcast.net
Wed Aug 29 18:14:31 MDT 2012
On 8/29/12 8:22 AM, David L. Anselmi wrote:
> Sean LeBlanc wrote:
>> In case anyone was interested, I used Ruby to do this.
> Thanks, I'm interested.
>
>> Ultimately, it will most likely be launched by Ant - modern versions now
>> support a "script" task that utilizes Apache's BSF which implements
>> JSR-223 - which means that with the JRuby jar, I can run Ruby within Ant
>> without shelling out. This is so I don't have to write all the crufty
>> Ant stuff to detect OS version, and then try to either guess where the
>> ruby binary might be installed or what shell to launch....bash or
>> cmd...which is a long way of saying "this is why there isn't a shebang
>> at the top of the script".
> I don't know much about that, but doesn't Ruby just do the right thing? I'd guess that if you use
> back ticks in perl it would get the right interpreter on both Linux and Windows. Although you don't
> seem to use any of that in your script.
I think Ruby does the right thing - whether Ant does the right thing is
debatable. :)
I'm talking about shelling out from Ant to invoke a shell or Ruby
directly. There may be a better way for this, but going from what I've
done in the past, it's pretty crufty. You can check environment
variables, and assume if you are running on non-Windows that you invoke
bash so that it can use environment to find ruby, OR you can hard-code
the path to ruby and exec it that way.
> Do the files in the zip have paths? If not I'm not sure why you have 2 loops to figure out the
> version number.
>
> In perl I'd have made each path a hash key and incremented the value each time. Then you shouldn't
> need to check whether something exists--if it's in the hash you use the value as the version number
> and if it isn't you use it as is.
Yes, the files in the zip have paths. So, the problem I was running into
if I *only* considered uniqueness by the entire path was this:
/foo <- Actually a FILE
/foo/ <- Actually a DIR
/foo/file1
/foo/file2... etc.
If I only checked full name for uniqueness, I'd be shafted when I went
to write out file1 and and file2 because the file was in the list from
zip first, and so I need to write out /foo.1/ and put file1 and file2 in
there, if that makes sense.
> I didn't look at perl's zip library so I'd have used unzip to get the list of files and do the
> unzipping, so you're more elegant there.
>
> And looking a little closer, would unzip -B have done what you need?
>
I'm not sure if it works or not - I see the option in the man page, even
on the Mac, but I cannot get it to recognize it. :)
The man page has this, too, which doesn't give me a warm and fuzzy for
the Mac version....or the Cygwin/Windows one...
-B [Unix only, and only if compiled with UNIXBACKUP defined]
save a backup copy of each overwrit-
ten file with a tilde appended (e.g., the old copy of
``foo'' is renamed to ``foo~''). This
is similar to the default behavior of emacs(1) in many
locations.
Turns out now I'm trying to figure out how to have my Ruby script
embedded in Ant be able to find the gems it needs....I might end up
re-writing the Ruby script in Groovy instead. :)
More information about the clue
mailing list