[clue] Unzipping zip archives with duplicates?

Sean LeBlanc seanleblanc at comcast.net
Wed Aug 29 18:14:31 MDT 2012


On 8/29/12 8:22 AM, David L. Anselmi wrote:
> Sean LeBlanc wrote:
>> In case anyone was interested, I used Ruby to do this.
> Thanks, I'm interested.
>
>> Ultimately, it will most likely be launched by Ant - modern versions now
>> support a "script" task that utilizes Apache's BSF which implements
>> JSR-223 - which means that with the JRuby jar, I can run Ruby within Ant
>> without shelling out. This is so I don't have to write all the crufty
>> Ant stuff to detect OS version, and then try to either guess where the
>> ruby binary might be installed or what shell to launch....bash or
>> cmd...which is a long way of saying "this is why there isn't a shebang
>> at the top of the script".
> I don't know much about that, but doesn't Ruby just do the right thing?  I'd guess that if you use
> back ticks in perl it would get the right interpreter on both Linux and Windows.  Although you don't
> seem to use any of that in your script.

I think Ruby does the right thing - whether Ant does the right thing is 
debatable. :)

I'm talking about shelling out from Ant to invoke a shell or Ruby 
directly. There may be a better way for this, but going from what I've 
done in the past, it's pretty crufty. You can check environment 
variables, and assume if you are running on non-Windows that you invoke 
bash so that it can use environment to find ruby, OR you can hard-code 
the path to ruby and exec it that way.
> Do the files in the zip have paths?  If not I'm not sure why you have 2 loops to figure out the
> version number.
>
> In perl I'd have made each path a hash key and incremented the value each time.  Then you shouldn't
> need to check whether something exists--if it's in the hash you use the value as the version number
> and if it isn't you use it as is.

Yes, the files in the zip have paths. So, the problem I was running into 
if I *only* considered uniqueness by the entire path was this:

/foo <- Actually a FILE
/foo/ <- Actually a DIR
/foo/file1
/foo/file2... etc.

If I only checked full name for uniqueness, I'd be shafted when I went 
to write out file1 and and file2 because the file was in the list from 
zip first, and so I need to write out /foo.1/ and put file1 and file2 in 
there, if that makes sense.


> I didn't look at perl's zip library so I'd have used unzip to get the list of files and do the
> unzipping, so you're more elegant there.
>
> And looking a little closer, would unzip -B have done what you need?
>

I'm not sure if it works or not - I see the option in the man page, even 
on the Mac, but I cannot get it to recognize it. :)
The man page has this, too, which doesn't give me a warm and fuzzy for 
the Mac version....or the Cygwin/Windows one...

        -B     [Unix only, and only if compiled with UNIXBACKUP defined] 
save a backup copy of each overwrit-
               ten file with a tilde appended (e.g., the old copy of 
``foo'' is renamed to  ``foo~'').   This
               is similar to the default behavior of emacs(1) in many 
locations.



Turns out now I'm trying to figure out how to have my Ruby script 
embedded in Ant be able to find the gems it needs....I might end up 
re-writing the Ruby script in Groovy instead. :)


More information about the clue mailing list