Let’s say you run a network with a large number of Mac OS X or iOS (or, more likely, both) devices. Software Update and the two App Stores (Mac App Store and iOS App Store) make keeping all those devices up-to-date a pretty straightforward process. They are a huge improvement compared with the rather old-fashioned practice of looking through applications, visiting the web site for each one and manually downloading updated versions. When updating two or more similar machines, of course, one only needed to download the updated version once, then copy it to each other machine. Better, but a process that when performed across a lot of machines requires a lot of work.
However, even though the App Store and Software Update Server in Mac OS X Server make things easier, there’s no simple way to download things once and distribute the downloaded files to multiple machines for items purchased on the App Store. When large updates come out (such as a new version of iOS), you’re essentially downloading huge amounts of data to each and every machine, and if machines are set to automatically download updates, you could even have a large number of them downloading simultaneously.
Of course you can run your own Software Update service in Mac OS X Server, but this requires that every client machine be configured to use the local server. This works well for machines under your control, but for all those people who bring in their own laptops this doesn’t help.
What’s worse is that there’s currently no way whatsoever to run a Software Update-like service for App Store purchases. Imagine if you have a lab of dozens or hundreds of Macs with Final Cut X or iPads (or iPhones, iPod Touches, whatever comes out next with iMovie or ). Any time there’s an update you’re potentially downloading over a gigabyte per machine in the case of Final Cut X or 70 megabytes or so in the case of iMovie. That can easily add up to a tremendous amount of traffic and the congestion, complaints and headaches which go with it..
What’s needed is an easy way to cache App Store downloads. While we’re at it, it would also be nice to transparently have machines use our own Software Update server. Let’s be even a little more ambitious and do this without needing Mac OS X Server. Aw, heck – let’s make it work on any reasonably Unix-like OS.
So how do we do this? The App Stores and Software Update services use http for fetching files. So what we need to do is to capture those http requests and either redirect them to a local store of Software Update files or locally cached App Store files.
Just as an aside, it’d be tremendously difficult to create a local store of App Store files if for no other reason than the fact that there are currently more than half a million applications. Add to this the rate at which updates become available and your machine would probably never be finished attempting to download all of the applications! Considering this, we’re looking at running Apache and squid on our Unix-like machine and doing a little redirection magic on whatever device does NAT or routes for us.
Note: There’s no reason that the same machine can’t do both NAT/routing and Apache/squid, although in most environments we are assuming that the machine would simply be a proxy for Mac or iOS-based devices. To make this example end-to-end though, we’ll run the router on the host.
Our example uses a Mac OS X (non-Server) machine running Leopard which is doing both NAT and running our Apache and squid software. We’re simply using the Internet Sharing service, the public network interface is en0 (which we don’t use anywhere) and the interface which will serve our iOS and Apple clients is en1 and has the address 10.0.2.1.
Everyone has their own favorite way of installing software on Unix-like OSes and a discussion about which is best and why would certainly be outside the scope of this article. In these examples we’re using NetBSD’s pkgsrc for no other reason than the fact that it will compile packages from source with a base directory which is easily configurable (feel free to use ports or some other automated tool according to what platform you are using). Get pkgsrc (usually via cvs; we’ll assume it’s put into /usr which can be as simple as:
cd /usr ; setenv CVSROOT :pserver:anoncvs@anoncvs.netbsd.org:/cvsroot ; cvs checkout -P pkgsrc
And then run /usr/pkgsrc/bootstrap/bootstrap like so:
cd /usr/pkgsrc/bootstrap/
./bootstrap --prefix /usr/local --pkgdbdir /usr/local/var/db/pkg --sysconfdir /usr/local/etc --varbase /usr/local/var --ignore-case-check
This puts all files into /usr/local including logs and configuration files, so keeping your system clean is simple and keeping track of the differences between built-in and pkgsrc software is easy. Next, install pkgsrc’s www/squid and www/apache (and net/wget if your Unix doesn’t already have it):
cd /usr/pkgsrc/www/squid
bmake update
cd /usr/pkgsrc/www/apache22
bmake update
cd /usr/pkgsrc/net/wget
bmake update
Note that on systems like Mac OS X which come with GNU make by default, that pkgsrc uses bmake; if you have BSD make already, just use make. Another note is that /usr/local/sbin is not in Mac OS X’s path by default, so add /usr/local/sbin to /etc/paths if you’re going to use it.
Now that the software is installed in consistent locations we can configure it. The squid.conf file only needs one line to be changed; everything else is added. Find the line which says:
http_port 3128
And change it to:
http_port 3128 intercept
Then add the following lines:
maximum_object_size_in_memory 4096 KB
cache_replacement_policy heap LFUDA
cache_dir ufs /usr/local/var/squid/cache 16384 16 256
maximum_object_size 2097152 KB
refresh_pattern -i .ipa$ 360 90% 10800 override-expire ignore-no-cache ignore-no-store ignore-private ignore-reload ignore-must-revalidate
refresh_pattern -i .pkg$ 360 90% 10080 override-expire ignore-no-cache ignore-no-store ignore-private ignore-reload ignore-must-revalidate
acl no_cache_local dstdomain 10.0.2.1
cache deny no_cache_local
redirect_program /usr/local/bin/rewrite.pl
These settings are chosen to cache large files up to 2 gigabytes in size in a 16 gig cache on disk and to ignore cache directives with regards to .pkg and .ipa files. Adjust to your own liking. Of course, replace 10.0.2.1 with the private IP of your machine. The cache deny with that address is used to make sure that redirected Software Update files are not cached in squid which would just take up room which better used for App Store files.
The URL rewriting script (create /usr/local/bin/rewrite.pl) just changes Apple Software Update URLs to point to our server:
#!/usr/bin/env perl
$|=1;
while (<>) {
s@http://swscan.apple.com@http://10.0.2.1/swscan.apple.com@;
s@http://swcdn.apple.com@http://10.0.2.1/swcdn.apple.com@;
s@http://swquery.apple.com@http://10.0.2.1/swquery.apple.com@;
print;
}
Next we configure Apache. The location you choose for the Software Update files can be anywhere (in our example, they’re on a FireWire attached drive mounted at /Volumes/sw_updates/) which needs to be allowed in the Apache configuration.
Add to /usr/local/etc/httpd/httpd.conf:
<Directory “/Volumes/sw_updates/”>
Options Indexes FollowSymLinks
AllowOverride None
Order allow,deny
Allow from all
</Directory>
<VirtualHost *:80>
ServerAdmin hostmaster@318.com
DocumentRoot “/Volumes/sw_updates”
ErrorLog “/usr/local/var/log/httpd/swupdate_error_log”
CustomLog “/usr/local/var/log/httpd/swupdate_access_log” common
</VirtualHost>
The log lines are purely optional. If you don’t add them, logs will still be written at /usr/local/var/log/httpd/access_log and error_log.
Next, we configure ipfw (in the case of Mac OS X or FreeBSD) to redirect all port 80 traffic transparently to our squid instance. If you’re using a different device for NAT/routing or different firewalling software such as ipfilter, see the examples listed below.
ipfw add 333 fwd 10.0.2.1,3128 tcp from any to any 80 recv en1
Note that on Snow Leopard and Lion you’ll need to make this change, too:
sysctl -w net.inet.ip.scopedroute=0
ipfilter would look like this for the same ipfw task from above (if you’re using Linux):
rdr en1 0.0.0.0/0 port 80 -> 10.0.2.1 port 3128 tcp
Again, the local private IP is 10.0.2.1 and the local private interface is en1; substitute your IP and interface.
Finally, we need to mirror all Apple Software Updates. A simple shell script can do this. Save this file somewhere (named mirror_swupdate.sh, for instance) and run it from cron now and then, perhaps once a night:
#!/bin/sh
location=$1 # This is the root of our Software Update tree
mkdir -p $1
cd $1
for index in index-leopard-snowleopard.merged-1.sucatalog index-leopard.merged-1.sucatalog index-lion-snowleopard-leopard.merged-1.sucatalog
do
wget --mirror http://swscan.apple.com/content/catalogs/others/$index
for swfile in `cat swscan.apple.com/content/catalogs/others/$index | grep "http://" | awk -F">" '{ print $2 }' | awk -F"<" '{ print $1 }'`
do
echo $swfile
wget --mirror "$swfile"
done
done
Invoke this with the top of the tree of your Software Update files as you’ve used in the Apache config, like so:
./mirror_swupdate.sh /Volumes/sw_updates
Expect this to run for a long time the first time you run this because you’ll be downloading around 60 gigabytes of updates. Every time it runs afterwards, though, files won’t be downloaded again unless they change (which they won’t; new updates will show up as new files).
Start squid and Apache, then tail your Apache log and run Software Update to test:
/usr/local/share/examples/rc.d/apache start
/usr/local/share/examples/rc.d/squid start
tail -f /usr/local/var/log/httpd/swupdate_access_log
At this point, you can redirect your software updates to the host. Updates for both the Mac App Store and iOS are also now cached. In the next article we’ll look at using some squid extensions to enable you to block applications from the App Stores or block updates in the event that an update is problematic.