The mirroring procedure
Having finished with the requirements, we now move on to actually setting up a Fedora mirror. Before you get your hands dirty, it would be better if you could study the directory structure of the Fedora repository for a while. You can find it here.
Synchronising content
Synchronising content is, to put it simply, copying the content of a Fedora mirror into your server in such a way that all the properties of the files and directories being transferred remain unchanged. As this is the most time-consuming process involving a large number of file downloads, it is suggested that you first get this started and while it pulls content from the server, you do other necessary configurations. The only reliable way to do mirroring is to use rsync, which is a utility for incremental file transfer. Like FTP, rsync also transfers files between a server and a client, but if the file transfer breaks down midway as a result of a network or power outage, it will resume transferring files from the point where it left off. From now on, we shall use the terms synchronise or pull instead of ‘file transfer’.
It is best to set up a new user account on your system, which will perform the synchronisation.
# useradd -r -m mirror
The directory structure you are mirroring should match that of Fedora’s master mirrors. To do so, simply create them and give your mirror user write permissions:
# mkdir -p /var/www/html/pub/fedora/linux/releases
# chown -R mirror:mirror /var/www/html/pub
# find /var/www/html/pub -type d -exec chmod 0755 {} ;
If you wish to exclude some content from synchronising, you will create an exclude.txt file. You may put any expression into that file and when rsync is told about it, it won’t pull that content. You can do this as your new mirror user:
# su - mirror $ touch exclude.txt
An exclude.txt file typically looks like what follows:
#dont sync any ppc content ppc* #don’t sync debug directories debug* #don’t sync source directories source*
As you can see, you can put regular expressions in the exclude file. It means that you need not put in all the names of the directories that you want to exclude. When you put ppc* in the exclude.txt file, all directories starting with ppc will not be pulled.
Now that we are finished with the exclude part, we are ready to pull in the actual content. The rsync command may look like what’s given below:
$ rsync -vaH --exclude-from=/home/mirror/exclude.txt --numeric-ids --delete --delete-after --delay-updates rsync://mirror.anl.gov/fedora/linux/releases/11/var/www/html/pub/fedora/linux/releases/
This command will start pulling the Fedora 11 repository and put them into /var/www/html/pub/fedora/linux/releases/11.
Now, let’s see what this means. rsync, as stated earlier, is an incremental file transfer protocol. -v stands for verbose mode, -a means the achieve option, and -H means that the rsync run will preserve hard links between the files (which saves considerable amounts of disk space and reduces file transfers).
We now define which directories not to synchronise using --exclude-from. The --delete, --delete-after and --delay-updates tells rsync not to delete old content while synchronising new data. Instead, it tells rsync to keep the old file and directories until the synchronisation is complete. Then, finally, we define the remote rsync server and the destination directory.
If you are worried from which server you want to pull the repositories from, you can get a list of servers, which provide the rsync service, from the Fedora mirrorlist. It would be nice to choose a reliable server near you. Also, don’t forget to drop a mail to the admin of the server, as a matter of courtesy and also to ensure there is no planned outage in the next couple of days, at their end.
Saving some bandwidth
A little trick can save you a few gigabytes of download. If you are not sure about the directory structure Fedora repositories have, be a bit careful about this.
The ISO of the Fedora DVD resides at the Fedora/$architecture/iso/ directory. Also, the same contents of the DVD are at Fedora/$architecture/os/, but as extracted files and directories. For example, http://118.102.181.66/releases/11/Fedora/i386/os/ contains the files of http://118.102.181.66/releases/11/Fedora/i386/iso/Fedora-11-i386-DVD.iso. So if you download the ISO image first and then copy the content over to the os/ directory, you need not download the same content twice. Let’s see how we do it.
Once the download of the DVD ISO file is completed, mount it somewhere:
# mount -o loop /var/www/html/pub/fedora/linux/releases/11/Fedora/i386/iso/Fedora-11-i386-DVD.iso /mnt # cp -prv /mnt/* /var/www/html/pub/fedora/linux/releases/11/Fedora/i386/os/ # umount /mnt
Similarly, you can repeat this for x86_64 DVD ISO, if you are mirroring that architecture too.
Note: Be sure you use the -p option with cp. If you don’t, the copy operation will change the timestamps of the files being copied and rsync will treat them as invalid. rsync will pull all the content again, overwriting the copied files, and in the process thwart all your efforts to save bandwidth.
If the download stops
In the course of synchronising, it is highly possible that you will receive a few messages like this: “Suddenly the Dungeon collapses!! – You die…” and the download will stop. Don’t panic. It’s only that rsync has stopped for some reason. Just press the up arrow key and press Enter to run the same command again. rsync will pick up from where it left off. Also, you won’t be able to see any file in the directories until all the content of a directory is pulled. You can be assured that the download is indeed happening by using this feature periodically:
# du -m /var/www/html/ | tail -n 1
Let rsync run its own course. You have nothing to do other than periodically check if it has stopped. In the meantime, let’s do the other necessary configurations.





Pingback: Getting larger files over n/w