Save Bandwidth by Setting Up a Fedora Mirror

Configuring the Apache server

Enable KeepAlive

Enabling KeepAlive in httpd allows persistent connections. These long-lived HTTP sessions allow multiple requests to be sent over the same TCP connection, as it does not require separate connection set-ups for each file. This reduces some overhead and significantly reduces latency periods. By default, Fedora’s Apache httpd package has KeepAlive disabled. They should be enabled, with a timeout of two seconds. Don’t keep this very high since it may overload your server. Take a look at Figure 1 to see the changes required in the Apache configuration file.

Figure 1: Enabling KeepAlive in Apache

Figure 1: Enabling KeepAlive in Apache

Handling of metadata

Metadata are typically defined as ‘data about data’. When you try to install a package or update a system, the first things that get downloaded are package metadata. These are files with information about the packages, their age and other details. If, for example, a computer has old metadata cached, according to which all the packages are up-to-date, no new updates will be installed into the system. To work around this, we explicitly add the Cache Control: must-revalidate option, which insists that Yum or any client must revalidate the metadata against the server before serving it from the cache. For this, add the following section to your /etc/httpd/conf/httpd.conf around the <Location> directive (around line 900; take a look at Figure 2 to get an understanding of the exact location):

<LocationMatch "\.(xml|xml\.gz|xml\.asc|sqlite)">
    Header set Cache-Control "must-revalidate"
    ExpiresActive On
    ExpiresDefault "now"
</LocationMatch>
Figure 2: Configuring metadata handling in Apache

Figure 2: Configuring metadata handling in Apache

Content types

ISO and RPM files should be served using MIME Content-Type: application/octet-stream. In Apache, this can be done inside a VirtualHost or similar section:

<VirtualHost *:80>
    AddType application/octet-stream .iso
    AddType application/octet-stream .rpm
</VirtualHost>

Limiting download accelerators

Download accelerators will try to open the same file many times, and request chunks, hoping to download them in parallel. This can overload already heavily-loaded mirror servers, and cause a denial of service. In order to limit connections to ISO directories by some amount, per IP, add this to your apache configuration file:

<IfModule mod_limitipconn.c>
    MaxConnPerIP 3
</IfModule>

To block ranged requests as this is, indeed, what download accelerators do, add this section to your apache configuration file:

RewriteEngine on
RewriteCond %{HTTP:Range} [0-9] $
RewriteRule \.iso$ / [F,L]

Restart Apache

Now restart Apache. If everything is fine, you should not get an error. If you can start the Apache server successfully, it means you are done with most things.

About Susmit Shannigrahi

Susmit is a long time contributor to Fedora Project. He is a member of the Fedora Ambassadors Steering Committee (FAmSCo) which oversees the worldwide activity of Fedora Ambassadors. He also leads the Fedora Freemedia program. However, his main area of interest is to design and implement various IT infrastructures using only free software for enhanced efficiency and productivity. He also works for Fedora-Infrastructure and several other Fedora Groups.

Related Articles

5 Comments

  • ashishkumar2703
    October 15, 2009 | Permalink |

    Just commenting to get a wave invite. ;)

  • tinhed
    January 12, 2010 | Permalink |

    Thanks for this very useful article.

  • March 17, 2010 | Permalink |

    Re the mount ISO “use cp -p” or DIE … errr, not quite. ;-)

    It’s a good idea to get used to doing the right thing, but if you don’t, then rsync -a (which implies -t) will save you in this case.

    Missing files get copied completely. Identical date/time/size/name files get skipped. But filenames with different date stamps get checksummed on both sides — and only the differences are sent. In this case there *are* none, so you spend a bit of time (but not much bandwidth) figuring that out — but the entire file doesn’t come down.

    (If a file _really has_ changed, then just the checksum and changes are transmitted, not the entire file.)

    VERY good article besides that single nit.

  • susmit
    March 26, 2010 | Permalink |

    @DamnitDog, you are right.

    The rsync will only change the timestamp if it is already present, it won’t pull the entire content.

    Sorry for the mistake.

  • June 23, 2010 | Permalink |

    Thanks for the useful post.

    I try the same, but i am getting the error, on starting the apache service, after adding the following lines in httpd.conf:

    Header set Cache-Control “must-revalidate”
    ExpiresActive On
    ExpiresDefault “now”

    ERROR:
    Invalid command ‘Header’, perhaps misspelled or defined by a module not included in the server configuration

    What can i do to resolve this? I think this is related to some module that is not included in my conf file. Can you please tell me the name of that.

    Thanks & Regards,
    Your Fan :-)
    Rahul Panwar

One Trackback

Leave a comment

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Your email is never shared. Required fields are marked *

Twitter Users!
Enter your personal information in the form or sign in with your Twitter account by clicking the button below.

Subscribe to comments on this post