Small Office/Home Networking - Part 3 File/Print Servers Presentation Goal: To provide enough information to allow a first-time sysadmin to make the necessary hardware and software choices to set up a file/print server that best suites his needs. Introduction: First, this presentation will not give extensive details about how to configure samba, nfs, appletalk, lpr/lprng, cups, etc. Likewise, although we will talk about hardware choices one might make to set up a new file/print server, we will not discuss how one installs the various pieces of hardware. Rather, we will cover what each of these various software and hardware components may be used for and why you might choose one over another. Layout of presentation: 1. Off-the-shelf vs. Do-it-yourself A. Print Servers i. Off-the-shelf a. advantages o Usually easier to install, configure and maintain o smaller and easier to locate (networkable) b. disadvantages o Not very "customizable" Often leave file format processing up to the printer o Not very flexible or resilient: Little or no spooling, do not multitask well ii. Do-it-yourself a. advantages o Very "customizable" o Flexible and resilient: spooling, multitasking b. disadvantages o Harder to install, configure and maintain o Larger, thus harder to locate iii. Third Alternative: Combine i and ii Although much more complicated, it combines many of the advantages of i and ii. This is what I do. I have a Netgear print server that allows me to place a Canon BJC250 color inkjet printer on the family computer desk, while the file/print server is remotely located in a closet. Because all the workstations use the file/print server as the network printer, I can spool jobs, massage file formats, etc., then send the data to the Netgear print server. I also have a HP Laserjet4 connected to the parallel port on the file/print server. This printer is also located in the computer closet, but is used much less than the inkjet printer. B. File Servers i. Off-the-shelf OTS here means buying a commercially built server, for example, from Dell, HP, etc., that has all the hardware (and software) put together by the manufacturor. a. advantages o much less work for the sysadmin: she only has to specify the hardware/software components (if possible), and the vendor does all the assembly work. The sysadmin winds up with a server that is ready to go out of the box (almost). b. disadvantages o components may not be the optimum choices o hardware/software compatibility issues may arise o aftermarket configuration and customization may take a good deal of effort (the poor sysadmin may have to undo or redo a bunch of stuff) ii. Do-it-yourself DIY here means you specify the needed components, buy or download them, and assemble them yourself. a. advantages o The sysadmin ensures that all components meet specified requirements. No hardware/software compatibility issues. b. disadvantages o quite a bit of time and energy must be invested in component assembly/customization and QC/troubleshooting. 2. Server Hardware Issues A. Disk power Can we ever get enough disk space? Keep in mind though, that RAID can make a series of smaller disks provide more space. However, I recommend buying the biggest disks you can afford. i. SCSI or IDE SCSI is better at multitasking (part of the design) but more expensive. best for server that gets heavy use or simultaneous use by multiple users. IDE is cheaper, but sucks at multitasking. best for lightweight use, and by no more than one or two users at a time. ii. RAID: Hardware or Software (Note-to-self: The brief RAID tutorial goes here) For reliability, a server should at minimum use mirrored drives. Hardware RAID is much easier to setup, but much more expensive They usually have a menu-driven BIOS/CMOS setup. They cost $500+ for SCSI RAID adapters or you can spend even more to get a dedicated RAID device that looks like a single drive to the OS. The real problem is this: Can you manage and monitor the RAID from the OS. And, is it compatible with the OS. These are questions that you need answered BEFORE you invest in a hardware RAID solution. Software RAID is much cheaper, but can be much harder to set up: Some Linux distros (RedHat) let you create a RAID array, install the OS to it, and boot directly from the array, whereas others have to be mightily hacked to allow booting from a RAID (Debian). Another advantage is that the array is integrated into the OS toolset, and thus can be easily monitored, and can be maintained without booting into CMOS. C. Memory power Memory = disk buffer = speed increase The more users and the heavier the use, the more memory you should have. D. Network power Better network cards rely much less on the system's CPU to do their processing, and are usually designed for greater efficiency/throughput. However, they are always more expensive, and any network card that is compatible with the OS will work. Just keep in mind that the more your server will be pounded on by users, the greater your need for a quality network card (SMC, Intel, etc.) E. CPU power On a Linux file/print server, the CPU is not nearly as important as the disk, memory and network subsystems. Older/slower CPUs are more than adequate, and always cost less than the latest and greatest. Keep in mind though, that if you scrimp on the other subsystems, the CPU you buy will make a difference. I highly recommend, however, that the wisest dollars are better spent on disk, memory and network cards. G. Backup power Although this item is often overlooked or under-emphasized, the first time you experience a catastrophic failure you'll wish you'd invested! Most folks use tapes for backup, recount my tape repairman's story about disks being the best backup devices. Whatever you use, make sure it is big enough, or can automatically feed itself. If not, you'll relegate backups to the back burner, and come to regret it! Also, test your backup system for adequate restores before you need it. You may think you've got backups when you don't!! I'm in the process of re-evaluating my Backup system. I used to use a custom shell script and afio, and have toyed with using Amanda. H. Electric power (UPS is a must!) A crashed or dead file/print server causes no end of misery to a sysadmin. Do yourself a favour and buy a UPS, and install and TEST UPS software. How long after the power goes out do you want your server running? Long enough to shutdown cleanly? Until the power comes back? Maybe you need a generator! I. Upgradeability Keep in mind that you may want to add more disk space, printers, and memory down the road, and buy accordingly. 3. Server Software Issues A. Print Servers i. Lpr Practically speaking, for Linux, Lpr is dead. Most Linux distributions that have Lpr are using Lprng. And if you're going to use another Unix, I'd highly recommend one of the other two. ii. Lprng This is what I've used in the past. Very difficult to set up and configure, and administrate (NoteToSelf: research web/gui interface options) If you are an old hack like me, you'll find that this one is the most like the good old Lpr we all know and love (hate). iii. CUPS This is what I'm moving to. MacOSX uses this, and it has a web admin interface that makes it easier for Sysadmins and end-users to deal with. If starting fresh, this is what I recommend, as it is based on the new IPP (Internet Printing Protocol) standard. One caveat, however: After setting up a Wireless network environment, the MacOSX laptop has had several problems accessing the file/print server's CUPS server. This may be due to my having misunderstood and mis-configured something, though. B. File Servers i. NFS This is the ideal in an all-Unix environment, but be prepared to deal with different configuration setups/syntax on the different Unices. ii. Samba This works best for an all-Windows or mixed Windows/Unix environment, or if you are using MacOSX. iii. Appletalk This works for an all-Apple environment or if you need to support older Macs. iv. Web If your users only need read-only access to files, you may want to consider using a web server. C. Backup Software i. Dump/restore Dump and restore are the traditional system-level Unix programs for doing backups. 1. Advantages - Inherently does incremental backups (dump levels). - As a system-level program, it is well-integrated with the traditional Unix filesystem. - Newer versions can use compression inherently. 2. Disadvantages - Only backs up entire filesystems, SysAdmin cannot pick and choose what gets backed up and what doesn't on the filesystem. - May not work with newer, non-traditional filesystem types, i.e. Reiserfs, XFS, etc. - Doesn't do error-checking/-correction to/from the backup medium. - Older versions cannot use compression inherently. ii. Tar Tar is also a regularly-used user-leve Unix program for doing backups. 1. Advantages - As a user-level program, it doesn't care about filesystem types, thus works with all of them. - SysAdmin can pick and choose what gets backed up and what doesn't. New versions can backup links, pipes, sockets, etc. - Newer versions can use compression using external programs. 2. Disadvantages - No incremental backups inherent to program. - Doesn't do error-checking/-correction to/from the backup medium. - Older versions cannot use compression using external programs. iii. Cpio Cpio is also a user-level Unix program for doing backups. It has many of the same advantages and disadvantages as tar, however, newer versions can do limited error-checking on files. iv. Afio Afio is a user-level program that is often overlooked. It has many of the same advantages and disadvantages as tar, however, it is designed to do error-checking and correction. Also, you will most likely have to find and compile the sources for this one, as it doesn't usually come with most Linux distributions (or any other Unix, for that matter). v. Amanda Amanda is a user-level program that is designed to provide a way to administer single local backups or multi-system backups across a network. It relies on dump/restore and tar to do the underlying backups, so the features and problems of each apply to it as well. vi. Other Software There are several open source and commercial software packages for doing backups, none of which I'm familiar enough to evaluate, other than to say that the obvious disadvantage of the commercial types is that they cost money, and frequently the disadvante of the open source types is lack of "easy-to-use" documentation. vii. Summary I've used several combinations of the aforementioned programs. I currently use afio with a customised shell script wrapper, but am evaluating the idea of creating a "backup server" that uses Amanda, so that I can backup several of the machines on my network. Because there are so many alternatives, it is really hard for me to recommend a particular choice that will best meet your needs. Additionally, this is one of those issues that SysAdmins like to argue about. What it boils down to is this: Find the solution that works best for you, BUT make sure you test the restoration process BEFORE you need to rely on it! 4. System Administration Issues A. DNS Make sure your workstations can "see" the file server on the network. You'll need to use your ISP's DNS publicly, or a private local DNS server, or set up /etc/hosts on every machine. If your file/print server will double as a DNS server, beef up your resources accordingly. B. NTP File timestamps can be extremely important to some users. If you need accurate system time, set up a local timeserver on your network. If your file/print server will double as the time server, add to your server's resources accordingly. C. Security Answering the following questions will help you get an idea on what you need in a file/print server: If the local network on which your file/print server resides is connected to the Internet, have you made sure your network is secured?!? How much do you trust your users? Have you trained them to act securely? Do they choose secure passwords? Do they give their passwords out to others? Are they likely to want to crack into your network or server(s)? Do you need to keep track of what your users are doing on your servers? When your server gets cracked (not if), will you need to do offline forensics on it? Do you have adequate resources to do a complete install and restore from backups? To a spare machine? If a hard drive dies, how long can you wait for a new one? What if a second dies while you're waiting? What if your network card dies? Do you want to have spares? Can you afford them? Can you afford not to? How much downtime _can_ you afford? Security is a tradeoff between convenience and lowering risk, so there is no magic answer that is right for everyone. You have to answer these questions, then act accordingly. (I highly recommend Bruce Schneir's "Beyond Fear" as a first reading, then "Practical Unix & Internet Security after that. Info on both is listed below.) 5. Resources A. Books "Essential System Administration" - Æleen Frisch. ISBN: 0596003439 - I have this one, it's worth putting on your bookshelf. "Linux Network Administrator's Guide" - Kirch, Dawson. ISBN: 1565924002 - I have read this one, but recomment TCP/IP Network Administration instead, as it's a broader scope (much more than just a "Linux" perspective). "Practical Unix & Internet Security" - Garfinkel, Spafford, Schwartz. ISBN: 0596003234 - I have read this one, it's worth putting on your bookshelf. "TCP/IP Network Administration" - Craig Hunt. ISBN: 0596002971 - I have read this one, it's worth putting on your bookshelf. "Unix System Administration Handbook" (aka The Red Book, The Yellow Book, The Purple Book, etc.) - Evi Nemeth, et al. Purple Book ISBN: 0130206016 (Just Linux) Green Book ISBN: 0130084662 - I have the Yellow Book, and have perused the Red Book, both are worth putting on your bookshelf. Although I haven't checked out the Purple book, if it only covers Linux, I recommend the Red Book instead). "Beyond Fear" - Bruce Schneier. ISBN: 0387026207 - I have this book, it's an excellent discussion of what security is and how to evaluate "security measures". It's about security in its broadest sense, not just computer security. "Managing NFS and NIS" - Stern, Eisler, Labiaga. ISBN: 1565925106 - I have this book, but unless you're doing alot of NFS or NIS stuff, stick with TCP/IP Network Administration. "Managing RAID on Linux" - Derek Vadala. ISBN: 1565927303 - I've not read this one, so I can't give an evaluation: caveat emptor. "Network Printing" - Gast, Radermacher. ISBN: 0596000383 - I've not read this one, so I can't give an evaluation: caveat emptor. "Unix Backup & Recovery" - W. Curtis Preston. ISBN: 1565926420 - I've not read this one, but parts of it (pertaining to Amanda) are on the Web, which I have read, and found to be somewhat useful, but it didn't go in depth enough to cover my need for advanced configuration info. "Using Samba" - Ts, Eckstein, Collier-Brown. ISBN: 0596002564 - I've not read this one, so I can't give an evaluation: caveat emptor. B. Websites RAID - http://www.acnc.com/04_01_00.html - http://www.prepressure.com/techno/raid.htm SAMBA - http://www.samba.org APPLETALK - http://netatalk.sourceforge.net APACHE - http://www.apache.org CUPS - http://www.cups.org LPRNG - http://www.lprng.com AMANDA - http://www.amanda.org AFIO - http://freshmeat.net/projects/afio/ Conclusion: It is my sincere hope that by now you've accumulated enough information that you can decide what you need to implement a file/print server that best suites your needs, and/or that you are aware of the various information resources available to help you with your decision and implementation processes.