Network Attached Storage or NAS is file storage attached to a network. This term is used to refer to storage elements that connect to a network and provide file access services to computer systems. A NAS storage element consists of an engine, which implements the file services, and one or more devices on which data is stored. NAS elements may be attached to any type of network. When attached to SANs, NAS elements may be considered to be members of the SAS (SAN attached storage) class of storage elements. NAS provides file services to host computers. If the host system uses NAS the file system device drivers to access data is typically a file access protocol like NFS and WebNFS(Network File System-from SunMicrosystems), CIFS (Common Internet File System-supported by Microsoft, Intel, SCO Intergraph, DEC, and Network Appliance Inc), and DFS (Distributed File System-from Microsoft). Refer to PROTOCOL CONCEPTS below for more information.
NFS is a client/server distributed file service that provides transparent file-sharing for network environments. NFS was originally designed by a small team at Sun Microsystems in the 1980s, but is now an open Internet protocol. It was defined by RFC 1094 (version 2) and updated with RFC 1813 (version 3) in 1995. NFS is now a set of X/Open specifications defined as X/Open90 and X/Open91. (X/Open is now part of The Open Group.) The AFS (Andrew File System), is related to NFS. NFS runs on a full range of systems from PCs to mainframes in local and global environments. Multiple types of clients can access shared NFS file systems. In 1995, Dataquest estimated that NFS had an installed base of 8.5 million systems with an expected installed base of 12 million systems by 1997. While NFS is available in the public domain and from many vendors, Sun Microsystems also sells and supports the service. It includes the NIS+ enterprise naming service with its version of NFS and offers an Internet version of NFS called WebNFS, as discussed under the next subheading. An important distinction between NFS and Internet FTP (File Transfer Protocol) is that NFS does not need to fully transfer a file to a client’s system. Instead, only the blocks of the file that the client needs are transferred across the network link, thus reducing network traffic. NFS servers broadcast or advertise the directories that they share. A shared directory is often called a published or exported directory. Information about shared directories and who can access them is stored in a file that is read by the operating system when it boots. The process of making files accessible files in a directory is called file mounting. An automounter process is available that automatically mounts files on demand when a user attempts to access a file. Prior knowledge of a file’s location is not necessary. NFS version 3 provides integrity features for files that may be opened by multiple users simultaneously. If several people were accessing the same file and some of those people were writing changes to the file, then other people must know about the changes that have been made. NFS solves this problem by implementing a lock manager to lock the sections of a file that are currently being accessed by different people. A status monitor works with the lock manager to ensure that people accessing and making changes to a file do not “collide.” NFS version 3 also implements a global namespace that lets users move to different network locations but still access files with the same naming scheme used at the “home site.” This feature also benefits applications that are configured to access files based on specific location names. Security features in NFS include an authentication and authorization service to check user IDs and access rights before allowing them to access a file. NFS can also be configured to use other security services such as Kerberos. Encryption services such as DES (Data Encryption Standard) can also be implemented. NFS also implements ACLs (access control lists) which hold authorization information that defines exactly how an authenticated user can access a file.
With WebNFS Sun has extended NFS to operate on the Internet. WebNFS basically makes information on NFS servers (versions 2 and 3) available to users of Web browsers and Java applets over the Internet. It also allows users to access NFS servers through corporate firewalls. WebNFS is a complete file system for the Web, unlike FTP and HTTP. It supports in-place editing of files so users don’t need to download and edit a file, then send it back in separate operations. Files on the Internet appear as local files to users and are accessed using the NFS URL (Uniform Resource Locator) format such as nfs://server/directory/filename. WebNFS builds on NFS technology to bring file access and distribution to the Web. WebNFS has been designed to be much more efficient than NFS at bandwidth utilization over the Internet. This improves performance when downloading software from Web sites. Automatic error and crash recovery is also provided.
CIFS is a new file system supported by Microsoft and other vendors such as Data General, Digital Equipment Corp., Intel Corporation, Intergraph, Network Appliance Inc., and SCO. It is an extension of Microsoft’s existing SMB (Server Message Blocks) file-sharing protocol and allows individuals and organizations to run file systems over the Internet. In the past, most Internet file exchanges have been one-way transfers. CIFS goes beyond this by allowing groups of users to work together and share documents over the Internet in the same way they share documents when running peer networking services on Windows clients. Users can collaborate over the Internet by defining shared folders and files on systems that are connected to the Internet. CIFS has been submitted by Microsoft to the IETF (Internet Engineering Task Force) as a proposed Internet standard. Here are some of its important features: n CIFS uses the same multiuser read and write operations, locking, and file-sharing semantics that are used on most networks. n It runs over TCP/IP and uses the Internet’s global DNS (Domain Name System). n It supports multiple clients accessing and updating the same file without conflicts over the Internet. n CIFS supports fault tolerance and can automatically restore connections and reopen files that were open prior to interruption. n CIFS is “tuned” to provide optimal performance over dial-up links. n Users refer to remote file systems with an easy-to-use file-naming scheme. n CIFS is also widely available on UNIX, VMS, and other platforms. CIFS is basically an enhanced version of Microsoft’s open, cross-platform SMB (Server Message Blocks) protocol, the native file-sharing protocol in the Windows 95, Windows NT, and OS/2 operating systems. CIFS complements standard Web protocols such as HTTP (Hypertext Transfer Protocol) by providing a more sophisticated file-sharing protocol. Users do not need to rely solely on their Web browsers to access Internet information, because with CIFS most existing applications can access that data directly by using the standard Open and Save dialog boxes that users are already familiar with. The security features in CIFS include support both for anonymous transfers and for secure, authenticated access to named files. File and directory security policies are easy to administer, and use the same paradigm as share-level and user-level security policies in Windows environments. Most major operating system and application developers support CIFS. CIFS competes with Sun Microsystem’s Web NFS, a distributed file system that Sun is attempting to integrate directly into Web browsers and other clients. Netscape is embedding Web NFS into Navigator.
Microsoft DFS is designed to make it easier to access files on networks. It provides a way to unite files on different computers under a single name space. To the user, files appear as if they are in one location rather than on separate computers. A hierarchical tree provides a view of these files, and users can “drill down” through the tree to find just the information they are looking for. The user does not need to know or care about the physical location of the file, only where it is located in the hierarchical view. That means that users no longer search for files by opening file servers and disk drives and looking through a separate directory structure on each. Instead, users look through a logical directory that places shared information in a place that makes more sense to users and administrators alike. With DFS, an administrator does up-front work to logically organize information, so users don’t have trouble finding it later on. Some of the benefits of DFS are outlined here: i) Users can access information with DFS’s hierarchical view of network resources. Administrators can create custom views to make file access easier for users. ii) Volumes consist of individual shares and those shares can be at many different locations. A share can be taken offline without affecting the rest of the volume. iii) To ensure that critical data is always available, administrators can set up alternate locations for accessing data by simply including the alternate locations under the same logical DFS name. If one of the locations goes down, the other location is automatically selected. iv) Response time can be improved by load-balancing the system. Often- accessed files can be stored in multiple locations and the system will automatically distributed requests across the drives to balance traffic during peak usage periods. n Users don’t need to know about the physical location of files. Administrators can physically move files to other drives, but to the user the files still appear under the same location in the hierarchical tree. n Client access to shares is cached to improve performance. The first time a user accesses a published directory, the information is cached and used for future references. v) DFS simplifies enterprise backups. Since a DFS tree can be built to cover an entire enterprise, the backup software can back up this single “tree,” no matter how many servers/shares are part of the tree. The tree can include Microsoft Windows desktops as well. vi) Graphical administration tool makes it easy to configure volumes, DFS links, and remote DFS roots. DFS fits into an organization’s Internet and intranet strategy. The Web page of individual departments or even users can be included within the directory tree. DFS can also hold HTML links so if linked pages are moved to a different physical location, all links pointing to the pages will not have to be reconfigured.
DFS Volumes A DFS volume starts out by being hosted by a specific computer. There may be many individual DFS volumes available on a network, and each will have its own distinct name. Windows NT 4.0 (or greater) servers are currently the only systems that can host DFS volumes. An organization might have a master DFS volume that contains links to other DFS volumes at the department or division level. Another volume might tie together shares that are common in each department, such as public documents. In the DFS volume name shown here, the hosting computer name is \\Server_Name: \\Server_Name\DFS_Share_Name\path\name Like a local file system, a DFS volume has a root that is its starting point. This is represented by DFS_Share_Name. The reference to path\file can be any valid pathname. Figure below illustrates how links work. Three departments—Marketing, Engineering, and Research—have set up their own name spaces to fit their own needs. The corporate DFS volume links into specific parts of these shares as needed to provide corporate users with information from other locations in the organization. When a link is accessed, the junction between two different DFS volumes is crossed and the server that provides the DFS root changes. This is transparent to the user, however.

Structure of a distributed file system
In common usage, a NAS system is a special- purpose device that is designed to serve files to clients over a LAN (see Figure below). The clients request access to files using standard network file system (NFS) or common Internet file system (CIFS) commands. NAS devices typically contain embedded processors hosting a specialized operating system, or microkernel, and a highly optimized file system both designed to enable the NAS device to serve up files to clients with very high performance. Because they can serve multiple heterogeneous clients, NAS devices provide a form of heterogeneous data sharing.

NAS system serves files over a
LAN.
Although the attributes of specific NAS products vary, NAS vendors generally attempt to adhere to the "appliance" model of computing. That is, NAS devices are designed to do one thing file serving and to do it very well. Moreover, they are typically designed to be very simple to install and configure. The storage they provide is often housed within the device's enclosure, though some NAS devices allow for the attachment of external storage.
The NAS market was pioneered by companies such as Network Appliance and Auspex, which provide NAS systems for workgroup and enterprise customers. As the NAS market has grown, new vendors such as Connex and CDS, are attempting to stake out niches in the mid-range and low end, while system and storage vendors such as HP, Sun and EMC have also entered the market.
NAS components range from Quantum Corp.'s tiny 5GB Snap Servers to 12-terabyte-size units from Network Appliance Inc. For the low-end to midrange market, the chief attraction of NAS is that it is easy to implement— and some NAS schemes can be up and running in less than 15 minutes.
NAS communicates through the ubiquitous IP protocol (Ethernet and Gigabit Ethernet), so it doesn't require expensive host bus adapters, which are required for Fibre Channel, and is thus a good solution for providing general-purpose storage services to low-end clients as well as high-performance servers.
The addition of Fibre Channel storage subsystems gives EMC Corp.'s and Network Appliance's enterprise-class NAS systems high performance and scalability.
For the most part, enterprise-class NAS systems have been useful as content caching appliances for static content and streaming media, as well as some high-performance file serving. But high-end databases and server clusters primarily use SANs for their storage needs.
The major bottleneck in NAS products is the thin-server processing unit, which manages the file system and TCP/IP operations. Thin servers have highly optimized operating systems that are designed to deliver files as quickly as possible, but the software processes that sit between clients and their data introduce a large amount of latency compared with SAN systems.
However, the tables seem set to be turned with the emergence of startups such as BlueArc Corp. that aim to move software-based operations to hardware.
BlueArc's SiliconServer (which will be available later this spring) uses four customized processors to handle TCP/IP, protocol subsystem management, file system management and storage subsystem management. In early demonstrations, the SiliconServer has been a fast mover in basic storage benchmarks, but its viability in the enterprise storage market will have to be earned.
A storage area network (SAN) is, quite simply, a network dedicated to storage. It is a network whose primary purpose is the transfer of data between computer systems and storage elements and among storage elements. A SAN consists of a communication infrastructure, which provides physical connections, and a management layer, which organizes the connections, storage elements and computer systems so that data transfer is secure and robust.
Unlike the traditional direct attach storage model, a SAN attaches storage devices to servers in a networked fashion, using hubs, switches, routers and bridges to build the topology (see Figure below). Both the systems and the storage devices can, in theory, be heterogeneous in nature, though today interoperability concerns limit some customers to building homogeneous SANs. Although the network could conceivably be built with any networking technology, Fibre Channel has emerged as the technology of choice for SANs.

Topology of a SAN
SANs provide a number of advantages over direct attached storage. They provide any-to-any connectivity between servers and storage devices, making possible the sharing of storage resources between multiple servers and thus enabling IT managers to consolidate storage on a few large storage platforms. They also provide any-to-any connectivity between the storage devices themselves, opening the way for direct movement of data between storage devices, vastly improving efficiency of data movement and processes such as data backup or replication. The use of Fibre Channel, or most any other networking technology proposed for SANs, enables longer connectivity distances and higher performance than currently possible with SCSI technology. Over time, SAN technology will ease the task of centralized storage management and drive the adoption of remote management and data protection strategies, storage consolidation, system clustering and cross-platform data sharing.
The SAN market is made up of the vendors of Fibre Channel interconnect technology, as well as the vendors of the systems and storage devices that attach to the network. The Fibre Channel vendors are primarily new, relatively small companies such as Brocade, Vixel, Gadzoox and Crossroads. The storage companies are the same ones that have been providing direct-attach storage for years, such as EMC, Hitachi, Sun, HP and Compaq; and it is no exaggeration to say that every storage company is involved in the SAN market.
A typical SAN consists of Fibre Channel storage units (tape or RAID) that are networked to servers via Fibre Channel switches (from vendors including Gadzoox Networks Inc.)
The primary benefit of a SAN setup is its ability to provide device sharing, which allows storage resources to be consolidated. With a SAN in place, an IT manager can make single storage purchases instead of buying separate external storage units for each hardware platform.
A few weeks ago, Nishan Systems began shipping switches that have the ability to convert Fibre Channel traffic into Gigabit Ethernet traffic and vice versa. As Fibre Channel and networking vendors continue to grow, we should expect to see more hybrid systems like these in the future.
SAN describes a networked storage topology and NAS describes a highly optimized network file server. The questions asked by the IT managers, then, typically come down to some variation of the following:
The first question arises because, just as NAS provides high-performance shared access to (file system) data, one of the promises of SAN is also to provide high-performance storage and data sharing. The good news is that the choice between SAN and NAS is not an either/or decision. SAN topologies and NAS devices do, in fact, peacefully co- exist in many data centers. For example, a SAN in the data center may network database and application servers with a number of large storage devices on which their data resides, while one or more NAS devices are attached to the LAN providing file access to clients (see Figure below).

SAN and NAS can co-exist.
The choice of which technology to use is driven mainly by the requirement being addressed and partly by timing. If the requirement is to provide shared file access to a number of clients, NAS is generally the answer. NAS devices meet this need today with great efficiency. Because NAS systems are built on existing LAN and file system protocols, NAS technology is relatively mature in comparison with SANs. While a few SAN file sharing solutions exist, they are generally aimed at specialized markets such as video editing. Generalized SAN file sharing solutions will probably require a distributed SAN file system, which could be years away from appearing and maturing.
On the other hand, many users are grappling with the need to consolidate data used by large databases or applications such as Microsoft Exchange onto a small number of shared storage platforms to improve centralized management. Or, they want to take advantage of device-to-device data movement for applications such as backup or data replication. In this case, SAN topologies can provide unique capabilities to address these requirements.
Clients requiring file access still get the performance benefit of a highly optimized file server. One can take advantage of the efficiencies of storage consolidation by placing the NAS file system on a shared SAN storage device. And one benefits from the plug-and-play features of NAS setup and administration. Also the SNIA definition of NAS specifically allows for a NAS device to be connected to any type of network including a SAN. For this to be meaningful, the SAN would have to be capable of carrying file traffic in addition to the block protocols (i.e., SCSI) that it typically carries today. Fibre Channel, for instance, is capable of carrying both SCSI and IP traffic simultaneously. This capability is occasionally exploited today, but mainly for the transmission of management commands to a device via IP. It is relatively rare for clients to use Fibre Channel as the interconnect for accessing file servers. While theoretically possible, few people advocate the use of Fibre Channel as a generalized messaging network technology.
SAN topologies offer the ability to consolidate storage and improve data protection and storage management processes with a dedicated, high-performance storage network. NAS systems offer high-performance, low-administration file serving and file sharing for heterogeneous systems. Used together, they provide a potent one-two punch for addressing data center requirements. The fact that the two technologies can co-exist and work together means that investments in either or both will pay off well into the future. SAN and NAS are the reason that the future of storage is networked.
==================================================
Network communication protocols are defined within the context of a layered architecture, usually called a protocol stack . The OSI (Open Systems Interconnection) protocol stack is often used as a reference to define the different types of services that are required for systems to communicate. Figure 1:compares the OSI protocol stack to the more common protocols found today.
FIGURE 1:Common protocol stacks
The lowest layers define physical interfaces and electrical transmission characteristics. The middle layers define how devices communicate, maintain a connection, check for errors, and perform flow control to ensure that one system does not receive more data than it can process. The upper layers define how applications can use the lower network layer services.
The protocol stack defines how communication hardware and software interoperate at various levels. Layering is a design approach that specifies different functions and services at levels in the protocol stack. Layering allows vendors to build products that interoperate with products developed by other vendors.
Each layer in a protocol stack provides services to the protocol layer just above it. The service accepts data from the higher layer, adds its own protocol information, and passes it down to the next layer. Each layer also carries on a "conversation" with its peer layer in the computer it is communicating with. Peers exchange information about the status of the communication session in relation to the functions that are provided by their particular layer.
As an analogy, imagine the creation of a formal agreement between two embassies. At the top, formal negotiations take place between ambassadors, but in the background, diplomats and officers work on documents, define procedures, and perform other activities. Diplomats have rank, and diplomats at each rank perform some service for higher-ranking diplomats. The ambassador at the highest level passes orders down to a lower-level diplomat. That diplomat provides services to the ambassador and coordinates his or her activities with a diplomat of equal rank at the other embassy. Likewise, diplomats of lower rank, who provide services to higher- level diplomats, also coordinate their activities with peer diplomats in the other embassy. Diplomats follow established diplomatic procedures based on the ranks they occupy. For example, a diplomatic officer at a particular level may provide language translation services or technical documentation. This officer communicates with a peer at the other embassy regarding translation and documentation procedures.
In the diplomatic world, a diplomat at one embassy simply picks up the phone and calls his or her peer at the other embassy. In the world of network communication, software processes called entities occupy layers in the protocol stack instead of diplomats of rank. However, these entities don't have a direct line of communication between one another. Instead, they use a virtual communication path in which messages are sent down the protocol stack, across the wire, and up the protocol stack of the other computer, where they are retrieved by the peer entity. This whole process is illustrated in Figures 2 and 3. Note that the terminology used here is for the OSI protocol stack. The more popular TCP/IP protocol suite uses slightly different terminology, but the process is similar.
As information passes down through the protocol layers, it forms a packet called the PDU (protocol data unit). Entities in each layer add PCI (protocol control information) to the PDU in the form of messages that are destined for peer entities in the other system. Although entities communicate with their peers, they must utilize the services of lower layers to get those messages across. SAPs (service access points) are the connection points that entities in adjacent layers use to communicate messages; they are like addresses that entities in other layers or other systems can use when sending messages to a system. When the packet arrives at the other system, it moves up through the protocol stack, and information for each entity is stripped off the packet and passed to the entity.

FIGURE 2:Communication process between two separate protocol stacks
Figure 3 illustrates what happens as protocol data units are passed down through the layers of the protocol stack. Using the previous diplomatic analogy, assume the ambassador wants to send a message to the ambassador at the other embassy. He or she creates the letter and passes it to an assistant, who is a diplomat at the next rank down. This diplomat places the letter in an envelope and writes an instructional message on the envelope addressed to his or her peer at the other embassy. This package then goes down to the next-ranking diplomat, who puts it in yet another envelope and writes some instructions addressed to his or her peer at the other embassy. This process continues down the ranks until it reaches the "physical" level, where the package is delivered by a courier to the other embassy. At the other embassy, each diplomat reads the message addressed to him or her and passes the enclosed envelope up to the next-ranking officer.

FIGURE 3:How data and/or messages are packaged for transport to another computer
Each layer performs a range of services. In particular, you should refer to "Data Communication Concepts ," "Data Link Protocols ," "Network Layer Protocols ," and "Transport Protocols and Services " for more information. The sections "IP (Internet Protocol) " and "TCP (Transmission Control Protocol) " also provide some insight into the functions of the two most important layers as related to the Internet protocol suite.
Data communications is all about transmitting information from one device to another. All the controls and procedures for communicating information are handled by communication protocols. At the most basic level, information is converted into signals that can be transmitted across a guided (copper or fiber-optic cable) or unguided (radio transmission) medium. At the highest level, users interact with applications. In between is software that defines and controls how applications take advantage of the underlying network.
Communication Protocols
Any discussion of data communications must begin with a discussion of protocols. Communication protocols are the rules and procedures that networked systems use to communicate on a transmission medium. Communication protocols are responsible for establishing and maintaining communication sessions. Two computers engage in a session to coordinate the transfer of data. Sessions are connection-oriented. In contrast, a connectionless transmission occurs when data is sent to a device without the sender first establishing contact with the receiver. The Internet is a connectionless system. Connection-oriented and connectionless sessions are discussed under “Connection-Oriented and Connectionless Services.” Communication protocols can be compared to the diplomatic protocols used by foreign embassies. Diplomats of various rank handle different types of negotiations. They communicate with peer diplomats in other embassies. Likewise, communication protocols have a layered structure in which protocols at one layer in the transmitting system communicate with protocols in the same peer layer of the receiving system. A simplified diagram is pictured in Figure 4. Note the top layer is a high-level, network-enabled application where users make requests for network services. This layer talks with its peer layer in the computer it is communicating with. The messages sent by this layer travel down the protocol stack, across the wire and up through the protocol stack to the destination.
FIGURE 4:Layered network architecture simplified for clarity
The top layer is where applications interact with the network and is called the application layer protocol. The middle protocol layer, generically called the transport layer in this case, is responsible for keeping the communication session alive and running and for coordinating the transfer of information. It also provides “services” to the upper application layer. The lower layer defines connections to the physical transmission medium and the signaling techniques used on the medium. Note that the physical layer might provide modem connections, network connections, or even connections to satellites. For later reference, you should know that data passes through the protocol stack in blocks. For example, a file transfer might be broken up into any number of pieces, then transmitted one piece at a time. If one of the pieces is lost, it can be re-sent without needing to retransmit the entire file. Technically, pieces of data passing through the protocol stack are called PDUs (protocol data units). This is discussed further under “Protocol Concepts.” In more general terms, people talk about packets of data moving from one system to another. Another term you will encounter is frames, which has to do with dividing serial streams of data into manageable blocks for transmission as discussed later in this topic under the subheading “Framing in Data Transmissions.” The reason for layering the protocol stacks is simple. Protocols are published as worldwide standards so that one vendor can create network hardware or software that will work with another vendor’s hardware or software. A developer references a particular part of the protocol stack that is appropriate for the product being developed. Long ago, the ISO (International Organization for Standardization) developed the seven-layer OSI (Open Systems Interconnection) model. This model was supposed to have provided a framework for integrating data processing systems everywhere. However, to date it has only served as a very useful model for discussing how other more popular protocols operate and work together. The Internet protocols, including TCP/IP, are now commonly used throughout the world. Only a few years ago, a number of other protocols were vying for this top spot, including the OSI protocols. Other network protocol suites include Novell’s IPX/SPX, AppleTalk, and IBM SNA. The remainder of this section looks at layers of the protocol stack from the bottom physical layer to the upper application layer, with an emphasis on TCP/IP and other Internet protocols. Each section explains the basic terminology only and refers you to appropriate headings in this book.
Transmission Media and Signaling at the Physical Layer
A communication system consists of a transmission medium and the devices that connect to it. The medium may be guided or unguided, where guided media is a metal or optical cable and unguided media refers to transmitting signals through air or the vacuum of space. A communication system that connects two devices is said to be a point-to-point system. In contrast, a shared system connects a number of devices that can transmit on the same medium, but only one at a time. Both systems are illustrated in Figure 5. Note that system A and system Z have an end-to-end link that crosses over several individual data links.
FIGURE 5:Shared and point-to-point communication systems
Analog and Digital Signaling
Devices are connected to a transmission medium with an adapter that generates signals for transmitting data over some medium. For digital communication systems, discrete high- and low-voltage values are generated to provide the signaling for binary 1s and 0s, respectively. In contrast, an analog communication system like the voice telephone network transmits continuous analog signals that vary in amplitude and frequency over time. The frequency of these sine wave signals is measures in cycles per second, or Hz (hertz). As you’ll see, the frequency of the signals plays a role in the amount of data that can be transmitted without distortion over an analog telephone line. The bandwidth of a system refers to its data-carrying capacity. A modem (modulator/demodulator) is a device that can be used to transmit digital signals over analog transmission lines. A modem is required at both ends of a transmission to modulate, then demodulate the signal. As shown in Figure 6, the transmitting modem converts a digital signal into an analog signal and the receiving modem converts the signal back to discrete digital signals.
FIGURE 6:Digital-to-analog-to-digital conversion
There are a number of factors that limit the data rate (bandwidth) of a transmission system. One is the frequency allowed on the channel. It may be limited for a number of reasons, including government restrictions or the specifications of the transmission system. The telephone system has bandwidth limitations due to its use as a voice communication system. When transmitting digital data over analog systems, the higher the frequency, the higher the data rate. Figure 7 illustrates why this is so. In A, the frequency is low, so it is more difficult to transpose the discrete digital signal on the analog transmission. Note that the discrete signal is poorly represented, and this will result in distortion at the receiving end. In B, the bandwidth is much higher and more capable of representing the discrete digital signal without distortion.
FIGURE 7:Representing discrete digital signals on analog transmissions
Data Encoding
In its simplest form, digital data is transmitted as high- or low-voltage pulses. In a one-to-one relationship, a binary 0 may be transmitted as a zero-voltage level, and a binary 1 may be transmitted as +5V voltage level. However, special encoding schemes are used to more efficiently transmit signals. In these encoding schemes, 1s are not always represented by a high voltage and 0s by a low voltage (or vice versa). Instead, a change in polarity may reverse the scheme at any time, depending on the bit value. This is explained next. A scheme called Manchester encoding is used on Ethernet LANs. Its most important feature is that it provides a way for sender and receiver to synchronize and track the exact location of bits in a transmission without the need for a clocking mechanism. Note in Figure 8 that a bit transition always takes place in the middle of transmitting a single bit. This transition serves as a built-in clocking mechanism that the receiver can track. This also divides each bit period into two intervals in which bits are represented as follows: n A binary 1 is represented by the first interval set high and the second interval set low. n A binary 0 is represented by the first interval set low and the second interval set high.
FIGURE 8:Manchester encoding
Note that Manchester encoding is not the most efficient of the encoding schemes, but it is easy to implement and is used on many LANs today.
Synchronous and Asynchronous Transmissions
Not all transmissions are a steady flow of characters. A transmission that consists of many starts and stops is an asynchronous transmission. Assume you are back in the 1960s, sitting at a dumb terminal connected to a mainframe computer. As you type, each character is transmitted to the computer over an asynchronous link. You pause and the transmission pauses. Because the systems operate in asynchronous mode, the receiver is not expecting a steady stream of bits. It waits for further transmission at any time and does not assume that the link has been disrupted when transmissions stop. In contrast, a synchronous transmission is characterized by a long string of bits in which each character in the string is demarcated with a timing signal. Both types of transmissions are commonly used to connect computer systems over telephone lines or other channels. The choice of one over the other depends on the installation. In fact, modems that provide asynchronous operation for users may switch to synchronous mode for extended transmissions.
Serial Interfaces
A standard interface is required to connect communication devices like modems to computers. The most common interface for modems is the EIA-232 standard, which was originally called RS-232. In this scheme, computers or other similar devices are called DTE (data terminal equipment) and devices like modems are called DCE (data circuit-terminating equipment). The interface connector has 25 pins that are wired through to the opposite connector. Each pin represents a channel on which data is transferred or a specific control signal is sent. For example, pin 4 is the request to send line and the DTE uses it to signal that it wants to transmit. Pin 5 is the clear to send line and the DCE uses it to indicate that it is ready to receive.
Transmission Media
There are a variety of transmission media including copper cable, fiber-optic cable, and unguided wireless techniques. Each has transmission characteristics that restrict data transmission rates. Some of these restrictions are imposed by the designers of the communication systems on which the cable are used, based on various factors such as a need to reduce signal emanation. Other restrictions are based on signal loss over distance or even curvature of the earth in the case of ground-based microwave transmissions systems. Designers of communication systems take all of these factors into consideration when designing network systems such as Ethernet, token ring, FDDI (Fiber Distributed Data Interface), and others. Therefore, networks should be assembled within the standard specifications to avoid problems. Computer data can be transmitted over RFs (radio frequencies) in cases where wires are impractical. These RF transmissions take place between a transmitter and a receiver within a single room or across town. RF networks provide unique solutions for campus and business park environments where links are required across roads, rivers, and physical space (in general, where it is not practical to run a cable). Terrestrial microwave systems are commonly seen on the top of buildings and towers everywhere. The telephone companies have built networks of microwave transmitters and receivers for the telephone network. Satellite communication systems provide another solution for long-distance communication.
The Telephone System
The telephone system has always been an integral part of data communications. If an organization needs 24-hour connections to remote sites, it can lease dedicated digital transmission lines from the telephone company or other service providers or it can take advantage of packet-switched networks. An emerging trend is to build VPNs (virtual private networks) over the Internet. This saves much of the cost of leasing long-distance lines.
Now the discussion moves up the protocol stack above the hardware level. The next layer up is commonly referred to as the data link layer. The primary purpose of the data link layer is to manage the flow of bits between systems that are connected to a transmission medium. It is helpful to think of water flowing through a hose. Once transmission starts, the physical network sends raw bits through the hose to the receiver. Interference and electrical problems can disturb an electrical transmission, just like a kink in a hose can disrupt the flow of water. Also, the “bit buckets” on the receiving end may fill up quickly and overflow before the receiving system can process the data. The data link layer can provide a mechanism for controlling the transmission of bits across the physical layer. If necessary, it can detect and correct errors in transmission and tell the sending system to slow down or stop sending data until the receiving system catches up. On the other hand, performing all these tasks in the data link layer can reduce performance, so many networks only rely on the data link layer for fast data transmission. Higher-level protocols in the transport layer handle error detection and recovery.
Framing
Framing provides a controlled method for transmitting bits across a physical medium and provides error control and data retransmission in the event of an error. It is helpful to think of a freight train. A block of bits is put into each frame and delivered to the destination. A checksum is appended so the frame can be checked for corruption. If a frame is corrupted or lost, only that frame needs to be re-sent, rather than the entire set of data. Frames have a specific structure, depending on the data link protocol in use. The frame structure for a popular data link protocol called HDLC (High-level Data Link Control) is pictured in Figure 9. Note that the information field is where data is placed, and it is variable in length. An entire packet of information may be placed into the information field. The beginning flag field indicates the start of the frame. The address field holds the address of the destination, and the control field describes whether the information field holds data, commands, or responses. The FCS field contains error-detection coding.

FIGURE 9: HDLC frame format
Error Detection and Control
The data link layer is also responsible for error detection and control. One error control method is to detect errors and then request a retransmission. This method is easy to implement, but if errors are high, it affects network performance. Another method is for the receiver to detect an error and then rebuild the frame. This latter method requires that enough additional information be sent with the frame so the receiver can rebuild it if an error is detected. This method is used when retransmissions are impractical, such as a transmission to a space probe.
Flow Control
Finally, we get to flow control. As mentioned earlier, if a data transmission is like water flowing through a hose, some control is needed to prevent the bucket at the other end from overflowing. In this analogy, the bucket is the data buffer that the receiver uses to hold data until it can be processed. The buffers on some NICs (network interface cards) are large enough to hold an entire transmission until the processor can get to it. When buffers overflow, frames are usually dropped, so it is useful for the receiver to have some way to tell the sender to slow down or stop sending frames. Only one device can transmit on the network at a time, so a medium access control method is needed to provide arbitration. In the local area network environments defined by the IEEE, medium access protocols reside in a sublayer of the data link layer called the MAC (Medium Access Control) sublayer. The MAC sublayer sits below the LLC sublayer, which provides the data link control for any installed MAC drivers below it. The subdivision of the layers can be seen in Figure 10.

FIGURE 10. The data link layer consists of two sublayers: MAC (Medium Access Control) and LLC (Logical Link Control)
The MAC sublayer supports a variety of different network types, each of which has a specific way of arbitrating access to the network. Three different access methods are described here. n Carrier sense methods With this technique, devices listen on the network for transmissions and wait until the line is free before transmitting their own data. If two stations attempt to transmit at the same time, both devices back off and wait a random amount of time before retransmitting. n Token access methods A token ring network forms a logical ring on which each transmission travels around the ring from station to station. Only a station that has possession of a special token can transmit. n Reservation methods In this scheme, every transmitting device has a specific slot of time or frequency allotted to it. A device can choose to place data in the slot for transmission. This technique can waste bandwidth if a device has nothing to transmit.
Bridging
A bridge is a device that connects two network segments. In this discussion, the segments are IEEE 802.x LANs. A bridge can extend the distance of a LAN and can be used to split a shared LAN into two segments so there are fewer stations trying to share each segment of the medium. Bridges operate in the LLC layer of the protocol stack, as shown in Figure 11. Note that two Ethernet networks are joined by a bridge. A frame from the Ethernet LAN enters one port of the bridge and exits out the other port for transmission on the adjoining Ethernet segment. The bridge will only forward packets that have a destination address on the destination segment, thus minimizing unnecessary packet deliveries.

FIGURE 11:Bridge operation
Switching
As mentioned, a bridge can be used to split a LAN into two segments, which effectively makes two smaller shared segments. A switch is a device that expands on this concept. Whereas a traditional bridge has two ports to join two LAN segments, a switch has an array of ports for joining more than two segments. As shown in Figure 12, a hub is usually attached to a port. Then, only the workstations on that hub contend for access to the LAN segment. If a workstation needs to transmit to a workstation on another port, the switch will quickly set up a temporary connection between the ports so that all the workstations attached to the two ports share what is essentially a dedicated network segment. For example, in Figure 12, the switch could join segment A/B/C to G/H/I.

FIGURE 12:A switched network
The purpose of switching is to boost LAN performance by reducing the number of workstations on each LAN segment. The switch itself moves frames between ports at very high speeds so it does not introduce any delay to the network. The best performance is achieved with one workstation per port so that there is no contention at all when that workstation wants to transmit. The switch sets up a port connection between the sender and receiver for the duration of the transmission. Note that a switch operates in the data link layer relative to the OSI protocol. The industry refers to this as Layer 2 switching. The technique of dividing LANs is often called microsegmentation because a network can be split into smaller and smaller segments up to the point where a single port segment may be dedicated to a single computer. Most switching devices provide a way to configure VLANs (virtual LANs) as well. With a traditional hub, all the connected workstations are part of the same LAN segment. In a VLAN-capable network, workstations can be configured to belong to one or more logical LANs. For example, if the hubs in Figure 12 are replaced with VLAN-capable switches, workstations A and D could be configured into a VLAN, and workstations B, E, and H could be configured into another VLAN. Broadcasts from A are heard by D, and broadcasts from B are heard by E and H.
Routing, Internetworking, and the Network Layer
Only a few years ago, bridges were essential devices in corporate networks. Today, routers are more often selected because they provide a better way to connect the individual networks an organization may have installed over the years. Internetworking is all about joining networks with routers. Routers provide the following important services: i)Limit broadcast traffic between networks and intelligently forward packets between networks. ii)Provide a security barrier between networks (i.e., routers can filter traffic based on IP address, application, etc. iii)Provide connections to wide area networks. iv)Provide a way to build a network with redundant paths, as shown in Figure 13.

FIGURE 13:Routers are use to build networks with multiple connection points and redundant paths
Routers join the autonomous networks of the Internet. Each individual network has its own network address as defined by the IP (Internet Protocol). What IP offers is a higher-level internetwork addressing scheme similar to the way U.S. ZIP codes provide a way to identify individual cities throughout the nation. In this analogy, each individual network attached to the Internet is like a city or town. Routers examine the IP address and determine the port on which to forward the packet. To understand the role of routers, it may be useful to consider how the Internet joins the autonomous networks of organizations throughout the globe. The TCP/IP addressing scheme is an important part of the Internet because it provides a way to assign a unique address to all the networks and hosts attached to it. Keep in mind that individual networks already have a MAC layer addressing scheme that identifies individual nodes on that network. IP identifies individual networks in an internetwork.
Transport Layer Services
The transport layer provides a unique service. It allows two systems to set up a “conversational” session with one another so they can reliably exchange data. The session achieves reliability because the transport layer processes in each system exchange messages about the status of the session. Figure 14 illustrates how a transport layer session is a logical end-to-end connection that spans intermediate devices like routers. The two peer transport layers appear to be talking to one another.

FIGURE 14:The transport layer can engage in end-to-end “conversations” across internetworks
The network layer IP protocol is a connectionless service while the transport layer provides reliable connection-oriented services, in some cases over highly unreliable networks. For example, if a network link temporarily fails, a connection-oriented session does not immediately give up the connection, but attempts to keep it alive until the underlying link is reestablished or until a time-out occurs. Once the session is reestablished, data transmission continues from where it was interrupted. A connection-oriented session is actively monitored and dynamically managed to ensure proper delivery of data. While connection-oriented virtual circuits take time to set up, they are appropriate for lengthy “conversations” and data transmissions. In contrast, connectionless services like IP send datagrams to recipient systems without first notifying them.The recipient is expected to accept the datagrams and handle them as appropriate. If datagrams are lost, the recipient must detect that a packet is missing and request a retransmission from the sender. Interestingly, the Internet is based on IP, an unreliable protocol, but TCP adds reliability to the Internet.
The Application Layer
Applications that run at the highest level of the protocol stack are not really involved in communications, but they do use communication services and so have appropriate features and user interfaces that take advantage of the underlying network. Network file-sharing services like NCP (NetWare Core Protocol), NFS (Network File System) in the UNIX environment, or SMB (Server Message Blocks) in the Windows environment are specifically designed to use network services so that users can share files over networks. These systems are designed to work with most underlying networks.
In the OSI (Open Systems Interconnect) protocol stack, the network layer is layer 3, just above the physical layer and the data link layer. The physical layer is concerned with moving bits across a wire, while the data link layer is concerned with the point-to-point connection between two devices. The network layer is concerned with moving data across multiple data link connections, or put another way, moving data across multiple networks that are connected by routers. Assume you connect three networks together as shown in Figure 15. The three networks are interconnected with two routers. Traffic from network A must cross network B to reach network C. The illustration shows that there are three separate data link layer connections to perform this task. In contrast, network layer protocols are concerned with moving packets across multiple networks and provide an addressing scheme and routing functions for doing this. The network layer is connectionless, meaning that packets are addressed to a destination but no connection negotiation is performed in advance of sending packets. Transport layer protocols provide connection-oriented sessions.

FIGURE 15:Data link connections and network layer functionality
IP (Internet Protocol) is the most well-known network layer protocol. It provides an internetwork addressing scheme that allows devices to address packets to devices on other networks. Data link level addresses, which are typically the hardwired addresses on NICs (network interface cards) only work on the local LAN. To send a packet to a device on another LAN across a router, you need a higher-level addressing scheme. For example, in many small towns, it is possible for someone in the same town to address a letter to someone else in the town by putting only the street address on the envelope. However, a letter addressed to someone in another town will require the city, state, and ZIP code on the envelope. IP addresses can be compared to ZIP codes. ZIP codes provide a high-level addressing scheme that indicates a specific town in a vast web of towns across the country. See “Internetworking” and “IP (Internet Protocol)” for more information about internetworks and internetwork addressing.
The network layer is the “internetwork” layer of the protocol stack. It is involved with the topology of the internetwork and how all the subnetworks are connected together. This is the function of routers and routing protocols. A router connects two or more networks and runs a routing protocol that “discovers” the layout of the internetwork by exchanging routing information with other routers. The routers then determine which paths through the interconnected network are the best paths for sending a packet from a source to a destination. Network administrators can also get involved in specifying what these paths are because some may be preferable over others. For example, one path may be a low-speed link that is used for backup while a more preferable path provides high-speed data transmission.
Transport Protocols and Services
In the OSI protocol stack, transport protocols occupy layer 4, which is just above the network layer. Of all the layers, it could be said that that the transport layer is the most important because it provides network applications with reliable data delivery services. In the TCP/IP protocol suite, TCP provides transport services while IP provides network services. In the Novell SPX/IPX protocol suite, SPX (Sequenced Packet Exchange) provides transport services while IPX (Internetwork Packet Exchange) provides network services. As pictured in Figure 16, the simplest model of a network consists of three layers, with an application layer at the top, a transport layer in the middle, and a network layer at the bottom. In this model, an application running in one computer communicates with an application running in another computer. The source application relies on the lower two layers to move messages, files, and other information to the application running in the other computer. Top-level applications are built by relying on the underlying services.

FIGURE 16:A basic protocol stack
The network layer is involved with actually transmitting information from one system to another. It deals with physical interfaces, cabling schemes, putting data in frames, and delivery of data across a series of point-to-point links (i.e., frames, router- connected internetworks). To use an analogy, if network layer protocols were airline systems, the transport layer protocols would be the air traffic controllers. While airplane pilots obviously have the ability to fly from one airport to another on their own, doing so on a busy flying day would be unsafe without the traffic controllers. Figure 16 illustrates a phone connection between transport service providers. This analogy is appropriate because the two layers engage in a conversation to make sure that data is reliably delivered. However, the “connection” is virtual because the messages that make up the conversation are not exchanged directly between the two layers but are put in packets and delivered as frames across the physical layer. The transport layer establishes connection-oriented sessions over which data is reliably transmitted during the period that the session is open. The following events take place during such a session: i) Establish connections ii) Negotiate session parameters iii) Manage the transfer of data iv) Terminate the connection Establishing a connection is a simple matter of sending a connection request to the target host. If it is available, it responds with a connection acknowledgment message. The systems then negotiate session parameters such as timing, packet size, and syntax. A virtual circuit may also be established through a routed network so that each router along the way does not need to make a decision about how to route packets to the destination. The services provided by transport protocols are outlined here. In general, these topics are described in more detail where mentioned, or you can refer to “TCP (Transmission Control Protocol)” for a description of how transport layer services are implemented in the Internet environment.
RELIABILITY SERVICES These provide error-recovery and retransmission mechanisms to ensure that data is delivered to the destination. If a packet is lost along the way, either the sender or the receiver must detect the loss and recover from it. If a sender is responsible for error recovery, then the receiver can send an ACK (acknowledgment) back to the sender when it receives packets. If the sender does not receive an ACK within a period of time, it may assume the receiver never received the packet and send another one. If the receiver is responsible for detecting lost packets, it reads the sequence numbers in packets to determine if a packet is missing. TCP uses a combination of these techniques, as described in “TCP (Transmission Control Protocol)."
SEQUENCING As mentioned above, adding sequence numbers to packets allows the receiver to detect missing packets. Sequence numbers are also used to reorder packets that arrive out of sequence. Packets may arrive out of order if they take different routes through an internetwork, where some routes cross slow links or links that are experiencing problems.
FLOW CONTROL Senders and receivers will not always have the same ability to revive and process packets. A sender can overflow a receiver with too many packets and still continue to send packets if it does not know about the overflow. Packets are dropped in this condition that will eventually need to be retransmitted. If the receiver can signal to the sender and it is overflowing, the sender can slow down or stop its transmissions and thus reduce the need to retransmit at a later time. See “Flow-Control Mechanisms” for more details.
FRAGMENTATION/REASSEMBLY When packets travel across internetworks, they may encounter networks that cannot handle large packet sizes. The router leading into such networks must fragment such packets and insert information in the fragmented packets that helps the receiver to reassemble them.
The IP (currently IP version 4, or IPv4), is the underlying protocol for routing packets on the Internet and other TCP/IP-based networks. This section discusses IP in general and unicast IP, which is host to host. Multicast IP is a one-to-many transmission scheme, and it is discussed under “IP Multicast." IP is an internetwork protocol. It provides a communication system that works across linked networks. In an internetwork, the individual networks that are joined are called subnetworks or subnets. An internetwork is pictured in Figure 17. A router joins two subnetworks (A and B) to create the internetwork (A/B).

FIGURE 17:IP lets you build internetworks and address hosts on the internetwork
Each subnetwork in this scheme can be different—i.e., one subnetwork can be Ethernet while another can be token ring. Therefore, each subnetwork has its own MAC (medium access control) methods for putting information into frames and addressing those frames for transmission to other nodes on the same network. However, these frames cannot be reliably sent to other networks because those other networks probably use different framing formats, access methods, and addressing schemes. IP provides a universal way of packaging information for delivery across network boundaries. Whereas frames are used to transmit information on subnetworks, IP datagrams are the “envelopes” for transmitting information across the internetwork. But datagrams do not replace frames. Frames are the only way to transmit across subnetworks. As datagrams cross a subnetwork, they “piggyback” a ride in the frames of that subnetwork. Upon arrival at a router, the datagrams are removed from the frames and repackaged into the frame type of the next network. When a router extracts a datagram from a frame, it looks at the destination IP address and then makes a decision about where to route it. If the destination IP address matches a host on the next network, the datagram is put in a frame and addressed to that host. Otherwise, the datagram is put in a frame and addressed to the next router that will get the datagram to its destination. During this process, ARP (Address Resolution Protocol) is used to resolve IP addresses, if necessary. This process is pictured in Figure 18 and outlined here. For simplicity, the numeric IP addresses of networks, hosts, and routers are replaced with abbreviated letters (For example, the source resides on subnet A and is called A1, and router A/B connects subnets A and B.)
1. At the source (A1), a datagram is created with the IP address of the destination host (C1).
2. Software in A1 determines that the IP address is for a host on another subnetwork. Therefore, it must be sent to router A/B to reach that network. The software puts the datagram in a frame and inserts the MAC address of router A/B.
3. The frame arrives at router A/B on port A. The datagram is extracted and the IP address is inspected. The router determines that the destination can be reached through router B/C, so it puts the datagram in a frame type to match subnet B and attaches the MAC address of router B/C.
4. At router B/C, the frame arrives on port B. The datagram is extracted and the IP address is inspected. The router determines that the host is attached to subnet C, so it does a table lookup to resolve the IP address into a MAC address. The router then puts the datagram in a frame, attaches the MAC address of destination C1, and transmits the frame on the network.
5. Host C1 sees the frame on the network as being addressed to it, accepts the frame, and processes it.

FIGURE 18:IP packets across mixed networks
Note that the path between the source and destination is not a straight-through circuit. The path is a series of individual data links first between the source and its local router, then router to router, then router to destination. These links are handled by data link protocols associated with the underlying networks.
Routers are responsible for determining the next hop that will get a packet to its destination, not the complete path to the destination. This is like getting directions—a person may point you in the right direction at an intersection. At the next intersection, another person points you in the right direction. Eventually, you get to where you wanted to go. In some cases, you might be pointed in a direction that avoids construction or congestion. On a large meshed network consisting of many possible paths, routers can do the same thing for packets, helping them to avoid downed or congested links.
Note that IP is a connectionless service, unlike the higher-level TCP protocol, which is connection oriented. IP does its best to deliver packets, but they may be dropped or lost. It is up to end systems to recover those packets and provide other service features such as flow control and packet sequencing by using TCP. See “TCP (Transmission Control Protocol)” for more details.
IP Addressing and Host Names There are three ways to identify a host computer system in a TCP/IP network environment: the physical address, the IP host address, or the domain name. The physical address is the MAC address that is hardwired into network interface cards. It is used for LAN addressing, not internetwork addressing. The IP host address identifies a specific host on an IP internetwork. The domain name provides an easily recognized name for a host on an IP internetwork. While humans use domain names, they are resolved into IP addresses by DNS (Domain Name System) for general addressing on IP internetworks. In February of 1997, the IAHC (International Ad Hoc Committee) announced seven new gTLDs (generic top-level domains), in addition to the existing ones (.com, .net, and .org), under which Internet names may be registered. The new fields are as follows: .firm Businesses or firms .store Businesses offering goods .arts Culture and entertainment .rec Recreational entertainment .info Information services .web Entities related to the Web .nom For individual or personal nomenclature An IP address is a numeric address that uniquely identifies a host system on an internetwork. The address is a 32-bit (4-byte) binary number (called the address space) that contains two important pieces of information: n Network identifier Indicates the network (a group of computers) n Host identifier Indicates a specific computer on the network. An Internet address uses the dotted-decimal notation format similar to the following in which a period separates each byte of the 32-bit address: 192.100.10.5 When you connect to the Internet, you must obtain an IP address from the InterNIC (http://www.internic.net). The address you are assigned is just the network identifier portion. You are responsible for assigning host identifiers. IP Address Classes The 32-bit IP address space is divided into two parts with the left part identifying a particular network and the right part identifying a host on a network. There are three ways to split the address—after the first byte, the second byte, or the third byte—as pictured in Figure 19, forming class A, class B, and class C addresses. What is the significance of splitting the IP address in this way? First, it creates many millions of possible addresses out of the rather limited 32-bit address space. Basically, three addressing schemes are derived from the 32-bit scheme, but all can be used over the Internet. Second, the different classes support organizations of different sizes as will become clear. The IP address classes are described here: i) Class A The first bit set as 1 identifies class A. The next 7 bits define the network address, and the remaining 24 bits identify hosts. The 7-bit network address space allows 127 network addresses and the 24-bit host address space identifies 16,777,214 hosts per network. ii) Class B The first 2 bits set as 10 identifies class B. The next 14 bits define the network address, and the remaining 16 bits identify hosts. This scheme defines 16,382 networks and 65,534 hosts per network. iii) Class C The first 3 bits set as 110 identify class C. The next 21 bits define the network address, and the remaining 8 bits identify hosts. This scheme defines 2,097,150 networks and 254 hosts per network.
FIGURE 19:IP address classes
A class D scheme also exists for multicasting. The first 4 bits identify the class, and the remaining 28 bits refer to a group of hosts, all of which receive the same IP packet. Refer to “IP Multicast.” A class E is also defined for future use. Unfortunately, most class A network schemes were assigned to U.S. government agencies and large companies early in the history of the Internet, so if you have a network with 16 million hosts, you're out of luck! The class B scheme provides for over 16,000 networks, but these addresses are also allocated. Only class C addresses are still available (as of this writing). Some organizations will find the 254-host limit a bit restrictive, but as discussed shortly, there are ways to get around this problem. Also note that IPv6 will alleviate some of these addressing problems, as discussed later. When you configure a host or router with an IP address, a subnet mask must also be specified. The subnet mask basically serves as a sort of template to indicate which part of the address defines the network and which part defines the host. The subnet masks for the different classes of networks are shown in the following table, along with the binary equivalent:
|
Class |
Subnet Mask (Decimal) |
Subnet Mask (Binary) |
|
Class A |
255.0.0.0 |
11111111 00000000 00000000 00000000 |
|
Class B |
255.255.0.0 |
11111111 11111111 00000000 00000000 |
|
Class C |
255.255.255.0 |
11111111 11111111 11111111 00000000 |
Binary 1s mask out the network address to reveal the host address. As an example, a class B address of 128.10.50.25 and a class B subnet mask of 255.255.0.0 are shown in the following table. If you put the subnet mask over the IP address, the 1s basically mask out the first two bytes and reveal the host address in the last two bytes. Routers are interested in the network address, so they reverse the process to extract the network portion of the IP address.
Class B address 128.10.50.25 10000000 00001010 00110010 00011001
Class B subnet mask 255.255.0.0 11111111 11111111 00000000 00000000
As mentioned previously, class C addresses restrict the number of hosts per network to 254. To get around this problem, a subnetting scheme was devised that basically divides the host portion of the address into two parts and uses some of the bits to identify subnetworks within your own network. However, there is a trade-off in doing this. If you use some of the bits in the host address to identify a subnet, then you reduce the number of bits that are available for host addressing. This is outlined in the following table. For example, if you split your network into two subnets, you can have 126 hosts per subnet. With 16 subnets, only 14 hosts are possible per subnet.
|
Subnet Mask |
Binary Value of Last Byte |
Number of Subnetworks Allowed |
Number of Hosts per Subnet |
|
255.255.255.128 |
x.x.x.10000000 |
2 |
126 |
|
255.255.255.192 |
x.x.x.11000000 |
4 |
62 |
|
255.255.255.224 |
x.x.x.11100000 |
8 |
30 |
|
255.255.255.240 |
x.x.x.11110000 |
16 |
14 |
Note how the last byte in the subnet mask adds binary 1s to the mask in the second column. In the first case, decimal 128 adds binary 1 to the last byte of the mask. This single bit is the subnet address space, but only two values are possible—binary 0 and 1, so only two subnets are allowed. In the second case, decimal 192 adds two binary 1s to the last byte of the mask. With two bits, four subnets are possible—00, 01, 10, 11. Alert readers might notice that the number of possible hosts is shy by two. This is because the first and last binary values are used for broadcasting and internal use.
IP Datagram The IP datagram header, pictured in Figure 20, is the envelope in which data is transmitted. It is sometimes referred to as a packet, in general discussions. The datagram fields are described in the following list. Note that the maximum length of the datagram including header and data cannot exceed 65,535 bytes. i) Version The version number of the protocol. ii) IHL (Internet header length) Length of the header. iii) Type of service The various levels of speed and/or reliability. iv) Total length The total length of the datagram. v) Identification If a datagram is fragmented, a value that identifies a fragment as belonging to a particular datagram. vi) Flags DF (Don't Fragment) or MF (More Fragments). An indication of whether or not this is not the last fragment. vii) Fragment offset Where the datagram fragment belongs in the set of fragments. viii) Time to live A counter that is decremented with every pass through a router. When 0, the datagram is discarded. n Protocol The transport layer process to receive the datagram. ix) Header checksum Error correction for the header. x) Source address The IP address of the host sending the datagram. xi) Destination address The IP address of the host to receive the datagram. xii) Options/padding Optional information and filler to ensure the header is a multiple of 32 bits. xiii) Data The user data (a variable field, not shown in the figure).

FIGURE 20:IP datagram header
IPv6 (Internet Protocol version 6) IPv4 has served the Internet community well, but it has limited address space and is causing major problems as more and more hosts connect to the Internet. A solution was developed with the creation of CIDR (Classless Interdomain Routing), which allocated class C addresses as variable-size blocks. A block is a range of addresses (without excess addresses) appropriate for an organization's needs. This leaves addresses free for other users. Still, CIDR only buys time. The IETF began working on an IP protocol update in 1990. What was eventually hammered out was IPv6, which supports all the other Internet protocols but is not backward-compatible with IPv4. IPv6 is outlined in RFC 1883 and RFC 1887, which are available at the IETF Web site (http://www.ietf.org). The most important feature of IPv6 is its longer address space. It is 16 bytes long, compared to 16 bits for IPv4! That will provide enough addresses to assign an IP address to every person and every conceivable device on the planet. Imagine your home entertainment system has an IP address. From your office, you could send a command to have it record a television show. Imagine that every person has their own personal IP address, allowing anyone to communicate with you anywhere. GPS (Global Positioning System) is a satellite system that can locate any GPS transmitter on the planet. GPS devices can be assigned IP addresses so they are locatable through the Internet. Imagine you're waiting for a city bus that is equipped with such a device. You open your portable Web browser, locate the bus system's Web page, and then display a map of the bus route. The current location of the bus is shown via information provided by GPS. You can take this a step further. Imagine implanting such GPS devices in criminals on probation so you always know their whereabouts. You could even implant them in kids to track them in case of a kidnapping. Another important feature of IPv6 is support for multimedia. Basically, source and destination can establish a path through the network that can provide guaranteed delivery of real-time audio and video. An extension scheme is also included so that senders can add custom information into a datagram. This will allow flexible expansion of the design as new requirements appear. There are many other changes in IPv6 when compared to IPv4.
TCP (Transmission Control Protocol)
TCP is a transport layer component of the Internet’s TCP/IP protocol suite. It sits above IP (Internet Protocol) in the protocol stack and provides reliable data delivery services over connection-oriented links. To put this into perspective, assume you are writing one of the many thousands of network applications that must exchange messages and data with other network computers. Your program should be able to make a requests of lower-layer network protocols to have data delivered. At the same time, you should not need to write routines into your program that verify whether messages and data were received. This is a task that reliable protocols such as TCP perform. TCP in turn uses IP to deliver information across a network. Interestingly, IP is a connectionless network protocol that does not guarantee reliable delivery. While IP provides an efficient data delivery mechanism, TCP makes up for its deficiencies by providing reliable services. TCP messages and data (officially called segments, as described later) are encapsulated into IP datagrams and IP delivers them across the network. An interesting aspect of TCP is that early on, in the days when the Internet was still being defined, IP was not part of the design. During early development, Denny Cohen at USC argued that the connection-oriented features of TCP are unnecessary for some types of data transmissions and that they created excess overhead and traffic. He recommended splitting TCP to accommodate “timeliness rather than accuracy.” What was needed was a way to quickly get data to another system. Thus, TCP became TCP and IP. At that time, UDP (User Datagram Protocol) was also created to provide an alternative application interface to IP. Both TCP and UDP use IP. While UDP resides in the transport layer, it does not have any of the reliability features of TCP. What it does have are fields in the header that identify a source and destination port address, which basically identifies a particular process (an application) running in the destination computer. Thus, an application can connect with an application in another system without using the reliability features of TCP. The original TCP was developed to interconnect many different types of computers at research institutes, universities, and government agencies. An encapsulation scheme was implemented because the designers did not want the owners of the various networks to alter their internal networking schemes to accommodate internetworking. It was assumed that every network would implement its own communication techniques. Routers (originally called gateways) provided this encapsulation service.
TCP Features Perhaps the most important characteristic of TCP is that it sets up end-to-end connections between two computers that need to exchange data. An end-to-end connection is virtual because it is created in software and extends across all the point-to-point connections that make up a typical router-connected network. This is shown in Figure 21. Note that point-to-point connections are between two physical systems such as a host to router or router to router, while an end-to-end connection is between the end systems of a communication session.

FIGURE 21:TCP establishes end-to-end connections over router-connected networks
An end-to-end connection does not simply terminate at the network interface. It actually extends up into the application layer to a specific process running on a computer. Each computer creates a socket, and the endpoints of the connection attach to this socket. Each socket has an address, called the port number. You can think of a socket as being like the telephones at either end of a phone call and the port as being like a phone number. Ports have specific addresses that are “well known” throughout the computer industry. For example, it is “well known” that Web servers operate at port 80, so Web clients always connect with this port when accessing Web servers. The Web client and server then set up a temporary end-to-end connection at this port to exchange data. The full IP address of a Web server running in a computer has the form x.x.x.x:80, but it’s usually not necessary to enter the port number. A connection must first be requested by the sender and granted by the receiver. This provides the first level of reliability by ensuring that the receiver is ready to receive data. It also points out how TCP manages data delivery. If an application were to pass data directly to IP for delivery, IP would simply start sending packets to the destination. But if the destination is off-line or busy, those packets will be dropped and IP by itself has no way to inform the application that the packets were not delivered. TCP manages this by starting with a simple connection request, which IP delivers. When the recipient responds, TCP then starts passing more information to IP for delivery, making sure that IP doesn’t go out of control. In this respect, one might think of TCP as a traffic controller for IP. Some of the other features that TCP provides are outlined here: i) TCP connections are full-duplex, two-way virtual channels that allow either end system to send data at any time. In this respect, a connection is really like two separate transmit and receive channels. Buffers are used to hold incoming and outgoing data so other activities are not held up by the communication process. ii) The receiver can acknowledge receipt of datagrams to provide assurance of delivery to the sender . This acknowledgment scheme is used in a number of ways, as discussed in a moment. iii) Flow control provides a way for two systems to actively cooperate in the transmission of data to prevent overflows and lost datagrams caused by fast senders. This feature lets transmitting systems quickly adapt to the traffic loads on the network and/or the available buffer size on the receiver. iv) Sequencing is a technique for numbering datagrams so the receiver can put them back into the correct order and determine if datagrams are missing. v) A checksumming feature is used to ensure the integrity of packets.
TCP Segments A TCP segment is the official name for what is often loosely referred to as a packet (where a packet is some package of information). A segment is the actual entity that TCP uses to exchange data with its peers. The segment is what gets encapsulated into an IP datagram and transmitted across the network. Segments have a 20-byte header and a variable-length Data field. The fields of the TCP segment are described below and pictured in Figure 22. Keep in mind that either station may send a segment that contains just header information and no data to provide the other system with connection information, such as an acknowledgment that a segment was received. i) Source and destination port: Contains the port number of the sockets at the source and destination sides of the connection. ii) Sequence number: This field contains information for the receiver, which is a sequential number that identifies the data in the segment and where it belongs in the stream of data that has already been sent. The receiver can use the sequence number to reorder packets that have arrived out of order. It can also indicate that a segment is missing. iii) Acknowledgment number: This field is used by the receiver to indicate to the sender in a return message that it has received a previously sent packet. The number in this field is actually the sequence number for the next segment that the receiver expects. That number is calculated by incrementing the value in the Sequence Number field. iv) TCP header length: Specifies the length of the header. v) Codes: This field contains the following bit codes, which serve as flags to indicate specific conditions: URG (urgent); This bit is set to 1 if there is information in the Urgent Pointer field of the header. ACK (acknowledgment); If ACK is set to 1, it indicates that the segment is part of an ongoing conversation and the number in the Acknowledgment Number field is valid. If this flag is set to 0 and SYN is set to 1, the segment is a request to establish a connection. PSH (push); A bit set by the sender to request that the receiver send data directly to the application and not buffer it. RST (reset); When set, the connection is invalid for a number of reasons and must be reset. SYN (synchronize); Used in conjunction with ACK to request a connection or accept a connection. SYN=1 and ACK=0 indicates a connection request. SYN=1 and ACK=1 indicates a connection accepted. SYN=0 and ACK=1 is an acknowledgment of the acknowledgment. FIN (finish); When set, this bit indicates that the connection should be terminated vi) Sliding-window size Transmits information about how much space is available in the receiver’s buffers. This field is used by the receiver to inform the sender to slow down transmissions because the sender is sending data faster than the receiver can process it. If the receiver wants the sender to stop transmitting altogether, it can return a segment with 0 in this field. Later, when it can resume receiving data, it can send a segment with this field set to a nonzero value and an appropriate value in the Acknowledgment Number field to indicate which segment it needs. vii) Checksum Provides an error-checking value to ensure the integrity of the segment. viii)Urgent pointer This field can be used by the sender to indicate a location in the data where some urgent data is located. ix) Options A variable set aside for special options x) Data A variable-length field that holds the messages or data from applications.

FIGURE 22:The TCP segment layout
Establishing Connections The transport layer establishes connection-oriented sessions over which data is reliably transmitted during the period that the session is running. A connection is first established, data is transferred, and the connection is terminated. Establishing a connection is a simple matter of sending a connection request to the target host. If it is available, it responds with a connection acknowledgment message. The systems may then negotiate session parameters (or some parameters may be included in the Options field of the TCP segment). For example, a station may indicate in a connection setup message that it cannot handle payloads larger than 2,000 bytes. The opposite station may indicate that it cannot handle payloads larger than 1,000 bytes. The lower value is then negotiated. There are a number of other parameters that can be negotiated to improve the efficiency of transmissions. Connections are established using a three-way handshake mechanism, which takes place as follows:
1. Host A (the sender) sends a TCP segment to Host B with the SYN flag set to 1 and the ACK flag set to 0. A proposed initial sequence number is also inserted into the Sequence Number field of the header. This is the sequence number that Host A will use to send segments to Host B.
2. Host B stores the sequence number and returns a segment to Host A in which both the SYN and ACK flags are set to 1. To acknowledge that it has received Host A’s sequence number, Host B increments the sequence number by 1 and inserts it into the Acknowledgment field of the segment it returns to Host A. In addition, it inserts its own sequence number into the Sequence Number field. This is the sequence number that Host B will use to send segments to Host A.
3. Host A can now acknowledge to Host B that it received its acknowledgment. It sends a segment in which ACK=1 and SYN=0. It also increments the sequence number received from Host B by 1 and inserts it into the Acknowledgment field to indicate that it accepts B’s sequence numbering scheme.
After data is transmitted, the session is terminated. Host A sends a FIN=1 to Host B. Host B then responds with ACK=1 and FIN=1 and Host A responds to that with ACK=1.
TCP may have to deal with a number of connection parameters during the connection setup phase. One of these is to establish transmission delay parameters. Suppose Host A sends a segment to Host B and Host B returns an acknowledgment, but for some reason the acknowledgment does not arrive at Host A in a reasonable time. Host A must assume that Host B did not receive its transmission, so it retransmits the segment. In the meantime, the “lost” acknowledgment eventually finds its way to Host A and the retransmission arrives at Host B, which now has two of the same segment to deal with.
The amount of time that a sender should wait for an acknowledgment cannot be a fixed value because some links, such as satellites, have longer delays than others. TCP can negotiate this value by measuring the time it takes to receive responses. It then estimates a round-trip delay value and uses this value to clock transmissions and acknowledgments for a connection.
There are many other communication parameters that TCP must deal with in order to provide reliable services.