This document is intended to be a synthesis of all the things I learned at Wireless Battle Mesh v4. It is intended to be useful as a guide to a person wishing to start a wireless network in a community that has none.
Wireless Mesh Networks
XXX in this section, we describe what wireless networks are about, and the various reasons you might have for implementing one.
Purpose
When you are forming your organization, carefully consider your goals and priorities, as this will guide how your organization (and physical network) grow.
Wireless mesh networks can serve many purposes. What are the most important to your organization? Do you want to give free Internet to the people who cannot afford it? Do you want just to play with technology? Do you want to build a alternative/robust network so that some communication channels will be open also in the case of censorship or other (natural) disasters? Do you want to cover open spaces (parks and so on) for people to have access to the network/Internet while they are mobile? Do you want to cover rural areas? City areas? Do you want to cover as much area as possible or to make it as fast as possible (mesh vs. point-to-point)? Do you want to have an emphasis on connectivity (larger coverage) or throughput (faster).
How will you finance everything? Donations? Everybody will participate with their own hardware? Will you sell some services/premium bandwidth to cover other costs? Will you sell technical support? Quality of service? Do you want to grow fast (probably need to have money) or are you willing to grow slowly, but be strictly non-commercial?
These are all decisions that your organization will have to make. Of course things are not black-and-white, so you don't need to have one rigid answer to each of these questions. Instead, think about these questions, and your priorities, as you grow your organization and network. Of course, along the way, you will get some opportunities for your less-prioritized ideas.
Hardware
Hardware Basics
Radio Channels
There are two primary radio channels that are available for civilian use: 2.4GHz and 5GHz. Equipment for these bands is easily available. In addition, there are some more exotic bands that may be used.
2.4GHz
Normal equipment that people have operates in 2.4GHz. This band is split up into 11 channels, each 20MHz in width. The channels overlap with one another, so it is customary to use channels (XXX 1, 6, and 11), which are the furthest apart channels that do not overlap with each other.
There is a lot of noise on the 2.4GHz band, because it's explicitly designated for unregulated consumer use. This is the band of wifi, and microwave ovens.
When using wifi in 2.4GHz, it is important to pay attention to the channels you are using. Routers that can overhear each other should be set to different channels, to lessen interference and thus increase throughput. (XXX Can you do that in adhoc? Can they still communicate?) (Note that in 802.11s mode, all meshed routers must be on the same channel, so you can't use this strategy to avoid interference.)
Side note: A theorem from mathematics has proven that, given any map of countries, you can use only (XXX four, or is it three?) colors to color the countries on the map such that no two adjacent countries use the same color. Therefore, you should be able to do the same thing with wifi maps.
5GHz
5GHz is the band of radar, and it is required that if you operate in 5GHz, you must avoid interfering with radar (usually handled automatically by the drivers).
Due to the noise on 2.4GHz, you might want to make higher-throughput links in 5GHz instead, which is quieter, but the equipment can be more expensive, and it can't interact with most people's machines. Therefore, if you use a 5GHz link for a high-throughput link, you will probably have to also provide a 2.4GHz one for people to actually connect to. (See #Point-to-point links).
However, some wifi cards in laptops can indeed work in 5GHz - mostly with 802.11n. At the time of writing, these cards are somewhat rare.
(XXX 5GHz is likewise divided into channels).
Exotic radio bands
900MHz has been used in some cordless phones. It may be difficult or impossible to find 900MHz equipment that can do wifi.
In the US, the switch from analog to digital TV freed up a bunch of "whitespace" in the TV spectrum, (XXX 3.4GHz). It may be possible to obtain a license to operate in this band, but it may be difficult to find equipment capable of speaking wifi on this band.
In general, if you use lower frequencies, it is possible for the radio signal to penetrate more material, but the signal will be lower throughput. Higher frequencies have less penetrating power, but have higher throughput.
You can create a data link in the spectrum of HAM radio or CB radio. To do this, you will want to create a full-duplex link. That is, each end of the link transmits on one channel, and receives on another. Use two separate antennas, physically separated. Combine the signals into one full-duplex signal, using a circulator. Then, you can plug this full-duplex audio signal into an ordinary computer modem, and speak PPP over it.
Layer 1 protocols
Commercially available hardware implements one or more of these three Layer 1 protocols: 802.11a, 802.11g, and 802.11n.
In 802.11g and n, (XXX and also a?), wireless radios can be set in one of these "Modes":
- Master - the node is a wifi access point. Clients will set their hardware to Managed mode to connect to it. Together, Master and Managed are referred to as "Infrastructure mode".
- Managed - the node is a wifi client. It will connect to exactly one Master node.
- Ad-Hoc - the node joins an ad-hoc group. All nodes that are in ad-hoc mode communicate with each other, given that they have the same essid. This is the traditional way to implement a mesh network, since Ad-Hoc mode is supported by all hardware and drivers, and does not require you to manually configure nodes to be masters or managed nodes.
Mesh - There is a new protocol defined by IEEE, called "mesh mode", or 802.11s. See #802.11s.
802.11a
802.11a was the first wifi protocol. It always operates in 5GHz. It was designed for very long distance communication, but it has lower throughput than the protocols that came after it.
802.11g
802.11g is the "normal" wifi protocol, spoken by all laptops, all cellphones, and on almost all commercial routers, 802.11g is the only available protocol. The maximum theoretical throughput of this protocol is 54MBps. It can only operate on 2.4GHz.
802.11n
802.11n is a new protocol. It is available on some commercial routers, few laptops, and no cellphones. It has several features for increasing throughput, one of which is MIMO (multiple-in multiple-out), a way of using "spatial coding" so that you can transmit more data with two antennas than you can with one.
The maximum possible throughput of 802.11n is 150MBps. However, it is possible to use 802.11n devices in HT40- or HT40+ mode, where you transmit on a channel that is 40MHz in width, rather than the usual 20MHz. Since this creates noise in quite a bit of the available channel bandwidth, the law requires that you scan the channel for interference before initiating an HT40 connection. Hardware and drivers implement this automatically. So you should only use HT40 if you can ensure that the band is clear. Thus, it is best for #Point-to-point links. If you use HT40, the maximum possible throughput is 300MBps.
In general, 802.11n is split up in to the same channels as 802.11g. (XXX Is this true? What channels are available in 5GHz? How does that compare with the channels on 802.11a?)
802.11s
This is the new "mesh mode", a layer 1 protocol explicitly designed for meshing. It is an IEEE standard. This contrasts with the "Ad-Hoc Mode" that is the traditional way to implement mesh networks.
(XXX It does the mesh routing on layer 1 rather than layer 2? It has some relationship with OSLR?)
Since this protocol is new, very few devices which implement it are commercially available, and the driver support is poor. In my experiments with the TP-Link TL-841ND, I saw latencies in excess of 900ms, even though the machines were in the same room and under okay interference conditions. The same machines, speaking 802.11g in "Ad-Hoc Mode", had latencies of around 60ms.
I admit, my knowledge of 802.11s is not as comprehensive as that of the other layer 1 protocols, because I got the impression that it was less useful, so I stopped researching.
Transmit Rate
Wifi radios usually have a range of bitrates they can transmit at. Usually, they are configured such that they will attempt to transmit at the highest bitrate, but will back off to lower bitrates if there is too much noise, or if for some reason there is low transmit quality.
Multicast packets, however, are usually transmitted at the lowest possible bitrate: 1Mbit/s. That's because lower-bitrate traffic will usually have less packet-loss. If we're broadcasting a packet, it's probably because we feel it's pretty important, and want everybody to be able to hear it, so that's why we decrease the bitrate.
But having it on 1mbit, means protocol traffic needs more airtime, which especially in large networks -- networks with more protocol traffic -- gets to be an issue. And additionally 1mbit mcast-rate means that the link-quality detection which is based on packetloss or protocol traffic, will test the link with an quite "unrealistic" rate, leading to detection of links which for any higher througput are undesireable anyways.
Therefore in many community networks we usually use a higher multicast rate, such as 6Mbit/s. This parameter is configurable in openwrt. e.g.:
- uci set wireless.@wifi-iface[0].mcast_rate=6000
Antennas
There are a wide variety of possible antenna types. Two major considerations are: Band and directionality.
One minor detail to pay attention to is the connector type.
A final decision to be made is whether to build antennas yourself or buy commercially available ones.
Band
Antennas must be made specifically for the channel they are intended to operate on. Thus, you can buy 2.4GHz antennas, and 5GHz antennas. When buying an antenna, pay attention to what band you want to operate that link on. (XXX Some antennas are available that operate in both bands.)
Directionality
Antennas are available with varying degrees of directionality. Commercial antennas advertise the spread in degrees -- for example, an antenna might say that it has 60 degrees of spread.
Omnidirectional
The antennas that come with commercial wifi routers are omnidirectional. Omnidirectional antennas are perfect for #Mesh nodes because they provide 360 degrees of connectivity. However, this means that they are also susceptible to interference from 360 degrees. You may consider placing artificial obstacles to prevent interference between nodes that use omnidirectional antennas.
The antennas that come with commercial routers do not have the highest possible sensitivity. More expensive replacement antennas can be purchased to increase the radius in which these nodes can communicate. This works best in a quiet area where interference is less of an issue.
You cannot use the 5GHz band in an omnidirectional fashion, because you will not be able to guarantee that you are not interfering with radar.
Directional
Directional antennas are sensitive mostly in one direction. This cuts down on the possible interference, but also decreases its ability to connect nodes (vs omnidirectional antennas). Thus, directional antennas are best for high-throughput #Point-to-point links.
Having a more directionality on an antenna means that it will be effective at a longer range. However, the more directionality an antenna has, the more precisely it must be aimed, and the more likely it is that the aim will be disturbed by the wind or other environmental factors. In aiming directional antennas, you may want to use Horst, an ncurses-based tool for live-monitoring signal strength and watching the Layer 1 traffic for interference.
Directional antennas come in many types: Parabolic dish, flat panel, yagi-type (shaped like a cylinder).
Connector type
All antennas have a standard connector -- the so-called "N-type" connector. However, routers solt for home use often have a different connector with a smaller form factor. To connect an antenna to this, you will need an adaptor cable called a "pigtail". Make sure to find out the type of connector that your router uses, so that you can buy the right pigtail. You can also buy the connectors separately and solder stock cables onto them. This lets you make cables in the appropriate length for your setup.
Homemade antennas
There are many guides on the internet about how to build DIY antennas -- usually, directional antennas. There are guides on how to make cylinder-type antennas, as well as parabolic dishes using biquad antennas.
The DIY approach to antennas did not seem popular amongst the people at the Battle Mesh. They felt that the cost savings was not much, so that the labor required was not justified. They felt that there were affordable, high quality antennas available commercially. These antennas will be much more weatherproof than home-built ones. Furthermore, an improperly built antenna may very slowly damage the wireless chip.
Power Over Ethernet
A common scenario is that you have to deliver both power and ethernet to some device that is outside and far from an electrical outlet. It might be hard to deliver power over that distance. The solution is Power Over Ethernet. You can get devices that take DC and ethernet in, and have an output that is an ethernet cable with the power on it. Thus, you only have to drill one hole in the wall, and string one cable.
Routers sold for home use do not have any special provision for power over ethernet. Here, the power and ethernet must then be split when it gets to the router. Splitter devices are sold for this purpose.
Higher-end routers, though -- those that are intended for outdoor use -- may support Power Over Ethernet directly, (XXX allowing you to plug the powered ethernet cord directly into the ethernet port).
Types of links
If you are starting from nothing, you will need to purchase hardware. A network may be composed of two types of nodes -- point-to-point links and mesh nodes. Different types of hardware is recommended for each.
Point-to-point links
Point-to-point links may be required for two reasons:
- When you are linking two nodes that are distant from one another.
- To build a higher-bandwidth "backbone" in the network.
The same hardware is useful for both of these purposes.
The Ubiquity Nanostation is an all-in-one system that includes a router, a flat-panel antenna which has good directionality, a weatherproof case, and power-over-ethernet. The case is built with the idea that it will be mounted on a pole by tying zip-ties to it. Nanostations can be purchased for around $70. There are different models of the Nanostation that are available in 2.4GHz, 5GHz, and you can choose which of the 802.11 protocols you want it to support (a, g, n). Find the machine that suits your needs and budget.
(an anecdote: As a test of its weatherproofness, a Nanostation was placed in a flowerpot over an entire winter, as snow and rain piled on top of it. The machine was left on and running the whole time, and did not fail to function. Heat from the machine melted some of the snow, and still the machine functioned perfectly.)
For longer-distance links, Ubiquity sells a system in two parts: An Airgrid antenna can be purchased, that works for either 2.4GHz or 5GHz. This is a parabolic antenna that has a place to screw in a "Bullet" (another Ubiquity product). The Bullet is the same machine as the Nanostation, but in a different case (it looks like a lightsaber hilt). Thus, the Bullet can be purchased in any configuration of 2.4 vs 5Ghz, and 802.11{a/g/n}. Since this system comes in two parts, it is a little more expensive than the Nanostation, costing around $50 for each of the two components, for a total of around $100.
Note that, in order to make a successful point-to-point link, you will need one machine at each end of the link. Thus, the total cost of the link is double the amounts quoted above.
Software
Ubiquity machines come with an operating system called "AirOS". It is built with busybox and linux, but it is not based on any of the traditional open-source router distros (OpenWRT, DD-WRT, Tomato). Instead, Ubiquity has rolled its own, and has written a proprietary web front-end. However, Ubiquity machines do run OpenWRT perfectly well.
Is it a good idea to replace the stock AirOS with OpenWRT? Opinions vary. Some believe that it is a good idea to keep the original firmware, which is believed to be reliable and stable, and sufficiently feature-rich to provide for your needs. The theory goes that, in these high-bandwidth backbone links, you don't need the features that OpenWRT would provide, so installing it would only expose you to unnecessary risk (the risk that is inherent in changing anything).
The other theory is that OpenWRT should be on everything, so that you have absolute control over it.
In either case, it is agreed that the point-to-point links should not be running in Ad-Hoc mode. They should be running with encryption instead. If it's in ad-hoc mode, then any mesh nodes in range will cause the point-to-point link to divert its attention unnecessarily. Also, the routes should be set manually, rather than using a routing protocol. The route should be to the other side of the link -- any change in that could only be a mistake.
Mesh links
The mesh is the heart of the mesh network, obviously. They are the key to spreading the mesh cheaply. These nodes can be made with normal commodity routers. In fact, you could even use routers that people already have, and leave them in place, if they happen to be placed sufficiently densely that they can see one another. You can just convince people to let you flash the mesh firmware onto their machines. (See #Politics).
For convenience, it is probably best to use the 2.4GHz band for the mesh nodes and provide 802.11g, since these are the nodes that people will connect to, and almost nobody has 5GHz and most people don't have 802.11n support.
If you need to buy new machines, one choice would be the very cheap T-Link TL-841ND. This machine operates in 2.4GHz and has 802.11g and 802.11n. It has 4MB of flash, but the max size of the image is 3.6MB. It does not have Power Over Ethernet, and it is not weatherproof. You may choose to buy some random weatherproof box from the hardware store.
Another choice for mesh nodes is the Fonera machines. They are a bit more expensive. They are in 2.4GHz and provide 802.11g (XXX And maybe 802.11n). You can get ones that have USB. They have 8MB of flash, so they could have extra fancy gewgaws on there. However, it is difficult to flash them from the original firmware to OpenWRT. You need a JTAG->USB cable. (XXX They are weatherproof???) They do not have Power Over Ethernet.
Depending on the materials in the houses, and the distance between them, they may not be able to see each other from their positions in the houses. In that case, you may want to mount the machines outside, which is why you might think about weatherproof boxes.
Access point vs mesh
Almost all wifi clients support ad-hoc mode, but it may be inconvenient to have clients connect directly to the ad-hoc mesh, because the clients will not be speaking your routing protocols, so they may not be routable. Furthermore, the mesh nodes will not want to be pushing things out over dhcp -- you will want to manually configure the IP address of the node.
For these reasons, it would be convenient if a mesh node had two interfaces -- one for mesh and one to provide access for the clients.
However, it is possible to have more than one virtual interface share the same radio. These interfaces will have to share the same channel, of course. You can have one virtual interface for the mesh network, which can be in Ad-Hoc mode, and one virtual interface in Master mode, serving dhcp for the clients. The virtual interface for the mesh can hide its essid, so people don't try to connect to that as a client.
(XXX See this guide on setting up virtual interfaces).
Landing page
You may want to provide a Landing page on your network. That is, when users attach to a node for the first time, their web traffic will get intercepted and they'll be redirected to a page where they have to agree to your terms and conditions. If they agree, their MAC address will be added to a whitelist. This system is known as "Captive Portal".
The landing page would also be a great place to put propaganda about your network and your community. You might talk about your philosophy, or advertise services that are only available on your network (such as local bulletin boards or IRC channels).
The Freifunk firmware has some custom software that implements this feature. Another system that implements this is Coova.
Hop Length
Throughput decreases very quickly with the hop length to a gateway. It decreases according to this table:
- 1 hop - 50% of the possible bandwidth.
- 2 hops - 25% of the possible bandwidth.
3 hops - <10% of the possible bandwidth.
- 4 or more hops - it stabilizes at 7% of the available bandwidth.
This is one reason why you might be interested in making some high-throughput point-to-point links, to reduce the number of hops required to reach a gateway.
Bufferbloat
This is tricky to put simply, although the problem is a simple one. Bufferbloat is the congestion-induced delay that occurs when buffers "larger" than a path's latency are filled up. That is, if a buffer is large enough such that it takes more milliseconds for a packet to flow into and out of that buffer than the number of milliseconds it takes for a packet to traverse the path between that buffer and the packet's destination, and that buffer is full, it will introduce extra delay into a network segment. Full buffers wreak havoc on networks because transport protocols don't handle them correctly. The solution, according to Jim Gettys, is to employ active queue management.
Tools
When building a network, you will frequently need to use special diagnostic and management tools. This section describes the tools that are available and their use cases.
Tools to diagnose individual links
This section describes tools that are used to diagnose individual links, both at the Layer 1 and Layer 2 levels.
Horst
http://br1.einfach.org/tech/horst/
Horst is a diagnostic tool that displays Layer 1 information about the link. You can see the signal strength of the link. Use this when pointing your antenna. You can also see what frames your router can overhear. Use this to determine how quiet or loud a particular channel is. This will help you choose a frequency for this link.
This tool uses ncurses to display the information. It also has history and summary screens.
Horst is available in the OpenWRT repositories. It is not available in the Ubuntu repositories at this time, (XXX At least, not 10.04. Could it be available later?) but the source code is available, and compiles in Ubuntu just fine.
If you have Horst running on your router, you can either display the ncurses info screens over ssh, or you can run it in server mode, and run a local client to draw the ncurses screens (use this option if you're diagnosing a slow link, and the ncurses is updating slowly over ssh).
MTR
MTR (My Traceroute) is a tool that continuously runs traceroutes and shows the latency at each hop of the path. Use this if you have a slow link somewhere, and don't know which link it is.
In the Ubuntu repositories, the mtr package that is available presents a gtk interface. You might prefer the mtr-tiny package, which provides an ncurses interface. mtr can be run on your client machine, or run directly on the router.
The tools mentioned by Jim Gettys
(XXX the tools mentioned by Jim Gettys).
Tools to manage the network
It is fun and informative to have a map of where each node on the network is physically located, and also a map of the network topology. It is very useful for a network to have this software. This is a summary of the state of these tools.
Freifunk's approach
The Freifunk firmware has a Wizard built in that asks users to put in some registration information when they first set up their router. For example, they can pick their location from the map (openstreetmap). This data is sent to a (XXX central server) that is running a custom Django application to keep track of this information. Users of nodes can then look at their router's webpages and download a KML file generated by the Django app, and have it displayed in Google Earth or Google Maps (or any other application that understands KML).
The Freifunk firmware is available in the OpenWRT repositories. Freifunk's custom Django app is not available, and has been abandoned.
The Freifunk firmware also has network topology visualization tools, but they only work with OSLR.
nodewatcher
However, the maintainer of this Freifunk's custom Django app has abandoned the project, and Freifunk is looking to transition to the nodewatcher software, developed for wlan slovenia in Slovenia.
nodewatcher (also a Django app) is a more comprehensive software. It serves two purposes: To display network data and to manage the network. It is capable of building firmware for various configurations, over the web interface. It also has a network topology visualizer, which - again - currently only supports OLSR.
nodewatcher currently has a bunch of code that is specific to the wlan slovenia setup. However, the effort is underway to make nodewatcher useful for any network. To this end, they have set up a Trac server and a mailing list (in English). The code is currently available through a Mercurial repository. More information is available under http://dev.wlan-si.net/.
Routing Protocols
In a network, a major task is to maintain routing information, so that packets from one node can find their way to other nodes (perhaps gateways to other networks). These routes could be maintained by hand -- adding and removing entries to the kernel routing table -- but this would be an impossibly large amount of work for a human, especially in a wireless mesh network, where changing interference and environmental factors will cause the quality of links to change constantly. The routes should adapt quickly to changes in the environment.
Therefore, routing protocols have been developed that will add and remove routes automatically, in response to the changing environment.
Intra-network routing
There is a staggering array of intra-network routing protocols available for wireless mesh networks. The ostensible purpose of the Battle Mesh was to compare the effectiveness of each of these protocols.
The purpose of an intra-network routing protocol is so that the nodes of the network can communicate their views of the world to one another and determine the best path from any source to any destination, in the network.
For wired networks, you might choose to use one of the older routing protocols, such as OSPF, ISIS, or RIP. These protocols are not optimal for wireless networks, however. This is because they attempt to find the shortest path, but do not take into account characteristics of wireless networks, such as the reliability of a link, or the throughput.
Therefore, a new generation of routing protocols has been developed, with wireless in mind. OSLR, Babel, Batman, and several Batman variants -- BMX, a one-person fork of Batman, and Batman-adv, a Layer 2 routing protocol implemented as a kernel module that was recently accepted into the Linux mainline.
Using multiple protocols
The good news is that you may not have to choose just one routing protocol. You can, for example, assign each node two addresses, each with a different IP prefix, and then have one routing protocol for each of the prefixes. This will help in case you are having problems with one routing protocol (perhaps they have created a routing loop or a black hole). The other protocol might have found a way to nodes that are not available over another routing protocol.
However, running more routing protocols will cause more traffic to be generated on the network, in order to keep the tables synchronized. Also, running more routing protocols will cause more CPU load, to compute the latest routing tables. For these reasons, you might want to watch how many protocols you use.
Also note that Batman-adv cannot be used in conjunction with other protocols, because it is operating on Layer 2 rather than Layer 3, like the other protocols. Therefore, you can't use the trick of routing different prefixes. Batman-adv uses MAC addresses.
If you are running multiple protocols, though, you must eventually choose what prefix to send out to clients on DHCP. This will determine which protocol's routes the client will use when traversing your network. However, if users want, users can manually set their IP address to one managed by another protocol, and be routed over that protocol's routes.
But this method is mostly useful for experimentation and debugging.
Inter-network routing
So far, I have discussed the routing of traffic inside the network. However, when you want to access resources that are not on your network, you need to route your packets to another network. This is what "internet" means.
BGP
On the Internet, each network is assigned an Autonomous System Number (ASN). An Autonomous System (AS) is a network that runs intra-network routing protocols and does not share this information with neighboring networks. In order for two networks to communicate, you will need an ASN from your regional authority, such as RIPE or ARIN. These cost $1300 per year.
To speak with neighboring networks, routers at the edge of each network speak BGP with each other -- the Border Gateway Protocol. An AS compiles lists of prefixes that it knows how to reach and exchanges these lists with other ASes at the edge routers.
Non-BGP
However, a small network may not be able to convince other ISPs to speak BGP with them, and may not be able to afford an ASN. The problem is that, in order for traffic to get routed back to your network, the remote server must be able to find you. If you are not speaking BGP, then you will not be able to advertise prefixes that are available on your network, so nobody will know how to route to you.
However, it so happens that we have another tool at our disposal, due to the way broadband is currently sold: NAT. Most people have one IP address for their house, and communicate to the outside only with NAT. Return communication from the remote server is routed to the house's IP address, where the house's router has stored the association of the outgoing port with the internal address of the machine. Thus, packets from the outside can find their way to machines inside the house. This only works, however, for connections that are initiated from inside the house. The world outside cannot find particular machines that are hidden behind NAT, and are using private, unrouteable IP addresses.
To use this method, you must take care when configuring any "gateway" routers -- that is, routers on your network that also have connections to the internet. You must configure your internal routing protocols so that they the gateway routers advertise a route of 0.0.0.0/0, with the next hop being the gateway provided by the ISP.