Many IT administrators use proxies extensively in their networks, however the concept or reverse proxying is slightly less common. So what is a reverse proxy? Well, it refers to the setup where a proxy server like this is run in such a way that it appears to clients just like a normal web server.
Specifically, the client will connect directly to the proxy considering it to be the final destination i.e. the web server itself, they will not be aware that the requests could be relayed further to another server. It is possible that this will even be an additional proxy server. These ‘reverse proxy servers’ are also often referred to as gateways although this term can have different meanings too. To avoid confusion we’ll avoid that description in this article.
In reality the word ‘reverse’ refers to the backward role of the proxy server. In a standard proxy, the server will act as a proxy for the client initially. Any request by the proxy is made on behalf of the received client request. This is not the case in the ‘reverse’ scenario because because it acts as a proxy for the web server and not the client. This distinction can look quite confusing, as in effect the proxy will forward and receive requests to both the client and server however the distinction is important. You can read RFC 3040 for further information on this branch of internet replication and caching.
A standard proxy is pretty much dedicated to the client’s needs, all configured clients will forward all their requests for web pages to the proxy server. In a standard network architecture they will normally sit fairly close to the clients in order to reduce latency and network traffic. These proxies are also normally run by the organisations themselves although some ISPs will offer the service to larger clients.
In the situation of a reverse proxy, it is representing one or a small number of origin servers. You cannot normally access random servers through a reverse proxy because it has to be configured to specifically access certain web servers. Often these servers will need to be highly available and the caching aspect is important, a large organisation like Netflix would probably have specific IP addresses (read this) pointing at reverse proxies. The list of servers that are accessible should always be available from the reverse proxy server itself. A reverse proxy will normally be used by ‘all clients’ to specifically access certain web resources, indeed access may be completely blocked by any other route.
Obviously in this scenario it is usual for the reverse proxy to be both controlled and administered by the owner of the origin web server. This is because these servers are used for two primary purposes to replicate content across a wide geographic area and two replicate content for load balancing. In some scenarios it’s also used to add an extra layer of security and authentication to accessing a secure web server too.
For many people, there is a very strong requirement to mask their true identity and location online. IT might be for privacy reasons, perhaps to keep safe or you simply don’t want anyone to log everything you do online. There are other reasons, using multiple accounts on websites, IP bans for whatever reason or simple region locking – no you can’t watch Hulu on your holidays in Europe. The solution usually now revolves around hiding your IP address using a VPN or proxy as a minimum.
Yet the choice doesn’t end there, proxies are pretty much useless now for privacy and security. They’re easily detected when you logon and to be honest of very little use anymore. VPN services are much better, yet even here it’s becoming more complicated to access media sites for example. The problem is that it’s not the technology that is now the issue but the originating IP address. These are actually classified into two distinct groups – residential and commercial which can both be detected by most websites.
A residential IP address is one that appears to come from a domestic account assigned from an ISP. It’s by far the most discrete and secure address to use if you want to keep completely private. Unfortunately these IP addresses are difficult to obtain in any numbers and also tend to be very expensive. Bottom line is the majority of people for whatever reason who are hiding their true IP address do it by using commercial addresses and not residential ones.
Most security systems can easily detect whether you are using a commercial or residential vpn service provider, how they use that information is a little more unsure. So at the bottom of the pile for security and privacy are the datacentre proxy servers which add no encryption layer and are tagged with commercial IP addresses.
Do I really need a residential VPN IP Address? Well that depends on what you are trying to achieve, for running multiple accounts on things like craigslist and Itunes – residential is best. If you want to try and access the US version of Netflix like this, then you’ll definitely need a residential address. Netflix last year filtered out all commercial addresses which means that very few of the VPNs work anymore, and you can’t watch Netflix at work either.
If you just want to mask your real IP address then a commercial VPN is normally enough. The security is fine and no-one can detect your true location, although they can determine you’re not a home user if they check. People who need to switch IPs for multiple accounts and using dedicated tools will probably be best advised to investigate the residential IP options.
It is arguably the most important function of a web proxy at least as far as performance is concerned and that’s on-demand caching. That is documents or web pages which are cached upon request by a client or application. It’s important to remember that a document can only be cached if it has actually been requested by a user. Without a request, it will not be cached and indeed the proxy server will not even be aware of it’s existence.
This is a different method than using a replication model which is typically used to distribute data and updates. This is more often used on larger, busier networks where data can be replicated onto specific servers, this method is also known as mirroring and also useful for sharing over the internet. One of the most common examples for mirroring is when a large software package is being distributed instead of a single server being responsible, multiple duplicates are replicated onto different servers.
One of the best ways to facilitate performance increases is to use a method called round-robin DNS. This involves mapping a single host name to multiple physical servers. These servers must be assigned separate IP and physical addresses and their addresses distributed evenly among the clients requesting the software. When using the DNs method, the clients will be unaware of the existence of multiple servers because they will appear as a single logical server.
Most of the caching solutions used by proxies are centred around removing the load on a specific server. However when a proxy caches resources locally without mirroring or replication then it’s still the single server which is responsible. The physical loads doesn’t decrease however it does reduce the number of network requests that the server has to implement. This also reduces the number of name requests that the server makes which can also introduce some levels of latency.
Having caching enabled can reduce the speed of the server responses significantly. However this does depend largely on the sort of requests that are made, imagine a proxy used specifically to obtain a Czech IP Address and directly download a specific resource. Caching that resource locally would improve the speed significantly as long as the content didn’t change much, however this would be different for sites which stream audio or video and contained large amounts of multimedia content.
The SSL Tunneling Protocol allows any proxy server which supports it the ability to act as a tunnel for SSL enhanced protocols. This feature is essential to support normal web traffic and increasingly SSL is being used to secure normal web requests which would previously have been sent in clear text. The client makes the initial HTTP request to the proxy and asks for an SSL tunnel. If we look at the protocol level the actual handshake to establish the SSL tunneling connection is fairly straight-forward.
The connection is simple and in fact looks like virtually any other HTTP request, the only difference is that we use a new ‘Connect’ method. The format is also slightly different as the parameter is not a full url but rather the destination host address and the post number in the format 192.168.1.1:8080. The port number is always required with these connection requests, as the default number is generic and not always correct.
When the client has received a successful response then the connection will pass all data in both directions to the destination server. For the proxy server much of it’s role in authentication and establishing the connection is over, and it’s role is then limited to simply forwarding data for the connection. The final significant role for the proxy server is to close the connection which it will do when it receives a close request from either the client or the server.
Other situations where the connection will be closed mainly refer to error status codes. For example an error generated in response to authentication would be a typical situation where authentication has failed. Most proxies will require some sort of authentication especially the high quality US proxies such as this. The methods might change however from a simple username password supplied via a challenge and response to pass through authentication from a system like the Active Directory or LDAP.
It’s interesting to note that the mechanism used to handle SSL tunnelling is not actually specific to this protocol. It is in fact a generic technique which can be used to tunnel any protocol including SSL. There is no actual reliance on any SSL support from the proxy, which can be confusing when you see people look for SSL enabled proxies online. It is not required on a properly configured proxy server as the data is simply transported there is no need for the actual protocol to be understood after the initial connection request.
There are issues with some protocols transferring through proxies, certain specialised protocols need more support than is offered with the standard tunneling mechanism. For example for many years LDAP (Lightweight Directory Access Protocol) was not able to work across most common proxies. Some implementations support LDAP by using SOCKS while there is some difficulty with LDAP queries being cached and subsequently causing performance issues. Most protocols however work perfectly with this ‘hands off’ tunneling mechanism which you can see perfectly illustrated if you try and stream video through proxies like this which used to circumvent BBC iPlayer blocked abroad.
Most networks of any size need to have some sort of system for storing and managing their log files. Most network devices produce logs and many of them can contain lots of useful information. However without a way of analysing and reporting this data then it can simply become another system administration chore with little or no benefit.
One of the oldest methods of centralising these system messages and logs is by using a syslog server. Syslog messaging was originally used on UNIX system for the logs produced by network devices, applications and operating systems. Most modern network devices can be configured to generate Syslog messages which can be picked up by a server. These messages are normally generated and then transmitted using UDP to a server running a Syslog daemon that can accept the messages.
Over the years more and more devices have been created which cab support and generate Syslog messages. Despite being fairly old technology many firms have started to move away from specialized technology towards simply using a central Syslog server to receive, store and archive messages generated from network devices. These servers can also be used to create automatic notifications if specific critical events are generated – for example if an important default gateway becomes unresponsive. This means that IT support personnel can be made aware of potential issues quickly and often before it affects users directly or at least minimize downtime.
Although there are many other methods of receiving and sending system messages across a network using Syslog has many advantages. For a start it works directly with many reporting technologies and almost all network devices will support the Syslog message format. This is very important because as soon as you have multiple logging formats and messaging you’re faced with the prospect of installing multiple system log servers. This creates a hierarchy which can be difficult to support especially for network support staff who need access to all logs in order to troubleshoot issues.
For example if you have a RAS (Remote Access Server) which is configured to use a different system messaging system from other devices in your network you could miss vital pieces of information. In addition, problems in these servers can be missed and so important devices can suffer longer periods of downtime. Many remote users rely on having access through a good VPN service when travelling in order to connect back from remote networks.
If you do have different devices which don’t support the Syslog standard and aren’t able to get rid of them there are some other options. You can use software like Microsoft’s Log Parser program which can convert many formats into a log message that Syslog can understand.
Author of a Polskie Proxy
There is little excuse for not installing an IDS (Intrusion Detection System) on your Network, even the usual culprit of budget doesn’t apply. In fact one of the leading IDS systems called Snort is actually available completely free of charge and is sufficient for all but the most very complex network infrastructures. It is virtually impossible to effectively monitor and control your network, particularly if it’s connected to the internet, without some sort of IDS in place.
There are certain questions about the day to day operation of your network that you should be able to answer. Questions like the following will help you determine if you really have control over your network and it’s hardware =
- Can you tag and determine how much traffic on your network is associated with malware or unauthorised software.
- Are you able to determine which of your clients do not have the latest client build?
- Can you determine which websites are most popularly requested. Are these requests from legitimate users or as a result of malware activity.
- Can you determine which users are the top web surfers (and is it justified).
- How much mail are your SMTP server’s processing?
It is surprising how many network professionals simply wouldn’t have a clue about obtaining this information from their network however, it’s impossible to ensure that the network is efficient without it. For example a few high intensive web users can create much more traffic than the majority of ordinary business users. Imagine two or three users in a small department who used a working BBC VPN to stream TV to their computer 8 hours a day. The traffic that would generate would be huge and could easily swamp an important network segment.
All security professionals should ensure that they have the tools and reporting capacity to answer simple questions like this about network usage. Knowing the answers to these questions, will help control and adapt your network to meet it’s users needs. Of course a simple IDS won’t provide the complete solution but it will help keep control in your network. Malware can sit and operate for many weeks in a network which is not monitored properly. This will heavily impact performance and can enable it to spread to other devices and eventually other networks. In network environments where performance is important, then being aware of the sorts of situations can make a huge difference.
Network Professional and Broadcaster on author of BBC News Streaming.
For many people, travel is becoming much easier and as a species our geographical horizons are perhaps wider than ever. Inexpensive air travel and soft borders like the European Union means that instead of just looking to work in another city or town, another country is just as viable. The internet of course enables this somewhat, many corporations have installed infrastructure to allow remote or home working which means many people can work from wherever they wish. Instead of sitting in cubicles in vast expensive office space, the reality is that people can work together just as easily using high speed internet connections from home.
Unfortunately there are some issues from this digital utopia, of which most are self inflicted. Instead of being a vast unfettered global communications medium, the internet in some senses has begun to shrink somewhat. Not so much in size but rather an increasing number of restrictions, filters and blocks being applied to web servers across the planet. For instance the company I work for has two main bases one in the UK and the other in Poland, which means there is quite a bit of travel between the two countries. Not surprisingly employees who are working away from home for some time, use the internet to keep in touch with their homelife, yet this can be frustrating.
A common issue is the fact that many websites are not really accessible globally, they are locked to specific regions. Take for example the main Polish TV channel – TVN, it has a fantastic website and a media player by which you can watch all their shows. However a Polish citizen who tries to watch the local News from Warsaw from a hotel in the UK will find themselves blocked, the content is only available to those physically located in Poland. It’s no one off either, this behaviour is shared by pretty much every large media company on the web who block access depending on your location.
There is a solution and for our employees it’s actually quite simple, all they need to do is fire up their VPN client and remotely connect back to their home server in Poland. The instant they do this, their connection looks like it’s based in Poland and all the Polish TV channels will work perfectly. There’s a post about something similar here – using a Polish proxy to watch TVN and some other channels although this one is through a commercial service designed to hide your location. It’s a practice that is becoming increasingly necessary, the more we travel the more we find our online access is determined by our physical location.
The use of proxies and more recently VPNs allows you to break out of these artificial intranets which companies are creating by blocking access from other countries. The idea is that if you have the ability to switch to various VPNs across the world you can effectively take back control and access whatever website you need. Your physical location becomes unimportant again, by taking control of your virtual location you have an huge advantage over other internet users by choosing the location you wish to appear from. There are even some other options now take a look at this UK DNS proxy which does something fairly similar and can be used to watch the BBC and Netflix from outside the UK.
Author of – Does BBC Iplayer Work in Ireland
In these times when security is becoming ever more important the SSL Tunneling Protocol is extremely important, it allows a web proxy server to act as a tunnel for SSL enhanced protocols. The protocol is used when any connected client makes a HTTP request to the proxy server and asks for a SSL tunnel to be initiated. On the HTTP protocol level, the handshake required to initiate the SSL tunneling connection is simple. There is little difference to an ordinary HTTP request except that a new ‘Connect’ method is used and the parameter passed is not a full URL but instead a destination port number and hostname separated by a colon.
The port number is always required with ‘CONNECT’ requests because the tunneling method is generic and there is no protocol specified, hence default port numbers cannot be used reliably. The general syntax for the request is as below ;
CONNECT <host>:<port> HTTP/1.0
HTTP Request Headers
The successful response would be a connection established message, followed by another empty line. After the successful response the connection will then pass all the data transparently to the destination server and pass through any replies from the server. In practice what is happening is the proxy is validating the initial request, establishes the connection and then takes a step back. After this initial stage the proxy merely forwards data back and forth between the client and the server. If either side closes the connection then the proxy will cause both connections to be closed and no mor tunneling will take place until a new connection is established between the server and client.
The proxy does have the ability to respond to error messages within the SSL tunnel. If this error is generated in the initial stages then the connection will not be established, if it is already connected then the proxy will close the connection after the error response has been sent. However it is important to remember especially where security is important that this SSL tunneling protocol is not specific to SSL and therefore has no in depth security. The tunnelling mechanism used in this instance is a generic one and can in fact be used for any protocol. This means that there is no requirement either for the proxy to support SSL either as the server is merely establishing a connection and then forwarding data without any processing.
BBC Iplayer Ireland – Here’s How you Can Watch
There is one technology normally associated with IP name resolution and that’s DNS (Domain Name System) or Smart DNS, this is probably because it’s the dominant system on the internet. However in the average corporate network you’ll find all sorts of alternative methods to resolving names and IP addresses which have been around for years. Here’s just a few of the common ones that you might come across:
Broadcasting: The use of mass broadcasts to help resolve names is of course very inefficient, basically a plea to the whole network asking for an answer. You’d think that this method isn’t used any more and it’s true most network administrators have tried to remove it from their networks. However for anyone who’s tried to troubleshoot a network of any size you’ll almost certainly find devices who routinely broadcast looking for name resolution. A couple of reasons it doesn’t work well are it generates lots of unnecessary traffic and most routers won’t transmit the broadcasts anyway so calls are frequently just lost. You can configure routers to pass on these message using the IP address helper function but this is not the way to run a fast efficient network.
Netbios over TCP/IP
Netbios was the primary method used by windows computers to resolve names and IP addresses, although again DNS is likely to have replaced it normally. There are 4 methods to Netbios Name resolution and they are usually operated in a distinct order.
- p-Node – Client contacts a WINS or NBNS server using unicast. This needs to be configured on the client server to work properly but then just requires IP connectivity.
- b-Node – Client attempts to contact a WINS or NBNS server using a broadcast. This will only be successful if there is a server on the same subnet or routers are configured to forward the request.
- m- Node – Client uses b-node first then p-node is there is no reply to the initial broadcast.
- h-Node – Client will first use a p-node unicast if configured and then fall back to a b-Node broadcast afterwards.
Windows Internet Names Service is a Microsoft implementation of the NetBios (NBNS) protocol. It’s a dynamic and distributed method of name resolution used mainly in Windows environments. It has all name resolutions saved on central WINs servers, and indeed in some implementations the WINS service was installed automatically on Microsoft Windows server installations. Again it works best when the WINS server is configured correctly on the client, otherwise it will fall back on broadcasts like NBNS.
This is a simple static file similar to a hosts file which is must be created, distributed and kept updated by the network administrator. If a client is configured in h-node then the LMhosts file will be consulted as a fall back method. It can create a lot of work and potential issues in large dynamic environments although it can be used to distribute names of key servers which are unlikley to be moved or modified.
The network layer of the OS Protocol stack is often simply known as Layer 3. It is important for network troubleshooting as it is where routing takes place one level above the data link layer (Layer 2) which is where switching and bridging happens. A VLAN (virtual LAN) is a subnetwork of an internetwork however it is normally defined using a switched network topology.
So what do we mean by a switched network? Well simply put it is a series of devices such as computers attached directly to some sort of multiport switching device. A network switch acts like a connecting medium between the ports which computers are connected to. In the perfect switching environment each port has only one device connected to it, however in reality it’s likely to be another network device like a bridge or hub which has many more clients indirectly connected to the switch. The perfect scenario has no conflict between different devices trying to use the same network cable, performance is maximized here because there is no waiting or latency while information is transmitted such as you would get on Ethernet. Just like the simple VPNs we use across the internet to watch BBC USA whilst hiding your IP address they VLANs segment and protect traffic.
An important reason for segmenting networks initially then connecting them together again using routers is that it minimizes the size of broadcast domains with fewer devices competing for access. Switched topologies also reduce the level of contention and many networks have to evolve into large flat switched networks. If you remove routers though there is a price to pay both in ease of administration and being able to securely manage specific segments or devices. If you need to retain some sort of topological layout in this scenario, VLANs are probably the only feasible option.
A VLAN restores the advantages of a segmented network to a flat switched network. Network administrators can use VLANs to create pseudo segments in a open network across the switches. This is important for creating security segments and managing large networks as the computers which are joined to the VLAN can exists anywhere on the network. So for example you can create a high security VLAN to connect secured servers together where they can be managed and secured as a group. These servers can exist on different switches, different ports and across buildings and departments.
The next stage is to take these individual VLANs which connect many groups of computers and extend the model. Indeed a device can be a member of multiple VLANs and messages can be broadcast to specific devices by sending them to specific VLANs only. The issue with this setup is that routers still need to transmit packets across these different VLANs, there is still a requirement for data to be transported which can cause contention and performance issues.
Here we see the techniques of Layer 3 switching being useful where a routing algorithm is used to discover the fastest path through the switched network. Once a destination is actually located, a shorter layer 2 switched path can be used. This procedure is possible because the VLANS will actually overlay the physical switching fabric of the network. Obviously there is more to these techniques and indeed the design and construction of efficient switched networks is a large and interesting field.
John Simmons, american version of Netflix? Galsworthy Publications, 2013