Old Tom article
Continued from Page 4

Here's how your browser says "give me the web page /product/index.html?oldtom". Remember we're already talking to the otscripts.com server, so we know what domain we're dealing with:

GET /product/index.html?oldtom HTTP/1.0

The "web guy" who was listening on port 80, knows exactly how to interpret the above line of text, and feeds /product/index.html to your browser. We're done.

We're done. Except, that is, for one little detail.

Remember that socket connection? That's how the entire web works - socket connections. Your browser connects to the server's port 80, tells the server what it wants, and the server responds with the desired web page. So, what's the problem?

The problem is that socket connections *only* happen with IP addresses. Sockets don't know a thing about domains or domain names - they *only* know about IP numbers and ports. If you're connecting to otscripts.com, that means you open a socket connection to 209.215.97.148 port 80. However, if you're connecting to old-tom.com, you *also* connect to 209.215.97.148 port 80. For jojasa.com, connect to 209.215.97.148 port 80. Likewise for hotterpics.com or hotslutlinks.com.

Can you see the problem? Our browser makes the socket connection all right, but the server has absolutely no clue as to whether we're expecting to find jojasa.com, old-tom.com, or some other domain. Meanwhile, our browser has no way of knowing there's a problem... back in the "good old days," every domain had a different IP address. You get the right IP number and port 80, and you've got the right domain. With name-based server configurations, that's no longer true.

Here's how we solve the problem. The browser must supply *two* lines of information instead of just one line:

GET /product/index.html?oldtom HTTP/1.0
Host: www.otscripts.com

Now the "web guy" listening on port 80, knows which page of which domain, and feeds it back to you straightaway.

By the by, for hotlinking protection, your browser is expected to supply a *third* line of information, like this:

GET /product/index.html?oldtom HTTP/1.0
Host: www.otscripts.com
Referer: http://www.vnwr.com/main/index.html

The server uses your "referer" information, in processing your .htaccess file. If I wanted to redirect any visitor coming from vnwr.com, I could put the appropriate RewriteRule in my .htaccess. If I want to redirect anyone coming from a certain search engine, I just drop the line into my .htaccess, and let the server take it from there.

This of course, is how hotlinking protection works, and also shows how the site suckers get around your protection.

Suppose vnwr.com is trying to hotlink one of my pictures. What that means is that your browser will attempt to display the picture - but your browser will also report the fact that the link came from vnwr.com:

GET /images/logo.jpg HTTP/1.0
Host: otscripts.com
Referer: http://www.vnwr.com/main/index.html

Since the referer is *not* otscripts.com, my .htaccess file will tell the server to *not* hand out the image. You see a broken image icon, and my bandwidth is protected.

But... what if your browser chose to lie about the situation? The following three lines ask for the same image, but notice the difference:

GET /images/logo.jpg HTTP/1.0
Host: otscripts.com
Referer: http://www.otscripts.com/index.html

Now we're asking for the image, and the referer is otscripts.com. This is *precisely* what the socket connection looks like, when the browser is loading the logo on my otscripts.com splash page.

Browsers, of course, never lie. They can be trusted. Netscape, I believe, is owned by AOL. Internet Explorer, I believe, is owned by Microsoft. If you can't trust AOL software and Microsoft software, what can you trust? The whole basis of hotlink protection, is the presumption that AOL software and Microsoft software would *never* lie to you, for any reason, ever.

Unfortunately, however, there are people out there capable of writing a script which can do a socket connection on port 80, and send out those three lines of text. They are unscrupulous to the point that they *make things up* when filling in that Referer line. Yes, they lie to your server!

"I saw a sign: "Rest Area 25 Miles". That's pretty big. Some people must be really tired."
Steven Wright

What can your server do? Absolutely nothing. *All* your server can do, is listen to port 80, pick up the phone when it rings, and listen to what you have to say. You can lie to it; ask trick questions, anything. So long as it keeps listening, you're in business. That's how the site suckers work... they tell the server what it wants to hear, and harvest everything in sight.

That also, by the way, is how the intruders work their way into your PC at home. Microsoft Windows (or so I am told) has all of those listeners sitting right there, listening, listening, listening. When the intruder happens to try your IP address, he may do a port scan. That is, he attempts a socket connection to each of those ports, checking to see if anyone answers. When somebody picks up the phone, he's in. He knows exactly how to sweet-talk his way into your computer *and* keep his intrusion a secret.

While I'm at it, let me mention that there are some "special" IP addresses. Any address beginning with 192.168 is "nonroutable". What the heck does that mean? It means the address *must* be somewhere on your local network. It cannot be "out there" on the Internet somewhere. (This is also true for any address beginning with 10, and for any addresses beginning with 172.16 through 172.31.)

It may be handy to know that IP address 127.0.0.1 *always* refers to yourself. No matter what other IP address that computer may have, it will *also* answer to 127.0.0.1. In unix and linux, that's called "localhost". On my linux box at home, I can get to my at-home server as either
http://127.0.0.1/index.html or as
http://localhost/index.html. My browser simply connects to port 80, and my server answers.

A "router" (which may also act as a firewall) takes requests and forwards them along to the right place, much like your local post office. The router is aware of one or more "gateway" IP addresses. That's the equivalent of "out of town" mail. If the mail is "local", the local post office can distribute it directly. Otherwise, it sends the mail to the next "gateway" upstream. The "netstat" command should show you your current routing configuration.

When passing through the gateway, the IP address needs to get translated. For example, my linux box has IP address 192.168.0.3. However, when I post to the VNWR board from this linux box, it appears as IP address 148.78.255.41. How did that happen? My router (winproxy) translated the "inside" address to the "outside" address.

Have you heard of DHCP? That's when IP addresses get assigned "by magic". When you first connect, something at the other end of the connection assigns you an IP address. For example, when I reboot my linux box, winproxy (on the windows PC with the Starband connection) assigns an IP address. Right now, it's 192.168.0.3. Anything beginning with 192.168, you will recall, is "nonroutable" and therefore safe for the local network.

Well. I seem to have wallowed in detail. Was there a point to all of this? Yes there was!

How does the entire Internet work? The whole thing runs on socket connections. Networks and hubs and backbones and modems and satellite uplinks serve one single purpose... to allow the socket connection to get through. In the same way, the highways serve a single purpose... to allow the mail trucks to get through, so that they can deliver your mail to its destination, and so you can receive mail addressed to yourself. I'm oversimplifying, obviously, but hopefully you get the idea.

Actually, the telephone is the better analogy. *If* there is a listener at the other end, he'll pick up the phone when you call. Once you've made the connection, you need to talk *his* language. If you do, you'll get what you want. What if the listener refuses to pick up the phone? That means the firewall works! What if the listener stopped listening? That means it's time to reboot the server! What if there's no route to host? That means all circuits are busy at this time; try your call again later.

What if a thousand different people call that same number at the same time? That means you just got listed with The Hun. No problem... the listener has already lined up 30-60 assistants to handle the calls. The listener simply passes you over to one of the assistants, and continues listening.

There once was a poor man named Crocket,
Whose balls got caught in a socket.
His wife was a bitch,
So she cranked on the switch,
And Crocket took off like a rocket!

Old Tom


©2001 VNWR. All rights reserved.