~ Essays ~
         to essays    essays
(Courtesy of fravia's advanced searching lores)

Follow Links in the Underground
by altosax
published at fravia's searchlores in September 2002

Slightly edited by fravia+
"Real searchers must be able to find what they are looking for in the most effective way and when the site is realized to fool the users, they should never have to click onto a link just to discover it is not what they thinked it was."
Let's not forget, also, that seekers travelling with a good hosts filter and all their holy shields up
-- junkbuster + proxomitron chained together -- see Iefaf's, Bone Digger's and NME's chaining instructions elsewhere and moreover without any active java or javascript inside their browser wont need this kind of lore that badly. What is explained here could come quite handy for every seeker... Thanks Altosax! Let's hope to have more feedback on this by whoever will research further into it.

Follow Links in the Underground

by altosax

Many Searchlores users probably have read the Rumsteack essay about the use of Getright as a bot to explore the site structure. This is useful mainly in the underground sites, to avoid popups, redirection and/or tracking links, links pointing to a false location and so on.

A different way to do the same thing, but with more and more informations about the site structure is to use the freeware Xenu's Link Sleuth at http://home.snafu.de/tilman/xenulink.html. I know, there are a lot of commercial programs doing this, and as I said also Getright, but I've learned to respect the programmers' work so when a freeware exists doing what I need, I never consider to use another.

Real searchers must be able to find what they are looking for in the most effective way and when the site is realized to fool the users, they should never have to click onto a link just to discover it is not what they thinked it was. This means that a searcher must know, or must find, the tools to do his job the way he wants and not the way the webcoders want.

Xenu's Link Sleuth was realized by Tilman Hausherr to help the webmasters to maintain their sites. It checks the site for dead links, external URLs, redirection URLs and other. But the most useful feature for the searchers is the check for valid URLs, that can be used to find the right path to what we are searching for among all other tiresome links.

You simply need to give Xenu a working web address and it will scan the whole site to check the links types you have set in the configuration options window: start Xenu, type the URL to scan, then click on "More options". Here you can select the options you prefer of those Xenu provides.

First you can set the number of threads Xenu has to execute up to 100 parallel threads. This depends from the speed of your connection and the bandwidth you can use. I've found that 30 simultaneous threads are optimal, but this is not true with every site. Some sites, when receiving so many connections from a single machine, could think it is a DoS attempt and block you. If this happens, you have to reduce that number and retry.

Then you can set the results it has to return in the report it can create at the end of the check. The available options are:

Broken links, ordered by links
Broken links, ordered by page
Broken local links
Redirected URLs
FTP and Gopher URLs
Valid text URLs
Site Map
Local orphan files

Because the report is not a text file but a html page, it contains clickable links to the pages and the files hosted on that site, so you can click the links in your local report without follow them on the site. This way you can avoid also the popups, the redirections and the trackings. You can store the report on your disk too, to use it again later.

I suggest to start using Xenu on small sites first, just to take confidence with its options because if you enable all of them the time required to scan a site can grow considerably.

Or you can start checking just for valid text URLs, because if you use Xenu to peruse a site you don't want for sure the broken links :) Later, if needed, increase the number of results to return.

If you prefer, in the main window you can also expand/restrict the checks setting the program to scan additional external URLs beginning with something- or to exclude URLs beginning with someother-.

As i wrote, explore the underground is not the use the author of Xenu had in mind, this is just a different way to use a web-tool for searching purpose.

August 2002.

to essays
(c) 1952-2032: [fravia+], all rights reserved