HTTrack test page

Last modified: 2005.12.12 1520 AEST
NEL = No External Links
Last tested with: HTTrack 3.40-BETA-3 (+swf)

Redirect

With NEL=ON this file (redirect.php) should redirect, and the hyperlink here change to "external.html?link=http://www.httrack.com/"

Form

Testing NEL on form action, plus email address in hidden field.

<!-- Comments -->

Hyperlink and image should be ignored in this section

404 error on image


HTTPS - secure HTTP link


Various protocol test links


Badly-coded relative links


Links with ampersands

Turn off "Hide query strings" option for this test.

File extensions

Nothing, mp3, mid, avi, pdf, doc, xls, xml, xsl, swf, php, cfm, asp, gif, jpg, png, bmp, zip, rar, shtml, xhtm, htm, html

Mis-matching file extensions and MIME-types

If a file has no extension, or its extension does not match the MIME-type sent, how will it be handled?

Dodgy Microsoft HTML

When saving a Word document containing images as a web page, the resulting HTML can be riddled with much dodgy code. If HTTrack is unable to find the image in <v:imagedata> then Internet Explorer will fail to display an image. The problem is that the "image" is hidden within comment tags.
<!--[if gte vml 100]>
<v:imagedata src="images/dot_red_dk.gif" o:title="sport strip 2"/>
<![endif]-->

"Clever parsing": assorted Javascript and CSS tests

There are many tricky code situations where HTTrack may need to try to find "hidden" links within CSS or Javascript.

Xavier's function foo()

Tests the parsing of javascript to gather images.

Javascript: function argument

The following javascript code is embedded in the page here:
<script language="javascript">
	function imageURL(action) {
	}
</script>

(Empty Reference!)

This stupid Adobe GoLive message can appear in numerous places of a webpage. Should HTTrack attempt to get a background URL that looks like this?
<blockquote background="(Empty Reference!)"></blockquote>

CSS: Background applied to a DIV's style attribute

Test in this DIV to see if HTTrack detects the image.
<div style="background-image: url(../../misc/httrack/images/sto_bg.gif); background: url(../../misc/httrack/images/sto_bg.gif);">
  • v3.30-rc13 does not find the image in this <div> but will find it in the CSS block in the <head>.
  • v3.30-rc15 finds all occurances.

Javascript: Link graphic with rollover image change

This test is "fake", only containing example rollover code in the A tag, to see if HTTrack detects the images.

Javascript: Preloaded images in <body> tag

It's very common to see similar code to the following on a website. Fake test example included in this page's body, to see if HTTrack detects the images.
onLoad="MM_preloadImages('images/dot_darkonwhite.gif'"

Javascript: found a slash, thinking it's part of a URL

HTTrack has been known to find code like:
document.open("text/html");
and rewrite to:
document.open("../html.html");

Javascript: variable concatenation

Objects may be used in javascript... will HTTrack look "too hard" and find things like ".bgImageUp" thinking it is a file extension? The following code is in javascript in the <head> section.
"url(" + a.Menu.bgImageUp +")"
'style="background:url('+menu.bgImageUp+');'

CSS: @import

This is a two-level CSS test. This page contains @import url(cssimage.css); and that file itself contains @import url(cssimage_advanced.css); A file called "cssimage.jpg" should be downloaded and appear top-right of the page. Should also find "toy_gold_32.gif"