LinkCo

printer friendly document

 
 
Houdini Tech on the Internet (HTI):
Friends see one thing - foes another

By: James Cook, David Israel-Rosen, and Oded Maimon
LinkCo, Inc.; Glencoe, Illinois 60022, USA

 




    back

Abstract:  In this paper we present results of research that clearly shows that the Internet is being used, with what we call HTI, to present one type of information to one audience defined as friends, and another type (confused, partial, misleading, misdirected) to another audience, called foes. We show examples employed by one foreign company to deceive specific non-national audiences (especially USA) and we go on to expose the methods and the ways being used. The methods presented here can be used by any organization, such as a terrorist group and alike.

 
  Introduction:

We have come across intentional measures that manage different views for different communities of viewers. The measures themselves are neutral; they can be used for good or for bad purposes. For example, adjusting what is seen to viewers from different cultures is a good purpose; however these schemes, termed here HTI (Houdini Technology for Internet), as we found out, are also used for bad/criminal purposes, and in an organized and systematic way. Examples include, hiding messages, keeping information (such as evidence or messages) from appearing in one jurisdiction, but appearing in another (such as patent usage hiding, where you sell in friendly home market, while hiding in the foe market, e.g., where the patent comes from). Organizations can use HTI methods to manage the flow of information in such a way that they thwart detection of fraud and, potentially, command and control messages.

In this paper we show some of the methods by which internet pages can be made covert to one audience and overt to another audience. We start with a brief survey of five types of the HTI measures that are used.

 
 
Survey of Measures:

Here we present the methods themselves, which comprise the HTI We only present methods that we found and deduced from actual field examples. It is not a comprehensive list yet, and may never be.

1.

Color concealment (background and text in the same color):  When the foe views the web page (after having submitted it to a translation engine) both the text and its background are the same color, thus making text invisible. One way that the designer can effect this is to install special instructions (probably in the style sheet) to foul the translation presentation, so as to change the color of parts of the text (e.g., white text on white background).

Immediately below is an actual example taken from the internet (at the time of this research was at: http://glovia.fujitsu.com/jp/event/kansai/02sf1113.html). The "Japanese View" is easily visible text in the middle of the frame. The translated "English View" has that text translated, but it is presented in the identical color (white) as the background, hence is invisible. If you highlight the "white text" area (using Control A in Internet Explorer), you would be able to see the text we have presented in the "Exposed View".

white on white exposed

 



In this case, the perpetrator testified that no promotion of a subject had occurred, yet, on the internet, in Japanese, they had delivered essential information which the translation engine translated as: "The July this year of severe heat, at Tokyo international forum it received favorable comment ..." in direct contradiction to their testimony. However, English readers could not see this translated statement since the text's background was the same color as the text.

There are a three courses to thwart this ploy: a) Read the page in Japanese, b) Have the translation engine "anonymously" fetch the page, and c) search for "font ... color=same" where "same" is a color the same as the background, in this case white.

2.

Using image files to hide text:  This is a way to put textual content in view without search engines being able to see the content. Therefore a foe cannot find the file, and a translation engine cannot translate the file. (The search engine can translate encoded text, but not images).

If you examine the figure below, you can make out five occurrences of the word "EXCEL". By putting these occurrences into a graphics figure the perpetrator prevents their detection by all search engines. In fact, five detected occurrences would, generally, cause the search engine to score the page much higher thusly increasing the likelihood of the page being accessed. Therefore, this is a simple and effective method for operating in a low profile.

hiding EXCEL in image

 



Using this device in a friendly domain, you could be given information or instructions detrimental to foes. The information could be the "keys" to a web location or the instructions to perform an operation. This is a simple, well known, and well understood device, which is, nonetheless, very general purpose and effective for going undetected (by eluding search engines and defying translation engines).

There is no known effective way to harvest all these occurrences, character recognition programs will work in the simple instances (such as the upper right panel above), but not where there is a distracting background (such as the lower left panel above). There is a trick, though, and that is to search on the content of the URL as some will foolishly put clues to the hidden information there, such as "excel_seminar.htm" or "seminar_date.jpg".

3.

Hiding text in cursors:  This device puts messages into the cursor tag hidden to: a) translation engines, b) search engines, c) casual viewers, and d) those who don't place their cursor in the specific place. One can also see how this device used with the text hiding (#1, above) can effectively thwart the unintended viewer, that is, the "foe".

Below, there is a long message in the original "friend view" which is scrambled by the translation engine effectively hiding the message from the viewer. By going into the actual, original HTML code, we extracted the Japanese message and then had just that message translated. That's how we were able to expose the contents of the hidden cursor text.

hiding conferences in cursor text

 



To thwart this device, it is necessary to view the page in the original language, upgrade translation engines to translate cursor tags (which has not been done to date), or examine the HTML code for the cursor tags and extract them for separate examination.

4.

Blocked or Dead Links:  This device is simple and crude, but effective. What is done is to give the appearance of having a link, but when you click on it, nothing happens. This is most often an innocent file management error, but, as shown below, can seem quite intentional. Usually, we can't show dead links (simply, there's nothing to show, it's just that the link doesn't work).

hiding data behind dead links

 



Thwarting and detecting this device automatically is easy in one instance and impossible (or seemingly so) in another. In the case where the link is there but it doesn't work (usually because the target URL is not at its specified location), web crawlers regularly report these as "errors". In the instance shown above, where the link is "suspiciously" absent, there is no known automated method that draws attention to the link's deviant property.

5.

Miscellaneous Devices:

  1. Using aliases, misspellings, and/or abbreviations in order to prevent search engines from finding the real meaning and reference. Instead of "disclosure" we found a phonetic substitution used, "disc rose". This confuses the translation machine, but doee not confuse the person who reads it phonetically (e.g. interchanging "r" and "l" phonetically is prevalent in Japan).

  2. Outputting different pages to different viewers (perhaps based on the URL of the calling page) is a device we have experienced. In this instance, a Japanese page existed in Japan, but was unavailable in USA.

  3. Interchanging fonts and languages to confound translation engines and search engines. For example, by interchanging different but legible fonts (UTF, Shift JIS, and EUC), translation machines become confused and output either strange characters or bands of question marks where legible text in the original language existed.

  4. Gaming Search Engines is a commonplace marketing strategy with the aim of getting a higher ranking to appear earlier in the list of "hits." However, when hiding, you want the opposite, a lower ranking so that the "hit" will appear later in the list and probably go unnoticed. This is the goal of inserting repetitions into web documents (i.e.., salting) that may be ranked by search engines. Here is an example:

    salting pages for lower rankings

    You can see from the translation below that the salting of the page with meaningless "truth" adds no value and is only intended to bury the page at the bottom of any list of inquiry "hits." If the "foe" ever got to the page, below is what the "foe" would see.

    translation of the salting

  5. Gaming Translation Engines.  You may find translation engines asking if you want to make a suggestion to improve a translation. In this instance, we found an accurate translation in November of 2006 which is as follows:

    original translation

    Now, this erroneous translation [of the market size, that which follows the ":" in both the Japanese and the English translation] is off by two orders of magnitude, not to mention that the "several" modifier, japanese character for several , has been ignored, as well. The first number is 100, i.e., japanese character for hundred , followed by, not a million, but a hundred million, japanese character for hundred million . The result is that these three characters are "several hundred hundred million" or several ten billion, not the hundred million erroneously reported by Google translation in April. Today, Google, at our prodding, has corrected this and you can test it for yourself. We make no suggestion about who gamed Google's translation engine, but we do note that that is a discovery hacking avenue that those who want one view for friends and another for foes could exploit.

 




    back
Conclusion:

As the internet becomes the information utility of our age, it becomes increasingly important for discovery in litigation and law enforcement. As we have demonstrated above, it is possible for hackers, organized (corporations or terrorists) or individuals, to exploit deficiencies in internet search and translation tools to present different views for different audiences, friends and foes. We bring these to your attention with the hope that countermeasures will be devised, even if manual as in our case, to protect you from being deceived as foes might want.

 

  2008 by LinkCo, Inc.