NTIC
Lecture 2 : Finding information on the Net
Review of last lecture : a brief history of Internet and the Web
- After the invention in the 40's of modern computers, the idea of connecting them via telephone lines was inescapable. It was successfully achieved for the first time in 1965.
- Then Internet became inescapable as well.
- Two other inventions played a key role :
- Packet switching, that makes for very robust transmissions (hard to cut
by the ennemy)
(At first ATT engineers, when they heard of packet switching, declared : "This idea is as stupid
an idea as, say, transporting petrol from the oil fields to the final consumer in tea cups
via various routes." It turned out they were wrong. But their image is a very good
one !)
- Micro-computers that, from the very beginning around 1978 (Apple II), put information technology in the hands of the general public. The fast development of micro-computers was helped by "killer
apps" such as Visicalc
- Then there only remained to generalize client-server architecture for Internet to take off and
know the fantastic development of the late 90's and today
- That's not the end of the story, one may guess,...
- E-commerce is one of these new phenomena made possible by Internet. No
doubt others are to come.
So Internet was created in the late 60's early 70's :
- It is a huge worldwide library, containing the best as well as the worst
- It is a worldwide network ("network of networks" would be more
correct) connecting information servers and end-users. The information
delivered is anything that can be represented electronically.
- And now Internet is an E-commerce support.
While The Web was only invented in the early 90's (twenty years after Internet)
:
- The Web (accessible via https://...) is only a part of Internet.
The rest, accessible via ftp://ftp..., is much larger and not nearly as well charted.
The techniques to find information :
A research that, as recently as 1990, would take several weeks to be carried
out, and could require trips to different parts of the world, now can be
completed in only a few hours.
Searching libraries, newspapers, trade publications, specialized databases,
and contacting specialists, all this can be done from one micro-computer
connected to the Net.
The technology is different, but the principles are the same :
- First
Know what you are looking for : market data, information on products,
competitors, technologies, production methods, financial data, patents, key
events, new players, etc.
- Organize your questions logically and formulate them in natural language
- Transform them into keywords
- Find the "good sites" bearing information relevant to your topic
- Using Newsgroups and Mailing lists, find experts in the field you are
interested in ; find
the name of people having discussions with experts (other experts)
- Outlook Express is a convenient software to consult the Newsgroups.
You may prefer another one (for instance Free Agent).
- On Newsgroups respect the Netiquette :
- The notion of "thread"
- The notion of "flame" (for instance on Google Groups with the search : +astronomy +planets +images ;
the third thread presents a flame.)
- Use the search engines to find out more about the experts you have
identified : where they work, what they are specialized in, what they published, etc.
- On any given specialized topic it takes only a few hours to know everything
the Web has to offer
- Finally, after having properly used the Newsgroups you can contact directly, via e-mail or even the
telephone, the experts you wish to have a discussion with.
- You'll be amazed how willing and happy people are to talk about what they know well
- Be prepared not only to ask questions, and get information, but also to give information
The Web_tools
page on lapasserelle
Directories and engines
- At first : directories = human classification ; engines = robots.
- The difference is fading : Yahoo offers engine services too, and Google
has a directory (see the page google
services)
- Altavista : Digital, Louis Monier… ; for several years Altavista was
"the" reference ; then it was superseded by Google.
- Google (created in 1998, lauched in 1999) : more intelligent search results
classification (it uses refinements of the link measurement method)
- Work with Altavista and Google with various keywords and study the results
and the differences between the two engines
- Syntax
- +cancer +astrology
- +cancer -astrology
- Truncation (as of october 2002 Google still does not accept it)
- +astron* +planet* +image*
- Page translation service by Altavista and by Google
- The very useful "cache" function of Google : when a page has
disappeared Google still has a copy in cache.
Visit these sites (Sources : Patricia Seybold, Journal du Net)
- Dell
- National Semiconductor
- Hertz
- Amazon
- National Science Foundation
- Wells Fargo
- Boeing
- Dow Jones
- General Motors
- Cisco
- American Airlines
- Paypal
- Fnac
- Rue du Commerce
- Alapage
- Top Achat
- Surcouf
- Eveil et Jeux
(find the exact addresses using your favorite search engine...)
Sites typology : look and feel, objectives, etc.
Differences between french and american sites
Other useful tools
Meta engines Ixquick, Metacrawler and Copernic (the two former are
online engines, the latter installs a software on your machine). The syntax is
rudimentary, to jive with all consulted engines
Newsgroups (mostly Deja/Google - see above), Mailing lists (mostly
Topica/Lizt - less useful than Newsgroups or at least not nearly as
"interactive" for a search... ). Francopholistes, GetZeNews, Yahoo
Groups, etc.
Intelligent agents : they go one step further than engines : they
download onto your machine interesting pages (related to keywords you
gave) therefore enabling you to do local work, filtering, analyzing, etc.
Webseeker (price : about $30) the reference in Intelligent agents
Umap a very ambitious and sophisticated tool : following your keywords,
or even sentence in plain english, it finds and downloads a large number of pages, then it creates a thesaurus of words found in the downloaded pages,
measures "closeness" of words in the pages, and finally constructs a
chart, with "islands" of related words or concepts. Avantages : gives
plenty of addresses and even more importantly gives more ideas, more research
themes. Drawbacks : delicate to interpret and not particularly user-friendly.
Teleport Pro (belongs to the "Pull" technology) : sweeps
onto your machine a more or less complete site you want to look at at leisure .
Iwas useful mostly when Internet connexions were slow and costly. It can also
monitor page changes. Teleport pro is not performant with active pages.
The Push : Tracerlock, NetMind, Moreover News, etc.
FTP (difference with HTTP ; it's twenty years older…). With FTP
your are (to some extent) within somebody else's computer. Now well known
browsers have FTP downward capabilities, but not yet FTP upward.
|