Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!newsfeed.sgi.net!news-xfer.netaxs.com!nntp.giganews.com!nntp.primenet.com!news1.mpcs.com!news.iinet.net.au!not-for-mail
From: david@cn.net.au (David Novak)
Newsgroups: sci.research,comp.infosystems.www.announce,comp.answers,sci.answers,news.answers
Subject: Information Research FAQ v.2.5 (Part 7/9)
Followup-To: poster
Date: 17 Apr 1998 00:00:00 GMT
Organization: iiNet Technologies
Lines: 433
Approved: news-answers-request@MIT.EDU
Message-ID: <6hfv1q$fvf$7@news.iinet.net.au>
NNTP-Posting-Host: gothic21.nv.iinet.net.au
Summary: Information Research FAQ: Resources, Tools & Training
Xref: senator-bedfellow.mit.edu sci.research:17452 comp.infosystems.www.announce:21918 comp.answers:30980 sci.answers:8160 news.answers:128177

Archive-name: internet/info-research-faq/part7
Posting-Frequency: monthly
Last-modified: Apr 17 1998
URL: http://cn.net.au
Copyright: (c) 1998 David Novak
Maintainer: David Novak <david@cn.net.au> 

                        Information Research FAQ     (Part 7/9)

    This FAQ now continues to highlight other aspects of information
    research.

    This part of the FAQ is not duplicated in the website or Infokey
    shareware. This part is relatively concise, more of a discussion, an
    informative arm-chair read about the field and process of information
    research. Note also, the disclaimer statement on Part 1 of this FAQ.

                                Contents 

    		----- Part 7 -----
    31. More on the Internet as a research resource
    32. More on the Commercial Information Sphere
    33. More on the Information Service Industry
          33.1 judging information value
    34. Emerging Trends in the information sphere
    35. Education and Training in Professional Research
          35.1 Facts               35.3 Guidance
          35.2 Practice
    36. Question and Answer Section
          36.1 How do I find information on the Internet?
    37. Acknowledgments
    ___________________________________________________

 31. More on the Internet as a research resource

    Lets agree the Internet is a great resource for surfing, but less
    valuable when you have a certain question to answer. To find answers, we
    need to begin by understanding how the information is arranged on the
    Internet. Contrary to myth, information is not disorganized but rather
    organized very carefully along clear patterns. Each pattern differs
    between the various forms the information may take. Further, awareness
    of information moves through several systems. Your understanding of the
    strengths and weaknesses of each pattern, each format, each system, will
    guide your search for information. I will share two insights here then
    invite you to the website for more.

    Insight One: Information tends to clump on the Internet, as with most
    resources, either by design or by simple habit. The web is not the only
    source of information and often not the resource where the best
    information groups. If you routinely browse different Internet systems,
    you will find certain information is found primarily in certain systems.
    While much information is drifting to the web, this trend is far from
    complete. The dominant source of information can usually be explained
    historically, as websites, ftp-archives, online databases, software,
    telnet-databases, newsgroups, mailing lists, etc...

    Insight Two: Information moves from the producers of information to the
    people who are seeking such information, and the way the information
    moves defines the resource. This is far more general, and applicable to
    any information format. Let us use books as an example.

    Books are created by authors who have something to write. Books are
    printed and marketed by Publishers to the bookstores who then provide it
    on to the readers. Each facet of this process defines the resource.
    Books have quality, editorial vetting, sales value and a potentially
    lengthy preparation time.

    Now lets look at FAQs. The best resource in the world on copyright law
    is the musings of a group of copyright lawyers who form the copyright
    mailing list. The copyright FAQ supported by this group is a logical
    document which summarizes much of the discussion of this mailing list.
    FAQs are vetted by the news.answers team, automatically mirrored around
    the world, and read by millions. From its origins, the FAQ is a
    peer-reviewed document, often full of links to further resources,
    topical, knowledgeable, factual and few in number.

    Again, how the information is generated, organized and transmitted
    deeply affects the information.

    As you search and surf the Internet, carefully note the address. This is
    the key. Certain qualities of information reside at commercial websites,
    government websites, or personal websites. Each tool (ftp, gopher, web)
    has certain identifiable qualities. Each system (faqs, mailing lists,
    newsgroups, bureocratic websites) has certain identifiable qualities.
    All this is delicately coded into the Internet address.

    Can you easily identify personal webpages from the address?

    Your understanding of the relative qualities of information affects both
    the search process and your analysis of its value. This framework is
    very valuable when interacting with the Internet and cuts through much
    of the chaos which is the Internet. As I mentioned, 
    http://cn.net.au/training/  discusses this further.
    ___________________________________________________

 32. More on the Commercial Information Sphere

    The commercial information sphere existed in the 1970's and earlier. It
    is far more developed, far better organized, far better funded, almost
    always far more valuable and expensive than most every other research
    resource.

    Commercial information is arranged reasonably uniformly in large
    databases of full-text or bibliographic information. Some databases are
    small, single source documents, while others are huge unfoccussed
    collections of resources.

    Most directories and journals can be made into a database, but
    single-source databases do not enjoy much financial success, (except in
    a local market as in newspapers). To overcome this difficulty, single
    sources are grouped together into larger collections of databases on a
    particular topic. These larger database groups become the primary tool
    for commercial research.

    Developing these databases requires the assistance and expertise of a
    range of skills. Sometimes this requires abstracting, interpreting, and
    as with some Lexis-Nexis databases, even expert legal interpretation.
    Sometimes this is accomplished by large database developers with a range
    of databases in their portfolio. Sometimes this is accomplished
    privately.

    The marketting and consumer billing of such databases is then provided
    by a relatively small collection of very large database marketers. As an
    indication of the size of this market, Knight-Ridder sold Dialog &
    Datastar for a figure approaching half a billion dollars!

    Thus, we have an industry consisting of a wide collection of players,
    each improving and developing the information from individual
    periodicals, journals, news items, etc... All very confusing for the end
    user, of course.

    This is elegantly illustrated by the database descriptions for
    Lexis-Nexis databases (They prefer the term libraries. See
    http://www.lexis-nexis.com/lncc/sources/libcont/aust.html as an
    example).

    Luckily, there are actually very few large databases in existence. Many
    single sources exist in different commercial databases. The combinations
    are not endless, but they most certainly are difficult to understand.
    Further, different databases sometimes include different information
    from the same single-source. One database may include just abstracts,
    another may have fulltext, chemical indexing and more.

    Most researchers are unfamiliar with what exactly is being searched.

    This state of affairs is not unproductive. Searching a 'database about
    Australia', is uncomplicated. You receive information about whatever in
    Australia. It is simple, informative and incomplete.
    This system gives rise to great customer loyalty to database marketters
    brought on by ignorance and obsfucation in the quest for simplicity.
    Unfortunately, I am hard pressed to compare prices let alone describe
    the differences between information products. Community Networking
    currently toils at this issue, and hopefull we will have something more
    next month. Our database of Commercial Database Descriptions may help -
    see http://cn.net.au

    This system has distributed information for several decades. It is both
    sophisticated and quite difficult. You will need to become experienced
    with inverted indexes, search techniques (Boolean, truncation,
    proximity, field limits ...), and properly phrasing the question in a
    way which will be answered by a database search. I have always found the
    value of a database search directly proportional to the length of the
    query.

    If you are incompletely skilled at research, you will take longer, pay
    more and locate far more information, or unwisely discard, more than
    necessary.

    These are very different from searching Altavista and Webcrawler.

    Doing your own research offers an opportunity to more closely influence
    the research process. Sometimes only you understand the topic and
    sometimes you can more quickly discard unimportant details. Certainly it
    is becoming simpler to undertake some of this work.

    Many of the commercial databases are also available in a CD format.
    There are substantial subscription costs which limit their availability
    to large research institutions and libraries, though individual
    databases can be found in bookstores (I believe world books in print
    costs AU$5000+). Provided you can find casual access, it will cost you
    far less. Keep an eye on the age, though. Sometimes (and only sometimes)
    online information is more recent.
 
    The decision between undertaking research on your own or seeking
    external help is really a decision based on your research expertise,
    your budget, your access to information, your time, and the importance
    of finding all the information available. It also depends on your access
    to some decent research assistance. That is your decision.

    What I do know, is that a newcomer to the commercial information sphere
    will seriously underestimate the difficulty involved in searching, and
    underestimate both the cost of research and the cost of research
    assistance. Keep in mind this same system serves the needs of large
    commercial conglomerates, professional legal research, and well financed
    government studies. The commercial information sphere contains far more
    valuable information than the you need. Often the Internet is just an
    interesting sneeze in comparison.

  #  Article: The Gale Directory of Databases (bi-annual in two volumes)
    includes a factual article as a forward, which follows the development
    of this industry.
  #  Full text databases  - by Carol Tenopir and Jung Soon Ro   
    ___________________________________________________

 33. More on the Information Service Industry

    Private Detectives, Professional Database Researchers, Library
    Researchers, Legal Researchers, Commercial Database Producers,
    Commercial Database Marketers, Magazines, News Organizations, Libraries,
    this is a big industry. Information Research is just a process which
    links together those seeking information with those who provide it.

 __ 33.1 judging information value

    Information has value. It also has other qualities which will assist you
    to judge the value of information you may consider buying.

    Accuracy: the factual nature of the information presented. If the
    statistics purport to show a particular trend - how large is the margin
    of error? How large is the sample size? How likely are there to have
    been factual errors in their development? The measurement of statistical
    error is now a refined science in some fields. A statistical result can
    be inaccurate when the sample size is too small, if the margin of error
    is too large, the sample collection procedure incorrect, or a number of
    other situations.

    Reliability: the support for trusting the solutions, both from
    additional resources and from being able to duplicate the conclusions.
    This includes the reputation of the researchers. No matter how
    inaccurate and biased you may believe certain facts to be, successful
    independent support of a suggested fact does improve its value. If facts
    can not be duplicated, like cold fussion, they are of less value.

    Bias: conscious or subconscious influences which affect information.
    Bias can occur in collection, preparation and presentation of
    information. Most information you find will be tainted. Secondary
    information is deeply affected. Statistics are not necessarily less
    biased.

    We counter bias in several ways. Firstly, we try to be aware of bias.
    Where is bias likely? Which direction would the bias affect the
    information. Secondly, we try to collect information which has different
    bias. This is why research based solely on government research, no
    matter how accurate and reliable, is less valueable. Often information
    from different countries can counter bias. Thirdly, we need to accept
    bias is likely to exist. This is why primary sources are often more
    valuable than secondary sources. This is why tertiary sources, like
    experts, are likely to be very biased.

    Age: The date information was created or compiled will feature
    prominently in the value of information. Dates given sometimes mean the
    date information was created, or the date information was compiled. How
    old is a book compiled in 1995, which took the author 10 years to
    finish? I find statistics often forecast information, prominently
    displaying recent compilation dates but still use old census data or the
    like to draw their conclusions. Information on the Internet typically
    has no date.

    Purpose: purpose merits further discussion. When you are uncertain about
    potential bias, you can look for reasons to distrust the information
    instead. Suspicion is not equivalent to bias, but it can be thought
    provoking. Privately, I have heard repeated rumours that important
    national statistics have been fudged in different countries. A
    government research report investigating the price of books in Australia
    would have a political purpose, a purpose which provides the climate for
    some potentially significant bias. A tell-all book by industry experts
    often include a tremendous quality of insider experience difficult to
    find elsewhere. While there may be a purpose of self-agrandizement, the
    purpose is less a climate for significant bias. Medical research has
    perhaps the greatest climate for significant bias, and this suggests the
    greatest standard of proof and external, reliable support.

    This explanation of accuracy, reliability, bias, age and purpose is very
    important in research. This is what leads us to an appraisal of value.
    For years, the tobacco industry funded 'independent' research finding
    smoking minimally harmful to health. It is now likely there may have
    been errors brought on by accuracy, and bias. Certainly, purpose was in
    doubt. As other studies showed smoking in harmful, we can also say this
    research lacked reliability. In business and the Internet, research is
    perpetually suspect because it also ages so very quickly.

    Once you are aclimatized to these elements, you begin to see potential
    for error in a whole range of information. Real-Estate association
    figures, expert opinions, Toothpaste advertisements and National GDP
    figures all occassionally display some degree of warping and
    manipulation, clouding the truth. The solution is awareness, comparison
    and careful analysis. As a personal aside, this is part of the reason
    for my personal dislike for market research: it is often taken far more
    seriously than warranted and mean far less than is suggested.
    ___________________________________________________

 34. Emerging Trends in the information sphere

    I will outline three emerging trends whose impact is not fully
    understood. Firstly, for the past few years, individual database
    owners/maintainers have been flirting with the idea of making paid
    access available through the Internet, rather than the existing system
    of allowing database marketing firms to promote and market their
    databases. I have heard rumours most database producers earn up to 30%
    of retail price when delivered through database marketing firms. The
    Internet is not a commercially viable alternative...yet, but some have
    emerged with alternative funding despite this (Library of Congress,
    ERIC, see section 6.2). Others are creeping in around the edges by
    offering subscribers access at a much reduced flat annual fee (Computer
    Select at one time). I expect to see much more of this once a meaningful
    way to charge by the page emerges - which despite the hype appears to be
    some time away.

    A second trend is Internet publishing itself. Gradually, the information
    is getting easier to locate (don't laugh please - its undignified). We
    are also getting better at using the Internet as a tool to disseminate
    information. Emerging from these efforts are the very visible, if
    perhaps short-lived, search engines, but also other efforts like
    archives of FAQs, archives of guidebooks, applying the dewey decimal
    system to the Internet, specialist directories, specialist search
    engines and more ensure this will be a lively field for several years to
    come. As it gets easier to locate the good information, perhaps the
    lines between commercial quality and Internet quality will begin to
    merge. I have seen some promising plans for raising the quality of
    Internet information.

    Thirdly, there is this very interesting prospect of paying for
    information by the page through the Internet - and viewing the results
    in a web page immediately. There are many technical hurdles yet, but
    certain elements are already appearing, including ventures like
    DialogWeb, but much more is in the future. This step may prove
    profitable for ATM vendors and owners of Internet cafes, pubs and
    kiosks. It may also herald a dramatic drop in the cost of information.
    ___________________________________________________

 35. Education and Training in Professional Research

    Practice, Guidance and Facts are required to become better at research.
    None of these is particularly hard to get, just the time and effort to
    get better, for just like an artist, professional research is a lifetime
    study made more complicated by a moving  target.

 __ 35.1 Facts

    Facts on professional research are relatively easy to find. Making some
    coherent sense of them takes practice. You will want to learn of each
    tool in your field and their relative strengths, weaknesses and
    qualities. You will also want to learn about the technology supporting
    this industry, secondary experience on the skills you will need to
    learn, and some understanding of clear-thinking and statistical
    comprehension. Research as a business may also interest you.

    Technology:
  #  Full text databases  - by Carol Tenopir and Jung Soon Ro
    Research Skills:
  #  http://cn.net.au
    Research as a Business:
  #  _The Information Broker's Handbook_
    by Sue Rugge and Alfred Glossbrenner 
    (updated occassionally, so seek the latest edition)
     _Find It Fast_
     by Robert I. Berkman

 __ 35.2 Practice

    Almost all University Libraries make an assortment of Research CD-Roms
    available to their patrons. Most University Libraries, for a small sum,
    will issue members of the public with a library card and permission to
    use these databases. Practice on these as they are free, relatively
    current, and provide instant feedback. The Commercial databases which
    have migrated to the Internet (LOCIS & ERIC) are atypical or overly
    simplified. Plus there is no time pressure (things change when you are
    being charged $3 a minute + download costs), so beyond this, subscribe
    to a commercial database provider. They will also begin to send you
    ample resources to further educate you in the art of database research.

    Interviewing both primary resources (those involved) and secondary
    resources can become an elegant and quick way to learn something, and
    also a rapid way to get advice of further uncommon resources. There is a
    definite skill to learn here which you will not get out of a book -
    though there are books that will help you learn this skill. I will see
    if I can find some useful resources on this topic.

 __ 35.3 Guidance

    If you read and practice without guidance, you will become proficient,
    but incomplete as a professional researcher. For starters, you will have
    picked up inefficient habits along the way. More importantly, you will
    be set apart from other researchers with few ways of learning of new
    resources/new techniques/new concepts. This is a particular problem
    among professional researchers who are not fortunate enough to be
    librarians or work closely with other researchers - there are few
    opportunities to share and discuss professional research with your
    peers. Besides seeking these opportunities, you may wish to consider:
  #  InfoPro - a private mailing list devoted to discussing professional 
     research and detective work.
  #  There is a professional research periodical printed in Texas.
  #  Professional Associations in the World.
  #  Periodically read books by other authors on Research Techniques.
    ___________________________________________________

 36. Question and Answer Section

 __ 36.1 How do I find information on the Internet?

    Basically, you need to remember that a search for information on the
    Internet is not different from a the standard information search in
    process. You still need to start by outlining carefully just what you
    are hoping to locate. Secondly, you need to be aware of the
    peculiarities of the Internet as a researchable resource (or rather a
    collection of resources). If you expect instant delivery of exactly what
    you require, free, then you need a reality check (and I am sure you will
    get one as soon as you log in!) Sadly, the printed media tends to forget
    this, but that is another story.

    As with all resources, the more familiar you are with a given resource,
    the more efficiently you will work. Get to know the Internet for a time
    first. Understand how it works. Then re-adjust your expectations and
    file it as just another few resources which may be preferable in certain
    circumstances.
    ___________________________________________________

 37)  Acknowledgments

    I would like to thank my past clients, the Western Australians I have
    trained and all you internauts who will shortly inundate me with endless
    snippets of wisdom to be included here. Your help is greatly
    appreciated.
    ___________________________________________________
    Copyright (c) 1998 by David Novak, all rights reserved.
    This FAQ may be posted to any USENET newsgroup, on-line service,
    website, or BBS as long as it is posted unaltered in its entirety
    including this copyright statement. This FAQ may not be included in
    commercial collections or compilations without express permission from
    the author. Please post permission requests to david@cn.net.au
    -----------------------------------
    David Novak - david@cn.net.au
