HKM- Human Knowledge Maps
Some zoom views of its anatomy
Classified 001, 11 June 2002
Author:Juan Chamero, CEO Intag
HKM- Clones Network and Coopbots tasks
We may imagine the Webspace
hosting a HKM – Human Knowledge Maps network in different evolutionary
states: [Lg-abc, t, URL, U(type, M)], which stands for
- Language/s: for instance En
(English) or En/Sp (English – Spanish)
- abc: a given sequential
number of the clone of an original HKM Version
- t: traffic measured in some
adequate unit, for example, giga visits
- URL: the Website locator of
the site that hosts one associated HKM clone
- U (type, M): that defines
the user’s market by type of market and Mass measured in mega users
We may imagine also a set of
coopbots interchanging strategic information concerning the clones’
evolution, namely: matchmaking efficiency, dropouts, databases growing rates,
searching profiles, etc.
Anatomy of a HKM Clone
Let’s take a view onto one of the
clone’s hosts. The clone works in a matchmaking mode versus users. The clone
consists of nearly 500.000 human_made_agents_aided briefs pointing to equal
and corresponding number of websites, considered either Authorities or Hubs,
a necessary and sufficient basic approach to the Human Knowledge at a given
moment. As the “core” of each clone we have:
- Small Green Oval: A set of
nearly 250 trees, one for each Major Subject of the Human Knowledge. A tree
is the logical tree of a classical Major Subject Program, fort instance
Microbiology;
- Large Green Oval: A set of
Manuals, one for each major Subject (MS). They will have from 60 through 120
pages each with text, images and essential hyperlinks, totaling a Virtual
Encyclopedia of nearly 20.000 pages, equivalent to a 40 Volume Collection;
- Blue Oval: The Thesaurus,
the whole sets of keywords for the entire collection, from 400.000 through
600.000 units. That means a full and evolutionary Thesaurus for each human
language, enriched as time passes by in proportion to user’s traffic;
- Heavy Green Small Oval: The
whole set of threads for the whole collection. A thread stands for a string
of keywords that has a special meaning for each MS, without being a subject;
A set of connected concepts, to become a subject whether supported by a high
and/or consistent traffic;
The user, after browsing
internally the clone for a given “triad” [k, s, th], keyword, subject,
thread, the first obliged and the second and third optional, he/she may
decides either to visit or not a given Website (yellow oval).
The blue crown is built via
users-clone interactions, for instance, orthographically correct keywords not
present in the clone’s Thesaurus. For each of these “actually non-existent”
keywords the system registers statistics and accordingly sends robots to look
for information and global popularity statistics to the Web space, in order
to suggest some keywords updates and/or new Authorities-Hubs for the next
evolutionary step. For more details you have to see how the Expert Systems
works in detail.
The aquamarine external crown are
complementary URL’s, with briefs shown as they are by the search engines. These
URL’s behave like bait for users. If they are browsed and/or selected too
often they are considered potentially candidates to become part of the clone.
Robots guided by the clone’s administrators choose these complementary URL’s.
The algorithm that guides the robots for this intelligent pre selection task
must take into account popularity, age matters, and Website structural
parameters.
Major Subject’s Anatomy
We depict below a more detail of
how a Major Subject is organized within the host. Each MS comprises 2.500
briefs – I-URL’s in the average, 1000 threads and 2.500 words (the same
amount 2.500 is a mere coincidence). The program sector of the database holds
the subject’s tree and the Manual. We show here the suggested I-URL’s (in fact
we start suggesting only raw URL’s instead I-URL’s) to bait users and the
suggested keywords sector enriched via users interactions.
Tree Evolution
The pieces of knowledge are
logically presented as trees, and complemented by a set of Thesaurus and a
set of Threads. However, the tree is the pillar of the knowledge at a given
evolutionary state. With the tree we started the construction of the first
version of the map, once approved by the academic staff for each MS. The
initial trees must be considered only the best approaches for a given set of
MS’s. Each user’s market is able to make the tree evolve fitting to their
knowledge and information needs as well.
We depict in the figure a thread
within a given MS, for instance ADN sequencing in Biology. The users – clones
interactions, matches and mismatches, teach us many things. For instance, a
regular use of the first branch, a null for the second, a heavy for the third
and a rather weak for the fourth. One singularity could happens in the third
level (red) of the fourth branch, a sub-sub-subject that perhaps deserve to
be considered a higher subject level by itself!.
All these statistics and
singularities are factors to make the trees and Thesaurus evolve, as we will
see when studying the Expert System that makes the clones evolve.
How the HKM’s interface looks like to the users
The user queries the Expert
System that governs the clone via a triad [k, s, th]. The first variable is
mandatory meanwhile the two others are either optional or set by default.
Once the query is issued, the Expert System (XS) extract out from its Briefs
– I-URL’s database, the briefs – I-URL’s that match best, quite a few
compared to a classical search engine inquiry, for instance from 10 through
40 instead thousands to millions. The list is sorted by popularity –
presence. Popularity is the amount of real clicks, accesses to the URL
described in the brief meanwhile presence measures how many times the brief
was presented for any circumstances.
The algorithm outcome is highly
influenced by the brief downloads, that is, as a function of how many times
the brief was loaded into a user’s cart.
The brief – I-URL consists of
three parts, namely: The body or Human Edited Brief, a summary written
following a strict procedure that requires an exhaustive global research of
the Website commented, sometimes aided by a set of utilities and Intelligent
Agents;
A Certification Data Tags, that
account for the website architecture and anatomy as it’s described in the
I-URL white paper;
The statistics and living Brief –
I-URL history: the ranking assigned by the above mentioned ranking algorithm,
the traffic till the moment, the amount of clicks done to retrieve the
Website located at the URL, the search deepness, that is how many pages in
the average are browsed by each user (this computation strongly depends of
the type of access permitted: free user, captive user, registered user, etc),
and declared satisfaction (this feature demands that some users are invited to
evaluate/make a choice in a poll quest).

The K – Process
By that we mean the keyword
process. One user run is identified by one IP and the time run. Along that
time period the user queries the XS with a string of k’s, via a triad
selecting subject, and/or thread or leaving it by default.
For each k, the XS checks
existence and errors. If the keyword exists, it operates the query and user
decides to browse the list totally or partially, to save it or not some
briefs, to make or nor some clicks and in the aftermath to declare or not
satisfaction upon request.
Being the keyword non-existent,
the XS checks different types of grammatical, syntaxes and orthography
errors. By yes it typifies the error/s and suggests new keyword/s. By no it
initiates a complex process of searching in the Web and putting a demand of
potentially needed keyword in line. This process feeds the blue region of our
second figure.
The XS try to do its best to
discover coherent clusters of demand in order to learn as much as possible of
the users’ market behavior and more than that: to suggest new threads and
even new subjects to the HKM clone’s administrators. We have the intuition
that users behave like thinking via clusters, keywords and sets of
commonplace clusters, instead of trees. If the intuition is confirmed much
could be gained to increase the match making efficiency of the HKM’s and of
their clones.
|
|
|
Send your Comments
|
Recommend this page to a friend
|
|