|
The iBrarian project
is housed at
www.iBrarian.net
|
|
|
Welcome to
the
iBrarian project page.
Here you
will
find links to information on the iBrarian project:
- What
iBrarian
is.
- What
iBrarian does.
- What
makes iBrarian special.
- Published
papers and other information on iBrarian.
- How
to
contribute to the project.
|
|
|
What is iBrarian:
iBrarian is an automated system that learns by reading documents.
Without any of the conventional training or "priming" that goes into
many other systems, iBrarian learns concepts from the
documents it reads and how those topics relate to one another.
The resultant body of knowledge can be visualized, queried
and navigated, and a huge library of information is presented to the
user in this manner. For any concept iBrarian can present the
user with appropriately
related concepts or documents and articles pertaining to that
concept. Given any document, iBrarian uses its
understanding of that document to find related documents
and articles or to present to the user the concepts that define that
document. Despite the size of its collection, iBrarian is capable
of determining this information in real-time from the current state of
its ever increasing body of knowledge and collection of information.
From a technical perspective iBrarian consists of:
- A back-end that
- Fetches articles, papers and other information
- Specialized parsers that analyze the information
- Specialized learning mechanisms to integrate the new
information into its body of knowledge
- A front end that
- Produces data visualization of iBrarain's knowledge
- Provides visualized navigation through the body of
knowledge
|
|
|
|
|
What does iBrarian do:
iBrarian, through its proactive knowledge presentation and data
visualization, facilitates finding the
the ideas you want
thereby finding
information you want.
Ever eager to help, iBrarian is a tireless yet friendly reference
librarian with the entire card catalog committed to memory and a
thorough understanding of its content.
|
|
|
Published papers:
Several papers on iBrarian's much earlier predecessor systems, RUgle
and Navon can be found here.
With major advances in all parts of the system, particularly in the
back end mechanism, the system bears only superficial resemblance to
these earlier prototypes and the algorithms and methods presented in
these papers do not represent the current structure, capabilities or
workings of iBrarian.
|
|
|
What makes iBrarian special:
- Running on a minimal set of desktop computers, iBrarian scales
well enough to compute on-the-fly relationships on a body of knowledge
that includes over 10,000,000 documents, thousands of news articles,
and
other Web postings. iBrarian's document collection alone is
growing at a rate of 50,000 per-day with no noticeable decline in
performance.
- iBrarian's fully automated system, using no human
initialization or intervention in the learning process, has proven to
have at least a 50 percent
direct mapping onto a collection of purely human determined concepts
from the Library of Congress. The mapping is probably much higher than
this; we just haven't proven
it yet.
- iBrarian's methods are generic in nature and its learning
is entirely cross domain.
- iBrarian's methods are instantaneous and incremental.
Each document read and added to iBrarian instantly makes it that much
more intelligent.
- iBrarian does not rely on page-rank or other crawler based
algorithms.
|
|
|
Contributing or collaborating:
- As a student
- iBrarian is commercial scale piece of software with
numerous components and complex methods. Projects related to
iBrarian require a high level of sophistication in the underlying
science and the ability to join a team working on an ongoing project of
this
scale. Any student wishing to work on projects related to
iBrarian should discuss their qualifications and expectations with
me. All student projects are for-credit research. It is not
possible to volunteer.
- As a research partner or sponsor from another research
institution or government body.
- As a corporate partner or sponsor
|
|