In this chapter the background of
the research and the rationale initiated to do this research has been
presented. It presents the overall objective, methodology, significance of the
study, scope, and limitation of the thesis and problem that has to be addressed
finally the organization of the thesis is presented.
Since the Internet was introduced
in 1969, it has grown rapidly and is expected to continue to do so over the years.
The World Wide Web is a very popular and interactive information resource
acting as a storehouse for image, text, audio, video, and metadata. The amount
of information available on Web is very huge and noisy. The noise comes from
two major sources. First, an emblematic Web page contains many pieces of
information, e.g., the main content of the page, advertisements, copyright
notices, routing links, privacy policies, etc. Second, due to the fact that the
Web does not have quality control of information, i.e., one can write anything
that one likes. The popularity of WWW is largely dependent on the search
engines. A web search engine is a software system that is designed to search
for information on the World Wide Web. Now anyone can quickly search for
helpful direction tips, personal information, recipes, vacancy, pictures,
organizational websites and more with search engines. Search engines consist of
four discrete software components: Crawler or Spider: This is the part of the
search engine which combs through the pages on the internet and gathers the
information for the search engine.; Indexer: a blender like program that
dissects web pages that are downloaded by spiders; The database: The search
engine’s database is what you are actually searching. All of the information
that a web crawler retrieves is stored in a database. Every time you use a
search engine, it is this database you are searching, not the live internet
Figer1: architecture of standard web engine.
vast amount of data is making search more and more difficult with traditional
search engine as they return huge data for a given query which is consisting of
relevant as well as irrelevant data. 23 This not only results in wastage of
user time but also leads to data overload problem. So, users are not satisfied
with searching the information by traditional search engine. So
the problem of re-ranking search pages or results has become one of the main
problems in IR field. Exactly what information the user wants is unpredictable.
So the web page ranking algorithms are designed to anticipate the user
requirements from various static (e.g., number of hyperlinks, textual content)
and dynamic (e.g., popularity) features.
ranking algorithms developed are Page Rank, Weighted Page rank, Page Content
Rank, HITS, SALSA, SUBSPACE HITS, SIMRANK etc. Most of these algorithms are
either based on web structure mining or web content mining. Web content mining
extracts useful information from the content of web documents whereas web
structure mining is used to set links between references and referents in the
web 4. In this paper, Enhanced Weighted Page Rank (EWPR) is being proposed
for search engines that works on the basis of Weighted Page Rank algorithm and
takes into account Weight factor (WF). The important purpose of this proposed
algorithm is to find more relevant information as per the queries of the user
is one group of users, however, that have been largely ignored in the rush to
use the WWW. These are blind or visually impaired users, who have particular
problems in accessing web material. Whereas accessibility issues to sighted
users may be a matter of response time, getting lost in cyberspace, or less
performance of search engine retrieval, these are issues that may be
inconvenient, but they are not challenging problems. The considerations are not
the same for a blind or visually impaired user trying to access the same
information. Problems of accessibility to a person who is visually impaired
covers all those for sighted users, plus a number that are unique to this group
of users. These include the issue of screen design, the use of font size,
color, the use of patterns in screen backgrounds that make the text difficult
to read and an excess of graphics. These features, designed to be appealing to
the sighted user, may make Internet pages inaccessible to a visually impaired
Statement of the problem
According to Central Statistical Agency (CSA)
of Ethiopia there are about 800,000
blinds and more than 1.4 low visions learners which are deprived of the Internet. The visually impaired
learners are left to learn using the conventional method of “talk and Braille”
by the teacher. They cannot use the functionalities like “linking” of web pages
through the available browsers with their local language to acquire the various
information and knowledge from the web. The visually impaired learners are also
deprived of enjoying services in the Internet like sending and receiving
e-mails unlike their other normal friends.
Most importantly due to rapid development of the internet and exponential growth of information
amount it has been difficult for the visually impaired to search the relevant
information from search engines. The web has become difficult for users to
extract and filter the information that is more relevant. Since the visually
impaired users try to access the internet through voice recognition mechanism
and search results are provided to them by converting the contents of the pages
to voice and reading the result to them. Because of this result from search
engines should have to be re-ranked by giving a priority to those relevant
pages and page that are not full of graphic contents.
on the capabilities of the World Wide Web (WWW) to everyone’s daily activities
especially in online education, there is still some limitation to people with disabilities,
especially the visually impaired learners to access information. To avoid social
exclusion for people with disability, web accessibility is a requirement for
websites. Today, there are large numbers of websites which fail to meet the
requirements of web accessibility. Therefore, this research would focus on the accessibility
related to the development of Amharic voice browser plug-in that facilitates
the visually impaired in seeking information via internet.
The Objectives of the Study
objective of this paper is to focus on a study of the usability and accessibility
topic for developing Amharic voice-based browser plug-in as an assistive tool
that facilitates the visually impaired in seeking information via Internet.
More importantly to enable
visually impaired user access the relevant web page by enhancing the existing
content based relevancy algorithm so that web pages that are relevant and which
can be easily synthetized to voice will be ranked in priority. By developing this
interactive Amharic voice recognizer browser Plug-in and integrating the enhanced algorithm with the Plug-in to facilitates the
visually impaired learners accessing information through the Internet as a part
of accessing to virtual learning
ü To explore the challenges faced by the
visually impaired learners in accessing virtual learning environment.
ü To design and develop a browser plug-in that
enables them to browse the Internet through Amharic and English voice
ü To enhance existing content based web page
re-ranking algorithm to re-rank search results from search engines which are
accessible and relevant to the visually impaired users.
browser plug-in is an accessible browser plug-in that allows them to navigate
the Internet with less complexity by using a medium of speech for alternative
input and output.
Research Specific Questions
The proposed research
shall implicitly discuss the following exploratory questions like:
ü Whether the application of Amharic voice-based
web searching concept as a replacement to the conventional method of learn
using “talk and Braille”.
ü The advantage of using the Amharic voice-based
web searching system in translating documents in terms of the accuracy and
precision as compared to existing manual Amharic-English Dictionary.
The research questions
for the proposed research are:
(i) How can we evaluate the performance of
proposed enhanced algorithm in terms of page re-ranking results?
the enhanced Content Based Web Page Re-Ranking Using Relevancy Algorithm
provide the relevant page to the visually impaired users.
(iii) Whether Amharic voice-based web searching is
useful in filling the gap between the visually impaired and normal web users.
Theoretical / Conceptual Framework
The theoretical framework for the study was
constituted by quite a large amount of experience reports on web page
re-ranking algorithms. Reviewing the literature, it is found that there was a
mismatch between the page rank provided by search engines and the interests of
the visually impaired users. The companies work with general page ranking
algorithms to rank web page in the World Wide Web their ranking, which are
aimed with manual user quires rather than interest of specific user. First much
of existing search engines focus on manual keyboard query insertion for
searching and the results from searching are not relevant for visually impaired
users since this users access the result as voice output using some methods.
Secondly even if there are some search engines with voice searching, the
results or the ranking of the search engines are the same with manual key board
searching which are intended for normal users.
Scope and Limitation of the Study
The scope of this research is enhancing the
existing content based relevancy based web page re-ranking algorithm and
developing interactive Amharic voice-based browser Plug-in by integrating with the algorithm as an assistive tool that
facilitates the visually impaired in seeking information via Internet, which
comprises of recognizing voice in Amharic, process the recognized token or
commands, searching from Google search engine and re-rank search results by the
re-ranking application, get questions and answer as artificial intelligence and
finally read the search result for impaired users
Since Amharic language by its nature is
complicated language while spelling words we will consider accuracy of
translation as a limitation and the Plug-in will be developed for the Google
chrome browser. This research will contribute new way of web page re-ranking
for visually impaired internet users with Amharic voice search.
Significance of the Study
ranking algorithm has a potential to extract useful pages or documents from
huge collection of data from the web by fulfilling the need of web users. The beneficiaries of this research are visual impaired and low vision internet users.
research enable the visually impaired users to access the internet by their
voice and get relevant information from the internet by using the proposed page
re-ranking algorithm and it will fill the gap between visual impaired and
normal internet users in digital learning environment by getting relevant
information with new way of re-ranking search results from search engines.
Local visual impaired Amharic voice speakers will access the internet through
interactive Amharic voice recognizer plugin to surf the internet.
The proposed web
page re-ranking algorithm can be used by other researches for further
researching with other language across the world and more for multi lingual
voice based searching.
Organization of paper
rest of the study is organized as follows: Chapter two covered related
literatures on the basic concepts of page ranking algorithm: different types of
page ranking algorithms, comparison of existing algorithms using different
metrics , advantages and disadvantages of existing algorithms ,issues of
webpage accessibility and literature review on existing speech recognition
technology’s related works done on page ranking algorithms. Chapter three
describes the research methodology and strategy that aims to identify the
potential problem the visually impaired users face to access the internet. It
begins by describing exploratory research approaches. Primary data collection
and analysis technique were described as information acquisition method.
Validity and reliability requirements also identified. Chapter four presents
the results of the interview, techniques and technologies required to design
the prototype. In Chapter five, addresses backgrounds of architectural design of
the proposed algorithm and implementation of the Amharic voice enabled plugin
and integration of the algorithm with the voice enabled plug-in. Finally, in
chapter six conclusions about the research and suggestions for future research
direction were presented