Tuesday, December 20, 2011

Web Browser Behavior Exploitation

The use of web behavior data is important to get the feedback from customer. The feedback from customer can be divided into two groups, explicit and implicit feedback. Below are some items of those feedback.
Explicit feedback
  • specified keyword
  • selected & market document
  • rated items
Implicit feedback
  • natural interaction of the system
  • no extra cost
  • less accurate
  • can be combined with explicit feedback
  • Potential observable behavior
  • “modeling information content using observable behavior” (ASIST, 2003).
Those feedback are analyzed from year to year. Based on web browsing behavior data, in implicit feedback, web browsing behavior was focused on widely (www). Since 1995, researcher perform research on web browsing behavior (www started since 1991).

How to collect data
Data can be collected using WWW system via link hits (url, content, history, etc), operation pages, viewing page per frame, keyword, mouse activities, use of button, eye track (movement, gaze point), talk aloud by vid tapping, diary and interview.

On article, its reading time can be analyzed by the following techniques:
  • information filtering based on user behavior analysis and best text match
  • a correlation between article and reading time
  • article vs reading time in graph data
  • Using tools such as gnus/emacs block
Data as stated previously can be collected using www system (server logs, client). Using www system, the factors below should be considered.
  • server: easy collected
  • client: with special software installed, everything can be recorded
  • proxy: tick option in web browser, such as IE, mozilla, safari, chrome etc, to speed access, to log usage, data can be collected transparently, no special software needed, if numbers of users grow largely, system may stack
Not only by using www system, but the data also can be obtained using other systems such as ;
  • Eye docking device → sensor/module to capture eye movement
  • mouse → user interaction capture module
  • display → screen capture module
How to use data
After the data are collected the next step is to use the data to achieve the goals such as evaluate and analyze characteristic of users. Some algorithm of machine learning and data mining are used in exploiting data. Those algorithms included but not limited to,
  • collaborative filtering
  • clustering
  • support vector machine
  • bayesian network
  • neural network
  • association rule mining

Those algorithms are the most commonly used by the community. For example, the collaborative filtering can be easily described as in Fig 1.

Fig 1. Collaborative filtering: 5 should re recommended to A

As shown by Fig 1. If a group (group A) has the large similarity to another group (group B). So, what is done in group B should be recommended to group A. By this methods, it is possible to improve the service to group A and also get benefits.

The concrete studies have been done by researchers. Examples of the previous, current and future studies are as follows
  • They are many studies using web browsing data
  • To retrieve, some filtering should be proposed
  • web page recommendation
  • mining navigation history for recommendation
  • histories of accessed pages are collected
For focusing in web search study, below are important techniques and factors:
  • implicit used modeling for personalized search (CIKM, 2005)
  • A tourist and programmer may use “java” to search different information
  • For basic user actions are considered
  • submitting a keyword query
  • gaze position from mouse movement

Example of those technique can be given by following screenshots.
For picture data, tag recommendation are used with the following rules,
  • a large number of photos are taken and shared
  • Flickr, user can upload photos and applied tags
  • selecting appropriate word
The picture or image should be accompanied with keywords tag recommendation. Example of keywords tags recommendation can shown in the last figures. The first figure is a stadium for football called diamond stadium. So the tags are : diamond stadium, sport game, world cup, football. The second picture is picture of Tokyo Tower at Night. So, the tags should be: city, tokyo, trip, tower, night, Japan.
Tag:  diamond stadium, sport game, world cup, football

Tag: city, tokyo, trip, tower, night, Japan

By knowing interest of user, it can be given the better service. The better service means the better benefits. To achieve the goal, web browsing behavior is observed, analyzed and exploited by some algorithms.

Future studies can be analyzed. For examples are,
  • What data should be collected
  • Many issues are open to be investigated

    by Bagus Tris Atmaja - 114D9818
    Write as home work of Current Science and Technology Class
    Graduate School of Science and Technology, Kumamoto University
    Japan 2011
Related Posts Plugin for WordPress, Blogger...