Today the internet has made human life more dependent on it right?. In almost everything, the internet has been used. The number of websites and users is increasing gradually. With this noteworthy increase of existing data on the Internet and because of its fast and disordered growth. Web searching has become a tricky procedure for the majority of the users as it makes users feel confused and at times lost in overloaded data that perseveres to enlarge. e-business and web marketing are quickly developing and predicting the requirements of their customers is obvious particularly. As a result, guessing the users’ interests for improving the usability.
What is Web mining?
Web mining is the application of data mining techniques to discover patterns from the World-Wide-Web. It uses automated methods to extract both structured and unstructured data from web pages, server logs, and link structures.
What are the applications of Web Mining?
The common forms of web content data are HTML, web pages, images audio-video, etc. The main being the HTML format. Though it may differ from browser to browser the common basic layout/structure would be the same everywhere. Since it’s the most popular in web content data. XML and dynamic server pages like JSP, PHP, etc. are also various forms of web content data.
On a web page, there is content arranged according to HTML tags. The web pages usually have hyperlinks that connect the main webpage to the sub-web pages. This is called Inter-page structure information. So basically relationship/links describing the connection between web pages are web structure data.
The main source of data here is WebServer and Application Server. It involves log data which is collected by the main above two mentioned sources. Log files are created when a user/customer interacts with a web page.
Web usage mining , a subset of Data Mining, is basically the extraction of various types of interesting data that is readily available and accessible in the ocean of huge web pages, Internet or formally known as World Wide Web (WWW). Being one of the applications of the data mining techniques, it has helped to analyze user activities on different web pages and track them over a period of time. Basically, Web Usage Mining can be divided into 2 major subcategories based on web usage data.
The user logs are collected by the Web server. Typical data includes IP address, page reference, and access time.
Commercial application servers have significant features to enable E-commerce applications to be built on top of them with little effort. A key feature is the ability to track various kinds of business events and log them in application server logs.
new kinds of events can be defined in an application, and logging can be turned on for them thus generating histories of these specially defined events. Many end applications require a combination of one or more of the techniques applied in the categories above.
The Future of web mining
To conclude, As the web and its usage continues to grow. The past five have seen the emergence of web mining as a rapidly growing area, due to the efforts of the research community as well as various organizations that are practicing it.