August 10, 2017

# Overview of web mining

Internet is a large source of data and information; the data on web is frequently accessed and changed. Important and knowledgeable information extraction form the World Wide Web is the application of data mining techniques.

Figure 1 categories of web mining

The technique of exploring the web data using the data mining algorithms is termed as web mining in order to recover the significant patterns over the data. The information in web can be available directly by using contents and links or indirectly by using the access logs or other kinds of log formats. According to the application of mining algorithms or techniques the web mining can be categorized in three main classes:

1. Web content mining: this technique is also known as text mining, generally the second step in Web data mining. Content mining is the scanning and mining of text, pictures and graphs of a Web page to determine the significance of the content.
2. Web structure mining: that is one of three categories of web mining, it is a tool used to recognize the connection between web pages linked by information or direct link connection. This organization of data is discover-able by the condition of web structure schema through database techniques for Web pages. This relationship allows a search engine to pull data concerning to a search query directly to the connecting Web page from the Web site the content rests upon.
3. Web usage mining: this domain allows for the collected works of Web access information for Web pages. This usage data provides the paths leading to accessed Web pages. This information is often gathered automatically into access logs via the Web server.

## Applications of web mining

Electronic commerce: A significant application area of the web mining is observed on e-commerce platforms in terms of recommendation system design. That includes challenge to understand visitors or customers’ needs. This may help to improve quality of service for consumers.

E-Learning: Web mining can be used for enhancing the process of E-learning environments. Applications of web
mining to e-learning are usually based on web content and web usage mining based. Machine learning techniques by using the concept of recommendation system design improves the learning experience of learners in web based learning environments.

Cyber security: Web mining techniques are also used for protection of user’s confidential information against cyber crimes such as internet fraud, phishing websites, virus, pornographic contents and cyber terrorism. Content Mining techniques of web can reveal identities of cyber criminals.

Digital marketing: this techniques can support a web enabled electronic business to improve on marketing, customer support and sales operations.

Digital libraries: Digital libraries services provide precious information distribute all around world,eliminating the necessity to be physically present at different libraries in different parts of world.

Challenges in web mining

The subsequent factors made it hard for an effectual data warehousing and data mining.

• The massive size of the web
• No proper arrangement for the web documents.
• The active environment of the information source.
• The variety in usage and consumer society.

