simplified Chinese character: Simplified Chinese English

Shangpin China Joins Hands with Beisheng Internet to Create a New Chapter in Website Construction

Type: Shangpin Dynamic Learn more

home page / Journalism / Website construction / text

Retrieval using www search engine

Source: Shangpin China | Type: website encyclopedia | Time: July 7, 2014

WWW, also known as the Web or World Wide Web, is an information transmission network based on hypertext developed and named by Tim Bemers Lee, an American scientist at the European Institute of Particle Physics in 1989, turning a new page on the Internet. Tin Bemcrs-L "is therefore known as the father of the World Wide Web and won the world's first" Millennium Technology Award ". WWW and Internet are not the same concept, but one of the service functions provided by the Internet. WWW makes network users no longer face boring machine instructions. The exponential growth of text, images and various multimedia information on the network can be accessed intuitively and conveniently through browsers and hyperlinks. In addition, the WWW search engine is also a powerful tool that must be used to collect information that users are interested in on the Internet.

Search engine is a platform providing information search services on the Internet, and is the most widely used network service tool. The search guide we usually use now is basically run in WWW, so it can also be called WWW search guide. With the increasing penetration of network information into the lives of ordinary people, search and climbing has become a popular and key technology on the Internet, and the competition of research and development has never stopped. The reason why we can click "hundreds of millions of Internet page information" is that we rely on thousands of searches in the Internet to find, grab, store, index and provide network information retrieval services. They are moving towards specialization, localization and life orientation. WWW search engines can be divided into three types according to their different operation modes: directory websites, full-text search engines and meta search engines.

1、 Directory Site

daily record of events Website construction It is an early WWW information search tool. Its working mode is to collect and sort network information manually, and present and browse it in the form of classified topics. Due to the high labor cost, the technical content is relatively low. In essence, it is not a real search guide, so it has not been paid attention to so far. Almost all directory websites have developed their own independent new generation of search guide, which has evolved into common keyword search forms, such as Sina, Sohu, and Yahoo China. At present, it is difficult to find the traces of the original directory browsing style, Only a few still retain the characteristics of the original website classification search. The most famous website directory is Yahoo's Chinese website directory, which is sorted from time of appearance to Sohu, Netease, Sina, etc., and LookSmart abroad About, etc. The directory website has the following characteristics.

① Browsing network information based on tree directory is simple and easy to use. The information tour organized by the tree directory structure has strict systematicness and good expandability. The monthly record has added human intelligence, shielded the complexity of the network senior system relative to users, and can improve the accuracy of information and navigation quality. ② The resource classification directory is not detailed enough. The complexity of network information resources makes it difficult to determine a comprehensive category system to cover all network information resources as the basis of the theme tree structure. In order to ensure the usability of the topic and the clarity of the structure, the category system should not have too many categories, which makes some special categories nowhere to be found on the one hand, and a large number of Web pages are ignored because they are not included in the directory on the other hand. With the growth of the Web, this problem will become more and more serious. Clustering or other automatic classification methods (including natural language processing, correlation top extraction, etc.) are still unsatisfactory. And there will also be the problem that the results of automatic classification of the machine are different from those of manual classification. ⑧ Because of manual intervention, large amount of maintenance, relatively little information and untimely information update, this directory website often sends queries to other search engines to search the entire Web in order to make users get more information. Today's directory websites and full-text search engines are integrated, and users can hardly distinguish between them. For example, Yahoo used Google's search guide to provide page search, while Google used the "OpenDirectory" directory to provide classified queries, and the search interface is almost the same.

2、 Full text search index

Full text search engine is called a real search engine. Its difference from website directory is that it no longer uses manual information search and classification, but uses software programs to collect, index, and retrieve network information. The structure of full-text search index consists of four parts.

(1) Searcher. Searcher or network robot. It is a kind of network automatic search software, usually called "spider", crawler or robots. The only job of "Spider" is to roam the Web to find and collect information. It can "crawl" about 10 million pages every day and collect new information of various types as soon as possible. At the same time, because the information of the Web is updated very quickly, the old information that has been collected should be updated regularly to avoid dead links and invalid links. There are two strategies for collecting information. First, start with a set of URLs (resource locators), follow the hyperlinks in these URLs, and recursively extract information from the Web in a width first or depth first manner. These starting URLs are often very popular sites with many links, such as Yahoo's classification nodes; Second, the "Add URL" column can be set to allow web information authors to actively provide web addresses to search engines, but this method is often bombarded by spam pages, and almost 95% of the web addresses submitted by adding the URL column are rejected. Different search information strategies used by search engines, such as search frequency and search objects, will lead to differences in the search results and quality of each search engine.

(2) Indexer. Indexers or indexers. Its function is to analyze the information collected by the collector, carry out automatic indexing, and represent the document as a form convenient for retrieval and store it in the index library, that is, to establish inverted documents. Each index item in the inverted document contains a set of pointers to the page where it appears. In order to provide the user with information about the checked out document, the index also contains a simple description of each page, such as the generation date, size, title, subtitle and summary.

(3) Retriever. The function of the searcher, or retrieval software, is to quickly retrieve relevant documents in the index library according to the user's query, evaluate the relevance of documents and queries, sort the results to be output, and realize a user related feedback mechanism (that is, it can constantly revise the retrieval strategy). The searcher is regarded as the most complex part of the search engine, which contains important questions about the ranking of search results. Researchers found that users cannot patiently browse tens of thousands of search results, but only pay attention to the first few pages of search results. The simple sorting method based on click through rate and word frequency is obviously flawed.

3、 Meta search

Meta search engines are also called multi search engines. These search engines do not have their own massive databases, but submit users' query requests to multiple search engines at the same time, sort the returned results, and then return the results to users. According to its search mechanism, it can be divided into parallel and serial. The parallel meta search index refers to sending the query request time to each independent search index. The results are then provided to the user in a specific order. Serial meta search indexing is to send the query request to an independent search engine first, and then send the request to another search index after it returns the results.

Source Statement: This article is original or edited by Shangpin China's editors. If it needs to be reproduced, please indicate that it is from Shangpin China. The above contents (including pictures and words) are from the Internet. If there is any infringement, please contact us in time (010-60259772).

Previous: Network Information Resources Retrieval Next: Domain name strategy of e-commerce websites

TAG label:

Station building process

Website requirements
Website planning scheme
Page design style
Confirm delivery for use
Data entry optimization
Program design and development
Follow up service
contact number
010-60259772

Hot tags

Latest articles

Recommended News

More industries

What if your website can increase the number of conversions and improve customer satisfaction?

We can achieve

Make an appointment with a professional consultant to communicate!

City branch station:

Copyright © 2024 Hengjiu Shangpin All rights reserved| |Public security organ record number: 11011502003910 Disclaimer Shangpin focuses on high-end Website construction , system platform development, WeChat applet and APP development services

Disclaimer

Thank you very much for visiting our website. Please read all the terms of this statement carefully before you use this website.

1. Part of the content of this site comes from the network, and the copyright of some articles and pictures involved belongs to the original author. The reprint of this site is for everyone to learn and exchange, and should not be used for any commercial activities.

2. This website does not assume any form of loss or injury caused by users to themselves and others due to the use of these resources.

3. For issues not covered in this statement, please refer to relevant national laws and regulations. In case of conflict between this statement and national laws and regulations, the national laws and regulations shall prevail.

4. If it infringes your legitimate rights and interests, please contact us in time, and we will delete the relevant content at the first time!

Contact: 010-60259772
E-mail: [email protected]

University cluster solutions

Strong military network construction solution

Government business solutions

Smart exhibition solutions

Solutions in scientific research field

Smart Scenic Spot Solution

Overseas real estate solutions

Website construction

Applet

Mobile Internet

network marketing

VI Design

Shangpin China Joins Hands with Beisheng Internet to Create a New Chapter in Website Construction

Retrieval using www search engine

Station building process

Website requirements

Website planning scheme

Page design style

Confirm delivery for use

Data entry optimization

Program design and development

Follow up service

Hot tags

Latest articles

Website construction scheme: Fresh makeup aesthetics website

Enterprise website construction plan: create a new business card for the network and open the digital future

High end website production solution

Recommended News

SEO website optimization search spider like what kind of website?

What is a subnet mask

What are the website page design skills of Tianjin website construction?

Promotion formula of marketing website construction

How to make users pay more attention to your website?

How can website design companies design more innovative?

Make an appointment with a professional consultant to communicate!

Disclaimer

Telephone consultation

Online consultation

WeChat consultation

Communicate with professional consultants now!

Disclaimer