MENU service case
 Website construction website design Beijing website construction high-end website production company Shangpin China
We create by embracing change
360 ° brand value__
simplified Chinese character
Simplified Chinese English

Shangpin China Joins Hands with Beisheng Internet to Create a New Chapter in Website Construction

Type: Shangpin Dynamic Learn more

Where is the "big" of big data mining?

Source: Zeng Jianping | Type: website encyclopedia | Time: December 11, 2017
In the past, we talked about data mining, while in the era of big data, we talked about big data excavate. Where is the "big" of big data mining? This article has carried on some induction to this, hoped can provide some ponders the question the method.
Please leave a comment on the shortcomings.
 
1、 Large amount of data
How much data is there? This is a question that many people ask when they are mining big data.
From some practical applications, if the amount of data processed every day reaches T and P levels, you can consider deploying Hadoop, Spark and other big data processing platforms. A certain amount of data processing can highlight the advantages of these platforms.
The amount of data is small, and data reading and relocation take too much time, which can not reflect the advantages of big data processing platform. Many applications only use big data for big data, and hundreds of M also have a Hadoop. Therefore, when we talk about big data, we think that Hadoop, Spark and other platforms have limitations.
Of course, when deciding whether to use a big data platform, more factors may need to be considered, such as integrating many low performance machines, portability between heterogeneous software and hardware platforms, and processing a large amount of unstructured data.
 
2、 Diversification of data types
In the era of data mining, we mainly mine relational data. In the era of big data, various applications have produced various kinds of data, which usually involves multiple data types in big data mining. The data type mentioned here is not a common data type in program design, but is closer to the application data representation, usually including time series data, track data, graph data, text data, etc.
The daily sales records and prices are common data types, but they are connected in order from the time dimension. The time series data formed can reflect the law of price changes, and of course have richer meanings.
Everyone's location is just a (x, y) common data type, but connecting the locations according to the order of movement constitutes a person's activity trajectory, which reflects his life and habits, and these hidden information is what big data should pay attention to.
Everyone in microblog or forum exists independently and is also ordinary data. But if everyone is connected according to fans, attention and other relationships, it can form a large graph, that is, graph data. The population and outliers in the graph, as well as the higher graph data with attributes such as group preference and group movement, are the focus of big data mining.
 
3、 Data processing noise
In the era of data mining, data comes from relational databases, which are business related and high quality data, and can be directly mined when taken. Big data mining is certainly not the case. Big data thinking determines that we need to consider the quality of data from different sources, and data structures are mixed to enhance the robustness of data processing. For example, to conduct enterprise level customer analysis, different branches may use different customer management systems. Some systems use undergraduate/master/doctoral degrees to distinguish customers' degrees, while others use undergraduate/graduate degrees. This requires consideration of data consistency processing. In addition, data format, data integrity, etc. are all considered in big data mining.
 
4、 Diversification of data mining
In the era of data mining, it generally focuses on single data analysis, while big data mining may focus on the simultaneous existence of multiple data mining tasks, such as classification, prediction, correlation, clustering, etc. Although there are many business requirements, these classifications, predictions, correlations, and clusters may use the same model on the bottom layer. Therefore, it is very important to consider the separation of models, algorithms, and services when mining big data, that is, the so-called big data processing hierarchy.
Source Statement: This article is original or edited by Shangpin China's editors. If it needs to be reproduced, please indicate that it is from Shangpin China. The above contents (including pictures and words) are from the Internet. If there is any infringement, please contact us in time (010-60259772).
TAG label:

What if your website can increase the number of conversions and improve customer satisfaction?

Make an appointment with a professional consultant to communicate!

* Shangpin professional consultant will contact you as soon as possible

Disclaimer

Thank you very much for visiting our website. Please read all the terms of this statement carefully before you use this website.

1. Part of the content of this site comes from the network, and the copyright of some articles and pictures involved belongs to the original author. The reprint of this site is for everyone to learn and exchange, and should not be used for any commercial activities.

2. This website does not assume any form of loss or injury caused by users to themselves and others due to the use of these resources.

3. For issues not covered in this statement, please refer to relevant national laws and regulations. In case of conflict between this statement and national laws and regulations, the national laws and regulations shall prevail.

4. If it infringes your legitimate rights and interests, please contact us in time, and we will delete the relevant content at the first time!

Contact: 010-60259772
E-mail: [email protected]

Communicate with professional consultants now!

  • National Service Hotline

    400-700-4979

  • Beijing Service Hotline

    010-60259772

Please be assured to fill in the information protection
Online consultation

Disclaimer

Thank you very much for visiting our website. Please read all the terms of this statement carefully before you use this website.

1. Part of the content of this site comes from the network, and the copyright of some articles and pictures involved belongs to the original author. The reprint of this site is for everyone to learn and exchange, and should not be used for any commercial activities.

2. This website does not assume any form of loss or injury caused by users to themselves and others due to the use of these resources.

3. For issues not covered in this statement, please refer to relevant national laws and regulations. In case of conflict between this statement and national laws and regulations, the national laws and regulations shall prevail.

4. If it infringes your legitimate rights and interests, please contact us in time, and we will delete the relevant content at the first time!

Contact: 010-60259772
E-mail: [email protected]