Page 38 - Demo
P. 38
%u062c%u0645%u064a%u0639 %u0627%u0644%u062d%u0642%u0648%u0642 %u0645%u062d%u0641%u0648%u0638%u0629 %u0640 %u0627%u0625%u0644%u0639%u062a%u062f%u0627%u0621 %u0639%u0649%u0644 %u062d%u0642 %u0627%u0645%u0644%u0624%u0644%u0641 %u0628%u0627%u0644%u0646%u0633%u062e %u0623%u0648 %u0627%u0644%u0637%u0628%u0627%u0639%u0629 %u064a%u0639%u0631%u0636 %u0641%u0627%u0639%u0644%u0647 %u0644%u0644%u0645%u0633%u0627%u0626%u0644%u0629 %u0627%u0644%u0642%u0627%u0646%u0648%u0646%u064a%u062938Chapter ThreePreprocessing the Data Before MiningData must be of quality to satisfy the requirements of the intended use. I. Data Quality There are many factors comprising data quality that are presented in this section. 1- Accuracy: Considers if the data are correct or wrong, accurate or not. There are many possible reasons for inaccurate data (i.e., having incorrect attribute values). The data collection instruments used may be faulty. There may have been human or computer errors occurring at data entry. 2- Completeness: Incomplete data can occur for several reasons. Attributes of interest may not always be available, such as customer information for sales transaction data. Other data may not be included simply because they were not considered important at the time of entry. Relevant data may not be recorded due to a misunderstanding or because of equipment malfunctions. 3- Consistency: Data that were inconsistent with other recorded data may have been deleted. Furthermore, the recording of the data history or modifications may have been overlooked. Missing data, particularly for tuples with missing values for some attributes, may need to be inferred. 4- Timeliness: Ensuring the timeliness of processing requires the ability to collect, transfer, process, and present the stream data in real-time so updated timelines must be added 5- Believability: Reflects how much the data are trusted by users. 6- interpretability: Reflects how easy the data are understood.