New M.E. Thesis Submitted from CSE Student

PREDICTING HUMAN PROTEIN FUNCTION BASED ON DISTRIBUTED PROCESSING USING DECISION TREE By Simarjit Kaur,cse

Abstract
Drug discovery is today’s frontier of science. Numerous changes are now occurring in pharmaceutical industry, not just in the way that industry is perceived, but also in rapid expansion of biomedical and drug discovery. The task of discovering safe and effective drugs is more promising as our knowledge of disease increases. Before any potential new medicine can be discovered, scientists work to understand the disease to be treated as well as possible, and to unravel the underlying cause of the condition. They try to understand how the genes are altered, how that affects the proteins they encode and how those proteins interact with each other in living cells, how those affected cells change the specific tissue they are in and finally how the disease affects the entire patient. This knowledge is the basis for treating the problem. In present work, CART classifier proposed for classifying large data set with the help of distributed processing. Decision trees are effective classification algorithms. Each attribute of the data is examined in turned and ranked according to its ability to partition the remaining data. The data are propagated along the branches of the tree until sufficient attributes have been chosen to correctly classify them. Each leaf of the tree represents a subset of the data that lies wholly in one class. Decision trees have a tendency to over-fit the training data. The classifier is executed in distributed environment. The accuracy of prediction comes out to be 80%. It is better than the previous techniques. The timing diagrams are also shown with respect to the CPU time taken.




Leisure Readings :