In recent years the artificial intelligence has been developed rapidly since it can be applied easily to several areas like medical diagnosis, engineering and economics, among others. In this study we have devised a soft expert system (SES) as a prediction system for prostate cancer by using the prostate specific antigen (PSA), prostate volume (PV) and age factors of patients based on fuzzy sets and soft sets and have calculated the patients’ prostate cancer risk. Our data set has been provided by the Department of Urology, Meram Medical Faculty in Necmettin Erbakan University, Konya, Turkey.
Keywords:fuzzy set; soft set; prostate cancer; soft expert system
In recent years vague concepts have been used in different areas such as medical applications, pharmacology, economics and engineering since the classical mathematics methods are inadequate to solve many complex problems in these areas. Traditionally mathematics uses a crisp (well-defined) property , i.e., properties that are either true or false. Each property defines a set: .
The most successful theoretical approach to vagueness is undoubtedly fuzzy set theory introduced by Zadeh . The theory is used commonly in different areas as engineering, medicine and economics, among others. The fuzzy set theory is based on the fuzzy membership function . By the fuzzy membership function, we can determine the membership grade of an element with respect to a set. A fuzzy set F is described by its membership function . The fuzzy set theory has become very popular and has been used to solve problems in different areas. But there exists a difficulty: how to set the membership function in each particular case. The reason for these difficulties is, possibly, the inadequacy of a parametrization tool of the theory . Soft set theory was initiated by Molodtsov  as a new method for vagueness. Molodtsov showed in his paper that the theory can be applied to several areas successfully; for example, the smoothness of functions, game theory, Riemann-integration, Perron-integration, etc. He also showed that soft set theory is free from the parametrization inadequacy syndrome of other theories developed for vagueness. A soft set can be represented by Boolean-valued information system, and so it can be used to represent a dataset. Also, the hybrid models of the vague sets take attention of researchers. Maji et al. defined a hybrid model called fuzzy soft sets. This new model is a combination of fuzzy and soft sets and is a generalization of soft sets. Irfan Ali and Shabir  developed the theory. To address decision making problems based on fuzzy soft sets, Feng et al. introduced the concept of level soft sets of fuzzy soft sets and initiated an adjustable decision-making scheme using fuzzy soft sets . Feng et al. first considered the combination of soft sets, fuzzy sets and rough sets. Using soft sets as the granulation structures, Feng et al. defined soft approximation spaces, soft rough approximations and soft rough sets, which are generalizations of Pawlak’s rough set model based on soft sets. It has been proven that in some cases Feng’s soft rough set model could provide better approximations than classical rough sets. Simsekler (Dizman) and Yuksel  contributed to fuzzy soft topological structures.
Prostate cancer is the second most common cause of cancer death among men in most industrialized countries, and it depends on various factors such as family cancer history, age, ethnic background and the level of prostate specific antigen (PSA) in blood. The level of PSA in blood is very important method to an initial diagnosis for patients [10-12]. However the level of PSA in blood can be increased by inflammation of prostate and benign prostate hyperplasia (BPH). For this reason, it is difficult to differentiate it from benign prostate hyperplasia (BPH). The definitive diagnose of the prostate cancer is possible with prostate biopsy. The results of PSA test, rectal examination and transrectal findings help the doctor to decide whether biopsy is necessary or not [1,13,14]. However the patients with low cancer risk have to avoid this process due to possible complications and its high cost. Because of this reason, before agreeing to biopsy, the patients with low cancer risk can be determined. There are several research works in the area of the prostate cancer prognosis or diagnosis. One of them is FES which is a rule-based fuzzy expert system using the laboratory data PSA, PV and age of the patient and it aims to help to an expert-doctor to determine the necessity of biopsy and the risk factor . Benecchi  developed a neuro-fuzzy system by using both serum data (total prostate specific antigen and free prostate specific antigen) and clinical data (age of patients) to enhance the performance of tPSA (total prostate specific antigen) to distinguish prostate cancer. Keles et al. built a neuro-fuzzy classifier to be used in the diagnosis of prostate cancer and BPH diseases. Since the symptoms of these two illnesses are very close to each other, the differentiation between them is an important problem. Saritas et al. have devised an artificial neural network that provides a prognostic result indicating whether patients have cancer or not by using their free prostate specific antigen, total prostate specific antigen and age data.
In this study we aim to discuss how soft set theory can be used for developing knowledge-based system in medicine and devise a prediction system named soft expert system (SES) by using the PSA, PV and age data of patients based on fuzzy sets and soft sets and calculate the patients prostate cancer risk. It is a rule-based system, and according to the rules, we determine the risk of prostate cancer. Our aim is to help the doctor to determine whether the patient needs biopsy or not.
A fuzzy set A in U is a set of ordered pairs:
A fuzzy set can be related to a family of crisp sets through the notion of an α-level set. The α-level set of a fuzzy set F is defined by
Let . A pair is called a soft set over U, where F is a mapping given by , where E is the set of parameters. In other words, the soft set is a parametrized family of the subsets of U. Every set , from this family may be considered as the set of e-elements of the soft set , or the set of e-approximate elements of the soft set.
Example 2.1 Mr. X and Miss Y are going to marry and they want to rent a wedding room. The soft set describes the ‘capacity of the wedding room’. Let be the wedding rooms under consideration, and be the parameter set
The tabular presentation of is shown in Table 1.
Table 1. Tabular presentation of the soft set
Every fuzzy set can be considered as a soft set.
An information system is a 4-tuple , where is a non-empty finite set of objects, is a non-empty finite set of attributes, , is the domain of attribute a, is an information function, such that for every , called information (knowledge) function. An information system can be expressed in terms of an information table (see Table 2). In an information system , if , for every , then S is called a Boolean-valued information system.
Table 2. An information system
The reduction of parameters of soft sets has taken attention of several researchers. Kong  gave an algorithm for the normal parameter reduction of soft sets in 2008. In 2011 Ma  gave a new algorithm for the normal parameter reduction of soft sets and compared this new method with Kong’s method. These two algorithms calculate the same reduction, but Kong’s method is more difficult and complex. Ma gave a new algorithm that is more understandable and easier to avoid the difficulty of Kong’s algorithm.
3 Soft expert system
The prostate data set was provided by the Department of Urology, Meram Medical Faculty in Necmettin Erbakan University, Konya, Turkey. The true data set contains the PSA, PV and age data of 78 patients (see Table 3). For the design process PSA, age and PV were used as input values and prostate cancer risk was used as an output.
Table 3. The input values of several patients
The steps for our designed system are as shown in Figure 1.
Figure 1. Steps for soft expert system.
3.1 First step: fuzzyfication of data set
The data set used in this work is 78 patients who appealed to Meram Medical Faculty urology department for the prostate complaint. The data set is not convenient for applying to soft sets directly (see Table 3). For this reason, we first fuzzyficate the data set. For fuzzyfication of the factors, the linguistic variables are (for PSA) very low (VL), low (L), middle (M), high (H), very high (VH), (for PV) very small (VS), small (S), middle (M), big (B), very big (VB), (for age) young (Y), middle (M), old (O). Fuzzyfication of the used factors is made by the membership functions (1), (2) and (3). These formulas are determined by the expert doctor and literature.
We get the memberships of the input variables from the formulas (1), (2) and (3) and show them in Figure 2.
Figure 2. The membership functions of PSA, PV and age.
We fuzzificated all data of the patients by using these membership functions. We can see the membership functions of some patients in Table 4.
Table 4. The fuzzy membership values of factors
3.2 Second step: transforming the fuzzy sets to soft sets
We know that every fuzzy set can be considered as a soft set. First we choose the parameter set by using the membership functions. Hence we have numerical values for a parameter set. Some of the soft sets obtained by the relation with fuzzy sets are as follows:
3.3 Third step: parameter reduction of soft sets
In Step 2 we obtain the soft sets corresponding to each fuzzy set. Then we use the parameter reduction of soft sets given by Ma . Hence we have new soft sets. Some of them are shown in the following:
3.4 Fourth step: obtaining soft rules
We get the soft rules by the ‘AND’ operation of the soft sets we obtained in the second step, and we observe which patient provides which rule. Some of the rules we obtained are as follows:
In this way, we obtain 400 rules. Then we eliminate some rules that have the same output (the same patient set), and hence we get 285 rules.
3.5 Fifth step: analysis of soft rules
In this step we analyze the soft rules and calculate the prostate cancer risk percentage. The patients set for each rule was obtained in the fourth step. We consider these sets and observe how many of the patients in the set have prostate cancer, then we rate the patients with prostate cancer to each patient in the set. Therefore we have the prostate cancer risk percentage for each rule. If a patient’s data is convenient to more than one rule and so has more than one rate, then we accept the highest one.
Now we calculate the risk percentage of the first rule:
There are 23 patients who have the properties stated in Rule 1. Prostate cancer is found in eight of these patients. Hence, the risk percentage for first rule is . We can easily say that the patients whose values of PSA, PV and age are convenient to the first rule have cancer risk of 34%. The values of patient are convenient to Rule 3, Rule 4 and Rule 8. When we look at the risk percentage of these rules, we see that Rule 8 has the highest rate. Hence the risk percentage of is 100% (the percentage of Rule 8).
The risk percentage for some rules is as follows:
Finally, we write the soft expert system which calculates the prostate cancer risk by input variables PSA, PV and age.
3.6 Calculation of prostate cancer risk
We used MicrosoftVisual Studio 2008 and C Sharp programming language when we devised all the steps of the soft expert system. Figure 3 shows two results from the calculation system.
Figure 3. Calculator.
In this work we designed an expert system SES by using a soft set and it is a pioneering work for applying the soft sets to a medical diagnosis. We also used fuzzy membership functions and an algorithm to reduce the parameter set of soft sets. The expert doctor can reduce unnecessary biopsies in patients undergoing evaluation for prostate cancer by calculating the percentage of prostate cancer risk in the soft expert system. According to our devised system, if the risk percentage is bigger than 50%, then biopsy is necessary. Our data set contains 78 patients. These patients have high values of PSA, PV and age and they are potential prostate cancer patients. For this reason, the biopsy was applied to these patients; however, after biopsy it was seen that 44 of them had cancer. When we calculated the risk percentage of these 78 patients in the soft expert system, we saw that 51 patients needed biopsy, and 27 patients who really had low cancer risk had to avoid biopsy. Our aim is to help the doctor to decide whether the patient needs biopsy or not.
The authors declare that they have no competing interests.
All authors contributed equally and significantly in writing this paper. All authors read and approved the final manuscript.
Dedicated to Professor Hari M Srivastava.
Zadeh, LA: Fuzzy sets. Inf. Control. 8, 338–353 (1965). Publisher Full Text
Molodtsov, D: Soft set theory-first results. Comput. Math. Appl.. 37(4-5), 19–31 (1999). Publisher Full Text
Feng, F, Jun, YB, Liu, XY, Li, LF: An adjustable approach to fuzzy soft set based decision making. J. Comput. Appl. Math.. 234, 10–20 (2010). Publisher Full Text
Feng, F, Li, C, Davvaz, B, Ali, MI: Soft sets combined with fuzzy sets and rough sets. Soft Comput.. 14, 899–911 (2010). Publisher Full Text
Feng, F, Liu, XY, Leoreanu-Fotea, V, Jun, YB: Soft sets and soft rough sets. Inf. Sci.. 181, 1125–1137 (2011). Publisher Full Text
Catolona, WJ, Partin, AW, Slawin, KM, Brawer, MK, Flanigan, RC, Patel, A: Use of the percentage of free prostate-specific antigen to enhance differentiation of prostate cancer from benign prostatic disease: a prospective multicenter clinical trial. JAMA J. Am. Med. Assoc.. 279, 1542–1547 (1998). Publisher Full Text
Egawa, S, Soh, S, Ohori, M, Uchida, T, Gohji, K, Fujii, A: The ratio of free to total serum prostate specific antigen and its use in differential diagnosis of prostate carcinoma in Japan. Cancer. 79, 90–98 (1997) (Online)
(Online)PubMed Abstract | Publisher Full Text
Van Cangh, PJ, De Nayer, P, De Vischer, L, Sauvage, P, Tombal, B, Lorge, F: Free to total prostate-specific antigen (PSA) ratio is superior to total PSA in differentiating benign prostate hypertrophy from prostate cancer. Prostate. 29, 30–34 (1996) (Online)
Metlin, C, Lee, F, Drago, J: The American cancer society national prostate cancer detection project. Findings on the detection of early prostate cancer in 2425 men. Cancer. 67, 2949–2958 (1991) (Online)
(Online)PubMed Abstract | Publisher Full Text
Seker, H, Odetayo, M, Petrovic, D, Naguib, RNG: A fuzzy logic based method for prognostic decision making in breast and prostate cancers. IEEE Trans. Inf. Technol. Biomed.. 7, 114–122 (2003). PubMed Abstract | Publisher Full Text
Saritas, I, Allahverdi, N, Sert, U: A fuzzy expert system design for diagnosis of prostate cancer. International Conference on Computer Systems and Technologies - CompSysTech’2003 (2003). PubMed Abstract | Publisher Full Text
Saritas, I, Ozkan, IA, Sert, U: Prognosis of prostate cancer by artificial neural networks. Expert Syst. Appl.. 37, 6646–6650 (2010). Publisher Full Text
Maji, PK, Biswas, R, Roy, AR: Soft set theory. Comput. Math. Appl.. 45, 555–562 (2003). Publisher Full Text
Ali, MI, Feng, F, Liu, X, Min, WK, Shabir, M: On some new operations in soft set theory. Comput. Math. Appl.. 57, 1547–1553 (2009). Publisher Full Text
Herewan, T, Deris, MM: A soft set approach for association rules mining. Knowl.-Based Syst.. 24, 186–195 (2011). Publisher Full Text
Kong, Z, Gao, L, Wang, L, Li, S: The normal parameter reduction of soft sets and its algorithm. Comput. Math. Appl.. 56(12), 3029–3037 (2008). Publisher Full Text
Ma, X, Sulaiman, N, Qin, H, Herewan, T, Zain, JM: A new efficient normal parameter reduction algorithm of soft set. Comput. Math. Appl.. 62, 588–598 (2011). Publisher Full Text