Vol.1 No.2

Add to Favourites
Post to:

(IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 1 The Stone Cipher-192 (SC-192): A Metamorphic Cipher Magdy Saeb Computer Engineering Department, Arab Academy for Science, Tech. & Maritime Transport Alexandria, EGYPT (On-Leave), Malaysian Institute of Microelectronic Systems MIMOS Kuala Lumpur, MALAYSIA mail@magdysaeb.net Abstract: The Stone Cipher-192 is a metamorphic cipher that utilizes a variable word size and variable-size user’s key. In the preprocessing stage, the user key is extended into a larger table or bit-level S-box using a specially developed one-way function. However for added security, the user key is first encrypted using the cipher encryption function with agreed-upon initial values. The generated table is used in a special configuration to considerably increase the substitution addressing space. Accordingly, we call this table the S-orb. Four bit-balanced operations are pseudoranddoml selected to generate the sequence of operations constituting the cipher. These operations are: XOR, INV, ROR, NOP for bitwise xor, invert, rotate right and no operation respectively. The resulting key stream is used to generate the bits required to select these operations. We show that the proposed cipher furnishes concepts of key-dependent pseudo random sequence of operations that even the cipher designer cannot predict in advance. In this approach, the sub-keys act as program instructions not merely as a data source. Moreover, the parameters used to generate the different S-orb words are likewise key-dependent. We establish that the self-modifying proposed cipher, based on the aforementioned key-dependencies, provides an algorithm metamorphism and adequate security with a simple parallelizable structure. The ideas incorporated in the development of this cipher may pave the way for key-driven encryption rather than merely using the key for sub-key generation. The cipher is adaptable to both hardware and software implementations. Potential applications include voice and image encryption. Keywords: metamorphic, polymorphic, cipher, cryptography, filters, hash. 1. Introduction A metamorphic reaction takes place in a rock when various minerals go from amphibolites facies to some color schist facies. Some of the minerals such as quartz may not take place in this reaction. The process in its essence follows certain rules; however the end result provides a pseudo random distribution of the minerals in the rock or stone. The metamorphic natural process results in thousands or even millions of different shapes of the rock or stone. This process has inspired us to design and implement a new metamorphic cipher that we call “Stone Cipher-192”. The internal sub-keys are generated using a combination of the encryption function itself and a 192-bit specially-designed one-way function. The idea of this cipher is to use four low level operations that are all bit-balanced to encrypt the plaintext bit stream based on the expanded stream of the user key. The key stream is used to select the operation; thus providing a random however recoverable sequence of such operations. A bit-balanced operation provides an output that has the same number of ones and zeroes. These operations are XOR, INV, ROR and NOP. Respectively, these are, xoring a key bit with a plaintext bit, inverting a plaintext bit, exchanging one plaintext bit with another one in a given plaintext word using a rotation right operation and producing the plaintext bit without any change. In fact, these four operations are the only bit-balanced logic operations. In the next few sections, we discuss the design rationale, the structure of the cipher, the one-way function employed to generate the sub-keys, the software and hardware implementations of the cipher, a comparison with a polymorphic cipher and a discussion of its security against known and some probable cryptanalysis attacks. Finally, we provide a summary of results and our conclusions. 2. Design Rationale It is a long-familiar fact that all ciphers, including block and stream ciphers, are emulating a one-time pad OTP. However, for provable security, the key bits have to be used only once for each encrypted plaintext bit. Obviously, with present day technology this is not a practical solution. Alternatively, one resorts to computational complexity security. In this case, the key bits will be used more than once. Unfortunately, this will provide the cipher cryptanalyst with the means to launch feasible statistical attacks. To overcome these known attacks, we propose an improvement in the nonlinearity-associated filtering of the plaintext bits. This can be achieved in various ways as shown in [1]; however, the process can be further simplified and become appreciably faster and more riotously-secure if we parallelize all operations employed. We will establish that the proposed configuration can be further parallelized to enormously improve its security and throughput. One can imagine the algorithm as a pseudo random sequence of operations that are totally key-dependent. Accordingly, we presuppose that most known attacks will be very difficult to launch since there are no statistical clues left to the attacker. The algorithm utilized is randomly selected. Even the cipher designer has no clear idea what is the sequence of bitwise (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 2 operations would be. The encryption low-level operations are selected to be bit-balanced. That is, they do not provide any bias to the number of zeroes or ones in the output cipher. The result of such an approach will be the creation of an immense number of wrong messages that conceal the only correct one. Therefore, the cryptanalyst is left with the sole option of attacking the key itself. However, if the subkeey are generated based on a cascade of the same encryption function and a one-way hash, then we conceive that these attacks will be unmanageable to launch. We are producing an unexampled key-dependent encryption algorithm. In this case, the least high-priced kept secret is the key. The proposed system is malleable and resilient if unknowingly disclosed. This theme does not dispute Kerckhoffs' principle [2] or Shannon’s maxim since the “enemy knows the system”. However, it provides a degree of security against statistical attacks [3] that, we believe, cannot be attained with conventional ciphers [4], [5], [6], [7], [8],[9]. 3. The Structure of the Cipher The conceptual block diagram of the proposed cipher is shown below in Figure 1. It is constructed of two basic functions; the encryption function and the sub-key generation one-way hash function. The pseudo random number generator is built using the same encryption function and the one-way hash function in cascade. Two large numbers (a, b) are used to iteratively generate the subkeeys The details of the substitution box or what we call the S-orb can be found in [1]. The user key is first encrypted then the encrypted key is used to generate the sub-keys. Figure 1. The structure of the cipher The encryption function or the cipher engine is built using four low-level operations. These are XOR, INV, ROR and NOP. Table 1 demonstrates the details of each one of these operations. Table 1: The basic cipher engine (encryption function) operations Mnemonic Operation Select Operation code XOR Ci = Ki Pi 00 INV Ci = ¬(Pi) 01 ROR Pi ← Pj 10 NOP Ci = Pi 11 The basic crypto logic unit (CLU) is shown in Figure 2. All operations are at the bit level. The unit is to be repeated a number of times depending on the required word or block size. The rotation operation, referred to by the circular arrow, is performed using multiplexers as shown in Figure 3. In the software version these multiplexers are replaced by “case” or “switch” statement. This CLU is used as the encryptor or the decryptor. This can be easily verified, if we investigate the truth table shown in Appendix A. In this table, if we change the output cipher bit to become an input plain text bit, the new output will be the same as the old plain text bit. Obviously, this is a feature of the applied functions namely XOR, INV or NOP. The only exception is in the case of ROR, the decryptor will use ROL. OR4 inst XOR inst5 AND3 inst8 AND3 inst9 AND3 inst10 AND3 inst11 7404 inst12 7404 inst14 7404 inst16 7404 inst17 7404 inst18 Pi Ki S0 S1 Ci Figure 2. The basic crypto logic unit Figure 3. The rotation operation (ROTR) implementation using multiplexers The operation selection bits (S1 S0) can be chosen from any two sub-key consecutive bits; as shown in Figure 4. The same applies for the rotation selection bits (S’1 S’0). Figure 4. The proposed key format where the location of the selection bits is shown 4. The One-way Hash Function Cryptographic one-way functions or message digest have numerous applications in data security. The recent cryptoanallysi attacks on existing hash functions have provided (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 3 the motivation for improving the structure of such functions. The design of the proposed hash is based on the principles provided by Merkle’s work [10], Rivest MD-5 [11], SHA-1 and RIPEMD [12]. However, a large number of modifications and improvements are implemented to enable this hash to resist present and some probable future cryptoanallysi attacks. The procedure, shown in Figure 5, provides a 192-bit long hash [13] that utilizes six variables for the round function. Figure 5. Operation of MDP-192 one-way function [13] A 1024-bit block size, with cascaded xor operations and deliberate asymmetry in the design structure, is used to provide higher security with negligible increase in execution time. The design of new hashes should follow, we believe, an evolutionary rather than a revolutionary paradigm. Consequently, changes to the original structure are kept to a minimum to utilize the confidence previously gained with SHA-1 and its predecessors MD4[14] and MD5. However, the main improvements included in MDP-192[13] are: The increased size of the hash; that is 192 bits compared to 128 and 160 bits for the MD-5 and SHA-1 schemes. The security bits have been increased from 64 and 80 to 96 bits. The message block size is increased to 1024 bits providing faster execution times. The message words in the different rounds are not only permuted but computed by xor and addition with the previous message words. This renders it harder for local changes to be confined to a few bits. In other words, individual message bits influence the computations at a large number of places. This, in turn, provides faster avalanche effect and added security. Moreover, adding two nonlinear functions and one of the variables to compute another variable, not only eliminates the possibility of certain attacks but also provides faster data diffusion. The fifth improvement is based on processing the message blocks employing six variables rather than four or five variables. This contributes to better security and faster avalanche effect. We have introduced a deliberate asymmetry in the procedure structure to impede potential and some future attacks. The xor and addition operations do not cause appreciable execution delays for today’s processors. Nevertheless, the number of rotation operations, in each branch, has been optimized to provide fast avalanche with minimum overall execution delays. To verify the security of this hash function, we discuss the following simple theorem [13]: Theorem 5.1: Let h be an m-bit to n-bit hash function where m >= n input keys k1, k2 to h. Then h (k1) = h (k2) with probability equal to: 2-m + 2-n – 2-m-n Proof: If k1 = k2 , then h (k1) = h (k2). However, if k1≠ k2, then h(k1) = h(k2) with probability 2-n. k1 = k2 with probability 2-m and k1≠ k2 with probability 1-2-m. Then the probability that h (k1) = h(k2) is given by: Pr {h (k1) = h (k2)} = 2-m + (1 -2-m). 2-n As an example, assume two 192-bit different keys x1, x2 then Pr {h(x1) = h(x2)} = 2. 2-192 – 2-384 = 2-191 (1 -2-193) ≈ 3.186 x 10-58 This is a negligible probability of collision of two different keys. 5. The Pseudo Random Number Generator (PRG) The combination of the encryption function and the one-way hash function is used to generate the sub-keys. The cipher designer has to select which one should precede the other. Based on the work by Maurer and Massey [15] where they have proved that a cascade of two ciphers is as strong as its first cipher. Therefore, we have adjudicated to start with the encryption function. The one-way hash function is then used recursively to generate the sub-keys based on two large numbers that are derived from the user key. In this case, the encryption function requires some initial agreed-upon vector value (IV), [16], [17], [18] to complete the encryption process. This IV can be regarded as a long-term key or even a group-key that can be changed on a regular basis or when a member leaves the group. The combination of the encryption function and the one-way function are used as the required pseudo random number generator PRG. It is worth pointing out that the design of the cipher intentionally allows the change of the one-way hash if successfully attacked. 6. The Algorithm The algorithm can be formally described as shown in the next few lines. Algorithm: STONEMETAMORPHIC INPUT: Plain text message P, User Key K, Block Size B OUTPUT: Cipher Text C Algorithm body: Begin Begin key schedule (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 4 1. Read user key; 2. Encrypt user key by calling encrypt function and using the initial agreed-upon values as the random input to this function; 3. Read the values of the large numbers a and b from the encrypted key; 4. Generate a sub-key by calling the hash one-way function and using the constants a, b; 5. Store the generated value of the subkey; 6. Repeat steps 5 and 6 to generate the required number of subkeys; End key schedule; Begin Encryption 7. Read a block B of the message P into the message cache; 8. Use the next generated 192-bit key to bit-wise encrypt the plain text bits by calling the encrypt function; 9. If message cache is not empty, Goto step 8; 10. Else if message cache is empty: If message not finished 10.1 Load next block into message cache; 10.2 Goto 8; Else if message is finished then halt; End Encryption; End Algorithm. Function ENCRYPT Begin 1. Read next message bit; 2. Read next key bit from sub-key; 3. Read selection bits from sub-key; 4. Read rotation selection bits from sub-key; 5. Use selection & rotation bits to select and perform operation: XOR, INV, ROR, NOP; 6. Perform the encryption operation using plaintext bit and sub-key bit to get a cipher bit; 7. Store the resulting cipher bit; End; As seen from the above formal description of the algorithm, it simply consists of a series of pseudo random calls of the encryption function. However, each call will trigger a different bitwise operation. The simplicity of this algorithm readily lends itself to parallelism. This parallelism can be achieved using today’s superscalar multi-threading capabilities or multiple data paths on a specialized hardware such as FPGA with their contemporary vast gate count. 7. Software Implementation The pseudo C-function [19] that represents such a table is given by: encrypt (plain-text-bit, key-bit, selection-bit0, selection-bit1, rot-bit) { a1= plain-text-bit ^ key-bit; e1= a1 & (~selection-bit0) & (~selection-bit1); b1= ~ plain-text-bit; f1= b1 & (selection-bit0) & (~selection-bit1); g1= rot-bit & (~selection-bit0) & (selection-bit1); h1= plain-text-bit & (selection-bit0) & (selectionbitt1) cipher-bit = e1|f1|g1|h1; return (cipher-bit); } 8. Hardware Implementation The hardware version of the CLU, previously shown in Figure 2, is FPGA-implemented. We have used Altera Quartus II 6.1 Web Edition, [20]. The average delay per byte was found to be 4.33 cycles per byte. Straightaway, if we use four CLUs in-parallel, this delay will be approximately equal to one cycle per byte. This proposed parallel configuration is shown in Figure 6. Figure 6. The proposed parallel configuration A representative code of the Verilog file used to FPGAimpllemen the CLU is given by: module metamorph (p1,k1,s0,s1,p2,c1); input p1,k1,s0,s1,p2; output c1; xor(a1,p1,k1); and(e1,a1,~s0,~s1); assign b1= ~p1; and(f1,b1,s0,~s1); and(g1,p2,~s0,s1); and(h1,p1,s0,s1); or(c1,e1,f1,g1,h1); endmodule (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 5 9. Comparison with Chameleon Polymorphic Cipher As seen from the given analysis and results, one can summarize the various characteristics of this cipher, when compared to Chameleon Polymorphic Cipher [Saeb09], as follows: Table 2: A comparison between Stone Metamorphic Cipher and Chameleon Polymorphic Cipher Cipher Characteristic Chameleon-192 Polymorphic Cipher Stone-192 Metamorphic Cipher User key size Variable Variable Sub-keys 192-bit K, S(K) 192-bit K, S(K), S’(K) Estimated maximum delay per byte 10 cycles/byte 6 cycles/byte Estimated average delay per byte 9.1 cycles/byte 4.3 cycles/byte PRG (Sub-key Generation) One-way Function One-way cascaded with the Encryption Function Structural Sequential: Sel-1, ROT, Sel-0 Concurrent: XOR, ROT, INV, NOP Number of rounds Variable (keydepeenden with minimum equal to 5 rounds) Variable (keydepeenden with minimum equal to 8 rounds) Algorithm Template Yes (key changes operation parameters) No (key selects operations) Parallelizable Yes ( some sequential operations) Yes (operations are selected concurrently) Security Secure Improved Security (pseudo random sequence of operations and more secure PRG) 10. Security Analysis One claims that differential cryptanalysis, linear cryptanalysis, Interpolation attack, partial key guessing attacks, and side-channel attacks, barely apply in this metamorphic cipher. The pseudo random selection of operations provides the metamorphic nature of the cipher. This, in turn, hides most statistical traces that can be utilized to launch these attacks. Each key has its own unique “weaknesses” that will affect the new form of the algorithm utilized. Thus, different keys will produce completely different forms (meta-forms) of the cipher. Even the cipher designer cannot predict in advance what these forms are. It can be easily shown that the probability of guessing the correct sequence of operations is of the order of , where w is the word size and N is the number of rounds. That is for, say, a word size of 8 bits, the probability of guessing this word only is . For a block size of 64 bits, this probability is . Consequently, statistical analysis is not adequate to link the plain text to the cipher text. With different user keys, we end up with a different “morph” of the cipher; therefore, it is totally infeasible to launch attacks by varying keys or parts of the key. The only option left to the cryptanalyst is to attack the key itself. To thwart this type of attacks, we have used the encryption function as a first stage in a cascade of the encryption function and the one-way function. Regarding the key collision probability, it was shown in section 4 that the key collision probability is negligible when a 192-bit hash is applied. Moreover, the cryptanalyst has a negligible probability of guessing the correct form of the algorithm utilized. As was previously discussed, the simple structure of the proposed cipher provides a foundation for efficient software and hardwarebaase implementation. Depending on the word or the block size required, it is relatively easy to parallelize the data path either using multi-threading on a superscalar processor or by cloning this path on the FPGA material. Undeniably, using the same encryption process and sub-keys for each block is a disadvantage from a security point of view. Still, this is exactly the same issue with block ciphers in general. The advantage obtained from such a configuration, similarly to block ciphers, is saving memory and communication bandwidth on the chip and the channel levels. The pseudo random selection of operations and the key-dependent number of rotations provide a barrier against pattern leakage and block replay attacks. These attacks are quite frequent in multi-media applications. Using ECB mode, when encrypting images with conventional ciphers, a great deal of the structure of the original image is preserved [3]. This contributes to the problem of block replay. However, the selective operations allow the cipher to encrypt images with no traces of the original image. This is a major advantage of the Stone Metamorphic Cipher bit-level operations when applied to multimedia files. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 6 11. Summary & Conclusions We have presented a metamorphic cipher that is altogether key-dependent. The four bit-balanced operations are pseudoranddoml selected. Known statistical attacks are barely applicable to crypt-analyze this type of ciphers. The proposed simple structure, based on the crypto logic unit CLU, can be easily parallelized using multi-threading superscalar processors or FPGA-based hardware implementations. This presented CLU can be viewed as a nonlinearity-associated filtering of the data and key streams. The PRG, constructed from a cascade of the encryption function and the one-way hash function, provides the required security against known key attacks. On the other hand, it easily allows the replacement of the hash function if successfully attacked. The cipher is well-adapted for use in multi-media applications. We trust that this approach will pave the way for key-driven encryption rather than simply using the key for sub-key generation. Appendix A: The truth table of the CLU Pi Ki S’1 S’0 ® Pj S1 S0 OP Ci 0 0 0 0 0 XOR 0 0 0 0 0 1 INV 1 0 0 0 1 0 ROR 0 0 0 0 1 1 NOP 0 0 0 1 0 0 XOR 0 0 0 1 0 1 INV 1 0 0 1 1 0 ROR 1 0 0 1 1 1 NOP 0 0 1 0 0 0 XOR 1 0 1 0 0 1 INV 1 0 1 0 1 0 ROR 0 0 1 0 1 1 NOP 0 0 1 1 0 0 XOR 1 0 1 1 0 1 INV 1 0 1 1 1 0 ROR 1 0 1 1 1 1 NOP 0 1 0 0 0 0 XOR 1 1 0 0 0 1 INV 0 1 0 0 1 0 ROR 0 1 0 0 1 1 NOP 1 1 0 1 0 0 XOR 1 1 0 1 0 1 INV 0 1 0 1 1 0 ROR 1 1 0 1 1 1 NOP 1 1 1 0 0 0 XOR 0 1 1 0 0 1 INV 0 1 1 0 1 0 ROR 0 1 1 0 1 1 NOP 1 1 1 1 0 0 XOR 0 1 1 1 0 1 INV 0 1 1 1 1 0 ROR 1 1 1 1 1 1 NOP 1 References [1] Magdy Saeb, “The Chameleon Cipher-192: A Polymorphic Cipher,” SECRYPT2009, International Conference on Security & Cryptography, Milan, Italy; 7-10 July, 2009. [2] Auguste Kerckhoffs, “La cryptographie militaire,” Journal des sciences militaire, vol. IX, pp. 5-83, Jan. 1883, pp.161-191, Feb. 1883. [3] Swenson, C., Modern Cryptanalysis; Techniques for Advanced Code Breaking, Wiley Pub. Inc., 2008. [4] Merkle, R.C., “Fast Software Encryption Functions,” Advances in Cryptology-CRYPTO ’90 Proceedings, pages.476-501, Springer Verlag, 1991. [5] Massey, J. L., “On Probabilistic Encipherment,” IEEE. Information Theory Workshop, Bellagio, Italy, 1987. [6] Massey, J.L., “Some Applications of Source Coding in Cryptography,” European Transactions on Telecommunications, vol. 5, No. 4, pp.7/421-15/429, 1994. [7] Rogaway, P., Coppersmith, D., “A Software-oriented Encryption Algorithm,” Fast Software Encryption Cambridge Security workshop Proceedings, Springer-Verlag, pages 56-63, 1994. [8] Bruce Schneier, “Description of a New Variable-Length key, 64-bit Block Cipher (Blowfish),” Fast Software Encryption, Cambridge Security Workshop Proceedings, Springer-Verlag, pages 191-204, 1994. [9] Bruce Schneier, John Kelsey, Doug Whiting, David Wagner, Chris Hall, Niels Ferguson, “ Twofish: A 128-bit Block Cipher,” First AES conference, California, US., 1998. [10] Ralph C. Merkle, June, Secrecy, Authentication and Public Key Systems, Ph.D. Dissertation, Stanford University, 1979. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 7 [11] Rivest, R.L., “The MD5 Message Digest Algorithm,” RFC 1321, 1992. [12] Hans Dobbertin, Antoon Bosselaers, Bart Preneel, “RIPEMD-160: A Strengthened Version of RIPEMD,” Fast Software Encryption, LNCS 1039, Springer-Verlag, pages 71–82, 1996. [13] Magdy Saeb, “Design & Implementation of the Message Digest Procedures MDP-192 and MDP-384,” ICCCIS2009, International Conference on Cryptography, Coding & Information Security, Paris, June 24-26, 2009. [14] Rivest, R.L., “The MD4 Message Digest Algorithm,” RFC 1186, 1990. [15] Ueli Maurer, James Massey, “Cascade Ciphers: The Importance of Being First,” Journal of Cryptography, vol. 6, no. 1, pp. 55-61, 1993. [16] Discussions by Terry Ritter, et al., Accessed 2007. http://www.ciphersbyritter.com/LEARNING.HTM. [17] Erik Zenner, On Cryptographic Properties of LFSRbaase Pseudorandom Generators, Ph.D. Dissertation, University of Mannheim, Germany, 2004. [18] Erik Zenner, “Why IV Setup for Stream Ciphers is Difficult,” Dagstuhl Seminar Proceedings 07021, Symmetric Cryptography, March14, 2007. [19] Michael Welschenbach, Cryptography in C and C++, Apress, 2005. [20] S. Brown, Z. Vranesic, Fundamental of Digital Logic with Verilog Design, McGraw-Hill International Edition, 2008. Author Profile Magdy Saeb received the BSEE. School of Engineering, Cairo University, in 1974; the MSEE. and Ph.D. in Electrical & Computer Engineering, University of California, Irvine, in 1981 and 1985, respectively. He was with Kaiser Aerospace and Electronics, Irvine California, and The Atomic Energy Establishment, Anshas, Egypt. Currently, he is a professor in the Department of Computer Engineering, Arab Academy for Science, Technology & Maritime Transport, Alexandria, Egypt, (on leave) to Malaysian Institute of Microelectronic Systems (MIMOS), Kuala Lumpur, Malaysia. His current research interests include Cryptography, FPGA Implementations of Cryptography and Steganography Data Security Techniques, Encryption Processors, Computer Network Reliability, Mobile Agent Security. www.magdysaeb.net. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 8 Personal Authentication based on Keystroke Dynamics using Ant Colony Optimization and Back Propagation Neural Network Marcus Karnan1 and M. Akila2 1Department of Computer Science and Engineering, Tamilnadu College of Engineering, Coimbatore, India karnanme@yahoo.com 2Department of Information Technology, Vivekanandha College of Engeneering for Women, Tiruchengode, India akila@nvgroup.in Abstract: The need to secure sensitive data and computer systems from intruders, while allowing ease of access for authenticate user is one of the main problems in computer security. Traditionally, passwords have been the usual method for controlling access to computer systems but this approach has many inherent flaws. Keystroke dynamics is a relatively new method of biometric identification and provides a comparatively inexpensive and low profile method of hardening the normal login and password process. Here, the Ant Colony Optimization is used to reduce the redundant feature values and minimize the search space. It reports the results of experimenting Ant Colony Optimization technique on keystroke duration, latency and digraph for feature subset selection. Back Propagation Neural Network is used for classification and the performance is tested. Optimum feature subset is obtained using keystroke digraph values when compared with the other two feature values. Keywords: Ant Colony Optimization Algorithm (ACO), Backpropagation Algorithm, False Acceptance Rate, Feature Extraction, Feature Subset Selection, False Rejection Rate. 1. Introduction Access to computer systems is usually controlled by user accounts with usernames and passwords. Such scheme has little security [1][2] if the information falls to wrong hands. Key cards or biometric systems [3][4][5][6], for example fingerprints [7] is being used nowadays to improve the security. Biometric methods measure biological and physiological characteristics to uniquely identify individuals. The main drawback of most biometric methods is that they are expensive to implement, because most of them require specialized hardware to strengthen security. But they require quite expensive additional hardware. On the other hand keystroke dynamics [8] consist of many advantages like (i) It can be used without any additional hardware. (ii) Inexpensive (iii) Hardening the security. Keystroke dynamics include several different measurements[9][10][11] such as (i) Duration of a keystroke or key hold time (ii) Latency of keystrokes or inter-keystroke times (iii) Typing error (iv) Force keystrokes etc. Keystroke analysis [12] is of two kinds static and dynamic. Static keystroke analysis essentially means that the analysis is performed on typing samples produced using the same predetermined text for all the individuals under observation. Dynamic keystroke analysis implies a continuous or periodic monitoring of issued keystrokes and is intended to be performed after the log-in session also. There are two phases namely extraction phase and verification phase. During the feature extraction phase [4][12][13][14] user keystroke features from one’s password is captured, processed and stored in a reference file as prototypes for future use by system in subsequent authentication operations. During the verification phase[15][16] user keystroke features are captured, processed in order to render an authentication decision based on the outcome of a classification process of the newly presented feature to the pre-stored prototypes (reference templates) [17][18]. It would be necessary for the user to type his/her name or password a number of times in order for the system to be able to extract the relevant features that uniquely represent the user. However, the task of typing one’s name or password over and over is both tiring and tedious in the feature extraction phase, which could lead users to alter their normal typing pattern. Thus, most systems based on biometrics are required to work with a summarized set of information from which to extract knowledge. In order to reduce this problem, we could eliminate some features of the original dataset, selecting only the best ones in terms of class cohesion. Feature subset selection [9][19][20] is applied to high dimensional data prior to classification. Feature subset selection is essentially an optimization problem, which involves searching the space of possible features to identify one that is optimum or near-optimal with respect to certain performance measures, since the aim is to obtain any subset that minimizes a particular measure (classification error [21], for instance). In order to reduce the complexity and to increase the performance of the classifier the redundant and irrelevant features are reduced from the original feature set. Many feature subset selection [22] [23] approaches are proposed in the previous studies. Relevant research results of past decades are detailed in this section. Yu and Cho [24] propose a GA-SVM based wrapper approach for feature subset selection in which GA is employed to implement a randomized search and SVM, an excellent novelty detector with fast learning speed, is employed as a base learner. The degree of diversity and (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 9 quality are guaranteed, and thus they gave result in an improved model performance and stability. Ki-seok Sung and Sungzoon Cho [25] propose one step approach similar to that of Genetic Ensemble Feature Selection (GEFS), yet with a more direct diversity term in the fitness function and SVM as base classifier and similar to that of Yu and Cho [24]. In particular, so called "uniqueness" term is used in a fitness function, measuring how unique each classifier is from others in terms of the features used. To adapt SVM authors use Gaussian kernel. GA was used to filter the data and to carry out a selection of characteristics. It reports an average FAR of 15.78% with minimum FAR of 5.3% and maximum FAR of 20.38% for raw data with noise. Gabriel et al [26] designed a hybrid system based on Support Vector Machines (SVM) and Stochastic Optimization Techniques. Standard Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) variation was used and produced a good result for the tasks of feature selection. Standard GA and PSO variation was used and produced a good result for the tasks of feature selection and personal identification with an FAR of 0.81% and IPR of 0.76%. Glaucya et al [4] used weighted probability measure by selecting N features of the features vector with the minors of standard deviation, eliminating the features less significant. They obtained optimum result using 90% of the features with 3.83% FRR and 0% FAR. In Section II, the feature extraction phase is discussed. Section III explains the feature subset selection method. Section IV discusses the classification techniques and in Section V conclusion is presented. 2. Feature Extraction To capture a keystroke, it is necessary for users to type their password 100 number of times. The 100 samples were got in a week period. The system capture these features using three methods regarding the time (in milliseconds) that a particular user maintains the key pressed (Duration time) and the time elapsed between releasing one key and pressing the next (latency time) and the combination of the above, Digraph. The data was collected from 27 participants with different passwords. The mean and standard deviation values were measured. Keystroke recording application software was developed in C# for the measurement of duration, latency and digraph of each user sample in the raw data file. Upon pressing submit, a raw-data text file is generated. During the creation of the raw data file, the mean (µ) and standard deviation () [27][28][29] of each feature (i) of the pattern set (x) is calculated for N samples in agreement with the following equations: Mean (µi) = (1/N)Σx[i] i=1..N (1) Standard deviation (σi) = (1/N-1)Σ|x[i]-µ[i]| i=1..N (2) For instance, for the password “ANT” the timing information of duration for user x is [205, 250, 235] ms. Fig 1 shows the measurement of duration, latency and digraph of keystrokes of the password “ANT” of user x. 3. Feature Subset Selection 100 samples are typed by user for 100 number of times during feature extraction. During the verification phase, it takes more time to verify all the 100 number of features. To reduce this time complexity we are using feature subset selection methods. In feature subset selection, we extract the optimized features from the 100 number of features. It is essentially an optimization problem, which involves searching the space of possible features to identify one that is optimum. Various ways to perform feature subset selection has been studied earlier [4][24][25][26] for various applications. Here, we propose Ant Colony Optimization to select the feature subset. 3.1 Ant Colony Optimization Ant algorithms [30][31][32] was first proposed by Dorigo and colleagues as a multi-agent approach to difficult combinatorial optimization problems such as the Traveling Salesman Problem (TSP) and the Quadratic Assignment Problem (QAP). There are currently various activities in the scientific community to extend and apply ant-based algorithms to many different discrete optimization problems. The ACO heuristic [31][33][34] has been inspired by the observation on real ant colony’s foraging behavior, and on that ants can often find the shortest path between food source and their nest. Ant individuals transmit information through the volatile chemical substances which ants leave in his passing path and also known as the “pheromone” and then reach the purpose of finding the best way to search food sources. An ant encountering a previously laid trail can detect the dense of pheromone trail. It decides with high probability to follow a shortest path and reinforce that trail with its own pheromone. The large amount of pheromone is on the particular path, the large probability is that an ant selects that path and the paths pheromone trail will become denser. The Ant Colony Optimization algorithm is explained as follows: Step 1. Get the feature values a[x] from duration /latency /digraph of keystrokes. Step 2. Calculate the fitness function f[x] by the following equation for every a[x]. f[x] = 1 /(1+a[x]) (3) Initialize the following a. NI = 100 (Number of iterations) b. NA = 20 (Number of Ants) c. T0 = 0.001 (Initial pheromone value for every a[x]) d. ρ = 0.9 (rate of pheromone evaporation parameter for every a[x]) Step 3. Store the fitness function values in S, where S = {F[x], T0, flag} where flag column mentions whether the feature is selected by the ant or not. Step 4. The following is repeated for NI times Figure 1. Measurement of duration, latency and digraph (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 10 i. A random feature value g[x] in a[x]is selected for each ant with the criteria that the particular feature value should not have been selected previously. ii. Selected feature value’s, pheromone value is updated by the following: Tnew = (1-r) x Told + r x T0, for g[x] Where Tnew and Told are the new and old pheromone value of the feature value. iii. Find Obtain Lmin = min(g[x]) where Lmin is the Local minimum iv. Check if Lmin < = Gmin then assign Gmin = Lmin else no change in Gmin value where Gmin is the Global minimum. v. Select the best feature g[y], whose solution is equal to the Global minimum value at the end of the last iteration. vi. The selected g[y]’s pheromone value is globally updated by the following equation Tnew = (1-a) x Told + a x Told, for g[y] where a is a rate of pheromone evaporation parameter, = 1 /(Gmin) The remaining ants their pheromone is updated as: Tnew = (1-a) x Told vii. Finally, the Gmin value is stored as optimum value. At last, the ant colony collectively marks the shortest path, which has the largest pheromone amount. Such simple indirect communication way among ants embodies actually a kind of collective learning mechanism which is used in our experiment. 4. Classification using Back Propagation Neural Network Neural Networks are simplified models of the biological nervous system, which is a computing, performed like a human brain. A Neural network [35] has a parallel distributed architecture with a large number of nodes and connections. Each connection points from one node to another are associated with weights. The backpropagation neural network is a network of simple processing elements working together to produce a complex output. [The back propagation paradigm[36] has been tested in various applications such as bond rating, mortgage application evaluation, protein structure determination, signal processing and handwritten digit recognition [37][38][39][40]. It can learn difficult patterns such as those found in typing style, and can recognize these patterns even if they are variations of the ones it initially learned. The backpropagation neural network uses a training set composed of input vectors and a desired output (here the desired output is usually a vector instead of a single value. These elements or nodes are arranged into layers: input, hidden, and output. The output from a backpropagation neural network is computed using a procedure known as the forward pass [41]: 1) The input layers propagate a particular input value component to each node in the Hidden layer. 2) Hidden layers compute output values which become inputs to the output layer. 3) The output layers compute the network output for the particular input values. The forward pass produces an output vector for a given input vector based on the current state of the network weights. Since the network weights are initialized to random values, it is unlikely that reasonable outputs will result before training. The weights are adjusted to reduce the error by propagating the output error backwards through the network. This process is where the backpropagation neural network gets its name and is known as the backward pass: 1) Compute error values from the output layer. This can be computed because the desired output is known. 2) Compute the error for the hidden layer nodes. This is done by attributing a portion of the error at each output layer node to the middle layer nodes which feed that output node. The amount of error due to each middle layer node depends on the size of the weight assigned to the connection between the two nodes. 3) Adjust the weight values to improve network performance. 4) Compute overall error to test network performance. The training set is repeatedly presented to the network and the weight values are adjusted until the overall error is below a predetermined tolerance. The Back Propagation algorithm [42][43] can be implemented in two different modes: online mode and batch mode. In the online mode the error function is calculated after the presentation of each input timing vector and the error signal is propagated back through the network, modifying the weights before the presentation of the next timing vector. This error function is usually the Mean Square Error (MSE) of the difference between the desired and the actual responses of the network over all the output units. Then the new weights remain fixed and a new timing vector is presented to the network and this process continues until all the timing vectors have been presented to the network. The presentation of all the timing vectors is usually called one epoch or a single iteration. In practice many epochs are needed before the error becomes acceptably small. In the batch mode the error signal is calculated for each input timing vector and the weights are modified every time the input timing vector is been presented. Then the error function is calculated as the sum of the individual MSE for each timing vector and the weights are accordingly modified (all in a single step for all the timing vectors) before the next iteration. In the forward pass, outputs are computed and in the backward pass weights are updated or corrected based on the errors. The development of the Back Propagation algorithm is a landmark in neural networks in that it provides a computationally efficient method for the training of multi(IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 11 layer perceptron. The general procedure of back propagation algorithm is as follows: Initially, the inputs and outputs of the feature subset selection algorithms are normalized with respect to their maximum values. Step 1: ACO feature subset selection algorithm values are considered as input. Step 2: These feature values are normalized between 0 and 1 and assigned to input neurons. Step 3: Wih and Who Represents the weights to the link of input nodes to hidden nodes connection, hidden nodes to output nodes respectively. Initial weights are assigned randomly between -0.5 to 0.5. Step 4: Input to hidden neuron (Ii) is multiplied with weight wih. Step 5: The output from each hidden neuron (Oh) is calculated using sigmoid function S1 = 1 /(1 + e-λx) (4) where λ =1 & x = ΣwihIi, wih is the weight assigned between input and hidden layer and Ii is the input value to the input neurons. Step 6: The input to the output layer (Io) is multiplied by weight who with output of hidden Oh. Step 7: The output from the output layer (Oo) is calculated using the sigmoid function, S2 = 1 /(1 + e-λx) (5) where λ =1 & x = ΣwhoOh where who is the weight assigned i between hidden and output layer and Oh is the output value from hidden neurons. Step 8: Error (e) is found using subtracting S2 from the desired output. Using the error (e) value, the weight change is calculated as: Delta = e x S2 x (1 – S2) (6) Step 9: Weights are updated using the delta value. Who = Who + (n x delta x S1) Wih = Wih + (n x delta x Ii) (7) where n is the learning rate and I is the input value. Step 10: Perform steps (5) to (9) with the updated weights, till the target output is equal to the desired output. Then check the error (e) value and update the weights. After several iterations, when the difference between the calculated output and the desired output is less then the threshold value, the iteration is stopped in the above algorithm. 5. Results and Discussion Mean and standard deviation of duration, latency and digraph for each sample is measured. Ant Colony algorithm is used for selecting the optimum feature for each participant and the selected features are considered for classification. 5.1 Results of ACO From 100 samples, fifty best fitted values were selected to reproduce best new fit population. Partial experimental results of ACO are shown in Table 1. For instance the mean and standard deviation timing of the password “COMPUTER" is computed initially in the feature extraction phase. The feature subset from the feature subset selection phase using ACO is computed as follows: Step 1: Calculation of Fitness value for Duration: Mean (µi) = (1/N) Σ i =1N x (i) = 1.349375= x (i) Fitness value f (i) =1 /1 + x (i) = 1/1+1.349375 = 0.425645 Step 2: Calculation of Local Minimum for Duration: Initially the fitness value f(x) is directly assigned as Local Minimum (Lmin) for the first value (say f [1]). Then the next fitness value f (x) (say f [2]) is compared with f (1). The minimum is found and is replaced with the Local minimum value. For mean let f [1] = 0.425645 Assign f [1] = Lmin = 0.425645 The next value let f [2] = 0.416898 Check whether f [1] less then or equal to the value a [2]. If the condition is true, assign Lmin =f [1]. Otherwise, Lmin= f [2] Here, in this sample, as (0.425645<=0.416898) Lmin = 0.416898 Step 3: Calculation of Local Pheromone Update for Duration: Tnew = (1–ρ) x Told + ρ x T0, where Tnew = new pheromone rate, where Told = old pheromone rate, T0 = Initial pheromone value. Initially, Told = 0.001 and T0 = 0.001. For mean, first Local pheromone is updated as Tnew = (1-0.9) x 0.001 + 0.9 x 0.001 = 0.1 x 0.001 + 0.0009 = 0.00100 Note: Told value change due to the previous Tnew value i.e. Told = 0.00100. Step 4: Calculation of Global minimum for Duration: Global minimum (Gmin) is assigned as Lmin value initially (i.e. Lmin = Gmin). Next the value in the Gmin is compared with Lmin, to find the minimum amongst them. For mean, Lmin = 0.416898 Initially, Gmin = Lmin. Therefore Gmin = 0.416898 For next feature value, condition should be satisfied for Gmin i.e. (Gmin <= Lmin). So, Gmin = 0.416898 Step 5: Calculation of Global pheromone Update for Duration: The selected Gmin pheromone value is updated as follows: Tnew = (1 – ρ) x Told + ρ x Told, ρ=rate of pheromone evaporation parameter, =1/(Gmin) For mean, Global pheromone is updated as Tnew = (1–0.9) x 0.001+0.9 x (1/0.74167690) x 0.001 = (0.0001) + (0.9 x 2.39866 x 0.001) = 0.002259 (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 12 Table 1: Feature subset selection using ACO Mean (Gmin) Standard Deviation (Gmin) Duration Latency Digraph Duration Latency Digraph x(µ) F(µ) x(µ) F(µ) x(µ) F(µ) x(σ) F(σ) x(σ) F(σ) x(σ) F(σ) 1.570 0 0.389 1 1.618 7 0.381 9 2.925 0 0.254 8 0.348 2 0.741 7 0.383 2 0.722 9 0.723 8 0.580 1 1.398 7 0.416 9 1.597 0 0.385 0 2.773 0 0.265 0 0.344 5 0.743 7 0.360 3 0.735 1 0.709 4 0.585 0 1.356 6 0.424 3 1.583 2 0.387 1 2.562 3 0.280 7 0.336 3 0.748 3 0.322 7 0.756 0 0.675 4 0.596 8 1.349 3 0.425 7 1.382 8 0.419 7 2.531 9 0.283 1 0.329 7 0.752 0 0.295 0 0.772 2 0.654 4 0.604 4 1.347 6 0.426 0 1.273 6 0.439 8 2.526 0 0.283 6 0.348 2 0.741 7 0.292 7 0.773 6 0.568 3 0.637 6 1.322 0 0.430 7 1.266 2 0.441 2 2.502 6 0.285 5 0.316 2 0.759 8 0.284 6 0.778 5 0.590 9 0.628 6 Step 6: The remaining Ants Pheromone update for Duration: Tnew = (1 – ) x Told For mean, Global pheromone is updated as Tnew = (1-0.9) x 0.001 = 0.1 x 0.001 = 0.0001 Similarly the values for the latency and digraph are calculated as above. (a) Results of Back Propagation Neural Network (BPNN) Back Propagation Neural Network seems much more suitable for pattern classifier because it can solve a nonlinnea problem and for its ability to classify pattern and it is better in generalization Let Ii be Input of Input, Oi be Output of Input, Ih be Input of Hidden, Oh be Output of Hidden, Io be Input of Output, Oi be Output of Output. After applying the BPNN Learning the following calculations are done. The partial results are displayed in Table 2. It displays the initial input and random weight between input to hidden, output of hidden using sigmoid function, random weight between hidden to output, and the output of output layer using sigmoid function value. This value is compared with target output .01 and error value is displayed. The adjusted weights between input to hidden and hidden to output is also displayed. After completing the 30th iteration using the duration, latency and digraph the threshold value is obtained from maximum to minimum output within the 30 iterations. Computation of error values in Forward Pass Step 1: Input of Input Mean and Standard Deviation is the Input of input and output of input layers and is f (i). For instance let the Input f (i) = (0.488341, 0.969959) Step 2: Weight between Input to Hidden Assign weights randomly between Input to Hidden layers say, Wih = (-0.7, 0.4, -0.7, 0.6,) Assume two weights for each single node. Multiply each output of input into weight that is assigned randomly. Step 3: Input of hidden Ih = Oi * Wih Iih 1 = 0.488341*-0.7= -0.3418387 Step 4: Output of Hidden Compute sigmoid function as S1 = 1 /(1 + e-lx), where l = 1 & x = Si wih Oi S1 = 1/(1+e-(-0.3418387+-0.1953364)) S1 = 0.3688 Step 5: Weight Between Hidden to Output Assign the weights randomly between Hidden to Output Layer as Who = (0.6, -0.5) Step 6: Input of Output Layer Multiply the weight between Hidden and Output Layer (Who) and the Output of Hidden Layer Io =S1 *Who = *(0.6) = 0.22128 Step 7: Output of Output Layer Sigmoid Function of Output Layer is calculated as follows: Oo= 1 /(1 + e-lx), where l = 1 & x = Si who Oh Oo= 1/(1+ e-(0.350259+-0.32626)) = 0.718669 Step 8: Error Signal Compute the Error Signal using Error = (To-Oo)2 where To is Target Output and is assigned -0.1 and Oo is Output of Output and is assigned as 0.459700. Error = (To-Oo) 2 = (0.1-0.5060)2 =0.382751 Computation of updated weights in Backward Pass Weights are adjusted to achieve the Target Output and Reduce the Error Value. D=( To-Oo1)( Oo1)(1-Oo1) = (0.1-0.5060) (0.5060) (1-0.5060) D = -0.10148 Step 9: Output to Hidden Weight: Y = S1*D Y = [-0.3418387 0.1953364] x (-0.10148) Y = [0.03468 -0.0198] [w] 1= [w]0 +η[y][Assume η=0.6] [w] 0= 0 [w] 1= 0.6 x [0.03468 (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 13 -0.0198] [w] 1 = [0.436924 0.918542] Step 10: Hidden to Input Weight: The Adjusted weight between Hidden to Input [e]=(w)(D)=(-0.7) (-0.10148) = 0.071036 Similarly the remaining four weights are multiplied by the error Difference Value (D). [D*]= [e][ OH][1-OH]= [0.071036][ 0.3688][1-0.3688] = 0.01653 [x] = [S1][D*] [-0.3418387 0.1953364] x 0.01653 [v]1 = α [v]0+ η[x] [v]1 = [-0.704411 0.392331 0.591240 0.584767] Table 2: Intermediate result of BPNN nput Ii Wih Ih Output of hidden (Hidden) Who Io Sigmoid (Output) oo Target Difference (Error rate) Adjusted Weight (Woh) Adjusted Weight (Whi) Duration -0.7 -0.2723 -0.703482 Mean 0.3891 0.4 0.15564 0.46877 0.6 0.281262 0.56118 0.402675 0.6 0.44502 0.593362 SD 0.7417 0.6 0.44502 0.70889 -0.5 -0.354445 0.50073 0.1 0.160586 -0.5388 0.605099 Latency -0.7 -0.29183 0.703727 Mean 0.4169 0.4 0.16676 0.468773 0.6 0.2812638 0.561112 0.402848 0.6 0.44622 0.593351 SD 0.7437 0.6 0.44622 0.709393 -0.5 -0.3546965 0.499701 0.1 0.159761 -0.538888 0.605081 Digraph -0.7 -0.2970 -0.703791 Mean 0.4243 0.4 0.16972 0.468222 0.6 0.2809332 0.561059 0.402892 0.6 0.44898 0.593313 SD 0.7483 0.6 0.44898 0.710530 -0.5 -0.355265 0.499448 0.1 0.159558 -0.538941 0.605101 Similarly twenty five weights are calculated and old weights of Input to Hidden Layer are replaced. After training the user typing pattern, the threshold values for each trained user is fixed. Again the users are asked to verify by giving the user name and password. After the verification of user name and password, the typing pattern is verified through the comparison of desired output with fixed threshold value. If the error value is less then 0.001 then the user is considered as valid user otherwise invalid user. The success of this approach to identify computer users can be defined mainly in terms of False Rejection Rate(FRR) and False Acceptance Rate(FAR). False Rejection Rate of a verification system gives an indication of how often an authorized individual will not be properly recognized. False Acceptance Rate of a verification system gives an indication of how often an authorized individual will be mistakenly recognized and accepted by the system. False Rejection Rate is generally more indicative of the level of a mechanism. FAR and FRR rate is calculated using the following equations: FAR = FA /N * No of user’s where FAR – False Acceptance Rate, FR – Number of incidence for False Acceptance and N – Total number of samples FRR = FR /N * No of user’s where FRR – False Rejection Rate, FR – Number of incidence for False Rejection and N – Total number of samples. These results suggest that digraph may in general provide a better characterization of the typing skills than latency and duration. (a) Receiver Operating Characteristics (ROC) ROC analysis provides tools to select possibly optimal models and to discard suboptimal ones independently from (and prior to specifying) the cost context or the class distribution. ROC analysis is related in a direct and natural way to cost/benefit analysis of diagnostic decision making. Fig 2 shows the ROC curves for comparison of mean and standard deviation (Duration, Latency and Digraph) of classification performance. The error rate is reduced when the sample size is increased. 6. Conclusion To conclude, we have shown that keystroke dynamics are rich with individual mannerisms and traits and they can be Figure 2. Classification Error Rate using Mean (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 14 used to extract features that can be used to identify computer user. We have demonstrated that using duration, latency and digraph timings as classification features is a very successful approach. Features are extracted from 27 users with 100 samples of each. Using the samples, the mean and standard deviation is calculated for duration, latency and digraph. Subsets of features are selected using Ant Colony Optimization (ACO) algorithm. ACO using digraph mean provide the best performance comparing with duration and latency. The features are classified and tested using Backpropagation Algorithm Finally, it was found that using the values of digraph and back-propagation neural network algorithms has shown excellent verification accuracy. The classification error is reduced when the number of sample is increased. The classification error of 0.059% and accuracy of 92.8% is reported. References [1] Hu. J, Gingrich. D and Sentosa. A, “A k-Nearest neighbor approach for user authentication through biometric keystroke dynamics”, In Proceedings of the IEEE International Conference on Communications, pp. 1556 – 1560, 2008. [2] Pavaday N and Soyjaudah. K.M.S, “Investigating performance of neural networks in authentication using keystroke dynamics”, In Proceedings of the IEEE AFRICON Conference, pp. 1 – 8, 2007. [3] Adrian Kapczynski, Pawel Kasprowki and Piotr Kuzniacki, “Modern access control based on eye movement analysis and keystroke dynamics”, In Proceedings of the International Multiconference on Computer Science and Information Technology, pp. 477-483, 2006. [4] Gláucya C. Boechat, Jeneffer C. Ferreira and Edson C. B. Carvalho Filho, “Authentication Personal”, In Proceedings of the International Conference on Intelligent and Advanced Systems pp. 254-256, 2007. [5] Anil Jain, Ling Hong, and Sharath Pankanti, “Biometrics Identification”, Signal Processing, Communications of the ACM, Vol. 83, Issue 12, pp. 2539-2557, 2003. [6] Duane Blackburn, Chris Miles, Brad wing, Kim Shepard, ”Biometrics Overview”, National Science and Technology Council Sub-Committee on Biometrics, 2007. [7] Lin Hong and Anil Jain, “Integrating Faces and Fingerprints for Personal Identification”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20, No. 12, pp.1295 – 1307, 1998. [8] Fabian Monrose, Aviel D. Rubin, “Keystroke dynamics as a biometric for authentication”, Future Generation Computer Systems, Vol. 16, Issue 4, pp. 351-359, 2000. [9] Gabriel. L. F. B. G. Azevedo, George D. C. Cavalcanti and E. C. B. Carvalho Filho, “Hybrid Solution for the Feature Selection in Personal Identification Problems through Keystroke Dynamics”, In Proceedings of the International Joint Conference on. Neural Networks, pp.1947-1952, 2007. [10] Pin Shen Teh, Andrew Beng Jin Teoh, Thian Song Ong, Han Foon Neo, “Statistical Fusion Approach on Keystroke Dynamics”, In Proceedings of the Third International IEEE Conference on Signal-Image Technologies and Internet-Based System, Shanghai, pp. 918-923, 2007. [11] Shepherd S J, “Continuous Authentication by Analysis of keystroke typing characteristics”, European Convention on Security and Detection’, pp.111-114, 1995. [12] Christopher S. Leberknight, George R. Widmeyer, and Michael L. Recce, ”An Investigation into the efficacy of Keystroke Analysis for Perimeter Defense and Facility Access”, In Proceedings of the IEEE Conference on Technologies for Homeland Security, pp. 345-350, 2008. [13] Gaines .R, Lisowski .W, Press .S, and Shpiro.N, “Authentication by keystroke timing: Some preliminary results”, Rand Report R-256-NSF. Rand Corporation, 1980. [14] Young .J.R and Hammon .R.W, “Method and apparatus for verifying an individual’s identity”, US Patent 6862610, U.S. Patent and Trade-mark Office, 1989. [15] Bleha .S.A and Obaidat .M.S, “Dimensionality reduction and feature-Extraction Applications in Identifying Computer users”, IEEE Transactions on Systems Man and Cybernetics, Vol. 21, No. 2, pp. 452-456, 1991. [16] Daw-Tung Lin, “Computer-Access Authentication with Neural Network Based Keystroke Identity Verification”, In Proceedings of the International Conference on Neural Networks, Vol. 1, Issue 9-12, pp.174 – 178, 1997. [17] Sylvain Hocquet, Jean-Y Ves Ramel and Hubert Cardot, “Fusion of methods for keystroke Dynamics Authentication”, Fourth IEEE Workshop on Automatic Identification Advanced Technologies, pp. 224 – 229, 2005. [18] Enzhe Yu and Sungzoon Cho, “Keystroke dynamics identity verification–its problems and practical solutions”, Computers & Security, Vol. 23, pp. 428– 440, 2004. [19] Yang .J and Honavar .V, “Feature subset selection using a Genetic algorithm”, IEEE Intelligent Systems and their Applications, Vol 13, Issue 2, pp.44-49, 1998. [20] John G. H., Kohavi .R and Pfleger .K, “Irrelevant features and the subset selection problem”, In Proceedings of the Eleventh International Conference on Machine Learning, pp. 121-129, 1994. [21] Shiv Subramaniam .K.N, S. Raj Bharath and S. Ravinder, “Improved Authentication Mechanism using Keystroke Analysis”, In Proceedings of the International Conference on Information and Communication Technology, Vol. 7-.9, pp. 258-261, 2007. [22] Surendra K. Singhi and Huan Liu, “Feature Subset Selection Bias for Classification Learning”, In Proceedings of the 23rd International Conference on Machine Learning, pp. 849-856, 2006. [23] Karnan M, Thangavel K, Sivakumar R and Geetha (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 15 K, “Ant Colony Optimization for Feature selection and Classification of Microcalcifications in Digital Mammograms”, In Proceedings of the International Conference on Advanced Computing and Communications, pp.298-303, 2006. [24] Enzhe Yu and Sungzoon Cho, “GA-SVM Wrapper Approach for Feature Subset Selection in Keystroke Dynamics Identity Verification”, In Proceedings of the International Joint Conference on Neural Networks, Vol. 3, pp. 2253-2257, 2003. [25] Ki-seok Sung and Sungzoon Cho, “GA SVM Wrapper Ensemble for Keystroke Dynamics Authentication”, In Proceedings of the International Conference on Biometrics, Hong Kong, China, Vol. 3832, pp. 654-660, 2006. [26] Gabriel L. F. B. G. Azevedo, George D. C. Cavalcanti and E.C.B. Carvalho Filho, “An Approach to Feature Extraction for Keystroke Dynamics Systems based on PSO and Feature Weighting”, In Proceedings of the IEEE Congress on Evolutionary Computation, pp. 3577–3584, 2007. [27] Fabian Monrose, Michael K. Reiter and Susanne Wetzel, “Password Hardening Based on Keystroke Dynamics”, In Proceedings of the 6th ACM conference on Computer and communications security, pp. 73-82, 1999. [28] Francesco Bergadana, Daniele Gunetti and Claudia Picardi, “User authentication through Keystroke Dynamics”, ACM Transaction of Information and System Security, Vol. 5, pp. 367-397, 2002. [29] Magalhaes, Paulo Sergio and Henrique Dinis dos, “An improved Statistical Keystroke Dynamics Algorithm”, In Proceedings of the IADIS Virtual Multi Conference on Computer Science and Information Systems, pp. 256-262, 2005. [30] David Martens, Manu De Backer, Raf Haesen, Jan Vanthienen, Monique Snoeck, and Bart Baesens, “Classification with Ant Colony Optimization”, In Proceedings of the IEEE Transactions on Evolutionary Computation, Vol. 11, pp. 651-665, 2007. [31] Haibin Duan and Xiufen Yu, “Hybrid Ant Colony Optimization Using Memetic Algorithm for Traveling Salesman Problem”, In Proceedings of the IEEE International Symposium Approximate Dynamic Programming and Reinforcement Learning (ADPRL) pp. 92-95, 2007. [32] Dorigo .M and Gambardella .L.M., “Ant colonies for the traveling salesman problem”, Bio Systems, 1997. [33] Dorigo .M, Maniezzo .V and Colorni .A, ”Positive feed back as a search strategy”, Technical Report Politecnico di Milano, Italy, 1991. [34] Youmei Li and Zongben Xu, “An Ant Colony Optimization Heuristic for solving Maximum Independent Set Problems”, In Proceedings of the Fifth International Conference on Computational Intelligence and Multimedia Applications, pp.206– 211, 2003. [35] Obaidat M. S., and Macchairolo D. T., “A Multilayer Neural Network System for Computer Access Security”, IEEE Transactions on Systems, Man and Cybernetics, Vol.24, pp. 806-813, 1994. [36] Rumelhart .D, Hinton .G and Williams .R, “Learning Internal Representations by Error Propagation”, Parallel distributed processing: explorations in the Microstructure of Cognition, MIT Press, Vol.1, pp. 318–362, 1986. [37] Hecht Nielsen .R, “Neuro Computing,”, Springer-Verlag New York, Inc. pp. 445-453, 1989. [38] Kohonen .T, “The Neural Phonetic Typewriter”, IEEE Computer, Vol. 21, pp. 11-22, 1988. [39] Obaidat .M.S and Walk .J.V, “An Evaluation Study of Traditional and Neural Network Techniques for Image Processing Applications”, In Proceedings of the IEEE 34th Midwest Symposium on Circuits and Systems, Vol.14, pp. 72-75, 1991. [40] Marcus Brown and Samuel J. Rogers, “A Practical Approach to User Authentication”, In Proceedings of the 10th Annual Computer Security Applications Conference, pp. 108-116, 1994. [41] Sajjad Haider Ahmed Abbas K. Zaidi, “A Multi-Technique Approach for User Identification through Keystroke Dynamics”, IEEE Transactions on Systems Man, and Cybernetics, Vol.2, pp.1336–1341, 2000. [42] Brown M, Rogers S.J., “User identification via keystroke characteristics of typed names using neural networks”, International Journal of Man-Machine Studies, Vol. 39, pp. 999-1014, 1993. [43] Nadler M and Smith E P, “Pattern Recognition Engineering”, New York: Wiley-Inter Science, 1993. Authors Profile Marcus Karnan received the BE Degree in Electrical and Electronics Engineering from Government College of Technology, Bharathiar University, India. Received the ME Degree in Computer Science and Engineering from Government College of Engineering, Manonmaniam Sundaranar University in 2000. Received the PhD degree in Computer Science and Engineering Degree from Gandhigram Rural University, India in 2007, Currently he is working as Professor, Department of Computer Science & Engineering Department, Tamilnadu College of Engineering, India. He has been in teaching since 1998 and has more than eleven years in industrial and research experience. His area of interests includes medical image processing, artificial intelligence, neural network, genetic algorithm, pattern recognition and fuzzy logic. M. Akila received the Bachelor of Computer Science and Engineering from Thiagaraja College of Engineeering, Madurai Kamaraj University in 1991. She Received the Master of Computer Science and Engineering from National Engineering College, Manonmaniam Sundaranar University in 2003. She is now a Research scholar in Anna University, Coimbatore and working as Assistant Professor in Vivekanandha College of Engineering for Women, Tiruchengode, Tamilnadu, India. Her area of interests includes image processing, pattern recognition and artificial intelligence. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 16 A Rough Set Model for Sequential Pattern Mining with Constraints Jigyasa Bisaria1, Namita Srivastava2 and Kamal Raj Pardasani3 1,2,3 Department of Mathematics, Maulana Azad National Institute of Technology (A Deemed University), Bhopal M.P. jigyasab@gmail.com, sri.namita@gmail.com, kamalrajp@hotmail.com Abstract: data mining and knowledge discovery methods host many decision support and engineering application needs of various organisations. Most real world data has time component inherent in them. Sequential patterns are inter-event patterns ordered in time associated with various objects under study. Analysis and discovery of frequent sequential patterns in user defined constraints are interesting datamining results. These patterns can serve a variety of enterprise applications concerning analytic and decision support needs. Impostion of various constraints further enhances the quality of mining results and retrict the results to only relevent patterns. In this paper, we have proposed and rough set perspective to the problem ofconstraint driven mining of sequential pattern. We have used indiscernibility relation from theory of rough sets to partition the search space of sequential patterns and have proposed a novel algorithm that allows pre-visualization of patterns and imposition of various types of constraints in the mining task. The algorithm C-Rough Set Partitioning is atleast ten times faster than the naïve algorithm SPRINT that is based on various types of regular expression constriants. Keywords: Rough sets, Sequential patterns, constriants, indiscernibility, partitioning 1. Introduction Sequential pattern mining is studied extensively in data mining literature due to its applicability into a variety of applications. It is applied to a lot of real world decision support applications like root causes of banking customer churn [8], analysis of web logs [9], fault diagnosis and prediction in telecom networks [10], study of adverse drug reactions as temporal association rules[11]. The enormous search space and huge number of patterns are inherent challenges in the sequence mining task. Conventional studies into sequential pattern mining give various computational methodologies to enumerate the frequent sequence space [1]-[6]. These methods mine all sequential patterns in the support confidence framework. Computational methodologies in [1]-[5] are botton up candidate generate and test approaches. The method PrefixSpan [6] works on the concept of iteratively projecting the database on the basis of the prefix. This method does not generate any candidate and is strictly based on the events present in the database. New generation mining methods require the retrieval of patterns in user defined constraints. Impostion of constraints not only condense the mining results to the most useful ones but also reduce the search space and improve performance. A constraint can be regarded as a Boolean function on all sequences. The problem of constraint based mining of sequential patterns is about finding all those patterns which satisfy . Under classical framework constraints can be classified as monotonic, anti-monotonic and succint [14]. A constraint is anti-monotonic if its agreement for any sequence a implies its satisfaction by all its subsequences. A constraint is monotonic if a sequence a satisfies implies that every super-sequence of a also satisfies . Succinct type of constraints is pre-counting pushable constraints such that for any sequence a the satifaction of the constraint implies its satisfaction by all the elements of sequence a. A succinct constraint is specified using a precise “formula”. According to the “formula”, one can generate all the patterns satisfying a succinct constraint. There is no need to iteratively check the constraint in the mining process. Early work in the domain of constriant imposition into sequential pattern mining task is the algorithm GSP [3]. They proposed the concept of time interval constraint, maximum gap and minimum gap constraint and build them into apriori algorithm framework. Another work in the framework in time interval constraints is given by Mannilla et.al [2]. They defined “an episode as a collection of events that occur relatively close to each other in a given partial order.” They did consider the importance of time frame of patterns and gave the concept of event window and sliding event window. They defined patterns as directed acyclic graphs with vertex as a single event and edge as “Event A occurs before event B”. Their method of finding frequent episodes is “bottom-up candidate-generate and test apporach” which is similar to Apriori ALL proposed by Agrawal and Srikant [1]. F Masseglia et al.[15] have also proposed the time constraint imposition into mining of sequential patterns. They have presented a graph theoretic mining algorithm to deduce the search space of time constraint sequential pattern. Garofalakis et al. [16] have given the framework for imposing regular expression cosntraint into sequential pattern mining. A regular expression R is a set of expressions such as disjunction and Kleene closure [17]. R specifies a language of strings over a regular family of sequential patterns that are of interest to the user. They confirmed that Regular expression constraints have the same expressive power as diterministic finite automata [17]. The algorithms SPRINT is a multi database scan candidate generate and test strategy based on GSP [3]. The candidate generate strategy works on imposing a relaxed constraint (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 17 The method first genrates candidates and checks for validity patterns that statisfy the given the regular expression constraint and then finds occurance frequency for such length-1 sequences that cross the minimum support threshold. This becomes the seed set for further iteration. The Candidate Length-2 sequences are formed by joining the elements of the seed set. Now, the database is scanned again for searching these candidates and their counts are accumulated after checking the relaxed constriant . In subsequent iterations, candidate k-length sequences are formed by joining frequent k-1 sequences that have the same contiguous subsequences. Suppose a sequence Sa= 1 2 n e ,e ,......e , another sequence sb is a contigeous subsequence of Sa if (i) sb is derived from Sa (ii) sb is derived from Sa by dropping an item from an element ej that has at least 2 items. (iii) sb is a contiguous subsequence of sd and sd is a contiguous subsequence of Sa The process is continued untill all frequent sequences present in the database are found satifying the relaxed constriant Given an anti-monotonic constriant, the constraint is first imposed and candidates which do not satisfy the constraint are pruned. It is clear that like the support constraint, the constraint is also anti-monotonic, that is if the constriant is not supported by a sub-pattern it will not be supported by its super pattern also. In case the constraint in monotone an appropriate choice of relaxed constriant is used for generate of valid results. The family of SPRINT methods suffer from the drawback of huge query overhead due to multiple scans, weak constriant imposition based candidate generation followed by frequent pattern discovery from amongst the candidate set. Han et al. [17] have confirmed the imposition of various user defined constraints for efficient mining of patterns. They have proposed architecture for mining multidimensional association rules in the framework of online analytical mining. They proposed constraint imposition at the level of transaction database with the use of PLSQL query language which is further subject to multidimenisional association pattern discovery. Pei et al. [14] have studied the process of constriant imposition in the framework of prefixspan [6]. They have presented the constraint imposition framework in both classical and application centric framework. Their work presents a detailed study on how conventional monotone, anti-monotone and succinct constraints can be studied as a prefix constraint while recursively projecting the database with the same. Their study confirmed that while the method prefixspan is efficient for sequential pattern mining it is not suitable for constraint driven mining. They have presented a systematic study of regular expression and aggregate constraints imposition and presented various application oriented examples for tough but interesting constraints. They defined seven categories of constraints from the application perspective; item, super pattern, time intrerval, gap between subsequent transactions, regular expression constraint, length of sequence and various aggregate constraints. Though these are not the complete set of possible constraints but are more or less comprehensive to address most decision centric constraint imposition tasks. In this paper, we explain all the seven types of constraint their treatment in the rough set based framework. Here we retrict our discussion to length-1 sequences. This correspond to many real world sequential patterns for example sequential pattern of web access patterns, faults in telecom landline networks etc. (i) We have proposed a user friendly interface that generates previsualization of a sample of emerging sequential patterns and allows flexible imposition of time, length, gap constraint prior to mining task and (ii) we have presented a novel algorithm based on indiscerniblity relation from theory of rough sets to address the computational aspect of the expensive mining problem of frequent sequential patterns satisfying item, super pattern, regular expression contraints. It is found from experimental evaluations that our algorithm is atleast 10 times faster than algorithm SPRINT [16]. 2. Problem Formulation From theory of rough sets, an information system is given as: t S {U,A ,V, f} = where U is a finite set of objects, 1 2 n U {x , x ,.............x } = At is a finite set of attributes, At is further classified into two disjoint subsets, conditional attributes C and decision attributes D, t A C D = È p p At V V Î = U and p V is a domain of attribute p t f : U A V ´ ® is a total function such that i q f (x ) V ,q Î for every t q A Î and i x U Î . Consider an example transaction database as in TABLE I. Table 1: Example transaction database t A (T, I) = where T is the set of transaction times and I is the set of associated itemsets with i x . Examples of transaction database can be database of customer purchase patterns in a retail store, web access details etc. There are multiple instances of the same customers ( i x ) in the information system U. Alternate representation of the (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 18 transaction database is termed as a sequence database formed by grouping transactions corresponding to same ( i x ).The alternate information system is S' (U,E) = where 1 2 n U {x , x ,.......x } = 1 2 3 m E {e ,e ,e ,........e } = a sequence or serial episode is defined as a set of events that occur within a predetermined event interval and are associated with the object under study. Given I be the set of itemsets 1 2 3 n I {i ,i , i ..............i } = then the set of sequence t E A Ì is formed by combining itemsets associated with the same object ordered by time. i e E " Î i 1 2 l e {i , i ,.......i } = The length of a sequence is the number of items it contains. A ksequuenc contains k items j j k e =∑ . The absolute support of a sequence ei is defined as the number of transaction that contain it and relative support is defined as sup (ei) = absolute support/no. of objects in the dataset. A pattern is frequent if it crosses a user specified frequency threshold called minimum support threshold [15] Given sequences, , represents a disjuction operator which indicates the selection of either of the event patterns. Here, is the ith element of the sequence. is a regular expression constraint. represent the Kleene closure operatorwhich signifies zero or more occurances of element . The problem of constraint driven mining of sequential patterns is concerned with discovery of frequent patterns that also satisfy user specified contriants. Commonly imposed constriant can be classified in the following categories. Contraint type 1: (Item constraint) An item constraint specifies subset of items that should or should not be present in the patterns. Considering the case of n size length-1 sequential patterns V also corresponds to subsequence relation. (1) Where V is the subset of items, If then the item constraint is both anti-monotone and succint under operation. If then the item constraint is both amonotone and succint under operation. Example of type 1 constraint is discovery of specific web usage pattern of customer characterized by one type of sites for example online gift stores. Another example in case of fault diagnosis in telecom landline networks; a constraint of type 1 can be characterized by all sequential patterns in which the fault signal “dead phone” is present or absent. If T is the set of gift stores on the web then, (2) Given the domain all uniques sequential patterns; all transactions that follow the type 1 constraint are the members of the indiscernibility relation formed by the equivalence class of patterns indiscernible with respect to the concept of pattern existance. (3) Constraint type 2: (super pattern constraint) a super pattern constraint finds those patterns which encapsulate a user specified sequence. (4) For example consider the example of web browsing patterns of customers, a pattern of type 3 can be web access pattern which encapsulates the subsequence (online advertisement, product site). Super pattern constraint is monotone and succint. Constriant type 3: (time interval constraint) a transaction database has time stamp information against event labels. The time interval or duration constraint are a set of sequences with the property that the time interval between first and last transaction is less than or greater than a specific value. (5) Where and is a given integer. The length of the sequential pattern depends on the choice of the time interval under study. Let in t T A Ì , s t be the start Time and e t be the end time for study of transaction patterns. Then, the event/time interval for study of patterns is given by: s e t t - for given information system S. If we group the transaction information t I A Ì corresponding to the same i x , we derive and alternate representation of the information system S. If we impose time interval retriction we derive sequence database in constraint time interval. The maximum length can be controlled by the appropriate consideration of time interval constraint. Consider the transaction database in TABLE I If the time interval under consideration is 20 days then the sequence database is as given in TABLE II and if the time interval under consideration is 25 days then the derived sequence database is given by TABLE III. Both length and time interval constriants are anti-monotone under operation and they are monotone and succint under the operation. Constraint type 4: (Length Constraint) In case of length-1 sequences this type of constraint restricts the size of the sequence under consideration. It can be the restriction of the maximal pattern length. (6) Consider the example in TABLE I,II,III the maximum length of sequential pattern in TABLE II is 5 while in case of TABLE III it is 3. Constraint type 5: An aggregate constraint is the constraint on an aggregate of items in a pattern, where the aggregate function can be sum, avg, max, min, standard deviation, etc. For example in case of data for market basket analysis the retails store customer might be intrested in knowing those items which the sum of bill was more than 2000 Rs. Some aggregate function like sum, average on both positive and negetive values are neither monotone, anti-monotone or succinct. Constraint type 6: (Regular Expression Constraint) the regular expression constraints are specified as a regular expression over the set of items using regular expression operators like disjunction or Kleene closure. A sequential pattern satisfies a regular expression constraint if and only if the pattern is accepted by equivalent finite automata. Like aggregate constraints regular expression constraints are also neither monotone or anti-monotone nor succinct. Constraint type 7: (gap constraint in adjescent transactions) in many transaction events have to be equispaced in time that is the time gap between subsequent (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 19 transactions have to be either greater or smaller than a prespecified gap. (7) Where and is a given integer. The gap constraint has the anti-monotone property. We categorize constriants in two categories one which influence sequence database like length of pattern, time interval and gap between subsequent patterns named as CAT1 and other are that category of constraints which mine specific patterns in the sequence database under study. Examples of second category of constraints are regular expression, item, super patterns and aggregate constraints named as CAT2. 3. Proposed Model and Method The proposed algorithm C-RSP is a break and search strategy. C-RSP proposes a complete mining system that allows imposition of all types of constriant. The input to the problem of mining sequential pattern in user defined constriant is the transaction database of objects under study. A sample database is given in TABLE I. It is evident that resultant sequence database is governed by user’s choice of time interval and maximum length constriant. The algorithm first presents a user interface that allows flexible and adjustible impostion of CAT1 types of constraints. Once the user derives the relevent sequence database under study by impostion of CAT1 categories of constriants; the sequence database is now the input to the mining of patterns in CAT2 categories of constriants. This is a done by presenting a user interface that gives a view of the sequence database on choosing an appropriate time interval. Figure 1 gives user interface that allows previsuallizatio of sequences formed by transactions indiscernible with maximal time interval of patterns. Figure 2 gives the user interface for previsualization of the maximum length of patterns as a result of user’s choice of time interval. Figure 1. Patterns with constriant Figure 2. Patterns with constriant Figure 3 and Figure 4 give PLSQL code snnipets for finding appropriate sequence database in above mentioned constraints. P Top k LocationId from Table1 where transaction_date>=Tstart & transaction_date<=Tend //--P is project operator of relational algebra which implies Select Distinct k is the number of records the user //wishes to visualize FOR each customer id in the rec_inner_test LOOP return_str:=''; FOR I IN 1..rec_inner_test.COUNT LOOP return_sequence:=return_str||rec_inner_test(i).signal||':' END LOOP; Update the Sequence_table with Sequence against each LocationID ENDLOOP Figure 3. Algorithm Pseudocode to derive sequences from transaction database in user specified time interval P Top k LocationId from Table1 where Lengthofsequence<=n //--P is project operator of relational algebra which implies Select Distinct k is the number of records the user //wishes to visualize FOR each customer id in the rec_inner_test LOOP return_str:=''; FOR I IN 1..rec_inner_test.COUNT LOOP return_sequence:=return_str||rec_inner_test(i).signal||':' END LOOP; Update the Sequence_table with Sequence against each LocationID ENDLOOP Figure 4. Algorithm Pseudocode to derive sequences from transaction database in user specified maximum length Suppose the sequence database is as in TABLE II. Now the task is to enumerate frequent sequence space in the user defined constraints of category CAT2. The method C-Rough Set Partitioning is a divide and conquer strategy. We scan the database once and store all the data in the attribute set of events into two datastructures. One is the domain of set E containing all unique sequences and itemset in S. Step 1: Now to find frequent items, we query all unique itemsets and store them in a set ˆI. We partition the set V in a way that all sequence and subsequence with the same prefix are stored in one equivalence class. Thus each element in ˆI has an corresponding equivalence class partition in V. Considering the sequence database in TABLE II, the partitions in V are given in Figure 5.0. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 20 Figure 5. Partitions in V on basis of prefix indiscernibility Lemma 1: All equivalence classes formed by patterns with the same prefix form a partition in the database under study. Proof: An equivalence class [12] is formed by elements which can be treated as equivalent in some way. An equivalence relation on a set forms a natural partitioning into groups of like objects. From the theory of rough sets [13] given a knowledge base U, a concept is a relation which forms a partition of a certain universe in families. If 1 2 n C {y , y ,............y } = such that the following conditions are satisfied i y U Í : Condition 1: i y ¹ f Condition 2: i j y y Ç = f Condition 3: i y U' for i j i,j=1,2,.....n = ¹ U Given the domain S s V V =U of all sequence present in the database under study, V can be partitioned on the basis of equivalence classes yi such that each yi contains patterns with the same prefix. Clearly condition 1 is satisfied since each element of V will be a member of some yi . Condition 2 is satisfied since no two elements in V with the same prefix will be different equivalence classes. Since all members of V with different prefix are in some equivalence class union of all equivalence class should result in V. i y V for i j i,j=1,2,.....n = ¹ U Now the database is in good form for impostion of various constriants of CAT2, item constraint, super pattern constriant, regular expression constraint and other complex constriants. Case 1: Suppose the user want to find all frequent sequences that have pattern in them, the algorithm finds patterns which are indiscernible on the basis of pattern existance. (7) Step 2: We maintain an array of frequencies which is of the size of the set V. The following steps explain the support counting process in the indiscernibility mapping: Step 2.1: For all tuples in sequence database S, Step 2.2: Deduce subsequences, check if the subsequence is a superset of pattern b. Step 2.3: Each element subsequence found accounts for an increment in the element frequency at appropriate index in partition and one increment to its subset, the process continues till all elements of S are considered. The process of mapping item constriant is the same as that of super pattern constraint. Case 2: Suppose we desire to impose the regular expression constriant characterized by a disjunction operator. (8) This can be imposed by retricting the indiscernibility mapping to patterns (9) Case 3: Consider an example of fault pattern mining in telecom landline access networks. Often the user wants to mine support of pattern within pre-specified time interval with specific items of intrest embedded in the sequence. This types of dirty constriants cannot be handled by PrefixSpan based methods or even the class of SPRINT based methods reder inadequate for handling such combination of constriants. With C-RSP such constraints can be easily build into sequence mining task. In above example, the time constraint can be imposed at the level of transaction database and pattern existance and support counting is build onto the sequence database. 4. Results and discussion We have compared the effiiciency of C-RSP with SRINT(N) naïve. It was found to be more than 10 times faster than SPRINT. Figure [6][7][8] give runtime comparison of CRRS with SPRINT by imposition of time interval and length constriant represtively. Figure [6] give comparitive efficiency on impotion of time constriant on real data of network fault patterns in telecom landline networks of Madhya Pradesh in India. The time period of data was considered by the knowledge worker as three months. The algorithm C-RSP is implemented in JDK1.3. The preprocessing step is a java program which connects to database as in TABLE 1 and invokes a PLSQL cursor which creates TABLE II. The entire process is undertaken using java database connectivity interface. It connects to the database in MSSQL Server 2005 as in TABLE II and fetches the data into data structures using jdbc. The machine used is HP Proliant DL580G5 with Intel Xeon CPU 1.6 GHZ processor with 8 GB RAM. The operating system is Ms Windows Server 2003 R2. The data comprised of 75833 records with voice related gross faults collected over a time window of three months. There are 215 distinct elements in the sequence and maximum length of the sequence is 14. The algoithm SPRINT is also programmed onthr same machine using jdk1.3. The time contraint imposition is done (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 21 at the level of generating candidates. Only those candidates are considered in the support counting process in subsequent scan of data which satisfy the specified time constraints. Figure 6. Runtime evaluation on real data of network faults on imposition of time constriant Other experiments on efficiency are performed on data similar to data generated by synthetic data generation program at http://www.almaden.ibm.com/cs/quest. The following are the descriptions of the parameters of the dataset. |D| size of the database (number of customers) |C| Average number of transactions per customer |I| Average size of itemset in maximal potentially large sequence |N| Number of items Here we have imposed the constriant on the maximum length of the pattern. The maximum length of the pattern is retricted to 14 in the dataset under consideration. Figure 7. Runtime evaluations of synthetic data on impostion of length constraint It is clear from the above graphs that C-RSP outperforms the SPRINT family of methods by an order of magnitude. This is due to partitions of search space and impostion of constriant at the preprocessing level and avoiding validity of the same recursively. There is no candidate generation since we are only fetching data into data structures and applying computation logic on the same. The method C-RSP requires only one to two scans of the database while SPRINT recursively scans the databases and works on candidate generate test strategy. The constriant impostion strategies allow impostions of individual and composite constraints. 5. Conclusion The following are the benefits of proposed model: (i) Since support counting is usually the most costly step in sequential pattern mining, proposed technique improves the performance greatly by avoiding costly scanning. Also the algorithm is strictly based on elements that exist in the database inder study. The partitions once constructed and stored can be used to mine further data increments in the database. (ii) The creation of equivalence classes by indiscernibility relation greatly reduces the search space. Especially with impostion of CAT2 constraints, the search space is restristed to specific eqivalence class. (iii) The dynamic frequency accumulation sceme in each partition saves computaiton time. (iv) While other methods search the whole search space, our method partitions the problem into subproblems. (v) The categorization of constriants enables flexible and adjustable constraint imposition scheme on various data representations. (vi) Based on experimental results obtained and depicted in graphs, we conclude that C-RSP is atleast 10 times faster than SPRINT. References [1] R. Agrawal and R. Srikant, “Mining Sequential Patterns", In Proceeding of International Conference in Data Engineering pp:3-14, 1995. [2] Manilla, H. Toivonen H. and Verkamo A. I. “Discovering frequent episodes in sequences.” In proceeding of International Conference on Knowledge Discovery and Data Mining, IEEE Computer Society Press 1995 pp:210-125, 1995 . [3] R. Srikant and R. Agrawal, “Mining sequential patterns: Generalizations and performance improvements.” In Proc. 5th Int. Conf. Extending Database Technology (EDBT’96), pp: 3-17, Avignon, France, March 1996. [4] Jay Ayres, Johannes Gehrke, Tomi Yiu,& Jason Flannick, “Sequential Pattern Mining using A Bitmap Representation”, In Proc. 2002 of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining Edmonton, Alberta Canada pp: 429 – 435, 2002. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 22 [5] Zaki M. J. “SPADE-An efficient algorithm for mining frequent sequences.” Machine Learning 42,1/2, pp:31-60, 2001. [6] Jian pei, Jiawei Han, Behzad Mortazavi-Asl, Jianyong Wang, Qiming Chen, Umeshwar Dayal, Mei-Chun Hsu, “Mining Sequential Patterns-Growth: The PrefixSpan Approach”, IEEE Transactions on Data and knowledge engineering vol 16,no.11,pp:1424-1440, 2004. [7] Yen Liang Chen ,Mei Ching Chiang, Ming-Tat Ko, “Discovering time-interval sequential patterns in sequence databases” Expert systems with applications 25, pp:343-354, 2003. [8] Ding-An Chiang, Yi-Fan Wang, Shao-Lun Lee,Cheng-Jung Lin, “Goal-oriented sequential pattern for network banking churn analysis”, Expert Systems with Applications (25), pp:293–302, 2003. [9] Sasisekharan, R., Seshadri, V., Weiss, S. “Data mining and forecasting in large-scale telecommunication networks.” IEEE Expert 11 (1), pp:37-43, 1995. [10] J. Pei, J. Han, B. Mortazavi-Asl, and H. Zhu, “Mining Access Patterns Efficiently from Web Logs.” In Proc. Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD'00), Kyoto, Japan pp: 396-407, 2000. [11] Huidong Jin, Jie Chen, Hongxing He, Graham J. Williams, Chris Kelman and Christine M. O’Keefe, “Mining Unexpected Temporal Associations: Applications in Detecting Adverse Drug Reactions”, IEEE Transactions on Information Technology in Biomedicine, Volume 12, Issue 4, pp:488–500 July 2008. [12] Jigyasa Bisaria, namita srivastava, K. R. Pardasani, A Rough Sets Partitioning Model for Mining Sequential Patterns with Time Constraint, International Journal of Computer science and information security Vol 2. No 1. pp: 178-189 June 2009. [13] Z.Pawlak, “Rough Sets, Theoretical Aspects of Reasoning about data” Springer, 1991. [14] Jian Pei, Jiawei Han, WeiWang, “Constraint-based sequential pattern mining: the pattern-growth methods” Journal of Intelligent Information Systems 28. pp:133–160, 2007. [15] F Masseglia, P Poncelet, M Teisseire, “Efficient mining of sequential patterns with time constraiant: reducing the combinations”, Expert systems with applications Elsevier, Vol. 40, N. 3, 29 pp : 2677-2690, 2008. [16] Minos N. Garofalakis, Rajeev Rastogi,Kyuseok Shim, SPRINT: Sequential PatternMining with Regular Expression Constraints, Proceedings of the 25th VLDB Conference, Edinburgh, Scotland 1999 [17] Laksmanan, Han, Raymond T, Constraint based multidimensional mining SIGKDD 2006. [18] H. R. Lewis and C. Papadimitriou. “Elements of the Theory of Computation”. Prentice Hall, Inc., 1981. Authors Profile Jigyasa Bisaria is a faculty and research fellow with the Department of Mathematics Maulana Azad National Institute of Technology.Bhopal India. Her research interests are predictive data mining and its applications to real world problems. Dr. Namita Srivastava is working as Assistant Professer with the Department of Mathematics, Maulana Azad National Institute of Technology. She obtained her PhD. in Mathemetics in 1992 in crack problem. Her current research interest are data mining and its applications. Dr. Kamal raj Pardasani is working as Professor and Head with the Department of Mathematics and Dean Research and Development Maulana Azad National Institute of Technology, Bhopal. He did his PhD. in applied Mathematics in 1988. His current research interests are computational biology, data warehousing and mining, bio-computing and finite element modeling. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 23 Hybrid Content Location Failure Tolerant Protocol for Wireless Ad Hoc Networks Maher BEN JEMAA Research unit on Development and Control of Distributed Applications “ReDCAD” Department of Computer Science and Applied Mathematics, National School of Engineers of Sfax, Tunisia maher.benjemaa@enis.rnu.tn Abstract: The current network evolution has permitted the emergence of new providers of services which offer services of different types and qualities including not only simple services like printers but also complex services like real time encoding. However, to benefit from a network service, the user needs to be provided with the address of this service. With this evolution perspective, the static configuration will not be practical anymore in the case of a high number of diversified services. Service location protocols try to overcome this drawback by providing the user with an efficient and flexible access to services. The location changes and the exhibition of services have a particular importance in mobile environments like ad hoc networks. Indeed, ad hoc networks are wireless selforgannizin networks of mobile nodes and require non fixed infrastructure. This paper has for purpose the implementation of a new protocol of location of services HCLFTP (Hybrid Content location Failure Tolerant Protocol) for ad hoc networks within the “Network Simulator” environment. Keywords: services advertisement, service location, mobile ad hoc networks, hash table. 1. Introduction For the deployment of an ad hoc network without infrastructure, we can consider a distributed solution to enable users to extend their communications beyond the scope of their radio interface. Each user can relay messages to ensure that all users can join, regardless of distance, provided there are enough users on the path. The network is self-provided and supported by the collaboration of all participants. Figure 1 shows an example. Here, the computer A wants to communicate with the computer C. As they are not in direct communication range, A will send its message to B (a phone) which will in turn transmit the same message to C. While it is easy to understand the mechanism of routers when there are few objects, how to do when they are proliferating? How to do, for example in the case presented in Figure 2 to identify the router to be used in order to transmit a message from one point to another network? Moreover, the changing needs of mobile users have led to the emergence of new challenges. Despite the ad hoc networks constraints, the evolution of services no longer allows to have static configurations of services to mobile devices. Hence, it is needed to design a protocol for locating and dynamic deployment of services to provide the flexibility of communication to the user [1]. Figure 1. Example of an ad hoc network routing Figure 2. A large scale ad hoc network Mobile ad hoc networks are self organized networks, with no central control entity and having a dynamic topology governed by the connection and disconnection of the nodes. This evolution of networks towards dynamic architectures and non-centralized, and the development of new types of services in addition to data exchange have led to several problems concerning the detection of contents and services (see Figure 3). Furthermore, in order to access a service, the user must at least know the network address of the host that provides this service [5]. With wired communication, the solution is simple to find. Just have servers that identify the services available. As these servers are still available, just ask them to get the address of the host providing the service desired. To provide a service, it is also easy to publish with one of these servers to make available to the rest of the network [10]. We show in Figure 4 the principle of location service with a central server. In ad hoc networks, such centralization of data is inadequate. Indeed, the servers that identify the services may be inaccessible because of mobility. If we consider that (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 24 the network is only composed of mobile entities, so small size, it is unlikely that there is in the network a node with physical capabilities in terms of memory, energy and bandwidth for allow it to store all the services available and meet the requirements of each other network members. It is therefore suitable to propose methods to distribute information in the network [2]. Figure 3. Discovery of a printer Figure 4. Central search Once the service has been found, it will be used. A new problem arises when we transit from wired networks to ad hoc networks. In wired communication, once the connection is established, it remained valid throughout the phase of service use. If this is not the case, the network is considered down. For against, in ad hoc networks, this reliability situation is no more verified. Indeed, it is not uncommon that the mobility of nodes leads to the separation of the network into several disjoint components. While most routing protocols propose to change the paths when they become disabled, once two nodes cannot physically join, it is sometimes too late to try anything. It then becomes interesting to analyze the state of the network to try to predict when the nodes will be physically disconnected. If the event is planned well in advance, it is not later to respond. This reaction may be of different nature, research to duplicate the service, look for another node providing an equivalent service or strengthen the connection. Mobile ad hoc networks are characterized by the following [3]. -Dynamic topology: The mobile units of the network, moving free and arbitrary. Hence the network topology may change at unpredictable moments, with a fast and random. The links of the topology can be unidirectional or bidirectional. -Limited bandwidth: One of the primary features of wireless communication networks is the use of a shared communication medium. Such sharing is that the bandwidth reserved for a host is low. -Energy constraints: The mobile hosts are powered by independent power sources such as batteries. The energy parameter must be taken into account in any control by the system. -Limited physical security: The mobile ad hoc networks are more affected by the security setting, than the conventional tethered networks. Indeed, the physical constraints and limitations require reducing the control of the transferred data. -Lack of infrastructure: The ad hoc networks differ from other mobile networks by the lack feature of existing infrastructure and any kind of centralized administration. The mobile hosts are responsible for establishing and maintaining the network connectivity on an ongoing basis. In an ad hoc network, meet the requirements and demands of applications or users of many services and contents arises many challenges, brought about by the distributed aspect and use of wireless communication interfaces. Get in a dynamic and decentralized environment, a working environment with a quality equivalent to that provided by a wired network is very difficult but not impossible [11]. We refer in this paper the basic principles of HCLFTP (Hybrid Content Location Failure Tolerant Protocol) to meet the following objectives: (i) ensure fault tolerance and improve the load distribution, (ii ) solving the problem of locating and routing of data is improved by using a system based on hash functions for nodes and data, (iii) the data structures required are small in size and can therefore ensure the location of a data fast enough, (iv) and finally the system implements both technical replication to ensure data persistence as well as mechanisms of caches to improve data availability [8]. This paper is organized as follows. In section 2, we present the main methods of locating content. Section 3 describes the protocol HCLFTP dedicated to service discovery in ad hoc networks. In section 4, we present the simulation results of this approach in the simulation environment "Network simulator” (NS). Finally, we summarize our contributions and future prospects of our research work. 2. A survey of content localization protocols The previous solutions of locating nodes and routing data are not applicable to a large scale. Indeed, changes in such systems are numerous and fast: a node may be present in a system for a period of ten minutes and then disappears. New solutions are required and necessary to develop new mechanisms for tracking and routing [4]. Systems-based content can be classified based on the different techniques of localization and routing of data. Figure 5. Search Content (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 25 2.1 Location with centralized directory In a centralized architecture, a server is limited to direct all users. Note that there appears hierarchies since several computers do not have the same role than the others. Each time a user submits a query, the central server creates a list of resources matching the query by checking in its database the resources belonging to users connected to the network. One of the main advantages of this model is the central index to locate resources quickly and efficiently [7]. Figure 6. Centralized network 1: the node sends its query to the server. 2: If the server knows the node that can respond to the query, it sends the address otherwise it requests the nodes connected. 3: download direct connection between two nodes. The advantage of this technique lies in the centralized index of all directories, files or resources shared by the nodes in the network. In general, the updating of this database is done in real time, as soon as a new user connects. All users are required to be connected to the network of the server: the request reaches therefore all users, making the search more relevant. The main problem however is that this type of system allows a single point of entry into the network, and is not immune from failure server that may block the whole application. 2.2 Hybrid Location The hybrid model involves super-nodes. A super-node in these networks is node that meets several criteria. These criteria relate to most often: the available bandwidth, CPU power, availability on the network. With this model, we use the advantages of both types of networks (centralized and decentralized). Indeed, its structure reduces the number of connections on each server, and thus avoids the bandwidth problems [6]. Figure 7. Hybrid Network The operation of this model is similar to the centralization model. The user sends a query on the server. The search of nodes is through all the super-nodes containing all the users’ data. This solution requires an identification of each node. It gives a list of nodes hosting service for the query’s response. Just to connect directly to the corresponding node and start the resources access. The ring structure of supernoode allows for load balancing and dilutes the risk of local failure and interruption of service. Indeed, if a super-node is not available, other servers carry out its tasks and this will be transparent to the user (automatic reconnection to another server). In connecting the whole nodes to these rings, we get the simplicity of a centralized system with the robustness of a decentralized system. 2.3 Location by flooding As a first step, each node looks for other nodes on the network to announce its presence. Once integrated into the network, nodes question each other. These requests will remain active until the entire network has been covered. The operating principle is as follows (Figure 8): a node "A", with a specific software (which shall act as the client and server at a time), connects to a node "B" also equipped with this software. Thus "A" announces to “B” that it is "alive". "B" relays this information to all neighbor nodes: "C", "D", "E" and "F". The latter relay the information to turn to connected nodes, and so on with all the nodes on the network. Once "A" is recognized as "living" with other members of the network, it can search the content of interest in the directories or shared resources of other members’ network. The request will be sent to all members of the network, starting with "B", then all other members. If one node has the resource, it forwards the information to "A". The latter can then open a direct connection to that node and enable the service. Figure 8. Decentralized network 2.4 Location with distributed hash table If we exclude the possibility to use services based on centralized directories or on messages flooding, we must (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 26 consider that information about the resources location is distributed so that each node has to maintain a small amount of routing information. The search for a resource runs so incrementally through the transfer hop by hop of the corresponding request to the nodes which are better "informed" to respond. The nodes of such a system must continually adapt to changes in their environment. The following five ideas describe a set of mechanisms for locating with distributed hash table [9]. -Identification of nodes and resources: each host is assigned a numerical identifier calculated based on a hash function applied to its IP address. Each document or shared resource is also assigned a numerical identifier (based on a hash function applied to its content or its name). -Distribution of nodes responsibilities: when a node is present in the system, it is assigned the responsibility of several resources. -Organization of routing information: for reasons of scalability, each node must maintain a partial view of the network topology in which it participates. Thus, each node knows a subset of nodes in the system. -Lookup solving: a host must be able to request access to a document or a shared resource in the system. To do this, it must know the value of the key corresponding to this resource. This query is called lookup. The result of a lookup in the system, for a key k, is a reference to the node responsible for key k. To resolve a lookup, a host starts by searching among the hosts it knows, that which has the most feature to verify the relationship with the key corresponding to the searched resource. It forwards the request to this node, which performs the same operation. The search spread from node to node and ends when it reaches the host that is actually responsible for the key. The node issuing the request is then informed of the identity of the node responsible for the key. -Management of arrivals and departures of nodes: during the arrivals and departures of nodes, the system adapts and reassigns responsibilities of the nodes to remain in a coherence state. To enter the system, a node only needs to have access to a node already present. 3. Proposed protocol: HCLFTP In this section, we present HCLFTP designed specifically for ad hoc networks. Before describing the basic concepts and architecture of HCLFTP, we present the assumptions made about the ad hoc environment considered in the design of this protocol. The target environment of HCLFTP is a mobile ad hoc network, dense and scale. Indeed, the nodes can connect, disconnect, or move in the network at any time. All nodes know their own location information using GPS or by using relative coordinate. The objective of HCLFTP is to provide an effective mechanism for localization of content for dense large and scale ad hoc networks. To achieve this objective, several components have been implemented: a hash function to connect the content identifier to the corresponding area, a recursive function for the split and fusion of the network, and a function for dissemination and localization of content based on geographical properties. 3.1 The hash function The technique of hash table is used both for dissemination and for the localization of content. When a server wants to announce data content, it must first use a hash function. This function will allow the server to determine the set of nodes where to publish the content. In HCLFTP, the content is published by a set of nodes which is located in a particular geographic area of the network. It is assumed that the entire ad hoc network is divided into n zones, the hash function will therefore contained a numeric identifier between 0 and n-1.The content where the hash value is equal to i will be hosted by nodes located in zone zi. The hash function also allows the localization of content. Indeed, the request of a user who is looking for a content, whose hash value is equal to i will be redirected to the area zi where it will be resolved. The reason for the dissemination of content in an area, not a single node, is mainly due to the fact that maintaining a rigid, predefined structure, between the nodes in a mobile radio environment, is quite costly in terms of energy and bandwidth. In addition, routing packets in ad hoc networks is far less efficient and less robust than in fixed networks, making adjustments to take into account mobility is more expensive. This results in degradation of system performance and limitation of its scalability. 3.2 Function split of the network The main objective of splitting the network is the distribution of its load while maintaining the topology of zones. Indeed, if the number of nodes and contents in an unstructured area with no central entity to control exceeds a certain threshold, the localization/publishing costs of content become too high. Therefore, a zone could be divided recursively into sub areas to ensure better performance and achieve a uniform distribution of the load. In HCLFTP, to measure the load in an area, we rely on the following test. nh*nch < Th (1) Where nh is the number of nodes located in the zone h, nch is the number of content hosted in the same area and Th means the threshold below which a uniform distribution of the load is ensured. Indeed, nh*nch represents the cost of dissemination of nh contents to nch nodes, this distribution is based on a simple method of flooding. The decomposition of the network into zones is applied whenever nh*nch exceeds Th. This recursive decomposition stops once nh* nch becomes below the threshold Th. The advantage of using a recursive contents’ dissemination is the uniform distribution of load in dense and non-uniform networks. Indeed, choosing an area of a network, which is potentially broken down into different areas, requires knowledge of local information concerning the density and the number of contents to host. Our goal is to use a protocol for content localization in which the decision is recursively delegated to appropriate nodes. In the following, we present the mechanism of the network decomposition in the simplest case where the topology of the network is uniform and the hash result is evenly distributed between 0 and n-1. In a first step, if the inequality nc1 * n1> Th is checked (n1and nc1 mean respectively the number of contents and the number of (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 27 nodes in the network), the network is then splitted into n equal areas. Each zone contains n n1 nodes that should host n nc 1 contents. If in a second step, if n n nc 2 1 1* > Th then each area will be divided into n equal area. Another decomposition is applied in the case where n n nc 4 1 1* > Th and so on. The number of times necessary to distribute the network load can be easily found by calculating the minimum number i that satisfies the following relationship. n n nc i 2 1 1* motivation for the decomposition of the network is to maintain a reasonable cost for content distribution within an unstructured area. However, this is possible only in a hierarchical structure in each zone (recursive division of the areas), which generates an additional cost of the delivery of messages to the next sub area. For this reason, we propose to deploy a merging protocol if the query can be resolved directly in the zone itself. Since each node in the central region knows the number of nodes n and the number of contents nc within the zone, it can trigger the decomposition of the area only if nc*n> Th. We also propose that the merging process is triggered only if nc*n> Th-H (H> 0) to maintain the stability of the splitting/merging of the network. In addition, each node in the central region maintains information regarding the cost of dissemination of content within the zone. Thus, the merging is to deliver content to the next broken areas and disseminate the content in the current area. This intrinsic feature allows the passage from splitting to merging, locally, and vice versa without any additional cost. 3.5 Designation of an area Each message of announcement/location must be redirected, finally, towards the central region of the current area and must be solved by one of the nodes located in this region. For this reason, we need a mechanism that allows the designation of the central area of each zone with a fairly reasonable cost. -Election of “corner nodes" for a rectangular network: if the prior knowledge of the positions of all nodes was possible, the delimitation of the area could be made simply by identifying all nodes that are on the perimeter of (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 28 network. But, having whole knowledge of the positions of all nodes in a mobile ad hoc environment is costly and even impossible. Therefore, a distributed algorithm for the delimitation of the area based on local geographical positions is developed in HCLFTP. The main idea is to use specific nodes called "corner nodes" to delimitate the areas. We assume that all nodes are distributed in a rectangular region. For a rectangular network, if a node does not receive messages from the angles which the direction makes an angle belonging to the interval 90°±a then the node self proclaimed "corner node". The advantage of this algorithm for electing the "corner node" is that it is based on the position of its direct neighbor’s nodes. Since the geographic routing is based on this information, the positions of direct neighbors nodes are already available and thus the cost of messages generated for the election of the "corner nodes” is very low. Following the election of a "corner node", the latter tends to inform all the other "corner nodes" by sending a message of "Corner Announcement" to "corner nodes” of the same area. This message is sent in two geographic directions (east/south, west/south, east/north or west/north) according to its location. If a node “corner node” receives a message of "Corner Announcement ", it checks if the message contains a new "corner node". If a new “corner node” is found, it updates the local list of "corner nodes” and sends to its neighbors “corner nodes” a new message of “Corner Announcement" which contains a list of all the “corner node" whose it has known as well as their positions. After a stabilization period, all the “corner nodes” will be identified and their respective coordinates allow estimating the position of the center of gravity of the area. The election of the "corner nodes” is periodic due to the mobility of nodes. Similarly, messages like "Corner Announcement" will be periodically sent to the neighbors "corner nodes”'. In HCLFTP, the central region of a zone is defined by its center of gravity. Each "corner node" has a local list of all the other "corner nodes”. Thus, the position of center of gravity PCG can be calculated without any further exchange of messages. This position will be then propagated to all nodes in the area. The propagation of the position of center of gravity is performed as follows. Each "corner node" sends a message of "Gravity Announcement" containing the coordinates of the center of gravity along the perimeter and along the diagonals. At each hop, the neighbor node is the only node that is supposed to receive this message, but all nodes who hear the same message will also be informed of the coordinates of the center of gravity. However, the neighbor node is solely responsible for the delivery of the message announcement. Nodes located within a distance d£dh from the center of gravity are considered as belonging to the central region of the area, consequently, they are responsible for routing requests and advertisements. We recall that the messages advertisement/location will be sent by the provider/client in one of four geographic directions and will be intercepted by one of the nodes that know the coordinates of the center of gravity. The messages will be forwarded to the center of gravity and will be resolved by the first node on the routing path belonging to the central region of the zone. The latter will decide whether to forward the message to another level in the area. In addition, nodes in the central region are also responsible splitting the area and updating the current level of decomposition h. To achieve this, each node in the central region periodically sends a message that contains the number of advertisements of contents received. Although the number of contents can be easily found within a zone, the estimated number of nodes within this zone is not quite simple. We can include in the deployment of the decomposition of the network cost estimate for the location of contents which itself depends on the number of nodes. 4. Simulation of HCLFTP We conducted a series of simulation in Network Simulator “NS” by considering two metrics: the number of perimeter nodes elected and the number of confusions. We designate by the number of confusions the number of perimeter nodes elected but who are not actually located on the perimeter. We have assumed in the simulations that each content size is 512 bytes. These metrics were measured for different values of the angle a. Each result presented is an average of results obtained on three different topologies. Nodes are deployed randomly in an area of 1000m*1000m. Each simulation lasts 10 s. Table 1 shows the results for different network sizes. We found that the number of nodes elected as perimeter nodes exceeds the number of nodes actually in the perimeter. The confusion increases with the angle a. Table 2 shows the change in the ratio between the actual perimeter nodes (real PN) and the elected perimeter nodes (elected PN) as a function of the angle a. Table 1. Number of perimeter nodes according to a Nodes’ Number in the network a (degre ) Nodes’ Number in the network a (degre) Nodes’ Number in the network 100 40 20 60 25 200 40 25 60 30 300 40 30 60 70 400 40 60 60 90 500 40 70 60 110 600 40 100 60 125 700 40 110 60 140 800 40 130 60 160 900 40 155 60 180 100 80 25 100 30 200 80 35 100 60 300 80 85 100 120 400 80 95 100 150 500 80 120 100 180 600 80 140 100 220 700 80 156 100 245 800 80 180 100 300 900 80 200 100 380 1000 80 260 100 425 (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 29 Table 2. The ratio real PN/elected PN according to a. Nodes’ Number in the network a (degre) real PN/elected PN a (degre) real PN/elected PN 100 40 0,90 60 0,80 200 40 0,85 60 0,60 300 40 0,65 60 0,55 100 80 0,71 100 0,60 200 80 0,65 100 0,50 300 80 0,45 100 0,36 Furthermore, we found that the answer to a query for extracting the same content is not always provided by the same node but by one of the nodes in the central region. In addition, mobility or failure of one of the nodes in the central region does not cause inaccessibility for the content because there are multiple nodes in the central region, which are able to fulfill the request. However, the response time is not always the same for the same request but it depends on the current traffic in the network and the location of the node responding to the request. 5. Conclusion and future work The sources of information are now spread across networks. However, access to these sources poses challenges for users and applications that need it. Even if several solutions are proposed for access to sources, it still lacks the support of the dynamism that is essential in mobile environments such as mobile ad hoc networks. The protocols for service discovery mechanisms should provide autonomous management of mobility and quality of service and fault tolerance. In this paper, we studied the features of some localization protocols in dynamic environments. The comparison of these protocols aimed to deduce the ideal mechanism for the discovery of data. From these different approaches, we presented a new solution for the discovery of services deployed on mobile sources. The protocol is intended to HCLFTP for location services in a central region rather than in one node in mobile networks. It is based on dynamic hash table. The prospects of this work are the evaluation of this protocol on a real platform and explore the various proposals currently available for descriptions of data sources. References [1] A. Oram. Gnutella, “chapter 8 in Peer-to-Peer: Harnessing the Power of Disruptive Technologies”, pages 94–122. O’Reilly, May 2001. [2] I. Clarke, O. Sandberg, B. Wilez, T.W. Hong, “Freenet : a distributed anonymous information storage and retrieval system, Designing privacy enhancing technologies”, International workshop on design issues in anonymity and unobservability, Lecture Notes in Computer Science, pp 46-66, Berkeley, USA, Springer, July 2000. [3] G. Zussman, A. Segall, “Energy Efficient routing in ad hoc disaster recovery networks”, in Proceedings of IEEE INFOCOM, San Francisco, USA, 2003. [4] C. Bettstetter, C. Renner, “a comparison of service discovery protocols and implementation of the Service Location Protocol”, in Proceedings of EUNICE 2000, Twente, Netherlands, September 2000. [5] S. Cheshire, “DNS-based Service Discovery”, internetdraaft December 2002. [6] J. Govea, M. Barbeau, “Results of comparing bandwidth usage and latency: service location protocol and Jini”, Workshop on Ad Hoc communications, Bonn, Germany, September 2001. [7] A. Rao, C. Papadimitriou, S. Shenker, I. Stoica, “geography routing without location information”, in Proceedings of the 9th annual international conference on mobile computing and networking, ACM Press, pp 96-108, 2003. [8] E. Cohen, S. Shenker, “Replication strategies in unstructured peer-to-peer networks”, in ACM SIGCOMM Conference, august 2002. [9] T. Hara, Y. Loh, S. Nishio, “Data replication methods based on the stability of radio links in ad hoc networks”, in 14th international workshop on database and expert systems applications (DEXA’03), September 2003. [10] T. Hara, “Effective replica allocation in ad hoc networks for improving data accessibility”, in Proceedings of IEEE INFOCOM 2001, pp 1568-1576, April 2001. [11] A. Datta, M. Hauswirth, K; Aberer, “Updates in highly unreliable, replicated peer-to-peer systems”, in 23th International conference on distributed computing systems (ICDCS), May 2003. Author Profile Maher Ben Jemaa received the Engineering degree from National School of Computer Science in Tunisia in 1989, DEA degree in Computer Science from University of Nice France in 1989 and the Phd in Computer Science from INSA of Rennes France in 1993. Actually, he is an Associate Professor in Computer Science in National School of Engineers of Sfax. He is carrying his research in the ReCAD research unit (www.redcad.org). His research topics concern Mobile communication, routing and fault tolerance in wireless networks; (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 30 Wavelet based Watermarking Technique using Simple Preprocessing Methods S.MaruthuPerumal1, B.Vijaya Kumar 2, L.Sumalatha3 and Dr. V.Vijaya Kumar 4 1Research Scholar, Dr MGR University, Chennai, T.N. India. Associate Professor & Head, Department of IT, Godavari Institute of Engineering & Technology, Rajahmundry, A.P. India. maruthumail@gmail.com 2 Professor & Head, Department of CSE, Lords Institute of Engineering & Technology, Hyderabad, A.P. India. vijaysree.b@gmail.com 3Associate Professor & Head, Department of CSE, University college of Engineering, JNTU Kakinada, A.P. India. sumapriyatham@yahoo.com 4Dean & Professor, Department of CSE&IT, Godavari Institute of Engineering & Technology, Rajahmundry, A.P.India. vakulabharanam@hotmail.com Abstract: The Sudden increase in the internet applications has lead people into digital world. Digital watermarking facilitates ecliien distribution, reproduction and manipulation over networked information systems for image, audio clips, and videos. To address this, the present paper proposes a digital image watermarking technique based on various preprocessing methods. The watermark is inserted on the selected pixels based on some preprocessing methods applied on a L-level wavelet transformed image. The Level L has been chosen based on the size of the watermark and window. To test the robustness of the proposed method, various peak signal noise ratios are applied. The experimental result indicates imperceptibility, security, unambiguity and robustness of the present method. Keywords: Wavelet Transformation, Preprocessing, Peak Signal Noise Ratio. 1. Introduction The great advancement taken place in the field of Internet has facilitated the transmission, wide distribution, and access of multimedia data in an effortless manner. The use of digitally formatted image and video information is rapidly increasing along with the development of multimedia broadcasting, network databases and electronic publishing [3, 4, 5, 6, 19]. All these developments are proceeding with a serious drawback: if the media data is copyrighted, the unlimited copying of media data may cause considerable financial loss, the protection of intellectual property rights has become an important issue in the network-centric world. One effective solution to the unauthorized distribution problem is the embedded of digital watermarks into multimedia data [10]. New progress in digital technologies, such as compression techniques, has brought new challenges in to watermarking. Various watermarking schemes that can employ different techniques have been proposed over the last few years [1, 7, 9, 10, 13-19]. To be effective, a watermark must be imperceptible within its host, easily extracted by the owner, and robust to intentional and unintentional distortions [2]. In specific, DWT has wide applications in the area of image authentication. This is because it has many specifications which can make the watermarking process robust. In the recent times wavelet based digital watermarking has become, a very active research area. Watermarking approaches are classified into two categories: Spatial domain and Transform domain methods. Transform domain watermarking techniques are more robust in comparison to spatial domain methods. Among the transform domain watermarking techniques, Discrete Wavelet Transform (DWT) based watermarking techniques are gaining more popularity because of their superior modeling of Human Visual System [2]. To achieve copyright protection, a watermarking scheme for digital images must have the following properties: (1) Imperceptibility or low degree obtrusiveness: it should be extremely difficult to distinguish between the host image and the watermarked image. The quality of the image should not be compromised. (2) Security: a watermark should be statistically undetectable. The watermarking algorithm must be public, with security depending only on keeping the key secret [11, 12, 15]. Only the owner of the host image should be able to extract or remove the embedded watermark. (3) Fast embedding /retrieval: The speed of a watermark embedding algorithm is important for applications where documents are marked ‘on the fly’ (4) No reference to original document: For some applications, it is necessary to recover the watermark without requiring the original, unmarked document (which would otherwise be stored in a secure archive). (5) Multiple watermarks: It may also be desirable to embed multiple watermarks in a document. For example, an image might be marked with a unique watermark each time when it is downloaded [8]. (6) Robustness: when the quality of the host image is degraded by attacks such as blurring, sharpening, scaling, cropping, noising, or JPEG compression, it should still be possible to retrieve and identify embedded watermark. The watermark must be retrievable if common image processing or geometric distortions are performed. (7) Unambiguity: the retrieved watermark should clearly verify the copyright owner of the image. In addition, ideal watermarking schemes should also (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 31 be able to solve the problem of multiple claims of ownership. The rest of this paper is organized as follows: In Section 2, the wavelet transformation of images is discussed in detail. The proposed method along with various pre-processing methods is explained in Section 3. In Section 4, the performance of the proposed method is analyzed. Finally, 5th section deals with conclusions. 2. Wavelet Transformation of Images The wavelet transformation is a mathematical tool for decomposition. The wavelet transform is identical to a hierarchical sub band system, where the sub bands are logarithmically spaced in frequency. The basic idea of the DWT for a two-dimensional image is described as follows. An image is first decomposed into four parts based on frequency sub bands, by critically sub sampling horizontal and vertical channels using sub band filters and named as Low-Low (LL), Low-High (LH), High-Low (HL), and High-High (HH) sub bands as shown in figure 1. To obtain the next coarser scaled wavelet coefficients, the sub band LL is further decomposed and critically sub sampled. This process is repeated several times, which is determined by the application at hand. The block diagram of this process is shown in figure 1. Each level has various bands information such as low–low, low–high, high–low, and high–high frequency bands. Furthermore, from these DWT coefficients, the original image can be reconstructed. This reconstruction process is called the inverse DWT (IDWT). If C[m,n] represents an image, the DWT and IDWT for C[m,n] can similarly be defined by implementing the DWT and IDWT on each dimension and separately. Figure 1. Representation of L-Levels of DW Transformation 3. Methodology To carryout the proposed method, the compressed image is divided into non overlapping blocks of size 0…m-1 x 0… m-1, where m-1 is an integer. A preprocessing method is applied on the selected window. Based on the preprocessing method the hit pixel is decided. Hit pixel is a pixel where the watermark will be inserted. This process is applied on the L level wavelet transformed image. The L level is chosen by the principle that the wavelet image should contain at least double the number of pixels than the required wavelet text. The entire process is explained with the help of a flow chart given in figure 2. Based on the flowchart a block diagram for lena image is given in figure 3. The block diagram of figure 3 clearly indicates the process of inserting the watermark text in lena image after three levels of wavelet transform on LL sub image. The watermark can be inserted on any LL, LH, HL or HH sub bands. The same process can be applied on any wavelet transform. START Cover Image(C) Watermark (X) X-Characters Total No. of Bits of Watermark W=2*X*b C=N*M*b COUNT =1 IF Y Compress the C>W && Image by using COUNT<=4 DWT N ++COUNT Divide the Compressed Image into non Overlapping Equal Blocks Select the hit pixel in which watermark is to be inserted based on preprocessing methods Insert Watermark STOP Figure 2. Flowchart for the proposed scheme (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 32 Figure 3. Block diagram of watermark insertion on wavelet 3.1 Preprocessing Methods For the selection of the hit pixel various preprocessing methods are applied. The applied preprocessing methods are useful in smoothening, reducing noise, contrast and intensity etc. The preprocessing methods depend on image characteristics in a predefined region about each pixel in the image. The preprocessing methods used in the present paper are mean, median, mode, variance and standard variation (SD) as shown from equation 1 to 5 respectively             = ∑∑ - = -= z j i P Mean zi zj 10 10 ) , ( int ..……………...……(1)          " " = -= - = ) , ( 10 10 j i P ASC e middlevalu Median zj zi ..…(2)     " " = -= - = ) , ( mod 10 10 j i P value Mode zj zi …....….….(3)             - = ∑∑ ∑∑ - = -= - = -= z j i P z j i P Variance zi zj zi zj 10 10 2 10 10 )) , ( ( ) , ( int … .(4) 2 /1 10 10 2 10 10 )) , ( ( ) , (             - = ∑∑ ∑∑ - = -= - = -= z j i P z j i P SD zi zj zi zj ……….(5) where P(i,j) represents the gray level value at the location i,j of the window, z is no. of pixels in the block. The figure 4 shows the grey level image of size 6 x 6. Whereas figure 5 shows the hit pixel of figure 4, which are marked with circles based on the mean preprocessing method. 79 86 74 96 81 76 74 75 82 86 84 82 76 75 79 84 82 79 76 79 81 83 80 76 78 77 74 72 70 74 82 80 76 79 78 80 Figure 4. Grey level values of an image of 6 x 6 79 86 74 96 81 76 74 75 82 86 84 82 76 75 79 84 82 79 76 79 81 83 80 76 78 77 74 72 70 74 82 80 76 79 78 80 Figure 5. Hit pixels of the original image of figure 4 4. Experimental Result and Analysis For the experimental analysis different images of size 64x64 are selected and the proposed method is applied. The cover images considered in the present paper are brain image, lena image, barbara image, camera man image, and baboon image which are shown from figure 6(a) to 6(e) respectively. The figure 7(a) to 7(e) shows 3-level wavelet compressed image. The figure 8(a) to 8(e) shows the wavelet decomposed image with the watermark text “MGRU” embedded. The figure 9(a) to 9(e) shows the reconstructed watermarked image To measure the quality of watermarked images, the peek signal-to-noise ratio (PSNR) is used. Which is given in the equation (6) ] [         - = ∑ ∑ = = Mi Nj j i j i y x f y x f xMxN W C PSNR 1 1 2 2 ) ( ' ) ( 255 log 10 ) ' , ( …(6) where C is the cover image and W` is the watermarked image, with dimensions N X M. The PSNR is applied for all cover images of figure 6(a) to 6(e) and watermarked images at figure 9(a) to figure 9(e) and the results are tabulated in table 1. The table 1 clearly indicates that PSNR values for all the proposed preprocessing methods. From the table 1 it is clearly evident that all the proposed preprocessing methods are showing above 50db, which indicates the high robustness. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 33 Table 1: Five different reconstructed images expressed in PSNR (db) for different methods (a) (b) (c) (d) (e) Figure 6. The cover images a) Brain Image b) Lena Image c)Barbara Image d) Cameraman Image e) Baboon Image (a) (b) (c) (d) (e) Figure 7. Compressed cover image a) Brain Image b) Lena Image c)Barbara Image d) Cameraman Image e) Baboon Image (a) (b) (c) (d) (e) Figure 8. Compressed watermarked image a) Brain Image b) Lena Image c)Barbara Image d) Cameraman Image e) Baboon Image (a) (b) (c) (d) (e) Figure9. Reconstructed watermarked image a) Brain Image b) Lena Image c)Barbara Image d) Cameraman Image e) Baboon Image . The reconstructed watermarked images of figure 9(a) to 9(e) clearly indicate the clarity, imperceptibility, robustness of the image when compared to figure 6(a) to 6(e). 5. Conclusion The PSNR values clearly indicate that high robustness of the proposed method. The proposed preprocessing techniques can be extended on any window size and the watermark content may also be increased from minimum of two characters to maximum of any length depending on the size of the image. The advantage of preprocessing methods for selecting the hit pixel over the other methods on wavelet image is of maintaining the important the characteristics of the image without any loss of image content or the information in the selected region. Appendix A: Processing Methods Acknowledgement The authors would like to express their gratitude to Sri K.V.V. Satyanarayana Raju, Chairman, and Sri K. Sasi Kiran Varma, Managing Director, Chaitanya group of Institutions for providing necessary infrastructure. Authors would like to thank Dr MGR University Chennai for the suggestions and guidelines given and the anonymous reviewers for their valuable comments. References [1] Aboofazeli. M, G. Thomas and Z. Moussavi, “A wavelet transform based digital image watermarking scheme,” in Proc. IEEE CCECE, vol. 2, pp. 823 – 826, May 2004. [2] Adhipathi Reddy A , B.N. Chatterji “A new wavelet based logo-watermarking scheme”, Pattern Recognition Letters 26 (2005) 1019–1027 [3] Andreja Samcovic , Jan Turan, “ Attacks on Digital Image Wavelet Image Watermarks”, Journal of Images Used Pre Processing Methods Brain Image Lena Image Barbara Image Camera Man Image Baboon Image Mean 54.1 5 53.88 56.19 53.4 52.76 Median 53.1 8 53.40 55.05 53.88 52.97 Mode 53.8 8 53.40 54.15 53.88 55.05 Variance 55.4 0 54.43 53.88 53.40 55.40 Standard Deviation 53.6 3 54.43 53.88 53.88 54.73 (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 34 Electrical Engineering Vol 59, No:3 2008. Page131-138 [4] Bojkovic Z et al “Multimedia Contents Security: Watermarking Diversity and Secure Protocols” Serbia and Montenegro,Oct 1-3 2003. [5] Christine I Podilchuk , Edward J Delp “Digital Watermarking Algorithms and Applications” IEEE Signal Processing Magazine 2001. [6] Gwo-Chin Tai and Lomg Wen Chang “A Novel Public Digital Watermarking for Still Images based on Encryption Algorithm” IEEE 2003. [7] Guzman V. H, M. N. Miyatake, and H. M. P. Meana, “Analysis of a wavelet-based watermarking algorithm,” in Proc. IEEE CONIELECOMP, pp. 283-287, 2004. [8] Jonathan K Su ,Frank Hartung and Bernd Girod, “Digital Watermarking of Text, Image and Video Documents”, Computer. & Graphics, Vol. 22, No. 6, pp. 687 -695, 1998 [9] Kundur.D and Hatzinakos.D, “Digital watermarking using multiresolution wavelet decomposition,” in Proc. IEEE ICASSP, vol. 5, pp. 2969 – 2972, May 1998. [10] Liang-Hua Chen and Jyh-Jiun Lin “Mean quantization based image watermarking”, Image and Vision Computing 21 (2003) 717–727 [11] Juan R. Hernandez Martin,Lysis SA et., al “Information Retrieval in Digital Watermarking” IEEE Communication Magazine 2001. [12] LIU Tong, QIU Zheng –ding “The Survey of Digital Watermarking based Image Authentication Techniques” ICSP 2002 Proceedings. [13] Meerwald . P and A. Uhl, , “A survey of waveletdommai watermarking algorithms,” in Proc. SPIE, vol. 4314, pp. 505-516, 2001. [14] Mong-Shu . L, “Image compression and watermarking by wavelet localization,” Intern. J. Computer Math., vol. 80(4), pp. 401-412, 2003. [15] Wang S-H. and. Lin Y-P, “Wavelet tree quantization for copyright protection watermarking,” IEEE Transactions on Image Processing, vol. 13, pp. 154 – 165, Feb. 2004. [16] Wang . Y, J. F. Doherty, and R. E. Van Dyck, "A wavelet-based watermarking algorithm for ownership verification of digital images," IEEE Trans. Image Processing, vol. 11, pp. 77-88, 2002. 05-516 [17] Xia-mu Niu and Sheng – he Sun “Adaptive GrayLevel Digital Watermark” Proceedings of ICSP 2000. [18] Yuk Ying CHUNG and Man To WONG “Implementation of Digital Watermarking System”. IEEE 2003 [19] Zhe Ming Lu, Chun He Liu, et al, “Image Retrieval and content Integrity Verification Based on Multipurpose Image Watermarking Scheme” – International journal of Innovative Computing, Information and Control Vol – 3,Number 3, June 2007. Authors Profile S.MaruthuPerumal received his M.E. in Computer Science and Engineering from Sathyabama University Chennai in 2005. He is having eleven years of teaching experience. At present he is working as an Associate Professor and Head Department of IT Godavari Institute of Engineering and Technology, Rajahmundry. He is Pursuing his Ph.D at Dr MGR University Chennai under the Guidance of Dr V VijayaKumar His research interest includes Image processing, Digital Watermarking, Steganography and Security.He is a life member of ISCA,IAENG. B Vijaya Kumar completed his M S in CSE from DPI, Donetsk, USSR in 1993. He worked as Software Engineer in Serveen Software Systems pvt. Ltd. Secunderabad, India for four years (1993-1997). After that he worked as Sr. Assistant Professor in JBIET, Hyderabad for three years later joined in Royal Institute of Technology & Science, Hyderabad as Associate Professor and worked there for four years. Presently he is working as Professor & Head of CSE Department in Lords Institute of Engineering & Technology, Hyderabad, India. He is pursuing his Ph.D. in Computer Science under the guidance of Dr Vakulabharanam Vijaya Kumar. He is a life member of CSI, ISTE, NESA and ISCA. He has published more than 10 research publications in various National, Inter National conferences, proceedings and Journals. L. Sumalatha completed her B.Tech from Acahrya Nagarjuna University and M.Tech CSE from JNT University Hyderabad. She is working as Head Departement of CSE College of Engineering JNT University Kakinada. She is having nine years of teaching experience. She is pursuing her Ph.D from JNT University Kakinada. Her research areas includes network security, digital imaging and digital watermarking. Vakulabharanam Vijaya Kumar received integrated M.S. Engg, degree from Tashkent Polytechnic Institute (USSR) in 1989. He received his Ph.D. degree in Computer Science from Jawaharlal Nehru Technological University (JNTU) in 1998. He has served the JNT University for 13 years as Assistant Professor and Associate Professor and taught courses for M.Tech students. He has been Dean for Dept of CSE and IT at Godavari Institute of Engineering and Technology since April, 2007.His research interests include Image Processing, Pattern Recognition, Network Security, Steganography, Digital Watermarking, and Image retrieval. He is a life member for CSI, ISTE, IE, IRS, ACS and CS. He has published more than 120 research publications in various National, Inter National conferences, proceedings and Journals. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 35 An Assimilated Approach for Statistical Genome Streak Assay between Matriclinous Datasets Hassan Mathkour1, Muneer Ahmad2 and Hassan Mehmood khan3 King Saud University, Department of Computer Science, College of Computer and Information Sciences, P.O. Box 51178, Riyadh 11543, Saudi Arabia 1binmathkour@yahoo.com 2 muneerahmadmalik@yahoo.com 3 hasmkh@gmail.com Abstract: Genome Streak Assay for matriclinous datasets by using ORF (Open Reading Frames) artistries is a titillating area of inquest for bioinformatics inquisitors recently. There is a strong inquest focus on metaphorical assay between matriclinous behaviors and multeity of peculiar species. Antagonistic to choate genome streak assay, scientists are now trying to contemplate peculiarly ensconced assay to get a better peculiarly of pertinency among matriclinous datasets. This marvel will better help to understand species. We are adducing an ORF statistical assay for matriclinous data-sets of species Chimera Monstrosa and Poly Odontidae. For completion of this assay, we use a mongrel approach that combines generic contrivance for statistical assay with specific approach designed for out performance. At first exemplification, matriclinous datasets are rarefying for better usage at next level. These sets are then passed through ensconces of filters that perform DNA to Protein translation. Statistical correlation is performed during this translation. This ensconced architecture helps in better understanding of tenor of affinity and aberrations in genomic streaks. Keywords: Open Reading Frame, codon count, amino acid, preprocessing filter, Nucleotide 1. Introduction Due to existing and continuously growing bulk of biological data coming from genome projects and experiments now a days. Protein structure prediction and its systematic translation needs an efficient and effective way to streak, analyze and compare coded biological DNA streak information. The genome streak assay is directly related to the streak correlation and alignment. Streak affinity is a way to predict the functional affinity among genes and have been used as a tool for functional prediction. Assay and Correlation of DNA streaks and genes is useful for finding the fact that how these genes are organized and what are the similarities and aberrations [1]. These fundamental problems are NP hard [14, 17] and need optimal solution that can be achieved by improving algorithms and computing architecture. [2]. A little work has been done in mongrel statistical assay of genomic data against exponentially increasing problem size. Usage of Computer aided artistries are not the solution. There is need to work in computational molecular biological experiments by means of DNA streak assay. Finding unique streak on the entire target genome is one of the most important problems in molecular biology [3]. The overall goal of this paper is to adduce an assimilated approach that performs metaphorical assay between same species revealing that peptide translation in both has tenor of aberrations. This task is accomplished by using ORF with statistical assay. The method used for this purpose is a composite artistry that consists of series of filter from preprocessing level to final assay. The human genome project has built rich databases which attracted inquest titillates from biologists and computer scientist to explore and mine these precious data-sets. The computer aided applications now can reveal the hidden information in complex helix DNA structure. They also made it possible to perform fast and accurate assay. This has been made effective with the availability of cost effective and handy assay tools. Scientists have developed novel ideas, implemented and resolved complex situations in computational biology whose direct feasible solution was not possible yielding optimal solutions in some cases for streak assay, an NP hard problem [5, 9, 14, 17]. This paper is organized as follows. Section 2 highlights some related work. Section 3 describes the proposed artistry (elaborated in subsections). Section 4 contains fundamental concluding remarks for this metaphorical assay. Section 5 re-adduces an acknowledgement and section 6 contains References. 2. Literature review Rajita Kumar [17] gives an approach for a distributed bioinformatics computing system. It was designed for disease detection, criminal forensic and protein assay. It is a combination of peculiar distributed algorithms that are used to search and identify a triplet repeat pattern in a DNA streak. It consists of search algorithm that computes the number of occurrences of a given pattern in a matriclinous streak. The distributed sub-streak identification algorithm was to detect repeating patterns with sequential and distributed implementation of algorithms relevant to (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 36 peculiar triplet repeat search patterns and matriclinous streaks. The result of this system shows that as complexity of the algorithm increases, the response time also increases. There is space to make this work better for more DNA streaks of various lengths. Ken-ichi Kurata [9] adduces a artistry to find unique genome streaks from distributed environment databases. Ken-ichi used implementation of the method upon the European Data Grid and showed its results. The author worked on the unique streaks of E. Cole 0157 (12 genome). The genome is divided into smaller pieces being processed individually. In an example quoted by author, the total file size is 256 MB when it is hashed to 7. It is possible to divide the genomic files into at most 47 = 16384 pieces of 15 KB each. This method results in memory consumption and increases file size. This data grid method is not useful for parallelizing biological important data. Ao Li [16] proposes a genome streak learning method by simplifying Bayesian network. The nodes in Bayesian networks are selected as features. A feature selection algorithm is used for structure learning. This algorithm is based on matriclinous algorithm. The researcher used dataset of 570 vertebrate streaks, including 2079 true donor sites. This approach is limited to the donor site prediction and also confirms that the nucleotides closer to donor site are the key elements in gene expression. There is need to improve the structure learning method, valuable features and assay etc. DNA chips [7] have main role in disease diagnosis, drug discovery and gene identification. Elaine Garbarine [7] used an approach to detect unique gene regions of particular species. This artistry named information theoretic method exploits genome vocabularies to distinguish between pathogens. This approach is useful only for finding the gene streaks and most distinguished similarities between two organisms. Oligo probes were used to distinguish between two genes. Experiments were conducted to data from Sanger Institute. Currently 32 out of 92 bacterial pathogen sequencing projects are completed. The author selected a pair of genomes to test algorithm. Results were shown for a 12-mer and 25-mer Oligo pathogen probe set and confirmed the Elaine Garbarine method less likely to cross-mongrelize. José Lousadop [12] developed a software application for large-scale assay of codon-triplet associations to shed new light into this problem. This algorithm describes codontriiple context biases, codon-triplet assay and identification of alterations to standard matriclinous code. The method adduces an evolutionary understanding of codons within open reading frames (ORF). Gene-Split [8] is an application that shows codon triplet patterns in genomes and complete sets of ORFs. Generally this application gives opportunity to study the characteristics of codon and amino acids triplets in any genome for extraction of hidden patterns. Hua Zheng et al., [13] adduce a artistry that assimilates the low pass filter and wavelength de-noising method. Conventional artistries use the low pass filter with cheap hardware resulting in degraded de-noising quality. By properly choosing the cut-off frequency and wavelength denoiisin frequency, some enhancement can be made for signal to noise ratio and processed signals can be made for requirement of single base pair resolution in DNA sequencing and vector of targeting signal can be decomposed into orthogonal matrix of wavelength functions. This is an iterative method with levels n and can be conventionally reconstructed by inverse DWT. Binwei Weng et al., [14] apply wavelength transform to extract features from the original measurements. They partition the data in subsequent partitions by a hierarchal clustering method, the terahertz spectroscopy of peculiar DNA samples show the wavelength domain assay aids the clustering process, authors have clustered six DNA samples into two groups, the data has been cleansed before processing, wavelet function utilized the Haar wavelet methods. The signal trend is separated from the original records. The size of clusters may be calculated by the maximum distance between two points within cluster. Another preprocessing step is balancing the data which can achieve normalization of data. Bilu et al., [15] propose an alignment algorithm for NP hard alignment problem of streaks, author outperform an alignment procedure by sufficing optimal alignment of predefined streak segments, they contemplate on choate streak rather than letters and estimate running time by restricting the search space of dynamic programming algorithm. Authors take the aid from observation that encoding streaks used in NP hard problems are not necessarily depiction of protein and DNA streaks. Time expedition is calculated by taking advantage of biological nature of streaks antagonistic to traditional approaches that offer good computation leading to optimal alignment; more stress is given to the structure of input streaks. Tuqan and Rushdi [6] propose an approach for finding the complete periodicity in DNA streaks, the approach is spliced in three channels, firstly they explain the underlying contrivance for period 3 components, secondly directly relate the identification of these components for finding nucleotide bias in codon spectrum, thirdly completely characterize the DNA spectrum by a set of numerical streaks. Authors relate the signal processing problem with genomic one through their proposed multirate DSP model, the model identifies the essential components involved in the codon biased marinating the dual nature of problem. This marvel can further help in understanding the biological significance codon bias. The period 3 component detection works for a kind of genes and may not be suitable for all matriclinous datasets. Ma Chan et al., [4] has shown the functionality of popular clustering algorithms for assay of microarray data and concluded that performance of these algorithms can be further increased. Authors are also proposing an evolutionary algorithm for microarray data assay in which there is no need for calculation of no. of clusters in advance. The algorithm was tested with simulation and peculiar datasets. The noise and missing values are a big issue in this regard. The marvel is depicted by encoding the entire cluster (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 37 grouping in a chromosome so that each gene encodes one cluster and each cluster contains the labels of data used in it. Cross over and mutations are performed suitably. The proposed algorithm has been observed to be slow as compared to other prevailing algorithms. 3. THE ADDUCED TECHNIQUE The titillate mainly lies in finding genome regions that are responsible for protein translation. We have developed a ensconced architecture shown in the Figure 1 above for this assay that starts from preprocessing of raw data to final translation assay. For the sake we have used matriclinous datasets of Chimaera Monstrosa (rabbit fish, NC_003136) and Poly Odontidae (paddle fish, NC_004419) [18]. At preprocessing stage raw data sets are passed through a filter that outputs a more rarefy form of data which can be further used for actual metaphorical assay between species. It is evident from Figure 2 that dataset contains characters other than pure nucleotide bases. These illegal characters are removed by application of cleansing filter. At first exemplification it is worth noting that assay should be made with original data values, any garbage collection may lead to detritions of results. Figure 3 depicts that preprocessed data contains only pure nucleotide base pairs without any anomalies. This rarefy data is later fed into next ensconce for actual assay. First we display the ORF in a nucleotide streak and find the start and stop codon. By using the streak indices for start and stop, we can extract the sub-streaks and can determine the codon distribution effectively. The most informative and titillating marvel that Choate process is broken into steps and each step fully performs the metaphorical assay relevant to DNA to protein translation. A. SIZE OF DATASETS 1. Chimaera Monstrosa contains 18580 nucleotides of Adenine, Guanine, Thymine and Cytosine. Cumulative size of data becomes 37160 bytes arranged in the form of a uni-vector. 2. Poly Odontidae contains 16512 nucleotides of Adenine, Guanine, Thymine and Cytosine. Cumulative size of data becomes 33024 bytes arranged in the form of a uni-vector. B. ORF IN NUCLEOTIDE STREAKS It is worth noting that metaphorical assay between both species is being done at translation level, so this level is vital in assay. We split this ensconce into three more ensconces to get a better benefit of this ensconced assay. In each phase, our titillate lies in determining the accurate start and stop position of codons that perform the relative assay. C. ORF PRIMARY FRAMES At ORF primary frame level, Figure 4 shows that start position for the first frame is at 7156 and second at 8761. These start positions re-adduce the major translation regions in entire frames. These regions are pure depiction of tri-nucleotide molecules. This process leads towards the extraction of sub-chains that lately will be shifted to peptide regions. Figure 2. Dataset before filter application Figure 3. Pre-processed Dataset Figure 5. ORF of Poly Odontidae in Frame 1 Figure 4. ORF of Chimaera Monstrosa in Frame 1 Figure 1. Ensconced Architecture (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 38 Like wise we get the ORF in the second data set of Poly Odontidae shown in Figure 5. The by entering the start positions we can get stop codons. The start positions of the second dataset Frame 1 are 10798 to 11395, 14641 to 15559. It is clear that there is an evident aberration in codon regions for both frames of these species. The corresponding translated regions are so entirely peculiar that we can not guess even the idea of sub-channels affinity. D. ORF SECONDARY FRAMES At second level, we intend to find the codon positions for Frame 2 of both species, Figure 6 describes that major ORF start from 2753, 5426 and 10325, this re-adduces that there are series of other regions occupied between first and second frame that don't contribute the peptide translation regions. Similarly the frame 2 of Poly Odontidae shown in Figure 7 describes its codon position from 11120 to 11465 and 12464 to 12887. This shows a massive aberration in datasets at this level as we move with increasing nucleotide sub-streaks, we may get larger aberrations but this case does not seem to be true for all matriclinous datasets. This is the reason that marvel has been given importance in selection these particular sets. E. ORF TERTIARY FRAMES Discussing the last frame set in this streak, we first find the codon composition for these frames, for exemplification consider frame 3 of Chimaera Monstrosa. Figure 8 shows that major ORF starts from 4019, 11948 and 14328. This massive aberration in codon compositions also provide an evidence that first translated region lies some four thousand while second and third regions have jump gaps. This is the variation in translated regions in species. In Figure 9, third frame for Poly Odontidae goes from 2796 to 3242, 6315 to 6722 and 12753 to 13217. Fig. 8 shows that first 2 codon positions are relative similar while third position again describe a jump gap. Performing metaphorical assay this level, reveals the facts that both matriclinous data finds a kind of extremity in behavior which make them relevant at certain codon composition and peculiar at others. F. CODON COUNT The codon count describes the tri-nucleotide behavior of streaks. We need to find the tenor of pertinency in terms of strengths of nucleotide bases. For exemplification, we have selected frame 1 from codon composition of both species and compare the strength. Figure 10. Codon count (Chimaera Monstrosa in Frame 1) Figure 10 re-adduces the codon count for Chimera Monstrosa. Our aim focuses on metaphorical assay of codon strength at this stage. For the purpose, we need to calculate the codon count for Poly Odontidae. Figure 11 shows the codon count of the first ORF of the Poly Odontidae. Figure 11. Codon count (Poly Odontidae in Frame 1) G. STRENGTH OF AMINO ACID IN THE PROTEIN STREAK At last phase of this metaphorical assay, we need to find the relevant strength of peptide pairs in protein streaks (resulted as a translation from DNA to protein) Figure 6. Frame 2 (Chimaera Monstrosa) Figure 7. Frame 2(Poly Odontidae) Figure 8. Frame 3 (Chimaera Monstrosa) Figure 9. Frame 3 (Poly Odontidae) Figure 14. Strength of amino acid (Chimaera Monstrosa) (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 39 Figure 14 shows the strength of amino acid in Chimera Monstrosa, now we determine the atomic decomposition and molecular weight of the protein C: 1220 H: 1886 N: 298 O: 341 S: 12 Molecular weight is 2.6569e+004 The strength of amino acid in protein streak of the Poly Odontidae is depicted in Fig. 15 below, Similarly the atomic decomposition and molecular weight of the protein are C: 940 H: 1488 N: 276 O: 266 S: 14 Molecular weight is 2.1360e+004 Comparing amino acid streaks of both species obtained from the primary codon translation, we see in table 1 Table 1:(Amino Acid streak correlation) Amino acid Chim. Monstrosa Poly Odontidae C 1220 940 H 1886 1488 N 298 276 O 341 266 S 12 14 and corresponding molecular weight in table 2 Table 2:(Molecular weight correlation) Chimaera Monstrosa Poly Odontidae 2.6569e+004 2.1360e+004 These results clearly describe the marvel that despite both species from same class differ greatly in patterns of ORF. 4. Conclusion An Open Reading Frame (ORF) contains a start codon region. This subsequent region contains pairs of nucleotides in length multiple of 3 and end with a stop codon. This paper describes the phase wise metaphorical assay of two matriclinous data of species Chimaera Monstrosa and Poly Odontidae. It re-adduces an assimilated approach composed of step by step processes to elaborate the results effectively. The process gives more stress on peptide translation using Open Reading Frame concept and data refining methodology. At the end we look for all outcomes that make this effort optimal by performing a sensitive assay at DNA to protein conversion. Variations at each step were observed even the data classes remained same. Acknowledgements This work was partially supported by Research Center, College of Computer and Information Sciences, King Saud University Riyadh Saudi Arabia. References [1] Ravi Gupta, Ankush Mittal, Kuldip Singh, Prateek Bajpai, Suraj and Prakash, “A Time Series Approach for Identification of Exons and Introns”, 10th International Conference on Information Technology 2007, Page(s):91 -93 [2] Patrick Ma and C.C. Keith Chan, “Discovering Clusters in Gene Expression Data using Evolutionary Approach”, 15th IEEE International Conference on Tools with Artificial Intelligence 2003, page(s): 459-466 [3] Tejaswi Gowda, Samuel Leshner, Sarma Vrudhula and Seungchan Kim, “Threshold logic gene regulatory Networks”, International Workshop on Genomic Signal Processing and Statistics 2007, page(s): 1-4, ISBN: 978-1-4244-0998-3 [4] P.C.H. Ma, K.C.C. Chan, Xin Yao and D.K.Y. Chiu, "An evolutionary clustering algorithm for gene expression microarray data analysis", IEEE Transactions on Evolutionary Computation 2006, Volume 10, Issue 3, , page(s):296 -314 [5] Daniel Miranker,”Evolving Models of Biological Sequence Similarity”, First International Workshop on 2008,page(s):3-9 [6] J. Tuqan and A. Rushdi, "A DSP approach for finding the codon Bias in DNA Sequences", IEEE Journal of Selected Topics in Signal Processing 2008, Volume 2, Issue 3, page(s):343 -356 [7] Elaine Garbarine and Gail Rosen “An information theoretic method of microarray probe design for genome classification”, 30th Annual International Conference of the Engineering in Medicine and Biology Society, 2008, page(s): 3779-3782 [8] P.H.-M Chang, Von-Wun Soo, Tai-Yu Chen, Wei-Shen Lai, Shiun-Cheng Su and Yu-Ling Huang, “Automating the determination of open reading frames in genomic sequences using the Web service techniques -a case study using SARS coronavirus”, Fourth IEEE Symposium on Bioinformatics and Bioengineering 2004, page(s):451 -458 [9] Ken-ichi Kurata, Vincent Breton and Hiroshi Nakamura, “A Method to Find Unique Sequences on Distributed Genomic Databases”, IEEE/ACM International Symposium on Cluster Computing and the Grid 2003, 3rd Volume, page(s): 62 -69 [10] Nasreddine Hireche, J.M. Pierre Langlois and Gabriela Nicolescu, “Survey of biological high performance computing: Algorithms, Implementations and Outlook Research”, Canadian Conference on Electrical and Computer Engineering 2006, page(s):1926 – 1929 [11] Bartkowiak, “Nonlinear Dimensionality Reduction by Isomap and MLEdim as Applied to Amino-Acid Distribution in Yeast ORFs”, Computer Information Figure 15. Strength of amino acid (Poly Odontidae) (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 40 Systems and Industrial Management Application, 2008 , page(s):183 – 188 [12] José Lousado and R. Gabriela Moura, “Exploiting codon-triplets association for genome primary structure analysis”, International Conference on Biocomputtation Bioinformatics, and Biomedical Technologies 2008. , page(s):155 -158 [13] Hua Zheng, Yan Shi, Jie Wang, Liqiang Wang and Zukang Lu, "The Analysis on the Signals Denoising and Single Base Pair Resolution of DNA Sequencing", International Symposium on Biophotonics, Nanophotonics and Metamaterials, 2006, page(s):118 -121 [14] Binwei Weng, Guangchi Xuan, J. Kolodzey and K.E. Barner;, "Discriminating DNA Sequences from Terahertz Spectroscopy -A Wavelet Domain Analysis", Proceedings of the IEEE 32nd Annual Northeast Bioengineering Conference 2006, page(s):211 -212 [15] Y. Bilu, P.K Agarwal and R. Kolodny, "Faster Algorithms for Optimal Multiple Sequence Alignment Based on Pairwise Comparisons", IEEE/ACM Transactions on Computational Biology and Bioinformatics 2006, Volume 3, Issue 4, page(s):408 -422 [16] Ao Li, Tao Wang, Yun Zhou, Ming-hui Wang and Huan-qing Feng, “An efficient structure learning method in gene predection”, Proceedings of the International Conference on Neural Networks and Signal Processing, 2003, Volume 1, page(s): 567-570 [17] Rajita Kumar, Arooshi Kumar and Sanjuli Agarwa “A Distributed Bioinformatics Computing System for Analysis of DNA Sequences”, IEEE proceedings of Southeast Conference 2007, page(s):358 – 363. [18] http://www.ncbi.nlm.nih.gov. Author Profile Hassan Mathkour is a professor in the department of Computer Science. He is serving in the College of Computer and Information Sciences King Saud University, Riyadh, Saudi Arabia as the Vice Dean for Quality, Assurance and Development. He completed his PhD from the University of Iowa, USA in 1986. His research interests include Databases, Artificial Intelligence, Bio-informatics, NLP and Computational sciences. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 41 An Effective Localized Route Repair Algorithm for Use with Unicast Routing Protocols for Mobile Ad hoc Networks Natarajan Meghanathan1 1Jackson State University, Department of Computer Science, P. O. Box 18839, Jackson, MS 39217 natarajan.meghanathan@jsums.edu Abstract: We propose an efficient and effective Localized Route Repair (LRR) algorithm that would minimize the number of flooding-based route discoveries for on-demand unicast routing protocols in Mobile Ad hoc Networks (MANETs). The principle behind the LRR algorithm is that the downstream node of the broken link would not have moved far away and is highly likely to be in the 2-hop neighborhood of the upstream node of the broken link. Accordingly, upon the failure of a link on the path from a source node to a destination node, the upstream node of the broken link stitches the broken route by attempting to determine a 2-hop path to the downstream node of the broken link. If the underlying network is connected, the proposed LRR algorithm will help to stitch a broken route without going through network-wide flooding of the Route-Request (RREQ) messages or a time-consuming expanding ring route search process. LRR can be incorporated into the route management module for any MANET unicast routing protocol. In this paper, we implement LRR for the Dynamic Source Routing (DSR) protocol – referred to as DSR-LRR. Simulation results reveal that the number of broadcast node transmissions per session for the original DSR is 50%-70% more than that of DSR-LRR. The relative increase in the hop count for DSR-LRR routes is however within 25%. Keywords: Mobile ad hoc networks, Localized route repair, Hop count, Flooding, Route discoveries 1. Introduction A mobile ad hoc network (MANET) is a distributed dynamic system of autonomously moving wireless devices (nodes). The wireless nodes self-organize themselves for a limited period of time depending on the application and the environment. As the nodes are battery-charged and recharging is next to impossible, the transmission range (i.e., the communication range) of the nodes is often limited. As a result, it may not be always possible to have point-topooin direct communication between any two nodes. A wireless link is said to exist between two nodes only if the two nodes are within the transmission range of each other. Communication sessions in MANETs are often multi-hop in nature, involving intermediate peer nodes that cooperaativel forward data packets from the source towards the destination. As the topology changes dynamically, routes between the source and destination nodes of a communication session have to be frequently reconfigured in order to continue the session. MANET routing protocols are of two types: Proactive and Reactive. Proactive routing protocols tend to maintain routes between any pair of nodes all the time; while reactive routing protocols discover routes from the source to destination only on-demand (i.e., only when required). In a dynamically changing environment (like that of battlefields), reactive routing has been preferred over proactive routing as the latter involves considerable route maintenance overhead [1][2]. In this paper, we restrict ourselves to exploring the reactive routing strategy. On-demand route discovery in reactive routing protocols is often accomplished through a global flooding process in which each node will be involved in forwarding (transmitting and receiving) the route discovery message from the source towards the destination. Frequent flooding based route discoveries can quickly exhaust the battery charge at the nodes and also consume the network capacity (bandwidth). Several on-demand routing protocols have been published in the literature [1][2][3], each with a particular route selection metric. The most commonly used route selection metric is the hop count. Routes with the minimum hop count are preferred because the data would go through the minimum number of intermediate forwarding nodes, resulting in lower end-to-end delay and reduced energy consumption per data packet transferred. But, it has been identified that minimum hop routes are not very stable (i.e., the routes do not exist for a long time) [4] and routes have to be frequently determined through the flooding-based route discovery procedure. In this paper, we propose a Localized Route-Repair (LRR) algorithm that would minimize the number of floodingbaase route discoveries for on-demand MANET routing protocols. With the incorporation of the proposed LRR algorithm in its route management module, a unicast routing protocol would have to opt for a flooding-based route discovery only when the MANET is temporarily disconnected and a new route has to be determined between the source and destination. Otherwise, if the MANET is connected, the proposed LRR algorithm will help to fix a broken route without going through network-wide flooding of the Route Request (RREQ) messages and without requiring the intermediate nodes to determine the route all the way to the destination. The rest of the paper is organized as follows: In Section 2, we present a motivating example to illustrate the need for (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 42 LRR. Section 3 describes the proposed LRR algorithm. Section 4 describes the simulation environment and demonstrates the effectiveness of LRR through simulation results. Section 5 discusses related work and Section 6 outlines some of the benefits of the proposed LRR algorithm. Section 7 concludes the paper. 2. Motivation Let s–a–b–c–e–d be the route discovered by a routing protocol from source node s to destination node d through a regular flooding process. Once the route is discovered, data packets are sent continuously from s to d. After certain time, assume the intermediate nodes b and c move out of the transmission range of each other, leading to the failure of link b – c on the discovered route from s to d. As of now, the MANET routing protocols handle route failures in one of the following two principal ways: (i) The upstream node b of the broken link b – c attempts to find a path all the way to the destination by initiating an expanding ring route search [5]. The expanding ring route search technique attempts to locate the destination node d in an iterative fashion by restricting the scope of the broadcast route-request messages to 1-hop, 2-hops and so on up to a pre-determined hop count value configured for the routing protocol. If a route to the destination is determined within any of the route search attempts, the intermediate node continues to forward the data packets on the newly discovered route to the destination. The source node is completely unaware of the change in the route to the destination. (ii) The upstream node b of the broken link b – c immediately notifies the source node s about the failure in the route and stops from initiating any expanding ring route search. The source node launches a networkwiid flooding of the RREQ messages. The first strategy of expanding ring route search may be efficient if the destination can be located within the vicinity of the upstream node of the broken link. Otherwise, the route search has to be slowly expanded up to a certain hop count value and this would incur a considerable amount of route-management overhead (in terms of the number of localized Route Request messages broadcast) as well as larger route acquisition latency. The second strategy of immediately notifying the source node about the route failure can trigger frequent network-wide flooding, which would also generate significant control overhead (number of RREQ messages broadcast). 3. Localized Route Repair (LRR) Algorithm In this paper, we propose the Localized Route Repair (LRR) algorithm that basically works as follows: Upon the failure of a link on the path from a source to destination, the upstream node of the broken link stitches the broken route by attempting to determine a 2-hop path to the downstream node of the broken link. In this pursuit, the upstream node of the broken link initiates a Local-RREQ message broadcast process that is restricted for propagation only within its 2-hop neighborhood. The main idea behind the LRR algorithm is that the downstream node of the broken link would not have moved far away and is highly likely to be in the 2-hop neighborhood of the upstream node of the broken link. With LRR, the broken route can be stitched together rapidly without having to go through an expanding ring route search process to the destination node. The Local-RREQ message includes the IDs of the original source node, the original destination node, the originating intermediate node (i.e., the upstream node of the broken link), the targeted intermediate node (i.e., the downstream node of the broken link) and the most recent path used from the originating intermediate node to the destination node. The most recent path information would be useful for an intermediate node receiving the Local-RREQ message in order to decide how to further process the message. In the one-hop neighborhood, if an intermediate node receives the Local-RREQ message for the first time and it is neither the destination node nor a downstream node on the path towards the destination, the intermediate node simply records its ID in the Local-RREQ message and broadcasts to its neighbors. All duplicate Local-RREQ messages are dropped. If the underlying network is connected, one or more of the following would be the outcome(s) of the 2-hop Local-RREQ broadcast search process: (i) If the Local-RREQ message is received by the original destination node of the route (i.e., the destination has moved within the 2-hop neighborhood of the originating intermediate node of the Local-RREQ message), then the destination node sends back a Local Route-Reply (Local-RREP) message to the originating intermediate node of the Local-RREQ message either through a direct path or through a 2-hop path, whichever is appropriate. Referring back to our example in Section 2, if the most recent path used is s– a–b–c–e–d and b – c is the broken link, the new path would be either s–a–b–d or s–a–b–g–d (refer Figure 1) depending on whether the destination node d is 1-hop or 2-hops away from the originating intermediate node, node b. Figure 1. Possible Outcome of the LRR Algorithm [Destination Node d sends LRR-RREP] (ii) If an intermediate node located further downstream on the recently used source-destination (s-d) path receives the Local-RREQ message, then that node sends a Local(IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 43 RREP message back to the originating node of the Local-RREQ message. In our example, if the intermediate node e located further downstream on the recently used path s–a–b–c–e–d receives the Local-RREQ message directly from node b, then node e sends a Local-RREP message to node b. A new path s–a–b–e– d with a reduced hop count has been thus learnt from the source s to destination d (refer Figure 2). (iii) If the Local-RREQ message is received by the targeted downstream node for which the 2-hop broadcast route search was primarily initiated, the targeted node responds back with a Local-RREP message through an intermediate node from which the Local-RREQ message was received. In our example, node c would respond through an intermediate node, say node f, and the new path learnt from the source to the destination would be s–a–b–f–c–e–d (refer Figure 3). Note that this new path has a hop count that is one more than that of the previously used path s–a–b–c–e–d. Figure 2. Possible Outcome of the LRR Algorithm [Intermediate Node e sends LRR-RREP] Figure 3. Possible Outcome of the LRR Algorithm [Targeted Downstream Node c sends LRR-RREP] It could be possible that all of the above three scenarios could co-exist and an intermediate node generating the Local-RREQ message may receive Local-RREP messages through each of the three means. In such a case, the Local-RREP message received from the original destination node of the path is the most preferred. Otherwise (i.e., no Local-RREP message is received from the destination node), if one or more intermediate nodes downstream on the path towards the destination send the Local-RREP messages, the Local-RREP message received from an intermediate node that lies on the shortest path from the source (of course, through the originating intermediate node of the Local-RREQ message) to the destination is preferred. The LRR algorithm can be executed for every link failure on a route from the source node to the destination node. LRR can be applied even if more than one link fails on a path from the source to the destination (i.e., once for each link failure). If the underlying network is disconnected, then the originating intermediate node of the Local-RREQ message fails to stitch the broken route (i.e., could not find a path either to the destination or to any of the downstream nodes). In such a scenario, the intermediate node sends a Local-RERR (Local Route Error) message to the source node indicating the failure to stitch the broken route. The source node then initiates a network-wide flooding of the RREQ messages to discover a route from the source to the destination. 4. Simulation Environment and Results In this paper, we incorporate LRR into the route management module of the Dynamic Source Routing (DSR) protocol, one of the widely used classic MANET routing protocols. The optimized version of DSR using the LRR algorithm is referred to as DSR-LRR. The performance and the potential benefits obtained using the proposed LRR algorithm has been evaluated through simulations conducted in the ns-2 simulator (version 2.28) [6]. We implemented the LRR module in ns-2. We used the implementation of DSR that comes with ns-2 and incorporated the developed LRR module into the DSR route management module to obtain the optimized DSR-LRR protocol. In order to explore the maximum possible performance gain obtainable with the LRR mechanism, we disabled promiscuous listening and the default route cache and maintenance mechanism in DSR-LRR. Simulations have been conducted in a square network of dimensions 1000m x 1000m. The transmission range of each node is 250m. The density of the network is varied by conducting the simulations with 50 nodes (representing a low network density – on the average of 10 neighbors per node) and 100 nodes (representing a high network density – on the average of 20 neighbors per node). We did not face much limitations in our implementations other than that the ns-2 simulator was not scalable enough to conduct simulations for a larger number of nodes, beyond 100. The node mobility model used is the Random Waypoint model [7] commonly used in MANET simulation studies. According to this model, the nodes are initially uniformranddoml distributed throughout the network. Each node moves independent of the other nodes in the network. Each node selects a random target location to move and moves to the selected location with a velocity uniform-randomly selected from the interval [0,…, vmax]. After reaching the targeted location, the node again selects a new location to (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 44 move and with a new randomly chosen velocity from the interval [0,…, vmax]. The values of vmax used in our simulations are 10m/s, 30m/s and 50m/s representing low, moderate and high node mobility scenarios respectively. The performance results illustrated in Figures 4 through 9 are average values obtained for 15 source-destination (s-d) pairs run against 5 different node mobility profiles for every combination of node mobility (vmax) and network density values. The packet sending rate for each s-d session is 4 data packets per second, over a simulation time period of 1000 seconds. 4.1 Performance Metrics The two main performance metrics measured are the average hop count per path and the average number of broadcast node transmissions per s-d session. The average hop count per path is the time-averaged hop count of all the s-d paths used for the entire communication session. For example, if the communication session lasted for 10 seconds using a 3-hop path for the first 4 seconds, a 2-hop path for the next 2 seconds and a 4-hop path for the next 4 seconds, then the time-averaged hop count is [3*4 + 2*2 + 4*4] /[4+2+4] = 3.2. The average number of broadcast node transmissions per s-d session is a measure of the control message overhead incurred by the routing protocols and is computed here as the sum of all the RREQ and/or Local-RREQ messages broadcast by the nodes in the network for the entire simulation time period, averaged over all the s-d sessions. The performance results illustrate a tradeoff between the above two performance metrics. Figure 4. Average Hop Count per Path (50 Nodes) 4.2 Average Hop Count per Path As expected, the routes used by DSR-LRR have a larger hop count than those used by the DSR protocol. However, the difference in the hop count is only within 15%-18% for lowdennsit networks and 21%-23% for high-density networks. Both DSR and DSR-LRR incur a lower hop count in highdennsit networks compared to those incurred in low-density networks, especially at low and moderate node mobility. This could be attributed to the availability of an increased number of nodes to choose as the next hop node in highdennsit networks. At low and moderate node mobility conditions, the average hop count per path in low-density networks is 9%-15% and 7%-12% more than that obtained in high-density networks. At high node mobility conditions, there is no significant difference in the hop count values obtained for each of these routing protocols in low and highdennsit networks. Figure 5. Average Hop Count per Path (100 Nodes) 4.3 Average Number of Broadcast Node Transmissions per Session For a given node mobility, the average number of broadcast node transmissions per session for DSR is 50%-60% and 60%-70% more than that of DSR-LRR in low-density and high-density networks respectively. For a given network density, the difference between the number of broadcast node transmissions incurred for DSR and DSR-LRR, increases as node mobility increases. Thus, the LRR algorithm is very effective in reducing the control message overhead in the network and helps to optimize the usage of critical resources like energy and bandwidth. Figure 6. Number of Broadcast Node Transmissions per Session (50 Node Network) Figure 7. Number of Broadcast Node Transmissions per Session (100 Node Network) In high-density networks, for a given node mobility condition, both DSR and DSR-LRR incur a larger number of broadcast node transmissions per session. This could be attributed to two factors: (i) As the size of the neighborhood is doubled, more nodes receive a RREQ/Local-RREQ message and broadcast the message to their neighbors; (ii) The minimum hop routes of DSR and DSR-LRR are more likely to be relatively less stable in high-density networks compared to those determined in low-density networks. Both (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 45 the routing protocols tend to minimize the number of intermediate nodes between the source and destination nodes of a route and as a result attempt to choose routes that have a larger physical distance between the upstream and downstream nodes of the constituent links of the routes. For both DSR and DSR-LRR, the number of broadcast node transmissions incurred in high-density networks is 100% (at low node mobility)-150% (at high node mobility) more than that incurred in low-density networks. As we increase the maximum node velocity value from 10m/s to 30m/s (i.e., by a factor of 3), the average number of broadcast node transmissions increases by a factor of 2.5 and 2.3 in low-density and high-density networks respectively. Similarly, as we increase the maximum node velocity value from 10m/s to 50m/s (i.e., by a factor of 5), the average number of broadcast node transmissions increases by a factor of 3.3 and 4.2 in low-density and highdennsit networks respectively. 5. Related Work A local route recovery mechanism based on the Expanding Ring Search was explored in [8] for improving TCP performance in multi-hop wireless ad hoc networks. Simulation results indicate with the application of the localrecoover techniques, the end-to-end delay per data packet (a measure of the hop count per path) for DSR increased as large as by 60%. The Witness-Aided Routing (WAR) protocol [9] attempts to overcome link failures by locally broadcasting the data packets within predefined hop limits. Even though there can be fast local recovery, this approach of broadcasting the data packets itself as the recovery packets leads to significantly high control overhead. The Associativity-based Routing (ABR) protocol [10] attempts to locally fix a broken route if the upstream node of the broken link is located in the second half of the route (i.e., the node is closer to the destination than to the source). The upstream node of the broken link broadcasts a local route request that will propagate with a hop limit equal to the remaining number of hops to the destination in the broken route. Only the destination could respond for the local route request. If the route to the destination cannot be determined, the host preceding the upstream node of the broken link is notified and the above local route discovery procedure is recursively repeated until the error message reaches a node that is in the first half of the broken route. At this time, the source node is notified about the route failure and it initiates a global flooding-based route discovery. This approach of recursive route repair will significantly consume the bandwidth and lead to longer delay. A local route repair mechanism for the Ad hoc Ondemman Distance Vector (AODV) [5] routing protocol has been proposed in [11]. Here, the upstream node of the broken link attempts to locally fix the broken route by trying to determine a route to the node which is the next hop node to the downstream node of the broken link. However, this approach requires nodes to keep track of the two-hop nodes (in addition to the next hop node) for the path towards every destination node. This can significantly increase the storage complexity of the routing tables at the nodes. 6. Benefits of the LRR Algorithm The proposed LRR algorithm is not protocol-specific and it can be incorporated into the route management module of any unicast routing protocol. The LRR algorithm will meet the energy, throughput and Quality of Service (QoS) requirements for communication in resource-constrained environments, typical of MANETs. The two-hop route repair technique of LRR can be very effective in speeding up the communication between the different wireless devices in a dynamically changing mobile environment. As a broken route is more likely to be stitched quickly using the proposed LRR algorithm, data is more likely to reach the destination nodes within a limited amount of time – thus, providing users with a desired QoS. Because of the tendency to reduce the number of flooding-based route discoveries, the proposed LRR algorithm helps to conserve energy and bandwidth in the network. With limited number of floodingbaase route discoveries, the available network capacity is also enhanced. There can be more simultaneous communications in the network. 7. Conclusions and Future Work The high-level contribution of this paper is the development of an effective and efficient Localized Route-Repair (LRR) algorithm for use with MANET unicast routing protocols. We illustrate the effectiveness of LRR by developing an optimized version of DSR (referred to as DSR-LRR) that has the LRR algorithm incorporated in its route management module. Simulation study illustrates a potential tradeoff between the control message overhead incurred (measured as the number of broadcast node transmissions) and the hop count per path. We observe that the number of broadcast node transmissions of the control messages incurred by DSR can be 50%-70% times more than that incurred by DSRLRRR on the other hand, the average hop count per path for DSR-LRR can be only as large as 25% more than that of DSR. Hence, the tradeoff is not evenly balanced. Thus, DSR-LRR can be a good replacement for DSR in resourceconsttraine environments where minimum hop count is not a critical factor. We conjecture to obtain similar results when LRR is incorporated into the route management modules of several existing MANET unicast routing protocols. As future work, we will apply the LRR technique to other unicast routing protocols such as the stability-based Flow-Oriented Routing Protocol (FORP) and the position-based Location-Aided Routing (LAR) protocol for MANETs. We will also study the performance of the LRR optimized routing protocols under different mobility models for ad hoc networks [12]. References [1] D. Broch, D. A. Maltz, D. B. Johnson, Y-C. Hu and J. Jetcheva, “A Performance Comparison of Multi-hop Wireless Ad hoc Network Routing Protocols,” Proceedings of the 4th Annual ACM/IEEE International Conference on Mobile Computing and Networking, (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 46 Dallas, TX, October 25 – 30, pp. 85 – 97, New York, NY, USA, 1998. [2] P. Johansson, T. Larsson, N. Hedman, B. Mielczarek and M. Degermark, “Scenario-based Performance Analysis of Routing Protocols for Mobile Ad hoc Networks,” Proceedings of the 5th Annual International Conference on Mobile Computing and Networking, Seattle, WA, USA, August 15 – 20, pp. 195 – 206, New York, NY, USA, 1999. [3] N. Meghanathan and A. Farago, UTDCS-40-04 (2004) Survey and Taxonomy of 55 Unicast Routing Protocols for Mobile Ad hoc Networks, The University of Texas at Dallas, Richardson, TX. [4] N. Meghanathan, “Exploring the Stability-Energy Consumption-Delay-Network Lifetime Tradeoff of Mobile Ad hoc Network Routing Protocols,” Academy Publisher Journal of Networks, Vol. 3, No. 2, pp. 17 – 28, 2008. [5] C. Perkins, E. B. Royer and S. Das, “Ad hoc On-Demand Distance Vector (AODV) Routing,” IETF RFC 3561, July 2003. [6] K. Fall and K. Varadhan, NS-2 Notes and Documentation, The VINT Project at LBL, Xerox PARC, UCB, and USC/ISI, http://www.isis.edu/nsnam/ns, August 2001. [7] C. Bettstetter, H. Hartenstein and X. Perez-Costa, “Stochastic Properties of the Random-Way Point Mobility Model,” Wireless Networks, pp. 555-567, Vol. 10, No. 5, September 2004. [8] Z. Li and Y.-K. Kwok, “Local Route Recovery Algorithms for Improving Multi-hop TCP Performance in Ad hoc Wireless Networks,” Lecture Notes in Computer Science, vol. 3149, pp. 925-932, December 2004. [9] I. D. Aron, S. K. S. Gupta and E. K. S. Gupta, “A Witness-Aided Routing Protocol for Mobile Ad hoc Networks with Unidirectional Links,” Proceedings of the 1st International Conference on Mobile Data Access, pp. 24-33, Hong Kong, December 1999. [10] C-K. Toh, “Associativity-Based Routing for Ad hoc Mobile Networks,” IEEE Personal Communications, Vol. 4, No. 2, pp. 103 – 139, March 1997. [11] X. Bai-Long, G. Wei, L. Jun and Z. Si-Lu, “An Improvement for Local Route Repair in Mobile Ad hoc Networks,” Proceedings of the 6th International Conference on ITS Telecommunications, pp. 691 – 694, June 2006. [12] T. Camp, J. Boleng and V. Davies, “A Survey of Mobility Models for Ad Hoc Network Research,” Wireless Communication and Mobile Computing, Vol. 2, No. 5, pp. 483-502, September 2002. Author Profile Natarajan Meghanathan is currently working as Assistant Professor of Computer Science at Jackson State University, Mississippi, USA, since August 2005. Dr. Meghanathan received his MS and PhD in Computer Science from Auburn University, AL and The University of Texas at Dallas in August 2002 and May 2005 respectively. Dr. Meghanathan’s main area of research is ad hoc networks. He has more than 45 peer-reviewed publications in leading international journals and conferences in this area. Dr. Meghanathan has recently received grants from the Army Research Laboratory (ARL) and National Science Foundation (NSF). He serves as the editor of a number of international journals and also in the program committee and organization committees of several leading international conferences in the area of networks. Besides ad hoc networks, Dr. Meghanathan is currently conducting research in Bioinformatics, Network Security, Distributed Systems and Graph Theory. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 47 Selection of Proper Activation Functions in Backpropaagatio neural networks algorithm for Transformer Internal Fault Locations A. Ngaopitakkul and A. Kunakorn Faculty of Engineering King Mongkut’s Institute of Technology Ladkrabang, Bangkok 10520, Thailand knatthap@kmitl.ac.th Abstract: This paper presents an analysis on the selection of an appropriate activation used in neural networks for locating the internal fault locations of a two-winding three-phase transformer. A decision algorithm based on a combination of Discrete Wavelet Transforms and neural networks is developed. Fault conditions of the transformer are simulated using ATP/EMTP in order to obtain current signals. The training process for the neural network and fault diagnosis decision are implemented using toolboxes on MATLAB/Simulink. Various activation functions in hidden layers and output layers are compared in order to find and to select the best activation function for locating the position of internal faults of the winding transformer for the winding to ground faults. It is found that the use of Hyperbolic tangent-function for the hidden layers, and Linear activation function for the output layer gives the most satisfactory accuracy in these particular case studies. Keywords: Internal faults, Discrete Wavelet Transforms, Back-propagation neural network, Transformer windings 1. Introduction During the course of recent years, the development of fault diagnosis techniques for the power transformer has been progressed with the applications of wavelet transform and artificial neural networks [1,2,3,4,5]. Many research reports have paid consideration in effects of the magnetizing inrush current as well as the discrimination between magnetizing inrush current and internal faults [2,3,5]. It is very useful for electrical engineers if the fault positions along transformer windings can be detected. Therefore, a decision algorithm used to locate the fault position along the winding in order to decrease complexity and duration of maintenance time is required. Neural networks have been employed in the development of such an algorithm, and proved to be a powerful tool in fault detection as well as classification [2,3]. The activation function is a key factor in the artificial neural network structure. Back-propagation neural networks support a wide range of activation functions such as sigmoid function and linear function etc. The choice of activation function can change the behavior of the back-propagation neural network considerably. There is no theoretical reason for selecting a proper activation function. Hence, the objective of this paper is to consider studies of an appropriate activation function for the algorithm used in the detection of internal fault locations along transformer windings. The activation functions in each hidden layers and output layer are varied, and the results obtained from the decision algorithm are investigated. The decision algorithm is a part of a transformer protective scheme proposed in this paper. The structure of the protective scheme is shown in Figure 1. The simulations, analysis and diagnosis are performed using ATP/EMTP and MATLAB on a PC Pentium IV 2.4 GHz 512 MB. It is noted that the discrete wavelet transform is employed in extracting the high frequency component contained in the internal fault currents of a transformer. The construction of the decision algorithm is detailed and implemented with various case studies based on Thailand electricity transmission and distribution systems. Modal mixing unit WT Filter Analogue input module Calculating Differential current Detail (scale 1-5) Start Detection Calculating by Weight & bias of BP Decision Logic unit Fault position Comparison Coefficient Decision making unit Trip signal CT CT Y Y Figure 1. The transformer protective scheme 2. Simulation 2.1 Transformer winding models For a computer model of a two-winding three-phase transformer having primary and secondary windings in each phase, BCTRAN is a well-known subroutine on ATP/EMTP. To study internal faults of the transformer, Bastard et al proposed modification of the BCTRAN subroutine. Normally, the BCTRAN uses a matrix of inductances with a size of 6x6 to represent a transformer, but with the internal fault conditions the matrix is adjusted to be a size of 7x7 for winding to ground faults and of 8x8 for interturn faults [6]. In the research work of Bastard et al [6], the model was proved to be validate and accurate due to a comparison with measurement results. However, the effects of high frequency components which may occur during the faults are not included in such a model. Islam and Ledwich [7] described the character istics of high frequency responses of a transformer due to various faults. It has been shown that the fault types and fault locations have an influence on the frequency responses of the transformer [7]. In addition, it has (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 48 been proved that transient based protections using high frequency components in fault currents can be applicable in locating and classifying faults on transmission lines [8-9]. It is, therefore, useful to investigate the high frequency components superimposed on the fault current signals for a development of a transient based protection for a transformer. As a result, in this paper the combination of the transformer models proposed by Bastard et al [6] as shown in Figure 2, with the high frequency model including capacitances of the transformer recommended by IEEE working group [10] as shown in Figure 3., are used for simulations of internal faults a t t h e t r a n s f o r m e r w i n d i n g s . Primary Secondary Phase A Phase B Phase C ab 2 3 4 5 6 Figure 2. The modification on ATP/EMTP model for a three-phase transformer with internal faults. 115/23 kV 50 MVA Chg Clg Chl Primary Secondary Figure 3. A two-winding transformer with the effects of stray capacitances. The capacitances shown in Figure 3 are as follows: Chg = stray capacitance between the high voltage winding and ground Clg = stray capacitance between the low voltage winding and ground Chl = stray capacitance between the high voltage winding and the low voltage winding. The process for simulating winding to ground faults based on the BCTRAN routine of EMTP, can be summarized as follows: 1st step: Compute matrices [R] and [L] of the power transformer from manufacture test data [11] without considering the winding to ground faults [6]. [ ]         = 6 1 0 0 R R R L M O M L (1) [ ]             = 6 62 61 26 2 21 16 12 1 L L L L L L L L L L L M O M M LL (2) 2nd step: Modify Equations 5 and 6 to obtain the new internal winding fault matrices [ ]* R and [ ]* L as illustrated in Equations 3-4 [6]. [ ]                     = * 6 5 4 3 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 R R R R R R R R b a (3) [ ]                     = * 6 65 64 63 62 6 6 56 5 54 53 52 5 5 46 45 4 43 42 4 4 36 35 34 3 32 3 3 26 25 24 23 2 2 2 6 5 4 3 2 6 5 4 3 2 L M M M M M M M L M M M M M M M L M M M M M M M L M M M M M M M L M M M M M M M L M M M M M M M L L b a b a b a b a b a b b b b b b ba a a a a a ab a (4) 3rd step: The inter-winding capacitances and earth capacitances of the HV and LV windings can be simulated by adding lumped capacitances connected to the terminals of the transformer. 2.2 Power System simulation using EMTP A 50 MVA, 115/23 kV two-winding three-phase transformer was employed in simulations with all parameters and configuration provided by a manufacturer [11]. The scheme under investigations is a part of Thailand electricity transmission and distribution system as depicted in Figure 4. It can be seen that the transformer as a step down transformer is connected between two subtransmmissio sections. To implement the transformer model, simulations were performed with various changes in system parameters as follows: -The angles on phase A voltage waveform for the instants of fault inception were 0o-330o (each step is 30°). -Internal faults type at the transformer windings (both primary and secondary) which is winding to ground faults was investigated. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 49 -The fault position were designated on any phases of the transformer windings (both primary and secondary), was varied at the length of 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% and 90% measured from the line end of the windings. -Fault resistance was 5 W. Figure 4. The system used in simulations studies [12]. The primary and secondary current waveforms, then, can be simulated using ATP/EMTP, and these waveforms are interfaced to MATLAB/Simulink for a construction of fault diagnosis process. Figure 5. illustrates an example of phase A to ground fault at 10% in length of the high voltage winding while phase A to ground fault occurred at 10% in length of the low voltage winding as shown in Figure 6. Figure 5. Primary and secondary currents for a case of phase A to ground fault at 10% in length of the high voltage winding. Figure 6. Primary and secondary currents for a case of phase A to ground fault at 10% in length of the low voltage winding. With fault signals obtained from the simulations, the differential currents, which are a deduction between the primary current and the secondary current in all three phases as well as the zero sequence, are calculated, and the resultant current signals are extracted using the Wavelet transform. The coefficients of the signals obtained from the Wavelet transform are squared for a more explicit comparison. Figure 7. illustrates an example of an extraction using Wavelet transform for the differential currents and zero sequence current from scale1 to scale 5 for a case of phase A to ground fault at 10% in length of the high voltage winding while case of phase A to ground fault at 10% in length of the low voltage winding as shown in Figure 8. Figure 7. Wavelet transform of differential currents (Winding to ground fault at 10% in length of the high voltage winding) Figure 8. Wavelet transform of differential currents (Winding to ground fault at 10% in length of the low voltage winding) 3. Fault Detection Algorithm After applying the Wavelet transform to the differential currents, the comparison of the coefficients from each scale is considered. Wavelet transform is applied to the quarter cycle of current waveforms after the fault inception. With several trial and error processes, the decision algorithm on the basis of computer programming technique is constructed. The most appropriate algorithm for the decision with all results from the case studies of the system under the investigations can be concluded as Figure 9. where, scale = indicator scale of DWT which considered for detecting fault (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 50 = + diff ) 5 (t X s m coefficient from Wavelet transform for the differential current detected from phase X at the time of t+5ms = ® diff t) max(0 X coefficient from Wavelet transform for the differential current detected from phase X at the time from t =0 to t = t+5ms = diff chk X comparison indicator for a change in coefficient from Wavelet transform ( diff check diff check diff check C B A , , ), used for separation between normal conditions and faults = 1 t ٥ msec (depending on the sampling time used in ATP/EMTP) Sum of Differential current in each phases No Yes Wavelet scale 1-5 Start Differential Current Signal 1. All three phases 2. Zero sequence For scale = 1 scale = scale+1 For t = 5 us : 100 ms ) X * 5 (X diff t) max(0 diff ) 5 (t ® + ³ us t = t + 5us No Yes 2 ³ diff chk X Fault condition Normal condition No Yes 0 = diff chk X and find 0 = diff chk X ) Xdiff t) max(0 ® and record time (t+5us) 1 = diff chk X 1 = diff chk X t = 100 ms Yes No Figure 9. Flowchart for detecting the phase with a fault condition By performing many simulations, it has been found that when applying the previously detailed algorithm for detecting internal faults at the transformer winding, the coefficient in scale 1 (50-100 kHz) from DWT seems enough to indicate the internal fault inception of the transformer. As a result, it is unnecessary to use other coefficients from higher scales in this algorithm, and the coefficients in scale 1 from DWT are used in training processes for the neural networks later. 4. Neural Network Decision Algorithm From the simulated signals, DWT is applied to the quarter cycle of differential current waveforms after the fault inception. The coefficients of scale 1 obtained using the wavelet transforms are used for training and test processes of the BPNN. In this paper, a structure of a back propagation neural network consists of three layers which are an input layer, two hidden layers and an output layer as shown in Figure 10. Each layer is connected with weights and bias. In addition, the activation function is a key factor in the artificial neural network structure. The choice of activation function can change the behavior of the backpropaagatio neural network considerably. Hence, the activation functions in each hidden layers and output layer are varied as illustrate in Table 1 in order to select the best activation function for locating the positions along the transformer windings due to winding to ground faults of a two-winding transformer. TABLE 1: Activation functions in all hidden layers and output layers for training neural networks Activation function in first hidden layer second hidden layer output layer Linear function Logistic sigmoid Logistic sigmoid function Hyperbolic Tangent sigmoid Linear function Logistic sigmoid Hyperbolic Tangent sigmoid function Hyperbolic Tangent sigmoid function Hyperbolic Tangent sigmoid Linear function Logistic sigmoid Logistic sigmoid function Hyperbolic Tangent sigmoid Linear function Logistic sigmoid Logistic sigmoid function Hyperbolic Tangent sigmoid function Hyperbolic Tangent sigmoid A training process was performed using neural network toolboxes in MATLAB. It can be divided into three parts as follows [13]: 1 The feedforward input pattern, which has a propagation of data from the input layer to the hidden layer and finally to the output layer for calculating responses from input patterns illustrated in Equations 5 and 6. ( ) ( ) 2 1 1 , 1 1 1 , 2 2 2 * * b b p iw f lw f a + + = , (5) ( ) 3 2 2 , 3 3 * /b a lw f p o ANN + = . (6) (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 51 ∑ 1 f 1 · 12 n 1 R P 1 , 11 , 1 iw ·· 1 P2 P3 P 1 , 1 ,R S iw 1 f 1 f··· 12 b ∑1 11 S n 11 S b ∑1 11 n 11 b ··· ∑1 22 n 2 f··· 2 2 b ∑1 22 S n 22 S b ∑1 2 1 n 2 1 b 12 a11 S a11 a 1 , 2 , 1 2 S S lw 1 , 21 , 1 lw 2 f 2 f ··· ∑1 32 n 3 f··· 32 b ∑1 33 S n 33 S b ∑1 3 1 n 3 1 b 22 a22 S a2 1 a 2 , 3 , 2 3 S S lw 2 , 31 , 1 lw ··· 3 f 3 f 32 a 33 S a3 1 a··· Layer Input Layer Hidden 1st Layer Hidden 2nd Layer Output Figure 10. Back-propagation with two hidden layers where, p = input vector of BPNN iw1,1 = weights between input and the first hidden layer lw2,1 = weights between the first and the second hidden layers lw3,2 = weights between the second hidden layer and output layers b1, b2= bias in the first and the second hidden layers respectively b3 = bias in output layers f1, f2 = activation function (Hyperbolic tangent sigmoid function : tanh) f3 = activation function (Linear function) 2 The back-propagation for the associated error between outputs of neural networks and target outputs; The error is fed to all neurons in the next lower layer, and also used as an adjustment of weights and bias. 3 The adjustment of the weights and bias by Levenberg-Marquardt (trainlm). This process is aimed at trying to match between the calculated outputs and the target outputs. Mean absolute percentage error (MAPE) as an index for efficiency determination of the back-propagation neural networks is computed in Equation 7. % 100 * ///* 1 1 ∑= - = n i TARGETi TARGETi ANNip o p o p o n MAPE (7) where, n = number of test sets As a result, a structure of the back propagation neural network consists of 4 neuron inputs, two hidden layers and 1 neuron output. The inputs pattern are the maximum coefficients details (cD1) in scale 1 at ¼ cycle of phase A, B, C and zero sequence for post-fault differential currents as mentioned in the previous section. The output variables of the neural networks are designated as values range 0.1 to 0.9 which corresponding to length of the winding that fault occurs. In this training process, a number of neurons in both hidden layers were increased as well as varying the activation functions in all hidden layers and the output layer in order to select the best performance. Input data sets are normalized and divided into 216 sets for training and 108 sets for tests. During the training process, the weight and biases were adjusted, and there were 20,000 iterations in order to compute the best value of MAPE. The number of neurons in both hidden layers was increased before repeating the cycle of the training process. The training procedure was stopped when reaching the final number of neurons for the first hidden layer or the MAPE of test sets was less than 0.5%. The training process can be summarized as Figure 11. Figure 11. Flowchart for the training process. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 52 Figure 12. Comparison of average error for fault position at various activation functions between each transformer windings. Table 2 : Average error of test sets for locating of fault Activation function in High voltage winding Low voltage winding the first hidden layer the second hidden layer the output layer Maximum error Minimum error Average error Maximum error Minimum error Average error Logistic sigmoid function Linear function 0.0414 0.0000 0.0099 0.1693 0.0000 0.0309 Hyperbolic tangent sigmoid function Hyperbolic tangent sigmoid function Linear function 0.0322 0.0001 0.0089 0.0621 0.0001 0.0211 Logistic sigmoid function Linear function 0.0497 0.0000 0.0094 0.1759 0.0000 0.0307 Logistic sigmoid function Hyperbolic tangent sigmoid function Linear function 0.0483 0.0001 0.0098 0.1709 0.0006 0.0377 After the training process, the algorithm was employed in order to locate fault positions in the winding transformer. Case studies were varied so that the algorithm capability can be verified. Case studies were performed with various types of fault at each position in the transformer. The total number of the case studies was 216. The result obtained from various activation functions of test set both high voltage and low voltage winding as shown in Figure 12. From Figure 12, it can be seen that there are four cases activation functions with average error less than 5% as follows: -Hyperbolic tangent – Logistic – Linear. -Hyperbolic tangent – Hyperbolic tangent – Linear. -Logistic – Logistic – Linear. -Logistic – Hyperbolic tangent – Linear. When the training process was completed, the algorithm was implemented to locate fault positions due to winding to ground faults along the transformer windings. The results obtained from the algorithm proposed in this paper are shown in Table 2. It can be seen that the accuracy from Hyperbolic tangent – Hyperbolic tangent – Linear activation function case is highly satisfactory. From Figure 13, the comparison of average error at various lengths of the winding among four case activation functions is shown. It can be seen that the average error of fault locations from the high voltage winding is 2.5% while the average error of low voltage winding is 6% at various lengths of the transformer winding. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 53 (a) High Voltage (b) Low Voltage Figure 13. Comparison of average errors for fault positions at various lengths of windings among various activation functions. (a) High Voltage (b) Low Voltage Figure 14. Comparison of average error for fault position at various lengths of the winding among phases that fault occur with Hyperbolic tangent – Hyperbolic tangent – Linear are activation functions in each layers. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 54 From Fig. 14, it can be seen that Hyperbolic tangent – Hyperbolic tangent – Linear as activation function in each layers, are tested with various fault types on both high voltage and low voltage windings of the three-phase transformer, the accuracy of fault locations from the prediction of the decision algorithm is highly satisfactory as well as Table 3-6. Table 3 : Results of phase A to ground fault at high voltage winding with various inception angles. (Fault position at 10% in length of the winding) Prediction Fault Type Inception angle (Degree) Actual position (%) Output ½Error½ 90 0.1 0.1125 0.0125 150 0.1 0.1243 0.0243 240 0.1 0.1121 0.0121 Phase A to ground fault (HV) 300 0.1 0.1169 0.0169 Table 4 : Results of phase A to ground fault at high voltage winding with various lengths of the winding. (Inception angle of 240o) Prediction Fault Type Inception angle (Degree) Actual position (%) Output ½Error½ 240 0.2 0.2005 0.0005 240 0.4 0.4054 0.0054 240 0.6 0.6042 0.0042 Phase A to ground fault (HV) 240 0.8 0.8025 0.0025 Table 5 : Results of phase A to ground fault at low voltage winding with various inception angles. (Fault position at 10% in length of the winding) Prediction Fault Type Inception angle (Degree) Actual position (%) Output ½Error½ 60 0.1 0.0886 0.0114 120 0.1 0.136 0.036 210 0.1 0.1386 0.0386 Phase A to ground fault (LV) 330 0.1 0.0568 0.0432 Table 6 : Results of phase A to ground fault at low voltage winding with various lengths of the winding. (Inception angle of 210o) Prediction Fault Type Inception angle (Degree) Actual position (%) Output ½Error½ 210 0.2 0.1943 0.0057 210 0.4 0.4265 0.0265 210 0.6 0.5871 0.0129 Phase A to ground fault (LV) 210 0.8 0.7968 0.0032 5. Conclusion In this paper, Studies of an appropriate activation function for the decision algorithm used in the detection of internal fault locations along transformer winding have been discussed. The maximum coefficient from the first scale at ¼ cycle of phase A, B, and C of post-fault differential current signals and zero sequence current obtained by the wavelet transform have been used as an input for the training process of a neural network in a decision algorithm with a use of the back propagation neural networks. The activation functions in each hidden layers and output layer have been varied, and the results obtained from the decision algorithm have been investigated with the variation of fault inception angles, fault types and fault locations. The results have illustrated that the use of Hyperbolic tangent sigmoid function in the first and the second layers with Linear function in the output layer is the most appropriate scheme for the internal fault detection of the transformer windings as summarized in Table II. This technique should be useful in checking and repairing the transformer when winding to ground faults occur. The further work will be the improvement of the algorithm so that positions of interturn faults along the windings of the transformer can be identified. References [1] A.G. Phadke and J.S. Thorp, A new computer-based flux restrained current-differential relay for power transformer protection, IEEE Trans. Power Appar. Syst. PAS-102 (5) (1983) 3624-3629. [2] T.S. Sidhu and M.S. Sachdev, On-line identification of magnetizing inrush current and internal faults in threephhas transformers, IEEE Trans. Power Delivery 7 (4) (1992) 1885-1891. [3] Y.Zhang, X.Ding, Y.Liu and P.J. Griffin, An artificial neural network approach to transformer fault diagnosis, IEEE Trans. Power Delivery 11 (4) (1996) 1836-1841. [4] M.G. Morante and D.W. Nocoletti, A wavelet-based differential transformer protection, IEEE Trans. Power Delivery 14 (4) (1999) 1352-1358. [5] O.A.S. Youssef, A wavelet-base technique for discrimination between faults and magnetizing inrush currents in transformers, IEEE Trans. Power Delivery 18 (1) (2003) 170-176. [6] P. Bastard, P. Bertrand and M. Meunier, A transformer model for winding fault studies, IEEE Trans. Power Delivery 9 (2) (1994) 690-699. [7] S. M. Islam and G. Ledwich, Locating transformer faults through sensitivity analysis of high frequency modeling using transfer function approach, IEEE International Symposium on Electrical Insulation, (1996) 38-41. [8] Z. Q. Bo, M. A. Redfern and G. C. Weller, Positional Protection of Transmission Line Using Fault Generated High Frequency Transient Signals, IEEE Trans. Power Delivery 15 (3) (2000) 888-894. [9] P.Makming, S. Bunjongjit, A.Kunakorn, S. Jiriwibhakorn and M. Kando, Fault diagnosis in (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 55 transmission lines using wavelet transforms, in: Proceedings of IEEE Transmission and Distribution Conference, pp. 2246-2250, 2002. [10] IEEE working group 15.08.09, Modeling and analysis of system transients using digital programs, IEEE PES special publication [11] ABB Thailand, Test report no. 56039. [12] “Switching and Transmission Line Diagram”, Electricity Generation Authorisation Thailand (EGAT). [13] A. Ngaopitakkul and A. Kunakorn, “Internal Fault Classification in Transformer Windings using Combination of Discrete Wavelet Transforms and Back-propagation Neural Networks,” International Journal of Control, Automation, and Systems (IJCAS), pp. 365-371, June 2006. Authors Profile Atthapol Ngaopitakkul graduated with B.Eng, M.Eng and D.Eng in electrical engineering from King Mongkut’s Institute of Technology Ladkrabang, Bangkok, Thailand 2002, 2004 and 2007 respectively. His research interests are on the applications of wavelet transform and neural networks in power system analysis. Anantawat Kunakorn graduated with B.Eng (Hons) in electrical engineering from King Mongkut’s Institute of Technology Ladkrabang, Bangkok, Thailand in 1992. He received his M.Sc in electrical power engineering from University of Manchester Institute of Science and Technology, UK in 1996, and Ph.D. in electrical engineering from Heriot-Watt University, Scotland, UK in 2000. He is currently an associate professor at the department of electrical engineering, King Mongkut’s Institute of Technology Ladkrabang, Bangkok, Thailand. He is a member of IEEE and IEE. His research interest is electromagnetic transients in power systems. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 56 A Proposed SAFER Plus Security algorithm using Fast Psuedo Hadamard Transform (FPHT) with Maximum Distance Separable code for Bluetooth Technology D.Sharmila1, R.Neelaveni2 1(Research Scholar), Associate Professor, Bannari Amman Institute of Technology, Sathyamangalam. Tamil Nadu-638401.sharmiramesh@rediffmail.com 2 Asst.Prof. PSG College of Technology, Coimbatore.Tamil Nadu -638401. rnv@eee.psgtech.ac.in Abstract: In this paper, comparison of various security algorithms like pipelined AES, triple DES, Elliptic Curve Diffie Hellman (ECDH), Existing SAFER plus and Proposed SAFER+ algorithm for Bluetooth security systems are done. Performance of the above security algorithms are evaluated based on the parameters -data throughput, frequency and security level of the algorithms. The existing SAFER+ algorithm is modified to achieve higher data throughput, frequency and security level. On comparison, Proposed SAFER+ algorithm proves to be better for implementation in Bluetooth devices than the Existing algorithms. Keywords: Secure And Fast Encryption Routine, Triple Data Encryption Standard, Pipelined Advanced Encryption Standard, Elliptic Curve Diffie Hellmann, Pseudo Hadamard Transform, Encryption and Decryption. 1. Introduction Wireless communication is one of the vibrant areas in communication field today while it has been the topic of study since 1960’s, the past decade has seen a surge of research activities in the area. This is due to the confluence of several factors. First there has been an explosive increase in the demand for tether less connectivity driven so far mainly by cellular telephony but expected to be soon eclipsed by wireless data applications. Second the dramatic progress in VLSI technology has enabled small-area and low-power implementation of sophisticated signal processing algorithms and coding techniques. Third the success of third-generation (3G) digital wireless standards, in particular, the IS-95 Code Division Multiple Access (CDMA) standard, provides a concrete demonstration that good ideas from communication theory have a significant impact in practice Wireless communication technology has advanced at a very fast pace during the last years, creating new applications and opportunities. In addition, the number of computing and telecommunications devices is increasing. Special attention has to be given in order to connect efficiently these devices. Any electronic devices can be communicated without wire. In the past, cable and infrared light connectivity methods were used. The cable solution is complicated since it requires special connectors, cables and space. This produces a lot of malfunctions and connectivity problems. The infrared solution requires line of sight. The lists of draw backs due to cables are · A tangle of cables · Varying standards of cables and connectors · Unreliable galvanic connections · Need to keep cables and connectors on store · Awkward to move computerized units to different · locations, as cables might not be long enough · Need for manual switches when the number of physical ports are not sufficient for the need at hand There are several security algorithms available to ensure the security in wireless network devices. Some of the major methods are AES, DES, Triple DES, IDEA, BLOWFISH, SAFER plus, ECDH etc. The SAFER+ algorithm is based on the existing SAFER family of ciphers. Although SAFER+ is the most widely used algorithm, it seems to have some vulnerabilities. The objective is to compare the various security algorithms like pipelined AES, triple DES , Elliptic Curve Diffie Hellman (ECDH), Existing SAFER+ and Proposed SAFER+ algorithm The Proposed SAFER+ algorithm has rotation block between every round of Existing SAFER+, PHT is replaced by Fast Psuedo Hadamard transform (FPHT) and first round inputs are added or ored with the third round inputs and fifth round inputs are added or ored with the seventh round inputs. Thus the proposed SAFER+ has higher data throughput and frequency. This proves that proposed SAFER+ algorithm has better data throughput and frequency than the existing algorithms. In this paper, section 2 describes the overview of Bluetooth security architecture. Section 3 deals with the Existing SAFER+ algorithm. The Proposed SAFER+ algorithm is explained in section 4. A section 5 deal with the results, Section 6 refers the conclusion. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 57 2. Bluetooth Architecture Connection types define the possible ways Bluetooth devices can exchange data. Bluetooth has three connection types: ACL (Asynchronous Connection-Less), SCO (Synchronous Connection-Oriented) and (eSCO) extended Synchronous Connection Oriented. ACL links are for symmetric (maximum of 1306.9 kb/s for both directions) or asymmetric (maximum of 2178.1 kb/s for send and 177.1 kb/s for receive) data transfer. Retransmission of packets is used to ensure integrity of data. SCO links are symmetric (maximum of 64 kb/s for both directions) and they are used for transferring real-time two-way voice. Retransmission of voice packets is not used. Therefore, when the channel BER is high, voice can be distorted. eSCO links are also symmetric (maximum of 864 kb/s for both directions) and they are used for transferring real-time two-way voice. Retransmission of packets is used to ensure the integrity of data (voice). Because retransmission of packets is used, eSCO links can also carry data packets, but they are mainly used for real-time two-way voice. Only Bluetooth 1.2 or 2.0+EDR devices can use eSCO links, but SCO links must also be supported to provide backward-compatibility. Bluetooth devices that communicate with each other form a piconet. [7] [8] [9].but only two of them actually provide confidentiality. The modes are as follows: 3. Description of SAFER Plus Algorithm The SAFER+ (Secure And Fast Encryption Routine) algorithm is based on the existing SAFER family of ciphers, which comprises the ciphers SAFER K-64, SAFER K-128, SAFER SK-128. All algorithms are byte-oriented block encryption algorithms, which are characterized by the following two properties. First, they use a non-orthodox linear transformation, which, is called Pseudo-Hadamard-Transformation (PHT) for the desired diffusion, and second, they use additive constant factors (Bias vectors) in the scheduling for weak keys avoidance. [5] It consists of two main units: the encryption data path and the key-scheduling unit. The key-scheduling unit allows onthhefly computation of the round keys. To reduce the silicon area, we used eight loops of a key scheduling single-round implementation. Round keys are applied in parallel in the encryption data path. The full Safer+ algorithm execution requires eight loops of the single round. We chose the single-round hardware implementation solution because, with this minimum silicon area, we could achieve the required throughput. The encryption data path’s first component is an input register, which combines the plaintext and the feedback data produced in the previous round. The input register feeds the safer+ single round. 3.1 SAFER + Single round A Safer+ single round has four subunits: · The mixed XOR/addition subunit, which combines data with the appropriate round sub key K2r–1. · The non-linear layer (use of the non-linear functions e and l). The e function is implemented as y = 45x in GF(257), except that 45128 = 0. The l function is implemented as y = log45(x) in GF(257), except that log45(0) = 128. · The mixed addition/XOR subunit, which combines data with the round sub key K2r · The four linear Pseudo-Hadamard Transformation layers, connected through an “Armenian Shuffle” permutation. Figure1. SAFER Plus single round The implementation of the non-linear layer using a datamapppin component that produces the X1 and X2 bytes is done. These bytes are the input of the non-linear functions e and l. During one round, we execute e and l eight times. This design significantly reduces the required silicon area. Each function is implemented using 256 bytes of ROM. After the SAFER+ single round in the encryption data path is a mixed XOR/addition (or key odd addition) component. 4. Proposed SAFER Plus Algorithm Figure 2. Proposed SAFER+ for encryption (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 58 The Existing SAFER+ algorithm is modified to provide higher data throughput and frequency. The modified SAFER+ algorithm has three modifications when compared to the existing one. Figure 3. Proposed SAFER+ for Decryption (i) Rotation block is introduced between every round. Rotation is towards left for encryption and towards right for decryption (ii) The input of round 1 and the output of round 2 are Xor/Add Modulo 16 byte-by-byte to form the input of round 3. Similarly the input of round 5 and the output of round 6 are Xor/Add Modulo 16 byte-by-byte to form the input of round 7. (iii) Instead of PHT layer, Fast Psuedo Hadamard Transform (FPHT) is used. The Encryption and the Decryption block diagrams are given in the figure 2 and figure 3.The proposed work is to replace the Pseudo Hadamard Transform by Fast Psuedo Hadamard transform. Figure 4. Proposed SAFER+ single round Figure 4 shows the proposed SAFER+ single round architecture. In this, fast algorithms for Psuedo Hadamard Transform with MDS code are used to implement pattern matching most efficiently instead of PHT. The FPHT has been analyzed with respect to its speed and security. The transform has a provably bounded branch value for any given dimension as well as a fast implementation which requires at most O(NlogN) time to complete. It is possible to join the FPHT and MDS to create a fast transform that has higher branch than the FPHT alone. 4.1 Fast Psuedo Hadamard Transform Algorithm FPHT has several efficient means of implementation which make the design construct very flexible. The FPHT can be characterized by a recursive linear transform defined by the relationship --(1) It is provably non-singular since the two vectors [2, 1] and [1, 1] linearly independent. An emerging block cipher and one-way hash function design construct is the Maximum Distance Separable (MDS) code. The goal of the MDS code is to promote a high branch through the linear components of the design to ensure a correspondingly low differential and linear “proprattio" An MDS code of dimension NxN requires O (N2) time to complete. MDS and FPHT codes can be combined to produce fast transforms with branch numbers much higher than the comparable dimension unmodified FPHT.Any FPHT requires at most O (n log n) time to complete which scales nicely compared to an equal dimension MDS code which requires O (n2) time. More specifically with O(n) space an FPHT requires only O(logn) time to complete.In hardware designs the actual transform is very efficient. Since only the H1 transform must be implemented directly as shown in figure 5. A trivial multiplication by p(x) = x is all that is required. Figure 5. H3 as a three layer network 5. Results Various existing algorithms are analyzed and compared with the proposed algorithm based on the parameters such (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 59 as encryption frequency, Data throughput and security level and the results are shown in the bar charts. ENCRYPTION TIME (milliseconds) 99 88.33 78.08 65.44 58.33 0 20 40 60 80 100 120 Triple DES Pipelined AES ECDH Safer Plus Safer Plus (FPHT with MDS) ALGORITHMS ENCRYPTION TIME (ms) Figure 6. Encryption Time Vs Various Algorithms Based on the analysis, the modified Safer Plus algorithm required minimum encryption time and maximum encryption frequency when compared to all the existing algorithms due to the inclusion of FPHT instead of PHT layer as shown in Figure 6 and Figure 7. ENCRYPTION FREQUENCY (kilohz) 10.1 12.56 14.82 16.63 19.14 05 10 15 20 25 Triple DES Pipelined AES ECDH Safer Plus Safer Plus (FPHT with MDS) ALGORITHMS ENCRYPTION FREQUENCY (kilohz) Figure 7. Encryption Frequency Vs Various Algorithms The number of hits required to attack the various algorithms is also compared and shown in Figure 8. The security level is much enhanced for the algorithm proposed since the number of hits is found to be maximum due to the introduction of the rotation block between every round. HITS COUNTS FOR ATTACK 943999 688787 722519 8737361125852 0 200000 400000 600000 800000 1000000 1200000 Triple DES Pipelined AES ECDH Safer Plus Safer Plus (FPHT w ith MDS) ALGORITHMS HITS COUNTS FOR ATTACK Figure 8. No. of Hit counts Vs Various algorithms DATA THROUGHPUT (Bytes) 0 20 40 60 80 100 Triple DES Pipelined AES ECDH Safer Plus Safer Plus (FPHT with MDS) ALGORITHMS DATA THROUGHPUT (Bytes) Figure 9. Data throughput Vs various algorithm Figure 9 shows that modified Safer plus algorithm has higher data throughput comparatively because the input of round 1 and the output of round 2 are Xor/Add operation to form the input of round 3. 6. Conclusion In this paper, a modified SAFER plus algorithm is proposed by replacing PH transform with FPH transform and MDS code and introducing a rotation block for every round. The existing security algorithms are compared with the proposed SAFER Plus algorithm. The entire design is captured in J2ME. The efficiency of the algorithm is evaluated by the analysis of parameters like encryption time, encryption frequency, and data throughput and security level. On comparison, the modified Safer plus algorithm proved to be better for implementation in Bluetooth devices than the existing algorithms. References [1] Paraskevas kitos, Nicolas sklavos, Kyriakos Papadomanolakis and Odysseas Koufopavlou university of patras, Greece,” Hardware Implementation of Bluetooth Security” IEEE CS and IEEE Communications Society -January to March 2003, pp. 21 to 29. [2] Karen Scarfone John Padgette, “Guide to Bluetooth security National Institute of standards and technology Special Publication 800-121, U.S. Department of Commerce 43 pages. [3] Vainio, Juha T. “Bluetooth Security,” Helsinki University of Technology, 25 May 2000. [4] Gyongsu Lee, ”Bluetooth Security Implementation based on Software Oriented Hardware-Software Partition” IEEE journal 2005. pp. 2070-2074. [5] Kardach, James, “Bluetooth Architecture Overview,” Intel Technology Journal, 2000 [6] Jyrki Oraskari, "Bluetooth versus wlan ieee 802.11x", Helsinki university of technology, October, 2001. [7] A. Laurie and B.Laurie. serious flaws in blue tooth security lead to disclosure of personal data. http://bluestumbler.com. [8] Brent A.Miller And Chatschik Bisdikian “Bluetooth revealed” – low price edition. [9] Wikipedia.org, “Bluetooth,” Wikipedia.org, 5 March 2005, http://en.wikipedia.org/wiki/Bluetooth (21 February 2005) [10] Vrije Universiteit Brussel, “Bluetooth security” phd thesis, December 2004 [11] J. L. Massey, “On the Optimality of SAFER+ Diffusion”, Second Advanced Encryption Standard Candidate Conference (AES2), Rome, Italy, March 22-23 online available at http://csrc.nist.gov/encryption/aes/round1/conf2/aes2co nf.htm. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 60 Mining Fuzzy Multidimensional Association Rules Using Fuzzy Decision Tree Induction Approach* Rolly Intan1、Oviliani Yenty Yuliana2, Andreas Handojo3 Department of Informatics Engineering, Petra Christian University, Siwalankerto 121-131, Surabaya 60236, Indonesia 1rintan@petra.ac.id, 2ovi@petra.ac.id, 3handojo@petra.ac.id Abstract: Mining fuzzy multidimensional association rules is one of the important processes in data mining application. This paper extends the concept of Decision Tree Induction (DTI) dealing with fuzzy value in order to express human knowledge for mining fuzzy multidimensional association rules. Decision Tree Induction (DTI), one of the Data Mining classification methods, is used in this research for predictive problem solving in analyzing patient medical track records. Meaningful fuzzy labels (using fuzzy sets) can be defined for each domain data. For example, fuzzy labels poor disease, moderate disease, and severe disease are defined to describe a condition/type of disease. We extend and propose a concept of fuzzy information gain to employ the highest information gain for splitting a node. In the process of generating fuzzy multidimensional association rules, we propose some fuzzy measures to calculate their support, confidence and correlation. The designed application gives a significant contribution to assist decision maker for analyzing and anticipating disease epidemic in a certain area. Keywords: Data Mining, Classification, Decision Tree Induction, Fuzzy Set, Fuzzy Association Rules. 1. Introduction Decision Tree Induction (DTI) has been used in machine learning and in data mining as a model for prediction a target value based on a given relational database. There are some commercial decision tree applications, such as the application for analyzing a return payment of a loan for owning or renting a house [16] and the application of software quality classification based on the program modules risk [17]. Both applications inspire this research to develop an application for analyzing patient medical track record. The Application is able to present relation among (single/group) values of patient attribute in decision tree diagram. In the developed application, some domains of data need to be utilized by meaningful fuzzy labels. For example, fuzzy labels poor disease, moderate disease, and severe disease describe a condition/type of disease; young, middle aged and old are used as the fuzzy labels of ages. Here, a fuzzy set is defined to express a meaningful fuzzy label. In order to utilize the meaningful fuzzy labels, we need to extend the concept of (crisp) DTI using fuzzy approach. Simply, the extended concept is called Fuzzy Decision Tree (FDT). To generate FDT from a normalized database that consists of several tables, there are several sequential processes as shown in Figure 1. First is the process of joining tables known as Denormalization of Database as discussed in [4]. The process of denormalization can be provided based on the relation of tables as presented in Entity Relationship Diagram (ERD) of a relational database. Result of this process is a general (denormalized) table. Second is the process of constructing FDT generated from the denormalized table. Figure 1. Process of mining association rules In the process of constructing FDT, we propose a method how to calculate fuzzy information gain by extending the existed concept of (crisp) information gain to employ the highest information gain for splitting a node. The last is the process of mining fuzzy association rules. In this process, fuzzy association rules are mined from FDT. In the process of mining fuzzy association rules, we propose some fuzzy measures to calculate their support, confidence and correlation. Minimum support, confidence and correlation can be given to reduce the number of mining fuzzy association rules. The designed application gives a significant contribution to assist decision maker for analyzing and anticipating disease epidemic in a certain area. The structure of the paper is the following. Section 2 discusses denormalized process of data. Section 3 gives a basic concept of association rules. Definition and formulation of some measures such as support, correlation and confidence rule as used for determining interestingness of the association rules are briefly recalled. Section 4, as main contribution of this paper is devoted to propose the concept and algorithm for generating FDT. Section 5 proposes some equations of fuzzy measures that play important role in the process of mining fuzzy multidimensional association rules. Section 6 demonstrates the algorithm and in a simple illustrative results. Finally a conclusion is given in Section 7. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 61 2. Denormalization Data In general, the process of mining data for discovering association rules has to be started from a single table (relation) as a source of data representing relation among item data. Formally, a relational data table [13] R consists of a set of tuples, where ti represents the i-th tuple and if there are n domain attributes D, then ). , , , ( 2 1 in i i i d d d t L = Here, dij is an atomic value of tuple ti with the restriction to the domain Dj, where j ij D d Î . Formally, a relational data table R is defined as a subset of the set of cross product n D D D ´ ´ ´ L 2 1 , where } , , , { 2 1 n D D D D L = . Tuple t (with respect to R) is an element of R. In general, R can be shown in Table 1. Table 1: A Schema of Relational Data Table rn r r r nnn d d d t d d d t d d d t D D D Tuples L M O M M M LLL 2 1 2 22 21 2 1 12 11 1 2 1 A normalized database is assumed as a result of a process of normalization data in a certain contextual data. The database may consist of several relational data tables in which they have relation one to each others. Their relation may be represented by Entities Relationship Diagram (ERD). Hence, suppose we need to process some domains (columns) data that are parts of different relational data tables, all of the involved tables have to be combined (joined) together providing a general data table. Since the process of joining tables is an opposite process of normalization data by which the result of general data table is not a normalized table, simply the process is called Denormalization, and the general table is then called denormalized table. In the process of denormalization, it is not necessary that all domains (fields) of the all combined tables have to be included in the targeting table. Instead, the targeting denormalized table only consists of interesting domains data that are needed in the process of mining rules. The process of denormalization can be performed based on two kinds of data relation as follows. 2.1. Metadata of the Normalized Database Information of relational tables can be stored in a metadata. Simply, a metadata can be stored and represented by a table. Metadata can be constructed using the information of relational data as given in Entity Relationship Diagram (ERD). For instance, given a symbolic ERD physical design is arbitrarily shown in Figure 2. Figure 2. Example of ERD Physical Design From the example, it is clearly seen that there are four tables: A, B, C and D. Here, all tables are assumed to be independent for they have their own primary keys. Cardinality of relationship between Table A and C is supposed to be one to many relationships. It is similar to relationship between Table A and B as well as Table B and D. Table A consists of four domains/fields, D1, D2, D3 and D4; Table B also consists of four domains/fields, D1, D5, D6 and D7; Table C consists of three domains/fields, D1, D8 and D9; Table D consists of four domains/fields, D10, D11, D12 and D5. Therefore, there are totally 12 domains data as given by D={D1, D2, D3, …, D11, D12}. Relationship between A and B is conducted by domain D1. Table A and C is also connected by domain D1. On the other hand, relationship between B and D is conducted by D5. Relation among A, B, C and D can be also represented by graph as shown in Figure 3. Figure 3. Graph Relation of Entities Metadata expressing relation among four tables as given in the example can be simply seen in Table 2. Table 2: Example of Metadata Table-1 Table-2 Relations Table A Table B {D1} Table A Table C {D1} Table B Table D {D5} Through the metadata as given in the example, we may construct six possibilities of denormalized table as shown in Table 3. Table 3: Possibilities of Denormalized Tables No. Denormalized Table 1 CA(D1,D2,D3,D4,D8,D9); CA(D1,D2,D8,D9); CA(D1,D3,D4,D9), etc. 2 CAB(D1,D2,D3,D4,D8,D9,D5,D6,D7), CAB(D1,D2,D4,D9,D5,D7), etc. 3 CABD(D1,D2,D3,D4,D5,D6,D7,D8,D9, D10,D11,D12), etc. 4 AB(D1,D2,D3,D4,D5,D6,D7), etc. 5 ABD(D1,D2,D3,D4,D5,D6,D7,D10, D11,D12), etc. 6 BD(D5,D6,D7,D10,D11,D12), etc. A B C D {D1} {D1} {D5} (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 62 CA(D1,D2,D3,D4,D8,D9) means that Table A and C are joined together, and all their domains are participated as a result of joining process. It is not necessary to take all domains from all joined tables to be included in the result, e.g. CA(D1,D2,D8,D9), CAB(D1,D2,D4,D9,D5,D7) and so on. In this case, what domains included as a result of the process depends on what domains are needed in the process of mining rules. For D1, D8 and D5 are primary key of Table A. C and B, they are mandatory included in the result, Table CAB. 2.2. Table and Function Relation It is possible for user to define a mathematical function (or table) relation for connecting two or more domains from two different tables in order to perform a relationship between their entities. Generally, the data relationship function performs a mapping process from one or more domains from an entity to one or more domains from its partner entity. Hence, considering the number of domains involved in the process of mapping, it can be verified that there are four possibility relations of mapping. Let ) , , , ( 2 1 n A A A A L and ) , , , ( 2 1 m B B B B L be two different entities (tables). Four possibilities of function f performing a mapping process are given by: o One to one relationship k i B A f ® : o One to many relationship k p p p i B B B A f ´ ´ ´ ® L 2 1 : o Many to one relationship k r r r B A A A f k ® ´ ´ ´ L 2 1 : o Many to many relationship k k p p p r r r B B B A A A f ´ ´ ´ ® ´ ´ ´ L L 2 1 2 1 : Obviously, there is no any requirement considering type and size of data between domains in A and domains in B. All connections, types and sizes of data are absolutely dependent on function f. Construction of denormalization data is then performed based on the defined function. 3. Fuzzy Multidimensional Association Rules Association rule finds interesting association or correlation relationship among a large data set of items [1,10]. The discovery of interesting association rules can help in decision making process. Association rule mining that implies a single predicate is referred as a single dimensional or intradimension association rule since it contains a single distinct predicate with multiple occurrences (the predicate occurs more than once within the rule). The terminology of single dimensional or intradimension association rule is used in multidimensional database by assuming each distinct predicate in the rule as a dimension [1]. Here, the method of market basket analysis can be extended and used for analyzing any context of database. For instance, database of medical track record patients is analyzed for finding association (correlation) among diseases taken from the data of complicated several diseases suffered by patients in a certain time. For example, it might be discovered a Boolean association rule “Bronchitis ⇒Lung Cancer” representing relation between “Bronchitis” and “Lung Cancer” which can also be written as a single dimensional association rule as follows: Rule-1 ), Cancer" Lung " , ( ) " Bronchitis " , ( X Dis X Dis ⇒ where Dis is a given predicate and X is a variable representing patient who have a kind of disease (i.e. “Bronchitis” and “Lung Cancer”). In general, “Lung Cancer” and “Bronchitis” are two different data that are taken from a certain data attribute, called item. In general, Apriori [1,10] is used an influential algorithm for mining frequent itemsets for mining Boolean (single dimensional) association rules. Additional related information regarding the identity of patients, such as age, occupation, sex, address, blood type, etc., may also have a correlation to the illness of patients. Considering each data attribute as a predicate, it can therefore be interesting to mine association rules containing multiple predicates, such as: Rule-2: ), Cancer" Lung " , ( ) yes" " , ( ) "60" , ( X Dis X Smk X Age ⇒ Ù where there are three predicates, namely Age, Smk (smoking) and Dis (disease). Association rules that involve two or more dimensions or predicates can be referred to as multidimensional association rules. Multidimensional association rules with no repeated predicate as given by Rule-2, are called interdimension association rules [1]. It may be interesting to mine multidimensional association rules with repeated predicates. These rules are called hybriddimeensio association rules, e.g.: Rule-3: ), Cancer" Lung " , ( ) " Bronchitis " , ( ) yes" " , ( ) "60" , ( X Dis X Dis X Smk X Age ⇒ Ù Ù To provide a more meaningful association rule, it is necessary to utilize fuzzy sets over a given database attribute called fuzzy association rule as discussed in [4,5]. Formally, given a crisp domain D, any arbitrary fuzzy set (say, fuzzy set A) is defined by a membership function of the form [2,8]: (1) ]. 1 , 0 [ : ® D A A fuzzy set may be represented by a meaningful fuzzy label. For example, “young”, “middle-aged” and “old” are fuzzy sets over age that is defined on the interval [0, 100] as arbitrarily given by[2]: (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 63    ³ < < - £ =      £ £ < < - < < - ³ £ =    ³ < < - £ = 60 , 1 60 45 , 15 /) 45 ( 45 , 0 ) ( 45 35 , 1 60 45 , 15 /) 60 ( 35 20 , 15 /) 20 ( 60 or 20 , 0 ) ( _ 35 , 0 35 20 , 15 /) 35 ( 20 , 1 ) ( x x x x x old xx x x x x x x aged middle x x x x x young . Using the previous definition of fuzzy sets on age, an example of multidimensional fuzzy association rule relation among the predicates Age, Smk and Dis may then be represented by: Rule-4 ) " Bronchitis " , ( ) yes" " , ( ) young" " , ( X Dis X Smk X Age ⇒ Ù 3.1. Support, Confidence and Correlation Association rules are kind of patterns representing correlation of attribute-value (items) in a given set of data provided by a process of data mining system. Generally, association rule is a conditional statement (such kind of ifthhe rule). More formally [1], association rules are the form B A⇒ , that is, n m b b a a Ù Ù ⇒ Ù Ù L L 1 1 , where i a (for iÎ {1,…,m}) and j b (for jÎ {1,…,n}) are two items (attributevallue) The association rule B A⇒ is interpreted as “database tuples that satisfy the conditions in A are also likely to satisfy the conditions in B”. } , , { 1 m a a A L = and } , , { B 1 n b b L = are two distinct itemsets. Performance or interestingness of an association rule is generally determined by three factors, namely confidence, support and correlation factors. Confidence is a measure of certainty to assess the validity of the rule. Given a set of relevant data tuples (or transactions in a relational database) the confidence of “ B A⇒ ” is defined by: (2) , ) ( # ) and ( # ) ( confidence A tuples B A tuples B A = ⇒ where #tuples(A and B) means the number of tuples containing A and B. For example, a confidence 80% for the Association Rule (for example Rule-1) means that 80% of all patients who infected bronchitis are likely to be also infected lung cancer. The support of an association rule refers to the percentage of relevant data tuples (or transactions) for which the pattern of the rule is true. For the association rule “ B A⇒ ” where A and B are the sets of items, support of the rule can be defined by (3) , ) _ ( # ) and ( # ) support( ) ( support data all tuples B A tuples B A B A = È = ⇒ where #tuples(all_data) is the number of all tuples in the relevant data tuples (or transactions). For example, a support 30% for the association rule (e.g., Rule-1) means that 30% of all patients in the all data medical records are infected both bronchitis and lung cancer. From (3), it can be followed ). support( ) support( A B B A ⇒ = ⇒ Also, (2) can be calculated by (4) , ) ( support ) ( support ) ( confidence A B A B A È = ⇒ Correlation factor is another kind of measures to evaluate correlation between A and B. Simply, correlation factor can be calculated by: (5) , ) ( support ) ( support ) ( support ) ( n correlatio ) ( n correlatio B A B A A B B A ´ È = ⇒ = ⇒ Itemset A and B are dependent (positively correlated) iff 1 ) n( correlatio > ⇒ B A . If the correlation is equal to 1, then A and B are independent (no correlation). Otherwise, A and B are negatively correlated if the resulting value of correlation is less than 1. A data mining system has the potential to generate a huge number of rules in which not all of the rules are interesting. Here, there are several objective measures of rule interestingness. Three of them are measure of rule support, measure of rule confidence and measure of correlation. In general, each interestingness measure is associated with a threshold, which may be controlled by the user. For example, rules that do not satisfy a confidence threshold (minimum confidence) of, say 50% can be considered uninteresting. Rules below the threshold (minimum support as well as minimum confidence) likely reflect noise, exceptions, or minority cases and are probably of less value. We may only consider all rules that have positive correlation between its itemsets. As previously explained, association rules that involve two or more dimensions or predicates can be referred to as multidimensional association rules. Multidimensional rules with no repeated predicates are called interdimension association rules (e.g. Rule-2)[1]. On the other hand, multidimensional association rules with repeated predicates, which contain multiple occurrences of some predicates, are called hybrid-dimension association rules. The rules may be also considered as combination (hybridization) between intradimension association rules and interdimension association rules. Example of such rule are shown in Rule-3, the predicate Dis is repeated. Here, we may firstly be interested in mining multidimensional association rules with no repeated predicates or interdimension association rules. The interdimension association rules may be generated from a relational database or data warehouse with multiple (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 64 attributes by which each attribute is associated with a predicate. To generate the multidimensional association rules, we introduce an alternative method for mining the rules by searching for the predicate sets. Conceptually, a multidimensional association rule, B A⇒ consists of A and B as two datasets, called premise and conclusion, respectively. Formally, A is a dataset consisting of several distinct data, where each data value in A is taken from a distinct domain attribute in D as given by } N some for , | { n j j j j D a a A Î Î = , where, D DA Í is a set of domain attributes in which all data values of A come from. Similarly, } N some for , | { n j j j j D b b B Î Î = , where, D DB Í is a set of domain attributes in which all data values of B come from. For example, from Rule-2, it can be found that A={60, yes}, B={Lung Cancer}, DA={Age, Smk} and DB={Dis}. Considering B A⇒ is an interdimension association rule, it can be proved that | | | | A DA = , | | | | B DB = and Æ = Ç B A D D . Support of A is then defined by: r A a a d t A j j ij i | } , | { | ) support( Î " = = , (6) where r is the number of records or tuples (see Table 1). Alternatively, r in (6) may be changed to |QD(DA)| by assuming that records or tuples, involved in the process of mining association rules are records in which data values of a certain set of domain attributes, DA, are not null data. Hence, (6) can be also defined by: | ) ( | | } , | { | ) support( A j j ij i D QD A a a d t A Î " = = , (7) where QD(DA), simply called qualified data of DA, is defined as a set of record numbers (ti) in which all data values of domain attributes in DA are not null data. Formally, QD(DA) is defined as follows. } , ) ( | { ) ( A j j i i A D D null D t t D QD Î " ¹ = . (8) Similarly, | ) ( | | } , | { | ) support( B j j ij i D QD B b b d t B Î " = = . (9) As defined in (3), ) ( support B A⇒ is given by | ) ( | | } , | { | ) support( ) support( B A j j ij i D D QD B A c c d t B A B A È È Î " = = È = ⇒ (10) ) ( confidence B A⇒ as a measure of certainty to assess the validity of B A⇒ is calculated by | } , | { | | } , | { | ) ( confidence A a a d t B A c c d t B A j j ij i j j ij i Î " = È Î " = = ⇒ (11) If support(A) is calculated by (6) and denominator of (10) is changed to r, clearly, (10) can be proved having relation as given by (4). A and B in the previous discussion are datasets in which each element of A and B is an atomic crisp value. To provide a generalized multidimensional association rules, instead of an atomic crisp value, we may consider each element of the datasets to be a dataset of a certain domain attribute. Hence, A and B are sets of set of data values. For example, the rule may be represented by Rule-5: ), cancer" lung , bronchitis " , ( ) yes" " , ( ) "20...60" , (X Dis X Smk X Age ⇒ Ù where A={{20…29}, {yes}} and B={{bronchitis, lung cancer}}. Simply, let A be a generalized dataset. Formally, A is given by } N some for , | { n j j j j D A A A Î Í = . Corresponding to (7), support of A is then defined by: | ) ( | | } , | { | ) support( A j j ij i D QD A A A d t A Î " Î = . (12) Similar to (10), | ) ( | | } , | { | ) support( ) support( B A j j ij i D D QD B A C C d t B A B A È È Î " Î = È = ⇒ (13) Finally, ) ( confidence B A⇒ is defined by | } , | { | | } , | { | ) ( confidence A A A d t B A C C d t B A j j ij i j j ij i Î " Î È Î " Î = ⇒ (14) To provide a more generalized multidimensional association rules, we may consider A and B as sets of fuzzy labels. Simply, A and B are called fuzzy datasets. Rule-4 is an example of such rules, where A={young, yes} and B={bronchitis}. A fuzzy dataset is a set of fuzzy data consisting of several distinct fuzzy labels, where each fuzzy label is represented by a fuzzy set on a certain domain attribute. Let A be a fuzzy dataset. Formally, A is given by (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 65 } N some for ), F( | { n j j j j D A A A Î Î = , where ) F( j D is a fuzzy power set of Dj, or in other words, Aj is a fuzzy set on Dj. Corresponding to (7), support of A is then defined by: | ) ( | )} ( { inf ) support( 1 A r i ij j A A D QD d A A j ∑= Î = . (15) Similar to (10), | ) ( | )} ( { inf ) support( ) support( 1 B A r i ij j B A C D D QD d C B A B A j È = È = ⇒ ∑= È Î (16) ) ( Confidence B A⇒ is defined by ∑ ∑= Î = È Î = ⇒ r i ij j A A r i ij j B A C d A d C B A j j 1 1 )} ( { inf )} ( { inf ) ( confidence (17) Finally, ) ( n correlatio B A⇒ is defined by ∑ ∑ = Î Î = È Î ´ = ⇒ r i ik B B ij A A r i ij B A C d B d A d C B A k j j 1 1 )} ( { inf )} ( { inf )} ( { inf ) ( n correlatio (18) Similarly, if denominators of (15) and (16) are changed to r (the number of tuples), (17) can be proved also having relation as given by (4). Here, we may consider and prove that (16) and (17) are generalization of (13) and (14), respectively. On the other hand, (13) and (14) are generalization of (10) and (11). 4. Fuzzy Decision Tree Induction (FDT) Based on type of data, we may classify DTI into two types, namely crisp and fuzzy DTI. Both DTI are compared based on Generalization-Capability [15]. The result shows that Fuzzy Decision Tree (FDT) is better than Crisp Decision Tree (CDT) in providing numeric attribute classification. Fuzzy Decision Tree formed by the FID3, combined with Fuzzy Clustering (to form a function member) and validated cluster (to decide granularity) is also better than Pruned Decision Tree. Here, Pruned Decision Tree is considered as a Crisp enhancement [14]. Therefore in our research work, disease track record analyzer application development, we propose a kind of FDT using fuzzy approach. An information gain measure [1] is used in this research to select the test attribute at each node in the tree. Such a measure is referred to as an attribute selection measure or a measure of the goodness of split. The attribute with the highest information gain (or greatest entropy reduction) is chosen as the test attribute for the current node. This attribute minimizes the information needed to classify the samples in the resulting partitions and reflects the least randomness or impurity in these partitions. In order to process crisp data, the concept of information gain measure is defined in [1] by the following definitions. Let S be a set consisting of s data samples. Suppose the class label attribute has m distinct values defining m distinct classes, Ci (for i=1,…, m). Let si be the number of samples of S in class Ci. The expected information needed to classify a given sample is given by ) ( log ) ,..., , ( 2 1 2 1 i m i i m p p s s s I ∑= - = (19) where pi is the probability that an arbitrary sample belongs to class Ci and is estimated by si/s. Let attribute A have v distinct values, {a1, a2, …, av}. Attribute A can be used to partition S into v subsets, {S1, S2, …, Sv}, where Sj contains those samples in S that have value aj of A. If A was selected as the test attribute then these subsets would correspond to the braches grown from the node containing the set S. Let sij be the number of samples of class Ci in a subset Sj. The entropy, or expected information based on the partitioning into subsets by A, is given by ∑= + + = v j mj j mj j s s I s s s A E 1 1 1 ) ,..., ( ... ) ( (20) The term s s s mj ij + + ... acts as the weight of the jth subset and is the number of samples in the subset divided by the total number of samples in S. The smaller the entropy value, the greater the purity of the subset partitions.The encoding information that would be gained by branching on A is Gain(A)=I(s1, s2,…, sm) – E(A) (21) In other words, Gain(A) is the expected reduction in entropy caused by knowing the values of attribute A. When using the fuzzy value, the concept of information gain as defined in (19) to (21) will be extended to the following concept. Let S be a set consisting of s data samples. Suppose the class label attribute has m distinct values, vi (for i=1,…, m), defining m distinct classes, Ci (for i=1,…, m). And also suppose there are n meaningful fuzzy labels, Fj (for j=1,…, n) defined on m distinct values, vi. Fj(vi) denotes membership degree of vi in the fuzzy set Fj . Here, Fj (for j=1,…, n) is defined by satisfying the following property: } {1,... i , 1 ) ( m v F i nj j Î " = ∑ Let βj be a weighted sample corresponding to Fj as given by ) v ( F ) C det( i mi j i j ∑ ´ = b , where det(Ci) is the number of (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 66 elements in Ci. The expected information needed to classify a given weighted sample is given by ) ( log ) ,..., , ( 2 1 2 1 j n j j n p p I ∑= - = b b b (22) where pj is estimated by βj/s. Let attribute A have u distinct values, {a1, a2, …, au}, defining u distinct classes, Bh (for h=1,…, u). Suppose there are r meaningful fuzzy labels, Tk (for k=1,…, r), defined on A. Similarly, Tk is also satisfy the following property. } , {1,... , 1 ) ( u h a T h rk k Î " = ∑ If A was selected as the test attribute then these fuzzy subsets would correspond to the braches grown from the node containing the set S. The entropy, or expected information based on the partitioning into subsets by A, is given by ∑= + + = r k nk k nk k I s A E 1 1 1 ) ,..., ( ... ) ( a a a a (23) Where αjk be intersection between Fj and Tk defined on data sample S as follows. ∑∑ Ç ´ = uh mi h i h k i j jk B C a T v F ) det( )) ( ), ( min( a (24) Similar to (4), I(αik,…, αnk) is defined as follows. ) ( log ) ,..., ( 2 1 1 jk n j jk nk k p p I ∑= - = a a (25) where pjk is estimated by αjk/s. Finally, the encoding information that would be gained by branching on A is Gain(A)=I(β1, β2,…, βn) – E(A) (26) Since fuzzy sets are considered as a generalization of crisp set, it can be proved that the equations (22) to (26) are also generalization of equations (19) to (21). 5. Mining Fuzzy Association Rules from FDT Association rules are kind of patterns representing correlation of attribute-value (items) in a given set of data provided by a process of data mining system. Generally, association rule is a conditional statement (such kind of ifthhe rule). Performance or interestingness of an association rule is generally determined by three factors, namely confidence, support and correlation factors. Confidence is a measure of certainty to assess the validity of the rule. The support of an association rule refers to the percentage of relevant data tuples (or transactions) for which the pattern of the rule is true. Correlation factor is another kind of measures to evaluate correlation between two entities. Related to the proposed concept of FDT as discussed in Section 4, the fuzzy association rule, Tk ⇒Fj can be generated from the FDT. The confidence, support and correlation of Tk ⇒Fj are given by ∑ ∑∑ ´ Ç ´ = ⇒ uh h h k uh mi h i h k i j j k B a T B C a T v F F T ) det( ) ( ) det( )) ( ), ( min( ) ( confidence (27) s B C a T v F F T uh mi h i h k i j j k ∑∑ Ç ´ = ⇒ ) det( )) ( ), ( min( ) ( support (28) ∑∑ ∑∑ Ç ´ ´ Ç ´ = ⇒ uh mi h i h k i j uh mi h i h k i j j k B C a T v F B C a T v F F T ) det( ) ( ) ( ) det( )) ( ), ( min( ) ( n correlatio (29) To provide a more generalized fuzzy multidimensional association rules as proposed in [6], it is started from a single table (relation) as a source of data representing relation among item data. In general, R can be shown in Table 1 (see Section 2). Now, we consider χ and ψ as subsets of fuzzy labels. Simply, χ and ψ are called fuzzy datasets. A fuzzy dataset is a set of fuzzy data consisting of several distinct fuzzy labels, where each fuzzy label is represented by a fuzzy set on a certain domain attribute. Formally, χ and ψ are given by c={Fj|FjÎW(Dj), $ jÎNn} and y={Fj|FjÎW(Dj), $ jÎNn}, where there are n domain data, and W(Dj) is a fuzzy power set of Dj. In other words, Fj is a fuzzy set on Dj. The confidence, support and correlation of χ ⇒ ψ are given by s d F s i ij j Fj ∑= È Î = ⇒ 1 )} ( { inf ) support( y c y c (30) ∑ ∑= Î = È Î = ⇒ s i ij j F s i ij j F d F d F j j 1 1 )} ( { inf )} ( { inf ) ( confidence cy c y c (31) ∑ ∑ = Î Î = È Î ´ = ⇒ s i ik k B ij j A s i ij j F d B d A d F k j j 1 1 )} ( { inf )} ( { inf )} ( { inf ) ( n correlatio y c y c y c (32) Here (30), (31) and (32) are correlated to (16), (17) and (18), respectively. 6. FDT Algorithms and Results The research is conducted based on the Software Development Life cycle method. The application design conceptual framework is shown in Figure 1. An input for developed application is a single table that is produced by denormalization process from a relational database. The main algorithm for mining association rule process, i.e. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 67 Decision Tree Induction, is shown in Figure 4. For i=0 to the total level Check whether the level had already split If the level has not yet split Then Check whether the level can still be split If the level can still be split Then Call the procedure to calculate information gain Select a field with the highest information gain Get a distinct value of the selected field Check the total distinct value If the distinct value is equal to one Then Create a node with a label from the value name Else Check the total fields that are potential to become a current test attribute If no field can be a current test attribute Then Create a node with label from the majority value name Else Create a node with label from the selected value name End If End If End If End If End for Save the input create tree activity into database Figure 4. The generating decision tree algorithm はurthermore, the procedure for calculating information gain, to implementing equation (22), (23), (24), (25) and (26), is shown in Figure 5. Based on the highest information gain the application can develop decision tree in which the user can display or print it. The rules can be generated from the generated decision tree. Equation (27), (28) and (29) are used to calculate the interestingness or performance of every rule. The number of rules can be reduced based on their degree of support, confidence and correlation compared to the minimum value of support, confidence and correlation determined by user. Calculate gain for a field as a root Count the number of distinct value field For i=0 to the number of distinct value field Count the number of distinct value root field For j=0 to the number of distinct value root field Calculate the gain field using equation (4) and (8) End For Calculate entropy field using equation (5) End For Calculate information gain field Figure 5. The procedure to calculate information gain Figure 6. The generated decision tree In this research, we implement two data types as a fuzzy set, namely alphanumeric and numeric. An example of alphanumeric data type is disease. We can define some meaningful fuzzy labels of disease, such as poor disease, moderate disease, and severe disease. Every fuzzy label is represented by a given fuzzy set. The age of patients is an example of numeric data type. Age may have some meaningful fuzzy labels such as young and old. Figure 6 shows an example result of FDT applied into three domains (attributes) data, namely Death, Age and Disease. 7. Conclusion The paper discussed and proposed a method to extend the concept of Decision Tree Induction using fuzzy value. Some generalized formulas to calculate information gain ware introduced. In the process of mining fuzzy association rules, some equations ware proposed to calculate support, confidence and correlation of a given association rules. Finally, an algorithm was briefly given to show the process how to generate FDT. Acknowledgment This research was supported by research grant Hibah Kompetensi (25/SP2H/PP/DP2M/V/2009) and Penelitian Hibah Bersaing (110/SP2H/PP/DP2M/IV/2009) from Indonesian Higher Education Directorate. References [1] J. Han, M. Kamber, Data Mining: Concepts and Techniques, The Morgan Kaufmann Series, 2001. [2] G. J. Klir, B. Yuan, Fuzzy Sets and Fuzzy Logic: Theory and Applications, New Jersey: Prentice Hall, 1995. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 68 [3] R. Intan, “An Algorithm for Generating Single Dimensional Association Rules,”, Jurnal Informatika Vol. 7, No. 1, May 2006. [4] R. Intan, “A Proposal of Fuzzy Multidimensional Association Rules,”, Jurnal Informatika Vol. 7 No. 2, November 2006. [5] R Intan, “A Proposal of an Algorithm for Generating Fuzzy Association Rule Mining in Market Basket Analysis,”, Proceeding of CIRAS (IEEE). Singapore, 2005 [6] R. Intan, “Generating Multi Dimensional Association Rules Implying Fuzzy Valuse,”, The International Multi-Conference of Engineers and Computer Scientist, Hong Kong, 2006. [7] R. Intan, O. Y. Yuliana, “Fuzzy Decision Tree Approach for Mining Fuzzy Association Rules,”, 16th International Conference on Neural Information Processing, in be appeared, 2009. [8] O. P. Gunawan, Perancangan dan Pembuatan Aplikasi Data Mining dengan Konsep Fuzzy c-Covering untuk Membantu Analisis Market Basket pada Swalayan X, (in Indonesian) Final Project, 2004. [9] L. A. Zadeh, “Fuzzy Sets and systems,” International Journal of General Systems, Vol. 17, pp. 129-138, 1990. [10] R. Agrawal, T. Imielimski, A.N. Swami, “Mining Association Rules between Sets of Items in Large Database,”, Proccedings of ACM SIGMOD International Conference Management of Data, ACM Press, pp. 207-216, 1993. [11] R. Agrawal, R. Srikant, “Fast Algorithms for Mining Association Rules in Large Databases,”, Proccedings of 20th International Conference Very Large Databases, Morgan Kaufman, pp. 487-499, 1994. [12] H. V. Pesiwarissa, Perancangan dan Pembuatan Aplikasi Data Mining dalam Menganalisa Track Records Penyakit Pasien di DR.Haulussy Ambon Menggunakan Fuzzy Association Rule Mining, (in Indonesian) Final Project, 2005. [13] E.F. Codd, “A Relational Model of Data for Large Shared Data Bank,”, Communication of the ACM 13(6), pp. 377-387, 1970. [14] H. Benbrahim, B. Amine, “A Comparative Study of Pruned Decision Trees and Fuzzy Decision Trees,”, Proceedings of 19th International Conference of the North American, Atlanta, pp. 227-231, 2000. [15] Y. D. So, J. Sun, X. Z. Wang, “An Initial comparison of Generalization-Capability between Crisp and fuzzy Decision Trees,”, Proceedings of the First International Conference on Machine Learning and Cybernetics, pp. 1846-1851, 2002. [16] ALICE d'ISoft v.6.0 demonstration [Online]. Available at:http://www.alice-soft.com/demo/al6demo.htm [Accessed: 31 October 2007]. [17] Khoshgoftaar Taghi M., Y. Liu, N. Seliya “Genetic Programming-Based Decision Trees for Software Quality Classification,”, Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence, California, pp. 374-383, 2003. Authors Profile Rolly Intan obtained his B.Eng. degree in computer engineering from Sepuluh Nopember Institute of Technology, Surabaya, Indonesia in 1991. Now, he is a professor in the Department of Informatics Engineering at Petra Christian University, Surabaya, Indonesia. He received his M.A. in information science from International Christian University, Tokyo, Japan in 2000, and his Doctor of Engineering in Computer Science from Meiji University, Tokyo, Japan in 2003. His primary research interests are in data mining, intelligent information system, fuzzy set, rough set and fuzzy measure theory. Oviliani Yenty Yuliana is an associate professor at the Department of Informatics Engineering, Faculty of Industrial Technology, Petra Christian University, Surabaya, Indonesia. She received her B.Eng. in Computer Engineering from Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia. Her Master of Science in Computer Information System is obtained from Assumption University, Bangkok, Thailand. Her research interests are database systems and data mining. Andreas Handojo obtained his B.Eng. degree in electronic engineering from Petra Christian University, Surabaya, Indonesia in 1999. He received his master, in Information Technology Management from Sepuluh November Institute of Technology, Surabaya, Indonesia, in 2007. Now, he is a lecturer in the Department of Informatics Engineering at Petra Christian University. His primary research interest are in data mining, business intelligent, strategic information system plan, and computer network. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 69 A Collaborative Framework for Human-Agent Systems Moamin Ahmed1, Mohd Sharifuddin Ahmad2 and Mohd Zaliman Mohd Yusoff3 1College of Information Technology, Universiti Tenaga Nasional, Km 7 Jalan Kajang-Puchong, 43009 Kajang, Selangor, Malaysia momen42@yahoo.com 2College of Information Technology, Universiti Tenaga Nasional, Km 7 Jalan Kajang-Puchong, 43009 Kajang, Selangor, Malaysia sharif@uniten.edu.my 3College of Information Technology, Universiti Tenaga Nasional, Km 7 Jalan Kajang-Puchong, 43009 Kajang, Selangor, Malaysia zaliman@uniten.edu.my Abstract: In this paper, we demonstrate the use of software agents in assisting humans to comply with the deadlines of a collaborative work process. Software agents take over the communication between agents and reminding and alerting humans in complying with scheduled tasks. We use the FIPA agent communication protocol to implement communication between agents. An interface for each agent provides the means for humans to communicate with their agents and to delegate mundane tasks to them. Keywords: intelligent software agents, multiagent systems, workflow, collaboration. 1. Introduction In a human-centric collaboration, the problem of adhering to deadlines presents a major problem. The diversity of tasks imposed on humans and the procedures attached to them pose a major challenge in keeping the time to implement scheduled tasks. One way of overcoming this problem is to use a scheduler or a time management system which keeps track of deadlines and provides reminders for time-critical tasks. Other researchers have developed agent-based solutions to resolve similar problems in workflow systems [18], [19], [20], [21]. However, such systems do not always provide the needed assistance to perform mundane followuu tasks and resolve delays caused by humans. In this paper, we demonstrate the development and application of software agents to implement a collaborative work of Examination Paper Preparation and Moderation Process (EPMP) in our academic faculty. We use the FIPA agent communication language (ACL) to implement communication between agents [3], [4]. An interface for each agent provides a convenient means for humans to delegate mundane tasks to software agents. The use of such interface and the subsequent communication performed by agents and between agents contribute to the achievement of a shared goal, i.e. the completion of the examination paper preparation and moderation process within the stipulated time. We use the FIPA ACL to demonstrate the usefulness of the agents to take over the timing and execution of communication from humans. However, the important tasks, i.e. preparation and moderation tasks are still performed by humans. The agents continuously urge human actors to complete the tasks by the deadline and execute communicative acts to other agents when the tasks are completed. This paper reports an extension to our previous work in the same project [1]. Section 2 of this paper briefly dwells on the issues and problems relating to the EPMP. Section 3 reviews the related work on this project. In Section 4, we develop and present our framework to resolve the problems of EPMP. Section 5 discusses the development and testing of the system and Section 6 concludes the paper. 2. Issues and Problems in EPMP The EPMP is the standard process of our faculty for examination paper preparation and moderation. The process starts when the Examination Committee (EC) sends out an instruction to start prepare examination papers. A Lecturer then prepares the examination paper, together with the solutions and the marking scheme (Set A). Upon completion, he then submits the set to be checked by an appointed Moderator. The Moderator checks the set and returns it to the Lecturer with a moderation report (Set B). If there are no corrections, the Lecturer submits the set to the Examination Committee for further actions. Otherwise, the Lecturer needs to correct the paper and resubmit the corrected paper to the Moderator for inspection. If corrections have been made, the Moderator returns the set to the Lecturer. Finally, the Lecturer submits set to the Committee for further processing. Figure 1 shows the process flow for the EPMP. The Lecturer and Moderator are given deadlines to complete the process as shown in Table 1. The process continues over a period of four weeks in two preparationmoderrationcorrection cycles. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 70 Figure 1. The EPMP Process Flow Table 1: Typical Schedule for Examination Paper Preparation and Moderation Tasks Deadlines Set A should be submitted to the respective moderators Week 10 1st moderation cycle Week 10 & 11 2nd moderation cycle (Set B) Week 12 & 13 Set B should be submitted to EC Week 14 Lack of enforcement and the diverse tasks of lecturers and moderators caused the EPMP to suffer from delays in action by the academicians. Lecturers wait until the last few days of the second cycle to submit their examination papers, which leaves insufficient time for the moderators to scrutinize the papers qualitatively. Due to the manual nature of the process, there are no mechanisms which record the adherence to deadlines and track the activities of defaulters. To resolve some of these problems, we resort to the use of software agents to take over the communication tasks between agents and the reminding and alerting tasks directed to humans. We will describe more of these functions in greater details in Section 4.3. 3. Related Work 3.1 Agents and Agent Communication Language The development of our system is based on the work of many researchers in agent-based systems. For example, agent communication and its semantics have been established by research in speech act theory [14], [20], KQML [3], [14] and FIPA ACL [4], [5], [9]. We based our design of agent communication on the standard agent communication protocol of FIPA [4], [5] and its semantics [6]. FIPA ACL is consistent with the mentalistic notion of agents in that the message is intended to communicate attitudes about information such as beliefs, goals, etc. Belief, Desire, and Intention (BDI) is a mature and commonly adopted architecture for intelligent agents [12]. FIPA ACL message use BDI to define their semantics [6]. Cohen and Perrault [18] view a conversation as a sequence of actions performed by the participants, intentionally affecting each other's model of the world, primarily their beliefs and goals. While KQML and FIPA ACL epitomize agent communication, many researchers have developed other techniques of agent communication. Payne et al. [17] propose a shallow parsing mechanism that provides message templates for use in message construction. This approach alleviates the constraint for a common ACL between agents and support communication between open multiagent systems. Chen and Su [2] develop Agent Gateway which translates agent communication messages from one multiagent system to an XML-based intermediate message. This message is then translated to messages for other multiagent systems. Pasquier and Chaib-draa [16] offer the cognitive coherence theory to agent communication pragmatic. The theory is proposed as a new layer above classical cognitive agent architecture and supplies theoretical and practical elements for automating agent communication. 3.2 Workflow Systems Software agents have also been applied in workflow systems to resolve some specific issues. Many business processes use workflow systems to exploit their known benefits such as automation, co-ordination and collaboration between entities. Savarimuthu et al. [19] and Fluerke et al. [7] describe the advantages of their agent-based framework JBees, such as distribution, flexibility and ability to dynamically incorporate a new process model. Researches have also been made on the monitoring and controlling of workflow [19]. Wang and Wang [21], for example, propose an agent-based monitoring in their workflow system. Our framework extends the capabilities of these systems by employing a mechanism that enforces and motivates humans in the process loop to comply with the deadlines of scheduled tasks. We implement this mechanism by establishing a merit and demerit point system which rate human’s compliance to deadlines. 3.3 Ontology The term ontology was first used to describe the philosophical study of the nature and organization of reality [11, 12]. In AI it is simply defined as “an explicit specification of a conceptualization” [10]. This definition provokes many controversies within the AI community especially with regard to the meaning of conceptualization. An ontology associates vocabulary terms with entities identified in the conceptualization and provides definitions to constrain the interpretations of these terms. Most researchers concede that an ontology must include a vocabulary and corresponding definitions, but there is no consensus on a more detailed characterization [13]. Typically, the vocabulary includes terms for classes and relations, while the definitions of these terms may be informal text, or may be specified using a formal language like predicate logic as implemented in [8]. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 71 FIPA ontology uses a specification of a representational vocabulary for a shared domain of discourse involving definitions of classes, relations, functions, and other objects [5]. 4. The Collaborative Framework We develop our framework based on a four-phased cycle shown in Figure 2. The development process includes domain selection, domain analysis, tasks and message exchanges and application. Figure 2. The Four-Phased Development Cycle 4.1 Domain Selection Our framework involves a working relationship between an agent and its human counterpart. Considering the nature of the tasks and the complexity of the work process, the EPMP seems to be a suitable platform on which to develop a multiagent framework. The mundane tasks of document submissions, deadlines reminding and work progress tracking could be delegated to software agents. Consequently, we chose the EPMP as a platform for our framework that contains both humans and agents. The goal of this collaborative process is to complete the preparation and moderation of examination papers. 4.2 Domain Analysis Domain analysis consists of analyzing the process flow, identifying the entities and modeling the process. We have described and analyzed the process in Section 2 and will not repeat it here. For the purpose of our model, we create three agents that represent the Examination Committee (C), Moderator (M) and Lecturer (L). Figure 3 shows the architecture of our model. Humans communicate with their agents via an interface and their corresponding agents monitor and update their environment to communicate between agents, perform tasks that enables the progression of the workflow, and reminding and alerting their human counterparts to meet the deadlines. With this model, important human activities are recorded and tracked by the agents in their environment Figure 3. The Model’s Architecture 4.3 Tasks and Message Exchanges An agent sends a message autonomously when some states of the environment are true. It performs the following actions to complete the message-sending task (See Figure 4): Figure 4. Agent Actions 4.3.1 Check the state of the Environment The agent always checks its environment, which consists of four parts: · Status of uploaded files: The agent checks its user if he has uploaded Set A or Set B to a specified folder. If he has done so, the agent checks the next step. · Status of deadlines: The agent checks the system’s date (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 72 everyday and compare it with the deadline. · Status of subprograms: When an agent perform a task, it records the actions in a subprogram, e.g. when the Committee agent sends the Prepare message, it records this event to use it later for sending a remind message. · Message signal: The agent opens a port and makes a connection when it senses a message coming from a remote agent. 4.3.2 Send messages The agent decides to send a message when some status of the environment are true, otherwise it will inform all agents of any delays from its user and penalizes its user with demerit points. When sending a message, it performs the following actions: · Open Port (Connect): When the agent decides to send a message, it opens a port and makes a connection. · Send Message (online or offline): The message will be received when the remote agent is online. Two problems may occur: (i) The remote agent is offline; (ii) The IP address of the remote agent is inadvertently changed. We resolve these problems by exploiting the Acknowledge performative. If the sending agent does not receive an Acknowledge message from a remote agent, it will resend the message in offline mode. The same process is executed if the IP address is changed. We focus on these issues to ensure that the agents must achieve the goal in any circumstances because it relates to completing the students examination papers. · Register action, date and merit/demerit point: When the agent has sent the message, it registers the action and the date in a text file. It also evaluates the user by giving merit or demerit points based on the user’s adherence to any deadlines. The Head of Department could access these points to evaluate the staff’s commitment to EPMP and take the necessary corrective action. · Record in Subprograms: The agent records some actions as subprograms when it needs to execute those actions later. · Close Port (Disconnect): The agent disconnects and closes the port when it has successfully sent the message. 4.4 Autonomous Collaborative Agents Application We then apply the task and message exchanges to the EPMP domain. To facilitate readability, we represent the tasks and message exchanges for each agent as T#X and E#X respectively, where # is the task or message exchange number and X refers to the agents C, M, or L. A message from an agent is represented by m#SR, where # is the message number, S is the sender of the message m, and R is the receiver. S and R refer to the agents C, M, or L. For system’s tasks, CN#X refers to the task an agent performs to enable connection to a port and DCN#X indicates a disconnection task. We extend the state of the environment to include systems’ parameters that enable agents to closely monitor the actions of its human counterpart. The side effect of this ability is improved autonomy for agents to make correct decision as well as improved ability to implement one-tomaan and many-to-many message exchanges, e.g. inform_all message. Based on the analysis of Section 4.2, we create the interaction sequence between the agents. However, due to space limitation and the complexity of the ensuing interactions, we only show sample interactions between the Committee (C) and the Lecturer (L) agents: 1. Agent C CN1C : Agent C opens port and enables connection when a start date is satisfied. E1C : C sends a message m1CL, to L – PREPARE examination paper. -Agent L sends an ACK message, m1LC. -Agent C reads the ACK, checks the ontology and understands its meaning. -If ACK message is not received, it sends offline message. T1C : Agent C registers the action and the date. T2C : Agent C calculates the merit or demerit point and saves it for Head of Department’s evaluation. DCN1C : Agent C disables connection and closes the port. When Agent C decides to send a remind message it will perform the following: CN2C : Agent C connects to Agent L -Agent C makes this decision by checking its environment (the date, status of uploaded file and notice in subprograms). E2C : C sends a REMIND message to Agent L -Agent L receives the message and display on its screen to alert its human counterpart. DCN2C : Agent C disconnects and closes the port when it completes the task. 2. Agent L CN1L : Agent L opens port and enables connection when it receives the message from Agent C. -Agent L makes this decision by checking its environment (message signal). -Agent L reads the performative PREPARE, checks the ontology and understands its meaning. E1L : Agent L replies with a message m1LC, to C – ACK. T1L : Agent L displays the message m1CL, on the screen to alert its human counterpart. T2L : Agent L opens and displays a new Word document on the screen. -Agent L opens a new document to signal its human counterpart to start writing the examination paper. T3L : Agent L opens and displays the Word document of the Lecturer form on the screen. -Agent L opens the form which contains the policy to follow. DCN1L : Agent L disconnects and closes the port. When the human Lecturer uploads a completed examination paper via an interface, agent L checks its (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 73 environment (status of uploaded file and the deadline). Agent L will decide to send a message: E2L : Agent L sends a message m2LM, to M – REVIEW examination paper. -Agent M sends an ACK message, m1ML. -Agent L checks the states of the environment. T4L : Agent L registers the action and the date. T5L : Agent L calculates and saves the merit or demerit points. 5. Systems Simulation and Testing We simulate the EPMP using Win-Prolog and its extended module Chimera, which has the ability to handle multiagent systems [22]. We use Prolog for two reasons: Firstly, Prolog is well suited for expressing complex ideas because it focuses on the computation’s logic rather than its mechanics where the drudgery of memory allocation, stack pointers, etc., is left to the computational engine. Reduced drudgery and compact expression means that one can concentrate on what should be represented and how. Secondly, since Prolog incorporates logical inferencing mechanism, this powerful property can be exploited to develop inference engines specific to a particular domain. Chimera provides the module to implement peer-to-peer communication via the use of TCP/IP. Each agent is identified by a port number and an IP address. Agents send and receive messages through such configurations. We develop the collaborative process as a multiagent system of EPMP based on the above framework and test the simulation in a laboratory environment on a Local Area Network. Each of the agents C, M and L run on a PC connected to the network. The simulation executes communication based on the tasks outlined in Section 4.4. For message development, we use the parameters specified by the FIPA ACL Message Structure Specification [4]. We include the performatives, the mandatory parameter, in all our ACL messages. We also define and use our own performatives in the message structure, which are Prepare, Check, Remind, Review, Complete, Modify, ACK, Advertise, and Inform_all. To complete the structure, we include the message, content and conversational control parameters as stipulated by the FIPA Specification. The communication between agents is based on the BDI semantics as defined by FIPA [6]. The BDI semantics gives the agents the ability to know how it arranges the steps to achieve the goal: · Belief: When the agent wants to send a message, it checks its belief of which agent can perform the required action. · Desire: Achieving the goal completely will be the desire of all agents. The agents will never stop until it has achieved the goal. The agent’s goal is to complete the examination paper preparation and moderation and it will know this from Committee agent’s final message. · Intention: Intentions are courses of action an agent has committed to carry out. The agent’s intention results from its belief and a goal to achieve. Consequently, the agents will take actions such as sending Prepare message, remind message, etc. We show four samples of performatives used in the framework (Prepare, ACK, Review, Remind). The communicative act definitions for each of these performatives are as follows: · Prepare: The sender advises the receiver to start prepare examination paper by performing some actions to enable its human counterpart to do so. The content of the message is a description of the action to be performed. The receiver understands the message and is capable of performing the action. Prepare performative is timedepenndent prepare( ':sender', committee, ':receiver', lecturer, ':reply-with', task_completed, ':content', start_prepare_examination_paper, ':ontology', word_documents, ':language', prolog ) · ACK: The receiver acknowledges the sender that it has received the message. We use acknowledge for message state. If the sender receives acknowledge, it means that receiver is online and has received the message, otherwise the receiver is offline. The sender will resend the message in offline mode. The content of the message is a description of the action to be performed, which the receiver understands and is capable of performing. ACK performative depends on the receiving message signal. ack( ':sender', committee, ':receiver', lecturer, ':in-reply-to', task_completed, ':content', acknowledge_message, ':ontology', message, ':language', prolog ) · Review: The sender advises the receiver to review the examination paper by performing some actions to enable its human counterpart to do so. The content of the message is a description of the action to be performed, which the receiver understands and is capable of performing. Review performative depends on the deadline and status of uploaded file. review( ':sender', lecturer, ':receiver', moderator, ':reply-with', task_completed, ':content', review_examination_paper, ':ontology', word_documents, ':language', prolog ) · Remind: The sender advises the receiver to perform a (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 74 very important task, e.g. to submit Set A or Set B. Remind performative depends on the deadline, status of uploaded file and notice in subprograms. remind( ':sender', committee, ':receiver', lecturer, ':reply-with', task_completed, ':content', remind_message, ':ontology', message, ':language', prolog ) We reproduce below the sample codes that implement a communicative act to inform the committee agent that the EPMP process has been completed: complete_lecturer_dialog_handler :-agent_link( Agent, Link ), Complete = complete( ':sender', lecturer, ':receiver', committee, ':reply-with', task_completed, ':content', complete_prepare_examination_paper, ':ontology', word_documents, ':language', prolog ), nl, agent_post( Agent, Link, Complete), nl,check_points, open('c:\4',append), tell('c:\4'). For ontology development, we implicitly encode our ontologies in the actual software implementation of the agent themselves and thus are not formally published as an ontology service [5]. The sample codes below show the ontology implementation after the Committee agent (C), receives the Complete message from the Lecturer agent (L): committee_handler(Name,Link,complete(|Args) ):-committee_dialog_handler( (committee,1006), msg_button, _, _ ), repeat, wait( 0 ), ( complete_prepare_examination_paper ), fipa_member( ':sender', From, Args ), fipa_member( ':reply-with', ReplyWith, Args ), committee_reply( Name, From, ReplyWith, done, Reply ), agent_post( Name, Link, Reply ), timer_create( clock3, clock_hook3 ), timer_set( clock3,1000). % ontology call complete_prepare_examination_paper:-repeat, wait( 0 ), committee_ acknowledge_remot_agent, examination_paper, committee_form, committee_message. Due to space limitation, we will only show the ontology for examination paper. examination_paper:-absolute_file_name( system(ole), File ), ensure_loaded( File ), ole_initialize, ole_create( word, 'word.application' ), ole_get_property( word, documents, [], WordDocuments ), assert( my_object(word_documents,WordDocuments) ), ole_put_property( word, visible, -1 ), my_object( word_documents, WordDocuments ), absolute_file_name( ('C:\database\Examination Paper.docx'), FileName ), ole_function( WordDocuments, open, [FileName], SecondDocument ), assert( my_object(second_document,SecondDocument) ). To test the collaborative system, we deploy human actors to perform the roles of Committee, Lecturer and Moderator. These people communicate with their corresponding agents to advance the workflow. An interface for each agent provides the communication between human actors and agents (see Figure 5). Figure 5. A Lecturer Agent Interface The test produces the following results: On the set date, the Committee agent sends the Prepare message to the Lecturer agent. The Lecturer agent acknowledges the receipt of the message and then shows the message on the screen for its human counterpart. It then opens a new Word document for the examination paper and another document displaying the guidelines for preparing the examination paper. While the human lecturer simulates the preparation of the examination paper, the Committee agent sends a reminder to the Lecturer agent which displays the reminder on the screen for the human lecturer. When the human lecturer uploads the completed examination paper with its user interface (see Fig. 5), its lecturer agent checks the date, calculates the merit/demerit points and sends the Review message to the Moderator agent. The Moderator agent acknowledges the receipt of the message, displays the message on the screen for its human counterpart, and opens the examination paper and the moderation form. While the human moderator simulates the moderation of the examination paper, the Committee agent (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 75 sends a reminder to the Moderator agent which displays the reminder on the screen for the human moderator. When the human moderator uploads the completed moderation form and the moderated examination paper with its user interface, its agent checks the date, calculates the merit/demerit points and sends the Check message to the Lecturer agent. The Lecturer agent acknowledges the message, displays the message on the screen for its human lecturer and opens the moderated examination paper and the completed moderation form. The human lecturer checks the moderation form to know if there are corrections to be made. In this test, we do not simulate any corrections. The human lecturer then uploads the moderation form and the moderated examination paper. Its agent then checks the date, calculates the merit/demerit points and sends a Complete message to the Committee agent. The Committee agent acknowledges the message, displays the message on the screen for its human counterpart and opens the Committee form and the moderated examination paper. The human committee then uploads the moderated examination paper to the EC Print File. The Committee agent then sends an inform-all message to all agents that the EPMP process is completed. This simulation shows that with the features and autonomous actions performed by the agents, the collaboration between human Committee, Lecturer and Moderator improves significantly. The agents register dated actions, remind humans about the deadlines, advertise all agents if there is no submission when the deadline has expired, and award/penalize merit/demerit points to humans. The human's cognitive load is reduced when the deadlines of important tasks and documents' destinations are ignored. This is alleviated by the consistent alerting services provided by the agents that ensure constant reminders of the deadlines. All these actions and events are recorded in the agent environment to keep track of the process flow, which enables the agents to resolve any impending problems. The ease of uploading the files and the subsequent communicative acts performed by agents and between agents contribute to the achievement of the shared goal, i.e. the completion of examination paper preparation and moderation process. As such, we believe that the use of agent-based system has provided some evidence that the problems of lack of enforcement, lack of reminder of time critical tasks and delays in response suffered by the manual system have been addressed. Table 2 compares the features between the manual and the agent-based systems and highlights the improvements. Table 2: Comparison between Manual and Automated (Agent-based) Systems Features Manual Automated Human cognitive load High Reduced Process tracking No Yes Merit/demerit system No Yes Reminder/alerting No Yes Offline messaging Not applicable Yes Housekeeping Inconsistent Consistent Document submission Human-dependent Immediate Feedback Human-dependent Immediate 6. Conclusions and Further Work In this research, we developed and simulated a collaborative framework based on the communication between agents using the FIPA agent communication protocol. We demonstrated the usefulness of the system to take over the timing and execution of scheduled tasks from humans to achieve a shared goal. The important tasks, i.e. preparation and moderation tasks are still performed by humans. The agents perform communicative acts to other agents when the tasks are completed. Such acts help reduce the cognitive load of humans in performing scheduled tasks and improve the collaborative process. Our agents are collaborative and autonomous, but they are not learning agents. In our future work, we will explore and incorporate machine learning capabilities to our agents. The agents will learn from previous experiences and enhance the EPMP process. References [1] Ahmed M., Ahmad M. S., Mohd Yusoff M. Z., A review and development of Agent Communication Language, Electronic Journal of Computer Science and Information Technology (eJCSIT), ISSN [1985-7721], Vol. 1, No. 1, pp. 7 – 12, May 2009. [2] Chen J. J-Y., Su S-W., AgentGateway: A communication tool for multiagent systems, Information Sciences, Vol. 150 Issues 3-4, pp 153 – 154, 2003. [3] Finin T., Fritzson R., McKay D., McEntire R., KQML as an Agent Communication Language, Proceedings of the Third International Conference on Information and Knowledge Management (CIKM '94), 1994. [4] FIPA ACL Message Structure Specification: SC00061G, Dec. 2002. [5] FIPA Ontology Service Specification: XC00086D, Aug. 2001 [6] FIPA Communicative Act Library Specification SC00037J 2002/12/03. [7] Fleurke M., Ehrler L., Purvis M., JBees – An adaptive and distributed framework for workflow systems, Proc. IEEE/WIC International Conference on Intelligent Agent Technology, 2003, Halifax, Canada. [8] Fox M. S., Gruninger M., On Ontologies and Enterprise Modelling, Enterprise Integration Laboratory, Dept. of Mechanical & Industrial Engineering, University of Toronto. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 76 [9] Genesereth M. R., Ketchpel S. P., Software agents, Communication of the ACM, Vol.37, No.7, July 1994. [10] Gruber T. R., A Translation Approach to Portable Ontologies, Knowledge Acquisition, 5(2):199–220, 1993. [11] Guarino N., Giaretta P., Ontologies and Knowledge Bases: Towards a Terminological Clarification. In N. Mars, Editor, Towards Very Large Knowledge Bases: Knowledge Building and Knowledge Sharing, pages 25–32. IOS Press, Amsterdam, 1995. [12] Guerra-Hernandez A., El Fallah-Seghrouchni A., Soldano H., Learning in BDI Multi-agent Systems, Universite Paris, Institut Galilee. [13] Heflin J. D., Towards The Semantic Web: Knowledge Representation in a Dynamic Distributed Environment, PhD Dissertation, 2001. [14] Labrou Y., Finin T., Semantics for an Agent Communication Language, PhD Dissertation, University of Maryland, 1996. [15] Muehlen M. Z., Rosemann M., Workflow-based process monitoring and controlling – technical and organizational issues. Proc. 33rd Hawaii International Conference on System Sciences, 2000, Wailea, IEEE Press. [16] Pasqueir P., Chaib-draa B., Agent communication pragmatics: The Cognitive Coherence Approach, Cognitive Systems Research, Vol. 6 Issue 4, pp 364 – 395, 2005. [17] Payne T. R., Paolucci M., Singh R., Sycara K., Communicating agents in open multiagent systems, First GSFC/JPL Workshop on Radical Agent Concepts (WRAC), 2002. [18] Perrault C. R., Cohen P. R., Overview of planning speech Acts, Dept. of Computer Science University of Toronto. [19] Savarimuthu B. T. R., Purvis M., Fleurke M., Monitoring and controlling of a multiagent based workflow system. In Proc. Australasian Workshop on Data Mining and Web Intelligence (DMWI2004), Dunedin, New Zealand. CRPIT, 32. Purvis, M., Ed.ACS. 127-132. [20] Searle J. R., Kiefer F., Bierwisch M. (Eds.): Speech act theory and pragmatics, Springer, 1980. [21] Wang M., Wang H., Intelligent agent supported workflow monitoring system, CAISE 2002, LNCS 2348, 787-791. [22] http://www.lpa.co.uk/chi.htm. Authors Profile Moamin A. Mahmoud received his B.Sc. in Mathematics from the College of Mathematics and Computer Science, University of Mosul, Iraq in 2008. Currently, he is enrolled in the Master of Information Technology program at the College of Graduate Studies, Universiti Tenaga Nasional (UNITEN), Malaysia. During his studentship at UNITEN, he conducted additional laboratory work for degree students at the College of Information Technology. His current research interests include software agents and multiagent systems. Mohd S. Ahmad received his B.Sc. in Electrical and Electronic Engineering from Brighton Polytechnic, UK in 1980. He started his career as a power plant engineer specialising in Process Instrumentation and Control in 1980. After completing his MSc in Artificial Intelligence from Cranfield University, UK in 1995, he joined UNITEN as a Principal Lecturer and Head of Dept. of Computer Science and Information Technology. He obtained his PhD from Imperial College, London, UK in 2005. He has been an associate professor at UNITEN since 2006. His research interests includes applying constraints to develop collaborative frameworks in multi-agent systems, collaborative interactions in multi-agent systems and tacit knowledge management using AI techniques. Mohd Z. M. Yusoff obtained his BSc and MSC in Computer Science from Universiti Kebangsaan Malaysia in 1996 and 1998 respectively. He started his career as a Lecturer at UNITEN in 1998 and has been appointed as a Principle Lecturer at UNITEN since 2008. His has produced and presented more than 40 papers for local and international conferences. His research interest includes modeling and applying emotions in various domains including educational systems and software agents, modeling trust in computer forensic and integrating agent in knowledge discovery system. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 77 Modified Feistel Cipher Involving Interlacing and Decomposition K.Anup Kumar1 and V.U.K. Sastry2 1Associate Professor, Department of Computer Science and Engineering, SNIST, Hyderabad, Andhra Pradesh, India. 1k_anupkumar@yahoo.com 2 Dean R & D, Department of Computer Science and Engineering, SNIST, Hyderabad, Andhra Pradesh, India. 2vuk_sastry@rediffmail.com Abstract: In this paper, we have discussed the generation of large block cipher of 256 bit by using the modified feistel structure involving basic concepts of interlacing, decomposition and key based random permutations. In each round, we perform decomposition before encryption and interlacing after encryption. The key based random permutations and substitutions used in this process are similar to the one we already published in our previous paper. The cryptanalysis carried out in this paper, indicates that the cipher cannot be broken by any cryptanalytic attack due to the non linearity induced by the interlacing, decomposition and key based random permutations. Keywords: Encryption, Decryption, Plaintext, Cipher text, Key, Interlacing, Decomposition etc. 1. Introduction In the survey of literature of cryptography, Feistel structure has a predominant role in generating the block cipher of required size. Here, the bits of the plaintext undergo a series of diffusion and confusion transformations involving permutations, substitutions. The classical feistel structure involves a round function and the number of rounds which provides good strength to the cipher is sixteen. In this paper, we have developed a block cipher of 256 bit, using 16 rounds of classical feistel structure. In the process of encryption and decryption, we have used the function ‘F’ in each round same as our conventional feistel structure with key based random permutations and substitutions published in our previous paper, see reference [6].To get proper mixing of bits between two consecutive rounds; to introduce the non linearity and counter attack the cryptanalysis, we have used the concepts of interlacing and decomposition. Our interest is to develop a block cipher using feistel network which cannot be broken by any cryptanalytic attack. In section 2 of this paper, we introduce the process of interlacing and decomposition in feistel network followed by the process of interlacing and decomposition demonstrated in figure. In section 3, we discuss the development of cipher and we present the algorithms for encryption, decryption, Let ‘Ci’ be the 256 bit cipher obtained after interlacing the ciphers cm+11, cm+12, cm+13, cm+14. Here ‘i’ indicates the round interlacing and decomposition in section 4. We have illustrated the cipher in section 5 and investigated the cryptanalytic attack on cipher in section 6. In section 6.3, we have discussed the avalanche effect which is followed by the conclusion in section 7 and reference in section 8. 2. Interlacing and Decomposition Let us illustrate the process of decomposition first. Let ‘P’ be the plaintext of length 256 bit. Let us divide this plaintext of 256 bit block into four small blocks of 64 bits each. Let C0 = P be the initial plaintext. Thus we get, B01, B02, B03, B04 as 64 bits blocks by placing the first 64 bits of ‘C0’ in ‘B01’ and the next 64 bits of ‘C0’ in ‘B02’ and so on. Hence, Ck = Σ Bki , j . Such that, i = 1 to 4 and j = 1 to 64. k = 0 to 16; Where, k = 0 indicates initial plaintext, k = m indicates cipher text after mth round and Σ indicates concatenation of bits. Let C0 = { C01, C02, C03,………, C0256 }. Then, Bmi = Σ Cmj + k . Where, i = 1 to 4 , j = 1 to 64 and k = 64*( i -1 ). therefore, Bm1 = { Cm1, Cm2, Cm3,……., Cm64 } (2.1) Bm2 = { Cm65, Cm66, Cm67,……., Cm128 } (2.2) Bm3 = { Cm129, Cm130, Cm131,……., Cm192 } (2.3) Bm4 = { Cm193, Cm194, Cm195,……., Cm256 } (2.4) We perform decomposition before encryption. So that, a large block of 256 bit is divided into a small block of 64 bit. Hence encryption of these small blocks can be done in parallel and faster. Moreover, decomposition allows us to introduce enough confusion in a large block cipher due to which the desired avalanche effect is maintained. See (6.3). Now let us illustrate the process of interlacing. We perform interlacing after encryption is performed on small blocks Bm1, Bm2, Bm3, Bm4. Let cm+11, Cm+12, Cm+13, cm+14 be the corresponding ciphers obtained after encryption. Let ‘Ci’ be the 256 bit cipher obtained after interlacing the ciphers cm+11, cm+12, cm+13, cm+14. Here ‘i’ indicates the round after which interlacing is performed and i=m+1. In the process of interlacing, we take the first bit of ‘cm+11’ and place it as the first bit of Ci, next we take the first bit of (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 78 ‘cm+12’ and place it as the second bit of Ci, and similarly the first bit of ‘cm+13’ and ‘cm+14’ are placed as the third and fourth bit of Ci. This process is continued till all the bits of cm+11, cm+12, cm+13, cm+14 are combined into Ci. Therefore Ci = { c1,1, c2,1, c3,1, c4,1, c1,2, c2,2, c3,2, c4,2, ……., c1,64, c2,64, c3,64,c4,64 } (2.5) Thus, The process of interlacing allows us to mix the bits thoroughly before beginning the next round. Interlacing and decomposition enables us in performing variable permutations and substitutions on bits in each round. The following figures explain how interlacing and decomposition are used. Decomposition C1 …... C64 C65 …. C128C129 …..C192 C193 ….. C256 B1 B2 B3 B4 64 bit blocks B1,B2,B3,B4 obtained after Decomposition Interlacing C1 C2 C3 C4 c1 …. c64 c1 … c64 c1 ….. c64 c1 .... c64 …… c1,1 c2,1 c3,1 c4,1 c1,2 c2,2 c3,2 c4,2 … c1,64 c2,64 c3,64 c4,64 Cipher text Ci of 256 bits after Interlacing. 3. Development of Cipher Let us consider a block of plaintext ‘P’ consisting of 32 characters. By using the EBCDIC code, each character can be represented in terms of 8 bits. Then the entire plaintext of 32 characters yields us a block containing 256 bits. Let this initial plaintext be represented as C0. Let the key ‘K’ contain 16 integers, then the 8 bit binary representation of these integers yields us a block containing 128 bits. Let this block be denoted as ‘k’. Let the first 32 bits of ‘k’ be treated as k1. The next 32 bits of ‘k’ be treated as k2. Similarly, we get two more keys ‘k3’ and ‘k4’. As we use four different blocks B1, B2, B3, B4 of 64 bit each for encryption, by using required transformations on k1, k2, k3 and k4 published in our previous paper, see reference [6]. The following is the process proposed for using interlacing and decomposition during encryption/decryption in feistel structure. Note: permutations, substitutions and key generation during encryption and reverse permutations and substitutions and key generations during decryption are discussed in our paper published earlier. See reference [6]. Decompose the plaintext Plaintext C0 of 256 bit F Interlacing Interlacing : : : : : : : : : : : : Interlacing Interlacing Round 1 Round 2 Round 15 Round 16 Decompose Decompose Decompose Decompose Encryption involving interlacing and decomposition Cipher text C16 of 256 bit. Interlacing F F F F F F F F F F F F F F F F F F F F F F F (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 79 We generate the keys for respective rounds denoted as krm1, krm2, krm3, krm4. Such that if krmi is the round key, then ‘i’ indicates the block and ‘m’ indicates the round. The initial plaintext of 256 bits is represented as C0 Decompose C0 into four blocks of 64 bits each. This can be represented as B01, B02, B03, and B04. Therefore, Bmi = < Cm > where, ‘m’ indicates the round after which decomposition is performed, ‘i’ indicates the block number; i = 1 to 4 and < Cm > indicates decomposition. In the first round, encryption is done in the following way. We perform the required transformations on k1, k2, k3, and k4 to get krn1, krn2, krn3, krn4. Cni = Fkrni ( Bmi ); i = 1 to 4 indicates ith block. ‘F’ indicates encryption and krni indicates the round key for ‘nth’ round on ith block and n = m+1. After encryption in nth round, we get ciphertext as four blocks Cn1, Cn2, Cn3, Cn4. Next we perform interlacing after encryption. Cn = > Cni < ; Here i = 1 to 4 , indicates the cipher block. n = 1 to 16. indicates the round after which interlacing is performed. > Cni < , represents interlacing. Similarly, during decryption, we proceed in the same way as discussed above, performing reverse transformations on key. See reference [6] for reverse transformations used. 4. Algorithms 4. 1 Algorithm for Encryption BEGIN C0 = P //initialize 256 bits plaintext for i = 1 to 16 { for j = 1 to 4 { Bi -1 j = < Ci -1 > //Decompose } for j = 1 to 4 { Cij = Fkri j ( Bi-1j ) //Encryption } for j = 1 to 4 { Ci = > Ci j < //Interlace } } END 4. 2 Algorithm for Decryption BEGIN Decompose the ciphertext Ciphertext C16 of 256 bit Interlacing Interlacing : : : : : : : : : : : : Interlacing Interlacing Round 16 Round 15 Round 2 Round 1 Decompose Decompose Decompose Decompose Decryption involving interlacing and decomposition Plain text of 256 bit. Interlacing F F F F F F F F F F F F F F F F F F F F F F F F (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 80 C16 = cipher text //initialize 256 bits cipher text for i = 16 to 1 { for j = 1 to 4 { Bij = < Ci > //Decompose } for j = 1 to 4 { Ci-1j = Fkrij ( Bij ) //Encryption } for j = 1 to 4 { Ci-1 = > Ci-1j < //Interlace } } END 4. 3 Algorithm for Decomposition BEGIN < Ci-1 > //during ith round { j =1 for n = 1 to 256 { if ( n <= 64 ) { B i-1j [n] = Ci-1[n] j = j + 1 } else if ( ( 64 > n ) and ( n <= 128 ) ) { B i-1j [n] = Ci-1[n] j = j + 1 } else if ( ( 128 > n ) and ( n <= 192 ) ) { B i-1j [n] = Ci-1[n] j = j + 1 } else if ( ( 192 > n ) and ( n <= 256 ) ) { B i-1j [n] = Ci-1[n] j = j + 1 } } } END 4. 4 Algorithm for Interlacing BEGIN > Ci-1j < { for n = 1 to 64 { Ci-1 [( j-1)*64 + n] = Ci-1j [ n ] } } END 5. Illustration Of Cipher Consider the plaintext P = { O Lord, Please save me from evil }. Let the key K = { 155, 23 , 59, 3, 111, 26, 91, 36, 77, 148, 87, 59, 118, 2, 65, 181 }. Now the 8 bit binary representation of plaintext P and key K is as follows. Initial plaintext C0 = P. 01001111001000000100110001101111011100100110 01000010110000100000 01010000011011000110010 10110000101110011011001010010000001110011011 00001011101100110010100100000011011010110010 10010000011001100111001001101111011011010010 000001100101011101100110100101101100 (5.1) Initial key k is 10011011000101110011101100000011011011110001 10100101101100100100010011011001010001010111 0011101101110110000000100100000110110101(5.2) Let the plaintext be decomposed into B01, B02, B03, B04. Then the respective 64 bit blocks after decomposition are as follows. 01001111001000000100110001101111011100100110 01000010110000100000. (5.3) 01010000011011000110010101100001011100110110 01010010000001110011. (5.4) 01100001011101100110010100100000011011010110 01010010000001100110. (5.5) 01110010011011110110110100100000011001010111 01100110100101101100. (5.6) Permute the bits in key ‘k’ by using the random key based permutations published in our previous paper. See reference [6]. Let this permuted key be divided into four equal size blocks and used as round keys kr11, kr12, kr13, kr14. for blocks B01, B02, B03, B04.respectively. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 81 Now we encrypt these four blocks with their respective round keys and with the help of round function ‘F’ as described in our previous paper published. See reference [6]. The corresponding cipher blocks C11, C12, C13, C14. obtained after encryption in first round are as follows. 01100001011101100110010100100000011011010110 01010010000001100110. (5.7) 01110010011011110110110100100000011001010111 01100110100101101100. (5.8) 01101000011001000000000101001001010101111111 00011011111101111001 (5.9) 00100010011110001111101011001000001001111110 10011010100100100010 (5.10) Next, we need to interlace these four blocks and get a block cipher C1. So that, enough confusion and nonlinearity is induced by mixing the bits of these small block ciphers. After applying interlacing, we get the following block cipher as C1. 00001110111101000010000001011000000011111111 10010101111011000100000111011101000101011100 00011110000100111100000000110000000000100000 11101101001010001111001111110011111111110110 00011100010010100011010011110010011100100010 011100001110111100100110110010010010 (5.11) Similarly, by using the respective round and sub keys, we continue the process up to 16 rounds and we get the following cipher. 10011111110011001000010110011010110000010111 01011000100011110111001000111110111101000101 00010001001110000001001000100110110000001001 01110100001000101100101010001111001001111100 11110111000001001010000000101001101011011000 011111000010000011000110011011101110 (5.12) Since the process of decryption is same as the process of encryption, we get the plaintext by following the similar steps as illustrated above but with reverse permuted keys. 6. Cryptanalysis Now, let us examine the brute force attack and the known plaintext attack on our cipher to assess the strength of the cipher. First, we show that the brute force attack is formidable and the known plaintext attack leads to a system of equations from which the unknown key cannot be determined. 6. 1 Brute Force Attack We are using 128 bit key k in each round, we divide k into four blocks, perform required transformations and get the round sub keys kr11, kr12, kr13, kr14 for plaintext blocks B01, B02, B03, B04 respectively. According to Brute force attack, if a round key has to be guessed. We need an exhaustive search of key space 2128 ≈ (210)13 ≈ (103)13 ≈ 1039. (6.1.1) Since it takes many years to test each and every key possible within such huge key space, we say that brute force attack is not possible on our algorithm as we cannot afford so many years in searching the exact key. 6. 2 Known plaintext Attack In this case, we have as many plain text – cipher text pairs as we require. In our present paper, it is worth noticing the interlacing and decomposition concepts introduced which handle the known plaintext attack. Let us first understand how classical feistel cipher is prone to known plaintext attack and then will discuss how our modified feistel cipher tackles this problem. According to classical feistel cipher network, the problem is with a particular set of bits, which always undergo into similar transformations in every successive round. For example, the first six bits always go into the first substitution box. Therefore, if we have enough plaintext cipher text pairs, one can easily guess the values used in a substitution box ignoring the other substitution boxes. Similarly, one will be able to guess the key bits also. This problem does not exist in our modified algorithm because; we are using four independent blocks of encryption in each round. It is ensured that bits after a particular round will not enter into the same substitution boxes, will not use the same permutations and key. This is due to interlacing and decomposition concepts, which allow the scattering of bits into four different blocks. Thus, interlacing and decomposition allow us to mix the bits properly and it helps us in introducing high nonlinearity in the algorithm. 6. 3 Avalanche Effect Let the plaintext be “O Lord, Please save me from evil”. By following the process of encryption, we get the cipher. 10011111110011001000010110011010110000010111 01011000100011110111001000111110111101000101 00010001001110000001001000100110110000001001 01110100001000101100101010001111001001111100 11110111000001001010000000101001101011011000 011111000010000011000110011011101110 (6.3.1) Now let the plaintext be fixed, but change the key by one bit. This can be done by changing the number “155” to “156” in key ‘K’, since 155 and 156 differ by one bit. Now (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 82 by using this new key ‘k’ we encrypt the same plaintext and we obtain the corresponding cipher as 00110010010101011010111001110010111111110110 01010011101110000101001000100010100001011101 01010101111010111100000111000001010001111010 00110101011110010110101101101010010101101010 01010101110010001011011100111100011001000100 100010011001010011010010101111010011 (6.3.2) Comparing (6.3.1) and (6.3.2), we notice that the two cipher blocks differ by 125 bits out of the total 256 bits. This shows that the algorithm exhibits strong avalanche effect. In the second case, let the key ‘K’ be fixed, But change the plaintext. So that, the new plaintext and the original one differ by exactly one bit. This can be accomplished by changing the first character of the plaintext from ‘O’ to ‘P’, because, ASCII values of ‘O’ and ‘P’ differ by one. We get the cipher text from this new plaintext as 11011100010001000100000100011000001000000101 00100100001110111010101111000001101100100110 11110010110010010000111001111001000111101000 00010001010011100100100000111000101001000101 00101010011111010011010110100010010010001100 001101101011001011100010001010101010 (6.3.3) On comparing (6.3.1) and (6.3.3), we notice that the two cipher blocks differ by 125 bits out of 256 bits. This shows, that the interlacing and decomposition introduced in our encryption algorithm exhibits good avalanche affect. 7. Computational Results and Conclusion In this paper, we have developed a block cipher of 256 bits. The plaintext is of 32 characters and each character is represented with its 8 bits binary equivalent. The key contains 16 integers which converted into its 8 bits binary equivalent. The algorithms used for encryption, decryption, decomposition, interlacing etc. are all written using C language. From the cryptanalysis presented, we found that, brute force attack is not possible. There is enough confusion and diffusion introduced in the encryption algorithm through the concepts of interlacing and decomposition. This is proved by the avalanche effect that is shown in (6.3). By using interlacing and decomposition, a 256 bit block , is broken into 4 equal parts of 64 bit blocks so that, cipher bits obtained after each round scatter into different blocks in the next round. By doing so, the cryptanalysis part becomes more difficult as the final cipher text obtained will depend on different substitution boxes and different transformations References [1] William Stallings, “ Cryptography and Network Security: Principles & Practices”, Third edition, 2003, Chapter 2 and 3. [2] Feistel. H. “ Cryptography and Computer Privacy” , Scientific American, Vol. 228, No. 5. pp 15 – 23, 1973. [3] Feistel, H., Notz W. and Smith. J. “ Some cryptographic Techniques for machine to machine data communications “, Proceedings of the IEEE, Vol. 63, No. 11, pp 1545 – 1554, Nov 1975. [4] “Avalanche Characteristics of Substitutions – permutation Encryption Networks” Tavares S. Heys H. IEEE Transactions on Computers 44 (9): 1131 – 1139, 1995. [5] Shakir M. Hussain and Naim M. Ajilouni, “Key based random permutation”, “Journal of Computer Science 2(5): 419 – 421, 2006. ISSN 1549 -3636. [6] K. Anup Kumar and S. Udaya Kumar, “Block cipher using key based random permutations and key based random substitutions”, “International Journal Of Computer Science and Network Security”, Seoul, South Korea. ISSN: 738-7906. Vol. 08, No. 3, March 2008. pp. 267-277. Authors Profile K. Anup Kumar is working as an Associate Professor in the Department Computer Science and Engineering, Sreenidhi Institute of Science and Technology. He is pursuing his PhD in the area of information security, Under the guidance of Prof. V.U.K. Sastry from Jawaharlal Nehru Technological University, Hyderabad, India. He published two papers in international Journals. He is interested in the research areas like: cryptography, Steganograpy, and Parallel processing systems. Prof. V.U.K. Sastry is working as the Director school of computer science and informatics and as Dean R & D CSE Department in Sreenidhi Institute of Science and technology. Hyderabad, India. He has successfully guided many PhD’s and his research interests are: information security, Image processing and Data warehousing -data mining. He is the reviewer of many international journals. Acknowledgement The authors are very thankful to Prof. Depanwita Roy Chaudhury, IIT Kharagpur, India, for giving necessary suggestions and for her valuable inputs given while writing this paper. The authors are very thankful to the management of Sreenidhi Institute Of Science and Technology, for their support and encouragement given during this research work. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 83 A Comprehensive Analysis of Voice Activity Detection Algorithms for Robust Speech Recognition System under Different Noisy Environment C.Ganesh Babu1 , Dr.P.T.Vanathi2 R.Ramachandran3, M.Senthil Rajaa4, R.Vengatesh5 1Research Scholar (PSGCT,) Associate Professor /ECE, Bannari Amman Institute of Technology, Sathyamangalam, India. E-mail :bits_babu@yahoo.co.in 2 Assistant Professor /ECE, PSGCT, Coimbatore, India . E-mail :pt_vani@yahoo.com 3,4,5 UG Schola,r Bannari Amman Institute of Technology-Sathyamangalam, India. Abstract: Speech Signal Processing has not been used much in the field of electronics and computers due to the complexity and variety of speech signals and sounds. However, with modern processes, algorithms, and methods which can process speech signals easily and also recognize the text. Demand for speech recognition technology is expected to rise dramatically over the next few years as people use their mobile phones as all purpose lifestyle devices. In this paper, we implements a speech-to-text system using isolated word recognition with a vocabulary of ten words (digits 0 to 9) and statistical modeling (Hidden Markov Model -HMM) for machine speech recognition. In the training phase, the uttered digits are recorded using 8-bit Pulse Code Modulation (PCM) with a sampling rate of 8 KHz and saved as a wave file using sound recorder software. The system performs speech analysis using the Linear Predictive Coding (LPC) method of degree. From the LPC coefficients, the weighted cepstral coefficients and cepstral time derivatives are derived. From these variables the feature vector for a frame is arrived. Then, the system performs Vector Quantization (VQ) utilizing a vector codebook which result vectors form of the observation sequence. For a given word in the vocabulary, the system builds an HMM model and trains the model during the training phase. The training steps, from Voice Activity Detection (VAD) are performed using PC-based Matlab programs. Our current framework uses a speech processing module including a Subband Order Statistics Filter based Voice Activity Detection with Hidden Markov Model (HMM)-based classification and noise language modeling to achieve effective noise knowledge estimation. Keywords: Hidden Markov Model, Vector Quantization, Subband OSF based Voice Activity Detection. 1. INTRODUCTION Currently, Speech Recognition Systems are pluged into many technical barriers towards modern application. An important drawback affect most of these application is harmful environmental noise and it also reduces the system performance. Most of the noise compensation algorithm often requires the Voice Activity Detector (VAD) to estimate the presence or absence of speech signal [1]. In this paper, we compare the performance of the VAD algorithm in presence of different types of noise like airport, babble, train, car, street, exhibition, restaurant and station for Automatic Speech Recognition (ASR) in a comprehensive manner. The proposed method of Speech Recognition System for Robust noise environment is shown in the figure.1 Figure1. Proposed Robust Speech Recognition System 1.1 Speech Characteristics Speech signals are composed of sequence of sounds. Sounds can be classified into three distinct classes according to their mode of excitation. (i) Voiced sounds are produced by forcing air through the glottis with the tension of the vocal cords adjusted so that they vibrate in a relaxation oscillation, thereby producing a quasi-periodic pulse of air which vibrates the vocal tract. (ii) Fricative or Unvoiced sounds a regenerated by forming a constriction at some point in the vocal tract and forcing air through .the constriction at a high enough velocity to produce turbulence. (iii) Plosive sounds result from making a complete closure and abruptly releasing it. 1.2 Overview of Speech Recognition A Speech Recognition System is often degraded in performance when there is a mismatch between the acoustic conditions of the training and application environments. This mismatch may come from various sources, such as additive noise, channel distortion, different speaker characteristics and different speaking modes. Various robustness techniques have been proposed to reduce this mismatch and thus improve the recognition performance [12]. In the last decades, many methods have been proposed to enable ASR systems to compensate or adapt to mismatch due to inter speaker differences, articulation effects and microphone characteristics [14]. The paper is organized as follows. Section 2 reviews the theoretical background of VAD algorithms. Section 2.1 INPUT SPEECH NOISE ESTIMATION VAD OUTPUT (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 84 shows the principle of VAD algorithm. Section 3 explains the subband OSF based VAD implementation. Sections 4 express the nature of VAD using HMM. Results are discussed in Section 5. The paper was concluded in Section 6. 2. Voice Activity Detection Voice is differentiated into speech or silence based on speech characteristics. The signal is sliced into adjoining frames. A real valued nonnegative parameter is associated with each frame. If this parameter exceeds a certain threshold, the signal is classified as speech or non speech. The basic principle of VAD device is that it extracts some measured features or quantities from the input signal and then compares these values with thresholds. Voice activity (VAD=1) is declared if the measured value exceeds the threshold. Otherwise (VAD=0) is declared for no speech activity. In general, a VAD algorithm outputs a binary decision in a frame by frame basis where a frame of an input signal is a short unit of time such as 20-40mseconds. 2.1 VAD Decision Rule Once the input speech has been de-noised, its spectrum magnitude Y (k, l) is processed by means of a (2N +1)-frame window. Spectral changes around an N-frame neighborhood of the actual frame are computed using the N-order Long-Term Spectral Envelope (LTSE) as: (1) where l is the actual frame for which the VAD decision[12] is made and k= 0, 1, ..., NFFT-1, is the spectral band. The noise suppression block have to perform the noise reduction of the block (2) before the LTSE at the l-th frame can be computed. This is carried out as follows. During the initialization, the noise suppression algorithm is applied to the first 2N + 1 frames and, in each iteration, the (l+N +1)-th frame is de-noised, so that Y (k, l+N +1) become available for the next iteration. The VAD decision rule is formulated in terms of the Long-Term Spectral Divergence (LTSD)[1] calculated as the deviation of the LTSE respect to the residual noise spectrum N(k) and defined by: (3) If the LTSD is greater than an adaptive threshold γ, the actual frame is classified as speech, otherwise it is marked as non speech. A hangover delays the speech to non-speech transition in order to prevent low-energy word endings being misclassified as silences. On the other hand, if the LTSD achieves a given threshold LTSD0, the hangover algorithm is turned off to improve non speech detection accuracy in low noise environments. The VAD is distinguished to be adaptive to time-varying noise environments with the following algorithm for updating the noise spectrum during non-speech periods being used: (4) Where Nk is the average spectrum magnitude over a Kfrram neighbourhood: = (5) 3. Subband OSF Based VAD An improved voice activity detection algorithm employing long-term signal processing and maximum spectral component tracking .It improves the speech/non-speech discriminability and speech recognition performance in noisy environments. Two issues are solved using VAD .The first one is performance of VAD in low noise condition (low SNR) and the second is with noisy environment (background) [1]. Figure 2. Block Diagram of Subband Order Statistics Filter based VAD The subband based VAD uses two order statistics filters for the Multi-Band Quantile (MBQ) SNR estimation [3]. The implementation of both OSF is based on a sequence of 2N+1 log-energy values {E(m − N,k), . . . , E(m,k), . . . , E(m + N,k)} around the frame to be analyzed [14]. The block diagram of the subband based VAD is shown in the Figure 2. This algorithm operates on the subband log-energies. Noise reduction is performed first and the VAD decision is formulated on the de-noised signal. The noisy speech signal is decomposed into 25-mseconds frames with a 10-mseconds window shift. Let X(m,l) be the spectrum magnitude for the mth band at frame l .The design of the noise reduction block is based on Wiener Filter theory whereby the attenuation is a function of the signal-to-noise ratio (SNR) of the input signal. The VAD decision is formulated in terms of the de-noised signal, being the subband log-energies processed by means of order statistics filters[2]. The noise reduction block consists of four stages. FFT NOISE REDUCTION VAD SPECTRUM SMOOTHING WF DESIGN FREQUENCY DOMAIN FILTER NOISE UPDATE (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 85 i) Spectrum smoothing: The power spectrum is averaged over two consecutive frames and two adjacent spectral bands. ii) Noise estimation: The noise spectrum Ne(m,l) is updated by means of a 1st order IIR filter on the smoothed spectrum Xs(m,l), (6) where λ=0.99 and m=0,1,…,NFFT/2 iii) Wiener Filter design: First, the clean signal S(m,l) is estimated by combining smoothing and spectral subtraction (7) where γ=0.98 . Then, the wiener filter H(m,l) is designed as (8) Where (9) and ηmin is selected so that the filter frequency response yields a 20 dB maximum attenuation. S’(m,l) the spectrum of the cleaned speech signal is assumed to be zero at the beginning of the process and is used for designing the wiener filter through Equation 3 to Equation 5. It is given by (10) The filter H(m,l) is smoothed in order to eliminate rapid changes between neighbor frequencies that may often cause musical noise. Thus, the variance of the residual noise is reduced and consequently, the robustness when detecting nonspeech is enhanced. The smoothing is performed by truncating the impulse response of the corresponding causal FIR filter to 17 taps using a Hanning window one to this time domain operation. The frequency response of the Wiener filter is smoothed and the performance of the VAD is improved. iv) Frequency domain filtering: The smoothed filter is applied in the frequency domain to obtain the denoised spectrum (11) 4. Hidden Markov Model The basic theoretical strength of the HMM is that it combines modeling of stationary stochastic processes (for the short-time spectra) and the temporal relationship among the processes (via a Markov chain) together in a welldeffine probability space. This combination allows us to study these two separate aspects of modeling a dynamic process (like speech) using one consistent framework. Another attractive feature of HMM's comes from the fact that it is relatively easy and straightforward to train a model from a given set of labeled training data (one or more sequences of observations). As mentioned above the technique used to implement speech recognition is Hidden Markov Model (HMM) [4][13].The HMM] is used to represent the utterance of the word and to calculate the probability of that the model which created the sequence of vectors. There are some challenges in designing of HMM for the analysis or recognition of speech signal. HMM broadly works on two phases under which phase I is Linear Predictive Coding and phase II consists of Vector Quantization, training, and recognition phases. The present hidden Markov Model is represented by equation 12. (12) p = initial state distribution vector. A= State transition probability matrix. B=continuous observation probability density function matrix. Given appropriate values of A,B and π as mentioned by equation 12, the HMM can be used as a generator to give an observation sequence (13) (Where each observation Ot is one of the symbols from the observation symbol V and T is the number of observation in the sequence) as follows: 1) Choose an initial state q1=Si according to the initial state distribution p. 2) Set t=1 3) Choose according to the symbol probability distribution in state Si . 4) Transit to a new state according to the state transition probability distribution for state Si. 5) Set ( return to step3) if ; otherwise terminate the procedure. The above procedure can be used as both a generator of observations, and as a model for how a given observation sequence was generated by an appropriate HMM. After re estimate the parameters, the model is represented with the following denotation (14) The model is saved to represent that specific observation sequences, i.e. an isolated word. The basic theoretical strength of the HMM is that it combines modeling of stationary stochastic processes (for the short-time spectra) and the temporal relationship among the processes (via a Markov chain) together in a well-defined probability space. This combination allows us to study these two separate aspects of modeling a dynamic process (like speech) using one consistent framework. Another attractive feature of (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 86 HMM's comes from the fact that it is relatively easy and straightforward to train a model from a given set of labeled training data (one or more sequences of observations). 4.1 Linear Predictive Coding Analysis One way to obtain observation vectors O from speech samples is to perform a front end spectral analysis. The type of spectral analysis that is often used is linear predictive coding[5]-[9]. The steps in the involved processing are as follows: i) Preemphasis: The digitized speech signal is processed by a first-order digital network in order to spectrally flatten the signal which is discussed as (15) ii) Blocking Into Frames: Sections of NA consecutive speech samples are used as a single frame. Consecutive frames are spaced MA samples apart. The frame separation is given by following equation (16) iii) Frame Windowing: Each frame multiplied by an NA sample window(Hamming Window) so as to minimize the adverse effects of chopping an NA samples section out of the running speech signal .The windowing technique is expressed as (17) iv) Auto Correlation Analysis: Each windowed set of speech sample is autocorrelated to give a set of (p+1) coefficients, where p is order of the desired LPC analysis . The autocorrelation process is given by = l(n) l , (18) v) LPC/Cepstral Analysis: A Vector of LPC coefficients is computed from the autocorrelation vector using a Levinson or a Durbin recursion method. An LPC derived cepstral vector is then computed up to the Qth component. The cepstral analysis is given by (19) vi) Cepstral Weighting: The Q-coefficient cepstral vector ct(m) at time frame l is weighted by a window Wc(m)[5][6] which is discussed as 20) (21) To find (22) vii) Delta Cepstrum: The time derivative of the sequence of weighted cepstral vectors is approximated by a first-order orthogonal polynomial over a finite length window of frames centered around the current vector[7][8] which is denoted in the following equation =[ (23) where G is the gain term to make the variance of ĉl(m) and ĉl(m) equal. (24) (25) 4.2 Vector Quantization Training and Recognition Phases To use HMM with discrete observation symbol density , a Vector Quantizer (VQ) is required to map each continuous observation vector in to a discrete code book index. The major issue in VQ is the design of an appropriate codebook for quantization. The procedure basically partitions the training vector in to M disjoin sets. The distortion steadily decreases as M increases. Hence HMM with codebook size of from M=32 to 256 vectors has been used in speech recognition experiments using HMMs [9-10] During the training phase the system trains the HMM for each digit in the vocabulary. The same weighted cepstrum matrices for various samples and digits are compared with the code book and their corresponding nearest codebook vector indices is sent to the Baum-Welch algorithm to train a model for the input index sequence. After training, three models for each digit that corresponds to the three samples in our vocabulary set. Then one obtained average of A,B and π matrices over the samples are calculate to generalize the models[11]. The input speech sample is preprocessed to extract the feature vector. Then, the nearest codebook vector index for each frame is sent to the digit models. The system chooses the model that has the maximum probability of a match. 5. Results and Discussion Several experiments are conducted commonly to evaluate VAD algorithm .The analysis mainly focused on error Probabilities. The proposed VAD was evaluated in terms of ability to discriminate speech signal from non –speech at different SNR values .The results are shown in table 1-10 Table 1: Performance of VAD for digit ‘0’for various noise sources NOISES 0d B 5d B 10d B 15d B AIRPORT EXHIBITION TRAIN RESTAURANT STREET B ABBLE STATION CAR (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 87 Table 2: Performance of VAD for digit ‘1’for various noise sources Table 3: Performance of VAD for digit ‘2’for various noise sources Table 4: Performance of VAD for digit ‘3’for various noise sources Table 5: Performance of VAD for digit ‘4’for various noise sources Table 6: Performance of VAD for digit ‘5’for various noise sources Table 7: Performance of VAD for digit ‘6’for various noise sources Table 8: Performance of VAD for digit ‘7’for various noise sources Table 9: Performance of VAD for digit ‘8’for various noise sources NOISES 0d B 5d B 10d B 15d B AIRPORT EXHIBITION TRAIN RESTAURANT STREET B ABBLE STATION CAR NOISES 0d B 5d B 10d B 15d B AIRPORT EXHIBITION TRAIN RESTAURANT STREET B ABBLE STATION CAR NOISES 0dB 5dB 10dB 15dB AIRPORT EXHIBITION TRAIN RESTAURANT STREET B ABBLE STATION CAR NOISES 0dB 5dB 10dB 15dB AIRPORT EXHIBITION TRAIN RESTAURANT STREET B ABBLE STATION CAR NOISES 0dB 5dB 10dB 15dB AIRPORT EXHIBITION TRAIN RESTAURANT STREET B ABBLE STATION CAR NOISES 0d B 5d B 10d B 15d B AIRPORT EXHIBITION TRAIN RESTAURANT STREET B ABBLE STATION CAR NOISES 0d B 5d B 10d B 15d B AIRPORT EXHIBITION TRAIN RESTAURANT STREET B ABBLE STATION CAR NOISES 0dB 5dB 10dB 15dB AIRPORT EXHIBITION TRAIN RESTAURANT STREET B ABBLE STATION CAR (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 88 Table 10: Performance of VAD for digit ‘9’for various noise sources 6. Conclusion The experimental results are shown in table 1-10 inferred that the VAD algorithm produces a better result for certain noises. The recognition system using VAD gives robustness than any another algorithm. For digits ‘0’ and ‘9’ VAD provides better result for airport and street noises. For digits ‘1’ and ‘8’ it gives better performance over exhibition and station. For digit ‘2’ the better recognition occurs for airport, street and car noises. For digit ‘3’ the recognition is good for street and car. For digit ‘4’ the VAD performs good recognition for street and babble noises. For digit ‘5’ it works well for exhibition and car noises. For digit ‘6’ the recognition works well in airport and exhibition environment. For digit ‘7’ the performance of VAD is better for exhibition and babble noises. Thus VAD works well for utterances in different digits and extracts speech signal at different noisy environment conditions. Further research is in the direction of Genetic Algorithm for Robust Speech Recognition in a noisy environment. Acknowledgement Firstly, the authors would like his thanks to the Supervisor, Dr. P.T.Vanathi, Professor, Department of Electronics and Communication Engineering, PSG College of Technology, Coimbatore, India. The author would like to express his thank to the Management and Principal of Bannari Amman Institute of Technology, Sathyamangalam, India. The author greatly expresses his thanks to all persons whom will concern to support in preparing this paper. References [1] Ramirez, J.C.Segura, C.Benitez, A.de la Torre, A.Rubio, Voice activity detection with noise reduction and long-term spectra divergence estimation” IEEE International Conference on Acoustics, Speech, and Signal Processing, Volume 2, Issue , pp 1093-6, 17-21, May 2004. [2] Sundarrajan Rangachari, Philipos C. Loizou,” A noiseestimmatio algorithm for highly non-stationary environments” . Speech Communication 48 (2006) pp.220–231, August 2005. [3] Javier Ramírez, José C. Segura, Senior Member, IEEE, Carmen Benítez, Ángel de la Torre and Antonio Rubio,” An Effective Subband OSF-Based VAD With Noise Reduction for Robust Speech Recognition”. IEEE Transactions on Speech And Audio Processing, Vol. 13, pp.1119-1129, November 2005. [4] Lawrence R. Rabiner,”A tutorial on Hidden Markov Model and selected applications in speech recognition”,proceedings of the IEEE, vol.77, no.2,February 1989 [5] J. Makhoul,”Linear Prediction a Tutorial view,” Proceedings of the IEEE, April 1975. [6] J.D.Markel and A.H.Gray Jr., “Linear Prediction of Speech”. Newyork, NY:springer-Verilag,1976. [7] Y.Tokhura,”Aweighted cepstraldistance measure for speechrecognition,”IEEE Trans.Acoust.speech signal processing,vol.ASSP-35,no.10.pp.1414-1422, October 1987. [8] B.H.Juang,L.R.Rabiner and J.G.Wilpon,”On the Use of Bandpass filtering in speech recognition”, IEEETrans.Acoust.Speech signal processing, vol.ASSP-35, no.7, pp947-954, July 1987. [9] J.Makhoul,S.Roucos and H.Gish,”Vector Quantization In Speech Coding” .Proc.IEEE.vol.73, no.11, pp.1551-1558 , November 1985. [10] L.R.Rabiner,S.E.Levinson and M.M.Sondhi,”On The Application Of Vector Quantization And Hidden Morkov Models To Speaker-Independent Isolated Word Recognition”,Bell Syst.Tech.J., vol.62, no.4, pp.1075-1105,April 1983. [11] M.T.Balamuragan and M.Balaji, ”SOPC-Based Speech toText Conversion Embedded processor design contest-outs standing design”, pp83-108, 2006. [12] Alan Davis, Sven Nordholm, Roberto Togneri. ”Statistical Voice Activity Detection Using Low-Variance Spectrum Estimation and an Adaptive Threshold” IEEE Transaction On Audio,Speech and Language Processing. VOL.14,NO.2,March2006. [13] Kaisheng Yao, Kuldip K. Paliwal,and Te-Won Lee, “Generative factor analyzed HMM for automatic speech recognition” Speech Communication vol.45 pp. 435–454 , January 2005. [14] Kentaro Ishizuka, and Tomohiro Nakatani, “A feature extraction method using subband based periodicity and aperiodicity decomposition with noise robust frontend processing for automatic speech recognition” Speech Communication vol.48, pp.1447–1457, July 2006. NOISES 0dB 5dB 10dB 15dB AIRPORT EXHIBITION TRAIN RESTAURANT STREET B ABBLE STATION CAR (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 89 Abstract: One of the challenges in the development of proxybaase content adaptation architecture is the implementation of proxy cache which aims at reducing the amount of transcoding operations and server traffic. In the absence of adaptationawwar caching at content proxies, the system performance suffers due to repeated and frequent access of same image versions on the remote content server. An efficient mechanism for caching the most relevant images and their variations can considerably reduce the above overheads. The efficiency of a content proxy can be improved by caching of image versions that are frequently accessed by client devices. Consequently, a considerable amount of transcoding-related computational load on content servers is reduced. This paper addresses these cache management issues by proposing an adaptation aware architecture for content adaptation proxies that focuses its cache removal policy on transcoding cost. The proposed architecture improves efficiency when compared to conventional cache management schemes, by resulting in lesser traffic between the proxy and remote server and lowering the number of transcoding operations. Keywords: Content adaptation architectures, cache management, multimedia. 1. Introduction Adaptation of multimedia content can be done at three possible locations: at the client device, at the content server or at a specialized multimedia proxy between the two. Client side adaptation requires the adaptation operations to be performed on the client devices receiving these objects. Though this is not unknown, in current scenario involving widespread use of handhelds, the computational power required for these operations is hard to find. Even is present, it takes up a larger time, decreasing efficiency than what could be obtained by the other schemes. In client-side architectures, cache precedence could increase efficiency considerably, but this is a seldom used practice as memory is again a scarce resource. In case of server-side adaptation techniques, conventional web servers are imbued with content adaptation processes. Caching of the adapted content here involves storing multiple variations of the same content in order to match with client requirement specification. This reduces the number of transcoding operations required, minimizing the need for computational power but increasing pressure on storage requirements. The trade off is between available storage and computational power. Also, it is to be noted in this case that the entire client load is on the server. This in tandem with a large number of transformations could possibly bring down response time below accepted standards and at the worse, result in system failure. Proxy-based approach introduces a new layer between the client and the server, the content proxy, effectively forming a three tier architecture. The content proxy receives the requests from the client and responds to it from the data obtained from the server. The need for a cache is implicit here. The proxy can store the variations of objects to respond to the client requests. It is easy to see that by associating a server to more than one client (a one to many mapping) the load on the server is reduced. Also there is effective separation of computation and data storage sectors by exporting computational requirements to the proxy. Also the effective computational load is reduced as it is distributed, hence lowering individual system costs as well as increasing system reliability while paving way for graceful system degradation via load balancing. It can be seen that the advantages of a content proxy are widespread and myriad. This approach is indeed widely used and recommended. Proxies can exhibit passive caching as well as active caching. In passive caching, if the object is not cache it is retrieved from the server and older objects are removed in the event of cache overflow. Active caching aims at improving retrieval performance by increasing the likelihood that a requested object will be found in the cache. It can also work as a superset of passive caching based on the load on the proxy. In content adaptation proxies, the choice of using active caching should be based on an analysis of frequently adapted object versions. The architecture proposed in this paper aims at implementing active caching and thereby reduce repeated adaptation operations and traffic between content server and content proxies. 2. Background and Related Work Over the past decade, a number of proxy-based architectures An Efficient Adaptation Aware Caching Architecture for Multimedia Content Proxies B.L. Velammal1 and P. Anandha Kumar2 1 Lecturer, Department of CSE, Anna University, Chennai – 25, India velammalbl@cs.annauniv.edu 2 Assistant Professor, Department of IT, MIT Campus, Anna University, Chennai – 44, India anandh@annauniv.edu (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 90 for content adaptation have been proposed. A proxy-based framework for dynamic adaptation in distributed multimedia applications [1] describes the functions of a proxy that can be configured to perform adaptations of multimedia. Another adaptation architecture called MARCH [2] enables the creation of a service-based overlay network. It provides for a dynamic proxy execution environment and mobile aware servers to decide which set of tasks have to be assigned to each proxy. Related research has also been carried out in Content Distribution Networks (CDNs) [3] that use presents topology aware overlays for efficient distributed multimedia application support. Self-adaptive CDNs [4] exhibit ability to adapt their behavior in order to cope with serving heterogeneous multimedia content under unpredictable network conditions. In addition to it, distributed content adaptation framework (DCAF) [5] has been proposed for pervasive computing. A similar proposal on Automatic Adaptation of Streaming Multimedia Content [6] makes use of a server side adaptation engine. The engine reacts to context changes and also facilitates multimedia adaptation in a distributed fashion along the delivery path. The cache forms the backbone of the content proxy, and hence an architecture that makes possible its efficient and effective utilization is required to exploit maximum benefits. Improved cache management ensures that the images that are accessed frequently by clients are stored in the cache. This decreases the number of adaptation operations involved in transcoding, thereby increasing average response time. This also increases the cache hit ratio (i.e., the probability that an requested object is found in the cache). In addition, the number of times the server is requested for an image is reduced reducing costly IO time and easing the load on the network. An important component in the design of any caching mechanism is the cache replacement policy. One of the existing cache replacement policies for transcoding proxies utilizes Aggregate Effect [7] for determining the image to be removed from cache. The transcoding graph is constructed based on parameters such as size of the object version, reference rate to each version and the delay in fetching the image version. However, it does not take the cache size restrictions and the optimal number of versions that should be kept on the cache and only provides a relationship determination for the existing image versions. Another caching scheme called as PTC [8] describes the working of proxies that transcode and cache for heterogeneous devices requirements. A similar graph-based data structure has been used that utilizes the earlier proposed Aggregate Effect along with network parameters and learning capabilities. This scheme again does not take the cache restrictions and indexing scheme into account. In addition to the replacement policy, another important factor in the design of caching is the indexing structure to be used. More recent research work in cache indexing has been focused on making the tree structure more cache conscious i.e. perform faster lookup with minimum required memory space. A number of such tree-based indexing schemes like CSB-tree [9], CST-tree [10], CSR-tree [11] have been formulated for cache management. However, their node structure and tree construction algorithms do not take the cache replacement policy into account. An efficient cache indexing structure on Content Proxies will have to take into account the transcoding costs between image variations as a part of determining the relevancy of any image object resident on the cache. Our work focuses on combining the cache replacement policy of the content proxy with an adaptation-aware cache indexing scheme in order to arrive at a more holistic method of cache management for content adaptation proxies. 2.1. Content Adaptation Architecture The architecture which we have referred to is one proposed by Jean-Marc Pierson [5] (see Fig.1). The architecture takes into consideration client profile, network conditions, content profile (meta-data) and available adaptation services (third party software elements) to arrive at an optimal dynamic adaptation decision. Figure 1. Content Adaptation Architecture 2.2 Local Proxies: They intercept user request and server responses and initiate the transfer of adapted content 2.3. Content Proxies: They accept user requests forwarded by the local proxies and retrieve the images from either the local cache or remote content servers. Adaptation is performed in the adaptation engine if the appropriate image version is not present and then transferred back to the local proxies ot be delivered to the client. 2.4. ASPs: Adaptation Service Proxies are the web services that can be deployed on the content proxies to execute required adaptation operations. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 91 2.5. Profiles: The architecture provides for – i. Device profiles for storing device capability and compatibility information. ii. Client profiles for storing user preferences and settings. iii. Adaptation Service Registry for storing service profiles and location. As explained before, the presence of cache at any point in this architecture increases productivity of the system. Though caches make sense both at local and content proxies, maximum performance gain can be attained when the cache is used at the independent content proxies that are deployed, as it provides all the aforementioned gains. A second driving force behind this decision is that local proxies are frequently absent in the current scenario of mobile computing. In the next section, we introduce our content proxy architecture that incorporates an adaptation-aware cache management system. 3. Content Proxy Architecture A content proxy that is supplied with a cache can be divided into distinct modules based on which functionalities they serve. The various functionalities of the content proxy includes: deserialisation of the incoming client request, management of the cache and its indexing, adaptation of images to fulfill the requests and IO with the server for retrieving required images. In line with these functional requirements, our architecture has been divided into four distinct modules, the layout of which is given in the architecture diagram (see figure 2). 3.1. Query Processor: The communication forms the interface between the client and the content proxy, and is depicted outside the proxy’s boundaries to indicate that it is strictly not a proxy component. Instead it could be merged with the query processor as a single module, and this combination is referred to here as the Query processor. The function of the query processor is deserialisation of incoming client requests in order to extract the user parameters for requested objects. For example in an image server this might include the pictures identity, resolution, format and other such characteristics. Additionally this can include device specific data such as supported data types Figure 2. Content Proxy Architecture (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 92 and processing constraints. Optionally they can be got from the device profile data structure. 3.2. Adaptation Engine: This module is in charge of carrying out all the required the adaptation operations. Thus happens in case of a cache miss or of a partial cache hit. In both cases, the proxy has to carry out adaptation operations to satisfy client requests. Adaptation operation functions that are existing can again be used in our architecture, the general framework presented here is independent of how adaptation is implemented. 3.3. Cache Management module: This includes both the indexing mechanism for the cache as well as the implementation of the cache removal policy. The indexing mechanism is required for fast and orderly access of the cache. The replacement policy is decisive in the cache architecture and determinant of the cache utilization and efficiency. 3.4. Working of the Content Proxy The Query Processor (QP) and Adaptation Engine (AE) perform the decision making regarding the adaptation operations that need to be performed on appropriate image objects. On the other hand, the Cache Management module (CM) maintains cache index and manages replacement operations based on the input parameters obtained from the previous two modules. The overall working of the proxy can be described as follows – i. Communication Interface gets the client request containing identifiers for client, image and the device type. ii. QP deserializes the query and forwards the parameters to Adaptation Engine. iii. AE fetches the device profile based on device type specified in the query and decides on the adaptation operations. iv. AE determines the expected optimal adaptation image parameters and sends it to QP. v. QP forwards it to CM. vi. CM checks if the image version with the required parameters is present in the cache. a. If image is present, it is fetched and sent to QP and then to the client via the Communication Interface. b. If not present, the CM tries to fetch a “similar transcodable” image from the cache which is then adapted to the appropriate format in AE and then sent to the client. c. Otherwise, CM sends cache miss message to QP. QP then instructs AE to fetch the image from remote server for adaptation. vii. If cache miss had occurred during recent image request, the image retrieved from server is added to the cache. viii. Cache overflow is avoided by removing the least preferable image using replacement policy. 4. Adaptation Aware Caching Scheme Cache efficiency is characterized by – · Amount of “less important” cached images · Size of allotted memory · Time required for the transmission of data from the remote server to the content proxy As the indexing scheme being proposed in this paper focuses on being adaptation-aware, the study of third factor above is beyond the scope of this paper. For content adaptation proxies, similar images should be indexed closer. Moreover, index should have caching policy information in order to achieve faster image replacement in case of a cache miss. And as per convention, the removal policy in the index should favor more frequently adapted objects which can lead to a reduction in computational overheads. Consequently, the response time can be bettered. A cache management algorithm can be analogized to the classic Knapsack problem [12]. Consider D as the dataset (more specifically image-set in our case). D = {D1, D2, …, DN} xi = {0,1} where xi = 0, if Di is cached xi = 1, if Di is not cached wi and si are said to be the “relevancy value” and size of Di respectively. The aim of the caching mechanism is to – a. Maximize (Cache volume) V = Σ i = 1 N xi wi b. Ensure Σ i = 1 N xi si <= S, where S is the total cache size It is clear from the above expressions that the “relevancy value” w plays a pivotal role in the overall design and implementation of the cache. Keeping in mind the application of caching on content adaptation proxies, we define the relevancy value of j the version of image Di as – wi,j = F (si,j, di,j, fi,j, ti,j, ni) where, si,j is the size of image version ri,j is the image version resolution di,j is distance of image version from “nearest similar” image fi,j is frequency of access of image version ti,j is last access time of image version ni is the number of cached image versions of Di From the above notations, it is obvious that a crucial parameter in determining relevancy value wi,j in content proxy cache is the distance of image version from “nearest similar” image. We define a nearest similar image as (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 93 another version of the same image that has higher resolution and from which the target image can be obtained through local adaptation. Arriving at a suitable value for this parameter can result in a reduction in the number of repeated requests to the remote server for the same image version. Some of the other terms that are used in the paper to describe the caching mechanism are defined below: Cache Hit: When an exact match for the clients request is found in the cache, it is termed a cache hit. This depends in the availability of frequently accesses cache images in the cache, a direct consequence of the replacement policy. Cache Miss: The requested image is not found in the cache; hence it should be accessed from the server. This involves IO time, hence an increased response time. Partial cache hit: The requested image is not found, but an image which could be an acceptable replacement, or one from which the object could be got by adaptation is present. Here the server need not be accessed and in the absence of IO time the access time is considerably lower. Hence the presence of partial cache hits increases performance. This is unique to our architecture and the adaptation scenario. Threshold: This is used to define when a partial cache hit occurs, i.e. the limits of acceptability of cache images and when cache images can be used for adaptation. This is not fixed by us as this threshold could depend on the processing capability of the deployment system, the promised QoS and other such factors that depend on individual instances. Hence we leave this as a variable parameter. 5. CACHE ACCESS SCENARIOS 5.1. Cache Hit Figure 3. Cache Hit · The request from the client arrives at the Query processor · The Query processor deserializes the request and passes parameters to the indexing component of cache management · The indexing component of the cache management system searches for the pertaining image · Image Found, the result is transferred to the query processor that is transferred to the communication interface from where it reaches the client 5.2. Partial Cache Hit Figure 4. Partial Cache Hit · The request from the client arrives at the Query processor · The Query processor deserializes the request and passes parameters to the indexing component of cache management · The indexing component of the cache management system searches for the pertaining image · Image not found, but another version from which it could be transcoded from is available in the cache. The existence of such an image is decided using a threshold value. · The candidate image is sent to the adaptation engine along with the deserialised parameters · The image is adapted and transferred to the communication module to be sent to the client 5.3. Cache Miss Figure 5. Cache Miss · The request from the client arrives at the Query processor · The Query processor deserializes the request and passes parameters to the indexing component of cache management · The indexing component of the cache management system searches for the pertaining image · No image that falls within the threshold limits is found in the cache · The server is requested for the corresponding image (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 94 a) If server responds with an exact match It is transferred to the client via the communication module b) In case the responded image is not an exact match, the received image id forwarded to the adaptation engine for transformation according to client requests then sent to the communication interface. · In the latter case (b) the cache must again be updated with the new adapted image. This is decided using further parameters based on whether an image or its variation is more frequently accessed. 6. Conclusion & Future Work The proposed architecture for content proxies in adaptation networks incorporates adaptation-aware cache management features. These features lead to an improvement in the performance of content proxies by reducing the number of requests to remote content server and reducing computations for adaptation operation by caching the most frequently adapted image versions. Our future work focuses on the development an efficient indexing data structure that in addition incorporates removal policy. Intertwining of these two makes the cache efficient as there is lesser overhead. The data structure is also made adaptation aware, in contrast to existing indexing and replacement policy making it an apt fit for content proxies. References [1] O. Layaida, D. Hagimonte, "Dynamic Adaptation in Distributed Multimedia Applications", INRIA, Technical Report, August 2002. [2] S. Ardon, P. Gunningberg, B. Landfelt, Y. Ismailov, M. Portmann, A. Seneviratne, "MARCH: A distributed content adaptation architecture", International Journal of Communication Systems 2003, 16. [3] Khalil El-Khatib, Gregor v. Bochmann, and Abdulmotaleb El Saddik, "A Distributed Content Adaptation Framework for Content Distribution Networks", School of Information Technology & Engineering, University of Ottawa [4] Jawaheer, G.; McCann, J.;"Building a self-adaptive content distribution network", Proceedings. 15th International Workshop on Database and Expert Systems Applications, 2004. Volume , Issue , 30 Aug.-3 Sept. 2004 [5] G. Berhe, L. Brunie, JM. Pierson. "Distributed Content Adaptation for Pervasive Systems". In proceedings of IEEE International Conference on Information Technology, ITCC 2005, April 4-6, 2005, Las Vegas, Nevada, USA, Vol.2 pp.234-241. [6] Hutter, A.; Amon, P.; Panis, G.; Delfosse, E.; Ransburg, M.; Hellwagner, H. "Automatic adaptation of streaming multimedia content in a dynamic and distributed environment", IEEE International Conference on Image Processing, 2005. ICIP 2005. Volume 3, Issue, 11-14 Sept. 2005. [7] Cheng-Yue Chang; Ming-Syan Chen, “Exploring aggregate effect with weighted transcoding graphs for efficient cache replacement in transcoding proxies”, Proceedings. 18th International Conference, 2002. [8] A. Singh, A. Trivedi, K. Ramamritham, P. Shenoy, “PTC: proxies that transcode and cache in heterogeneous web client environments”, proceedings of the Third International Conference on Web Information Systems Engineering, 2002. [9] J. Rao and K.A. Ross, "Making B+-tree cache conscious in main memory", In SIGMOD, pages 475-486, 2000. [10] Ig-hoon Lee, Junho Shim, Sang-goo Leeand Jonghoon Chun, "CST-Trees: Cache Sensitive T-Trees", Advances in Databases: Concepts, Systems and Applications, 2007. [11] Dong, 1 Yu, "CSR+-tree: Cache-conscious Indexing for High-dimensional Similarity Search,”, 19th International Conference on Scientific and Statistical Database Management, Statistical and Scientific Database Management (SSDBM), pp. 14-27, 2007. [12] Asanobu Kitamoto, "Multiresolution Cache Management for Distributed Satellite Image Database Using NACSIS-Thai International Link", Proceedings of the 6th International Workshop on Academic Information Networks and Systems (WAINS), pp. 243-250, 2000. (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 95 Rough Set and BP Neural Network Optimized by GA Based Anomaly Detection REN Xun-yi1, WANG Ru-chuan2 , and ZHOU He-Jun3 1Nanjing University of Posts and Telecommunications , College of Computer, xinmofan Road, Nanjing 66,China renxy@njupt.edu.cn 2Nanjing University of Posts and Telecommunications , College of Computer, xinmofan Road, Nanjing 66,China wangrc@njupt.edu.cn 3Nanjing University of Posts and Telecommunications , College of Computer, xinmofan Road, Nanjing 66,China zhouhj@njupt.edu.cn Abstract: To improve the speed and accuracy of detection, this paper proposes anomaly detection methods based on rough sets and GA Optimizing BP Neural Network. Using rough set to reduce 41 features of Kddcup'99 data sets to 14 features and using GA global searching capability to optimize the BP neural network weights Experiments based on Kddcup'99 intrusion data demonstrate that use of rough set and BP optimized by GA for intrusion data reduction not only have higher detection accuracy rate and has greatly improved network generalization ability. Keywords: Anomaly detection; Back propagation neural network; Genetic Algorithm; Rough set . 1. Introduction An intrusion detection system (IDS) inspects all inbound and outbound network activity and identifies suspicious patterns that may indicate a network or system attack from someone attempting to break into or compromise a system [1]. According to different detection technology, IDS is divided into two categories: misuse detection and anomaly detection. Misuse detection technology defines the attack features of packets and looks for a specific attack whether its features have already been in the collected data. Anomaly detection defines models of normal state for the network or host and compares the current state of network to the models and look for anomalies. The core of anomaly detection is how to define the so-called "normal" model. Misuse detection is based on the known defects or intrusion features and can accurately detect some attacks with known characteristics, but unable detect unknown attacks. Anomaly detection can detect unknown attacks, which is currently a hot research. Neural Networks are capable of learning on historical data (or called training) to get decision-making model for anomaly detection. K. FOX firstly applied neural networks to intrusion detection with SOM algorithm[2]. David Endler adopted neural network for training data audit and got decision-making model for host-based intrusion detection [3]. For the specific procedures, Anup K. Ghosh used of the input data with a labeling to train and study of normal and abnormal behavior[4]. C.jirapummin established mixed detection model by neural network [5]. Now, a lot of literature study anomaly detection based on BP Neural Network[6] because BP itself adopt gradient descent method to computing error and have many advantages such as parallel processing, nonlinear mapping, adaptability and non-parametric pattern recognition. However, BP neural network still has some applicable problems including difficulty to determine the number of neurons, low efficiency to adjust the network weights, local minimum points, and so on. BP neural network directly for the anomaly detection has the above shortcomings. To improve the speed and accuracy of detection, this paper proposed anomaly detection based on rough sets and BP neural network using GA optimization. We firstly use rough set to reduce 41 features of Kddcup'99 data sets to 14, simplify BP network structure, and improve BP neural network convergence and establish rate of anomaly detection model. Secondly we use GA global search capability to optimize the BP neural network weights to reduce the optimal time to find the right value, and improve the generalization ability of the model to more accurately detect anomalies. The results of experiments based on Kddcup'99 intrusion data show that the proposed method can improve performance of anomaly detection with BP neural network. 2. Kddcup'99 data reduction based on Rough Set 2.1 Rough Set Rough Set [7] is new data mining method for study data integrity, knowledge uncertainty proposed by Z. Pawlak In 1982. The basic idea of rough set is discovering decision rules based on dependent relationship between sample attributes and decision-making attributes of information systems. According to attribute impact on the decision rules, the importance of attributes is determined, unimportant attributes eliminate, and certain classifiable capacity maintains but characteristics of the data reduce. For an information system I=, S is nonemmpt sample set, A is the attribute set, V is the attribute value domain, and F is mapping: Each attribute A of S (IJCNS) International Journal of Computer and Network Security, Vol. 1, No. 2, November 2009 96 samples given specific values in the V. the aim of reduction is searching B A Í , Attribute B classification to the U is completely same as attribute A classification to the U. Because the training samples have some classification labels, for example samples of Kddcup’99 intrusion data set are 42 dimensions, and its 42nd dimension is the "normal" or "abnormal", also called decision attribute. So decision table is defined as DT=, in which A is known as condition attributes, D is the decision-making attributes. For reduction of Rough Set, we introduce an Bindiscerrnibilit relation: 2 ( ) {( , ') | , ( ) ( ') } I IND B s s S a B F s F s a = Î " Î = = ,which means two samples ( s and s’ ) can not be discernible by B attribute. As a result, using different value of B attribute, Set A will be divided into many condition equivalence class[ ]B s and decision equivalence class[ ]D s . Building a Resolution Matrix M, each element is { | ( ) ( )} [ ] [ ] 0 [ ] [ ] { i j i d j d i d j d a a A f s f s s s ij s s M Î Ù ¹ ¹= = composed of different elements in equivalence class [ ] i d s and [ ] j d s . When ij M only has one element, these ij M constitute a set called ( ) Core A . If attribute set B A Ì and meet the following conditions: ( ) B Core A f Ç ¹ we said the B is reduction of A. In other words, this reduction is the smallest subset of the attributes; it can distinguish all objects that can be distinguished with A. 2.2 Reduction intrusion data Kddcup'99[8] is experimental data that simulate five major categories attacks including the 23 kinds in a real network environment to be used for data mining. 10% data subset has 494,021 records, each of which has 41 features, containing continuous, discrete and text variables. Additional symbol to records show whether it is normal or abnormal. The data set is typical heterogeneous data sets with multi-protocol and multiple attacks We selected 30,000 records from the data; the normal data have 12802 records, the abnormal data 17198 records (DoS, 16560 records、Prob, 442 records、R2L,188 records、U2L, 8 records). Each data record contains 41 features from the TCP /IP connections, in which three features (protocal_type, service, flag) is a text variable. All text firstly was dealing with number variables and for equal to the data, all the features variables were deal with normalization. Assume n is totaled number of records, for i feature of the P records pi x , the adoption of the next normalization formula to [-1, +1]: ~ min( ) 2* 1,( 1,...,41, 1,..., ) max( ) min( ) i pi i i x x x i p n x x - = - = = - Where min( ) i x is minimum of i feature, max( ) i x is maximum of i feature. After normalized 30,000 records, add the 42 attributes, normal marked "+1", the attacks marked "-1", the 42 attribute as decision-making attribute of rough set. Using Rosetta tools [9] to reduce experimental data, 41 features were reducing in table 1. Table 1: TCP Data Reduction NO Reduction Feature 1 3,4,6,24,23,24,27,28,31,32,33,36,38 2 3,4,6,18,23,24, 27,28,31,32,33,36,38 3 3,4,6,14,23,24, 27,28,31,32,33,36,39 4 3,4,6,23,24, 27,28,31,32,33,35,36,39 5 3,4,6,23,24, 27,28,31,32,33,35,36,38 6 3,4,6,18,23, 24,27,28,31,32,33,36,39 7 3,4,6,10,23,24,27,28,31,33,35,36,37,39 8 3,4,6,23, 24,27,28,31,32,34,35,36,37,39 9 3,4,6,10,23, 24,27,28,31,33,35,36,37,38 10 3,4,6,10,12,18,23,24,27,28,31,32,34,35,36,37,38 11 3,4,6,10,12,14,23,24,27,28,31,32,34,35,36,37,38 Compared with reduction result in paper [10], our reduction result using Rough set have fewer features, and can be dealt with easily. Finally the experimental results show that our reduction can maintain a higher accuracy, but faster.We can see that the features of the data from about 41 to 13, 14 and 17. For BP neural network, based on 41 features of the establishment of the network structure needs 41 * m * 1 (m is the number of hidden layer nodes) weights assumptions space, and now only needs 14 * m '* 1 (m' for the hidden layer nodes number), in which hidden layer input-output node rely on the output and input the number of node, the more input and output nodes is, the more hidden layer is, so m '

Comments

Want to learn?

Sign up and browse through relevant courses.

Name:
Your Email:
Password:
Country:
Contact no:


Area code Number
Subjects you are interested in:
Word verification: (Enter the text as in image)


Sign Up Already a member? Sign In
I agree to WizIQ's User Agreement & Privacy Policy
2 Followers

Your Facebook Friends on WizIQ

Give live classes, create & sell online courses

Try it free Plans & Pricing

Connect