Vol. 35 No. 2

Methodology to Weight Evaluation Areas from Autism

Spectrum Disorder ADOS-G Test with Artificial

Neural Networks and Taguchi Method

M. Reyes ^*
P. Ponce ^*
D. Grammatikou ^*
A. Molina ^*
^* Tec de Monterrey, CCM.

: ABSTRACT
Autism diagnosis requires validated diagnostic tools employed by mental health professionals with expertise in autism spectrum disorders. This conventionally requires lengthy information processing and technical understanding of each of the areas evaluated in the tools. Classifying the impact of these areas and proposing a system that can aid experts in the diagnosis is a complex task. This paper presents the methodology used to find the most significant items from the ADOS-G tool to detect Autism Spectrum Disorders through Feed-forward Artificial Neural Networks with back-propagation training. The number of cases for the network training data was determined by using the Taguchi method with Orthogonal Arrays reducing the sample size from 531,441 to only 27. The trained network provides an accuracy of 100% with 11 different cases used only for validation, which provides a specificity and sensitivity of 1. The network was used to classify the 12 items from the ADOS-G tool algorithm into three levels of impact for Autism diagnosis: High, Medium and Low. It was found that the items “Showing”, “Shared enjoyment in Interaction” and “Frequency of vocalization directed to others”, are the areas of highest impact for Autism diagnosis. The methodology here presented can be replicated to different Autism diagnosis tests to classify their impact areas as well.

Keywords: Autism Spectrum Disorder (ASD), diagnosis, screening, ADOS-G, Artificial Neural Networks, Feed-forward networks, Taguchi Method, Orthogonal Arrays, classify.

Correspondencia:
Mayra Reyes
Calle del Puente # 222, Col. Ejidos de Huipulco, Tlalpan, C.P. 14380
Correo electrónico: mayra.reyes@itesm.mx
Fecha de recepción:
30 de abril de 2014 Fecha de aceptación:
29 de agosto de 2014

: RESUMEN
El diagnóstico del autismo requiere del uso de herramientas de diagnóstico validadas internacionalmente que son utilizadas por los profesionales de la salud expertos en trastornos del espectro autista, lo cual requiere de procesamiento de mucha información y un entendimiento técnico de cada una de las áreas evaluadas en ellas. La clasificación del impacto que tienen cada una de estas áreas, así como la propuesta de un sistema que pueda ayudar a los expertos en el diagnóstico, es una tarea compleja, por lo que en este artículo se presenta una metodología utilizada para encontrar los elementos más significativos de la herramienta de diagnóstico de autismo ADOS-G a través de redes neuronales artificiales entrenadas con retropropagación del error. El número de casos para entrenamiento de la red se seleccionó utilizando el método de Taguchi con arreglos ortogonales, reduciendo el tamaño de la muestra de 531,441 a solo 27 casos. La red entrenada tiene una exactitud del 100% validada con 11 casos diferentes de niños evaluados para diagnóstico de trastorno del espectro autista con lo que se obtuvo una especificidad y sensibilidad de 1. La red neuronal artificial se utilizó para clasificar los 12 elementos del algoritmo de la herramienta ADOS-G en tres niveles de impacto: Alto, Medio y Bajo. Se encontró que los elementos “Mostrar”, “Placer compartido durante la interacción” y “Frecuencia de vocalizaciones dirigidas a otros” son las áreas de mayor impacto para el diagnóstico de autismo. La metodología presentada puede ser replicada para diferentes herramientas de diagnóstico de autismo para clasificar sus áreas de mayor impacto también.

Palabras clave: Trastorno del Espectro Autista (TEA), diagnostico, detección, ADOS-G, Redes neuronales artificiales, Método de Taguchi, arreglos ortogonales, clasificación.

Introduction

Autism Spectrum Disorder (ASD) is the group of developmental disorders whose clinical profile includes a range of disorders in social interaction, communication, imagination and reduced and restricted behavior [1]. ASD is a world health problem described for the first time in 1943 by Kanner [2]. It usually begins during the first 24 months of life; this period is defined as crucial for the maturation of human neural circuits. As a result it affects, in varying degrees, normal brain development in social and communication skills. According to the Centers for Disease Control and Prevention, 1 in 68 children has been diagnosed with ASD; this number has increased about 64% from 2006 to 2010 in the U.S. [3]. In Mexico there is not a national study that can provide the Autism prevalence [4], but some nongovernmental associations estimate that 1 in 300 children has been diagnosed with Autism in Mexico [5]. The main characteristics of ASD are disorders in social communication and interaction such as lack of emotional reciprocity, non verbal communication, development and management of relationships [6].

Although the causes of ASD remain unknown, all recent clinical data of neuroanatomical, biochemical, neurophysiologic, genetic and immunological characters indicate that autism is a neurodevelopmental disorder with a clear neurobiological basis. Currently there is no biological test for the diagnosis of autism. Diagnosis is achieved by behavioral evaluations specifically designed to identify and measure the presence and severity of the disorder. The evaluations, made by trained and experienced health care professionals, are very important in order to assess strengths and weaknesses in the child and associated developmental impairments. The diagnostic criteria has been derived through consensus among specialists and the diagnostic cut-offs are hard to define. It is considered a spectrum because the core impairments in communication and social interaction vary greatly.

In 2012, the Ministry of Health of Mexico published the guide “Diagnosis and Treatment of Autism Spectrum Disorders” with recommendations oriented to early diagnosis and intervention algorithms, recognizing that timely care is a crucial factor in order for these children to achieve the maximum functioning level and independence, and facilitate educational planning , health care and family assistance. The manual includes the diagnostic procedure for the Autism Diagnostic Observational Schedule Generic (ADOS-G) tool among others tools [7].

Autism Spectrum Disorder (ASD) Detection and Diagnosis

According to the Clinical guide of Generalized Development disorders of the Infant Psychiatry Hospital “Dr. Juan Navarro” in Mexico City [8], which was based on the multidisciplinary Consensus Panel described by Filipek et al. in 1999 [9], ASD diagnosis can be separated in two levels. The first level corresponds to the detection of development disorders by parents or health professionals in the first contact clinic. It includes red flags for activities that the child had not developed at specific ages as well as screening tools such as questionnaires. The second level corresponds to the evaluation and diagnostic of ASD that should be performed by health specialists in areas such as Psychiatry or Psychology who can carry out a clinical diagnosis based on the fifth edition of the Diagnostic and Statistical Manual also known as the DSM-V [6] and the tenth revision of the International Classification of Diseases also known as the ICD-10 [10] ; or even use screening and diagnostic tools validated internationally. A summary of some of these tools is presented in Table 1.

The Autism Diagnostic Observational Schedule Generic (ADOS -G) Instrument

The ADOS-G scale is a semi-structured instrument based on observation that consists of 4 modules that are managed in accordance with the age and language skills of the child. Faced with the challenge of characterizing or measuring symptoms and locate a patient at a functioning level, the ADOS -G has the advantage, with its variety of tasks, to make a diagnosis on observational basis. And through the development of tasks it manages to make a representation of deficits and the level of impairment of the patient. It usually takes between 30 to 60 minutes to be applied and the test consists of activities performed by the child in interaction of the expert who observes him and assigns a grade [18].

Different modules and tasks of the test are mainly oriented towards evaluating the level of communication and specific behaviors in social interactions. Module 1 is used for toddlers that do not use language to communicate. Module 2 is used for children that communicate with flexible phrases composed of 3 words. Module 3 is used for children with fluid language and Module 4 for adolescents and adults. The objective of the instrument is not to evaluate knowledge abilities in the subject but rather to evaluate if the subject wants to participate in a social exchange [19].

Table 1. ASD screening and diagnostic tools


Tool	Type of tool	Age range	Advantages	Disadvantages
Checklist for Autism in Toddlers (CHAT) [11]	Screening test	18 months	Quick application	Low detection capacity
Modified Checklist for Autism in Toddlers Revised with Follow-Up (MCHAT-R/F) [12]	Screening test	16 - 30 months	The predictive capacity increases when used with a clinician’s interview	Low positive predictive capacity. Large number of false positives.
Screening Tool for Autism in Two-Year- Olds (STAT) [13]	Screening test	12-23 months	High sensibility	Sensitivity and Specificity based only on 12 cases.
Infant Toddler Checklist (ITC) [14]	Screening test	Less than 18 months	High sensibility	Does not differentiate between ASD and any other developmental disorder.
Childhood Autism Rating Scale (CARS) [15]	Diagnostic test	Starting from 24 months	Quantitative tool that evaluates the severity of the symptoms. Also useful to control evolution of the patient after treatment.	Can misdiagnose ASD in children with intellectual disabilities.
Autism Diagnostic Observation Schedule - Generic (ADOS-G) [16]	Diagnostic test	It can be used for children over 2 years of mental age or in adults	Direct observation of the child interaction through specified activities.	Requires clinical training and practice to observe and evaluate. Takes around 30 minutes to perform the activities and then some more time to evaluate the algorithm.
Autism Diagnostic Interview-Revised (ADI-R) [17]	Diagnostic test	Starting from 18 months	Interview answered by parents that help distinguish ASD from other disorders.	Takes from 1 to 2 hours to apply because it has 93 questions with multiple options.

For this study only Module 1 was used, this module consists of 8 communication items, 12 social interaction items, 2 game quality items, 4 stereotyped behaviors items and 3 items for other abnormal behaviors. From a total of 29 items, the evaluation algorithm only takes into account 12 items, 5 items that evaluate the child’s ability to communicate which are: how frequent the child vocalizes directed to others (A2), if he/she uses words or phrases in a stereotyped way (A5), if he/she uses other people’s body as a tool to communicate (A6), if he/she can point to an object of interest (A7) and the emotional gestures he/she normally employs (A8). The algorithm also counts the following 7 items to evaluate the child’s social interaction: if the child makes unusual eye contact (B1), if his/her facial expressions directed to others attempt to communicate emotions (B3), if he/she enjoys interaction with others (B5), if he/she shows objects to others without asking for a specific need (B9), if the child wants to obtain attention of an adult towards objects that none is touching (B10), if he/she responds to the adults attention towards a specific unreachable object (B11) and finally the quality of social interaction attempts(B12).

ADOS-G possible scores are 0, 1,2,3,7 and 8. Zero means no evidence of abnormality related to autism, 1 means mildly abnormal, 2 means definitely abnormal/severity varies, 3 means markedly abnormal/interferes with interview, 7 means abnormal behavior not included from 1 to 3 and 8 means that the behavior could not be evaluated [16].

The evaluation of the ADOS-G algorithm consists of three sums. For autism cutoff, the sum of the five communication items (Frequency of vocalization directed to others, Stereotyped used of words, Use of other´s body to communicate, Pointing and Gestures) must be greater or equal to 4; the sum of the seven Social Interaction items (Unusual eye contact, Facial expression directed to others, Shared enjoyment in interaction, Showing, Spontaneous initiation of joint attention, Response to joint attention and Quality of social overture) must be greater or equal to 7 and the sum of all 12 areas should be greater or equal to 12. Only when the three sums reach the threshold or cutoff, then the child can be diagnosed with Autism. The problem with this evaluation is that all areas are weighted equally; as long as the sums achieve the set points Autism is diagnosed. For example it would be the same for the ADOS-G algorithm to have a value of 2 (definitely abnormal) assigned to the item “Pointing” and a value of 2 assigned to the item “Gestures”, than having a value of 1, which means mildly abnormal, assigned to the four items “Frequency of vocalization directed to others”, “Stereotyped used of words”, “Use of other’s body to communicate”, and “Pointing”. Since both sums (“2 +2” and “1+1+1+1”) would be equal to 4, the threshold is met for the Communication area for either case. It is clear that “definitely abnormal” in two areas is not exactly the same as “mildly abnormal” in four areas since mildly abnormal could be easier to overcome than a definitely abnormal. Unfortunately this type of evaluation based on sums is not focusing on the main aspects that determine Autism diagnosis, therefore there are many aspects that are believed to be relevant symptoms for Autism but the real impact factors have not been determined according to their severity or impact.

Artificial Neural Networks

Artificial Neural Networks (ANN) are computational models based on a simplified version of biological neural networks with which they share some characteristics like adaptability to learn, generalization, data organization and parallel processing. An ANN is composed of layers, one input layer, one output layer and one or several hidden layers. Inside each layer there are several neurons which are processing units that send information through weighted signals to each other and an activation function determines the output as shown in Figure 1. Weights have to be trained and many neurons can perform their tasks at the same time (parallel processing) [20].

The input of a neuron would be the weighted sum of its entire input links plus a bias or offset. It will be activated only if the sum reaches the activation function level. The activation function

Figure 1. Multi-layered Artificial Neural Network

Figure 2. ANN learning types [21], [22].

is a sigmoid or “S” shaped function because it is bounded and always has a continuous derivative. ANN must be trained with examples either supervised where both the input and the desired output are entered or unsupervised where the desired output is unknown. The more examples it is trained with, the higher precision should be achieved to solve new cases. ANN can be classified depending on their learning process as presented in Figure 2. Learning can be supervised, where both inputs and desired outputs are well known and the ANN must infer the input-output relationship. Unsupervised learning is used when only the inputs are known and the ANN organizes by itself in clusters of patterns.

There are two main topologies for ANN: feed-forward networks, which connect information only in one direction from the input towards the output through any number of layers, and recurrent networks, which contain feedback information that may flow in both directions from input to output and vice versa where the weights can be automatically trained to obtain an optimal behavior of the network by reducing the error between the desired output and the obtained output from the network, see Figure 3.

Figure 3. Main Artificial Neural Network Topologies.

Back-propagation training method consists on minimizing the error with respect to the weights through gradient descent. The error is the difference between the desired output and the real output delivered by the ANN. Minimizing the error is achieved by sending back the error from the output layer to all the involved hidden layers which get a proportional part of that error, each neuron takes a proportional part and retrains the weight for each input during the next epoch [23]. The activation function is a differentiable function of the inputs given by

yP = F (sp) k k

(1)

Where

y_k^P is the output value for each output unit.

F is the activation function

s_k^p is the input of the k unit in pattern p.

Where

sp= ∑ w yp+ θ k j jk j k

(2)

w_jk is the weighting factor for input j and output k.

θ_k is the threshold.

Where

j = layer(j)

(3)

The error is defined as the quadratic error (E^p) at the output units for pattern p between the desired output and the real output

1 N∑o Ep = -- (dpo - ypo)2 2 o=1

(4)

d_o^p is the desired output for unit o in pattern p. The summed squared error is then E given by

E = ∑ Ep p

(5)

where E^p is the error on pattern p. Applying the chain rule

p p p δE---= δE---δsk- δwjk δspk δwjk

(6)

Where

-δspk- p δw = yj jk

(7)

Defining an update rule

p p Δpwjk = γδkyj

(8)

Where

p δEp δk = δsp- k

(9)

Applying the chain rule for the change in the error as function of the output and the change in the output as a function of the changes in the input,

p p δEp- δEp-δyk- δk = - δspk = - δypkδspk

(10)

Using the chain rule, when k is a hidden unit it is called h

No No δEp- = ∑ δEp-δspo-= - ∑ δpw δyph o=1 δspoδyph o=1 o ho

(11)

These yields to

No δp= F′(sp )∑ δpw h h o=1 o ho

(12)

Increasing the number of hidden neurons can prevent from falling in a local minimum and diminish the error, but it might consist of a long training process [23].

Artificial Neural Networks in Autism

Feed-forward Networks have been used for a great variety of medical applications such as diagnosis of appendicitis, dementia, myocardial infarction, pulmonary embolism, back pain and skin disorders among others [24]. Autism diagnosis should be early, exact, cost effective and easy to use for health specialists so that they can design the best intervention and offer the child more resources to integrate into society. Unfortunately this is not an easy task and requires plenty of knowledge and experience of the clinicians at first and second level of intervention. Artificial Neural Networks may be able to provide the approach needed to detect Autism Spectrum Disorders (ASD) by identifying the highest impact factors that could help detecting it at early stages of children’s development.

Cohen & Sudhalter [25] used Artificial Neural Networks (ANN) to create pattern recognition in order to discriminate if a patient had autism or mental retardation. After training the ANN with information obtained from interviewing 138 parents whose children had autism or mental retardation, the system could predict 97% of testing cases for autism and 86% of cases with mental retardation. The general accuracy from the discriminant function analysis was 85% increasing to 92% with the ANN. Veeraraghavan & Srinivasan [26] created a Knowledge Based Screener (KBS) and an intelligent trainer system that can detect different categories of developmental disorders using ruled based expert systems with factual and heuristic information through internet. Arthi & Tamilarasi [27] created a Neuro-Fuzzy system to diagnose autism depending on few questions focused on the patient’s communication and linguistic abilities. The questionnaire is answered by the children’s parents. The system had an overall performance of 85-90%. Wall, Kosmicki, DeLuca, Harstad & Fusaro [28] used an Alternating Decision Tree (ADT) to reduce from 29 to 8 items in the Autism Diagnostic Observation Schedule-Generic (ADOS) in order to classify autism. The population used to train the system consisted of 612 individuals with autism and 15 individual without autism, 446 cases were used to verify it reaching an accuracy of 99.8%. Wall, Dally, Luyster, Jung & DeLuca [29] used an Alternating Decision Tree (ADT) to decrease the number of items of the Autism Diagnostic Interview-Revised tool (ADI-R) from 93 to 7 questions which are asked to the child’s parents to detect autism. The sample size was of 966 individuals and an accuracy of 99.9% was achieved.

Table 2 summarizes the current methods for Autism diagnosis using Artificial Intelligence (AI). It can be observed that most of these methods have used a large sample size in order to train their models and none of them have tried to minimize the sample size. Although a large sample used for training AI algorithms such as ANN, usually provide better results, the quality of the samples for training data and possible computational problems when training it due to time consumption and machine resources used must be taken into consideration too.

Design of Experiments with Taguchi Method

Design of experiments (DOE) is the methodology that defines several conditions for an experiment with multiple variables. The most common technique is the factorial design which considers all the possible combinations for the variables and their states or levels [30]. The full factorial design is given by

N = Lm

(13)

Where m is the number of factors and L is the number of levels for each factor or the possible values each factor can have. For example, the ADOS-G tool algorithm is composed of 12 items (A2, A5, A6, A7, A8, B1, B3, B5, B9, B10, B11, B12) with 3 possible states for each item (0, 1 and 2). A complete factorial design would need N = 3¹²= 531,441 possible cases to train the network which would be costly, time consuming and might generate an over fitted network. The goal is to look for a reliable number of cases that could be used to train an ANN to generate a reliable diagnosis of ASD [31].

The Taguchi method proposes a fractional factorial design based on orthogonal arrays (OA) which are tables of significant population distribution. All the trials from the OA include all combinations with independent relationships among variables. It provides the least number of test combinations for a set of variables linear/non linear and dependent/independent to each other. This Orthogonal array is used as the selected data to train the ANN. The selection of the OA is made depending on the number of parameters and the number of levels for the parameters. In this case, the 12 items from the ADOS-G tool are the 12 parameters and since they have 3 possible states, then the OA corresponds to the L₂₇ orthogonal array which is presented in Table 3 and it contains the most representative combinations for the 12 items at different levels.

Table 2. State of the Art of Artificial Intelligence methods for Autism diagnosis.


Tool name	Reference	Tool description	Advantage	Disadvantage	Minimization of training sample
Artificial Neural Network (ANN) trained with Backpropagation of the error	Cohen & Sudhalter. (1993) A Neural Network Approach to the Classification of Autism [25]	ANN created to discriminate between Autism and Mental retardation based on the Autism Behavior Interview (ABI)	Increased accuracy from discriminant function analysis with 85% to 92% with ANN.	Based on the DSM-III which evaluates differently Autism compared to the most current version DSM-V. Selection of 11 from 28 questions of the ABI tool which is not validated as a gold standard for Autism diagnosis. 138 samples needed.	No
Knowledge Based Screener (KBS)	Veeraraghavan & Srinivasan (2007). Exploration of Autism using Expert Systems [26]	Ruled based expert system with factual and heuristic knowledge to analyze children development and identify developmental disorders.	Available through internet	Does not mention if the knowledge is obtained from a standardized tool or only from clinical experience. Does not mention if it was tested.	No

Neuro fuzzy system	Arthi & Tamilarasi (2008). Prediction of autistic disorder using neuro- fuzzy system by applying ANN technique [27]	Neurofuzzy system converts inputs from a parent answered questionnaire which is converted to fuzzy membership values. Those values are evaluated with if-then rules and the fuzzy output becomes the input for the artificial neural network trained with backpropagation method.	Helps diagnosing autism with an overall performance of 85-90%	Not based on a certified test, depends on the expertise of the clinicians that help to construct the system. Started with 40 samples and needed to increase to 194 to increase training performance.	No
Alternating Decision tree (ADT)	Wall, Kosmicki, DeLuca, Harstad & Fusaro (2012). Use of machine learning to shorten observation-based screening and diagnosis of autism [28]	ADTree classifier consisting of 8 questions from the ADOS Module1 tool.	Reduction from 29 to 8 items to classify autism with 99.8% accuracy with False positive rate of 0 and True positive rate of 1	Large sample needed for system training (623 individuals)	No
Alternating Decision tree (ADT)	Wall, Dally, Luyster, Jung, DeLuca (2012). Use of Artificial Intelligence to shorten the behavioral Diagnosis of Autism [29]	Decision tree classifier to detect autism rapidly through 7 questions from the ADI-R tool	Reduction from 93 to 7 questions to classify autism with 99.9% accuracy with False positive rate of .013 and True positive rate of 1	Large sample needed for system training (966 individuals)	No

Table 3. Orthogonal Array L₂₇

METHOD

This paper presents the methodology used to find the most significant items from the ADOS-G tool to detect Autism Spectrum Disorders through Feed-forward Artificial Neural Networks with back-propagation training. The number of cases for the network training data was minimized using the Taguchi method with Orthogonal Arrays.

The methodology starts by defining the Autism diagnosis tool; in this case, the ADOS-G was selected for being an international validated tool considered one of the gold standards for Autism detection [16]. The algorithm for this tool evaluates 12 items with 3 possible states. That means that the complete factorial design would be of 531, 441 cases. The next step was to reduce the number of cases to train the ANN, it has been mentioned that the L₂₇ orthogonal array should be selected for the number of parameters and states. Since the OA shown in Table 3 considers the states 1, 2 and 3 and the ADOS-G algorithm consists of three states 0, 1 and 2, Table 4 was created as the combination of cases that was used to train the ANN containing the items evaluated with the possible states. Since the information of column 13 is included in the other 12, only 12 columns were used. The 27 cases were evaluated with the ADOS-G algorithm. The sum of the first 5 items should be greater or equal than 4, the sum of the next 7 items should be greater or equal to 7 and the sum of all the 12 items should be greater or equal to 12. Only when these three conditions are met, then the case is diagnosed as Autism. This algorithm evaluation is shown as the last column in Table 4.

Table 4. L₂₇ Orthogonal array evaluated with ADOS-G test rules.

The next step was to train the ANN. Since both inputs and desired outputs are available, a supervised artificial neural network was created using Matlab software [32]. The ANN was trained using the back-propagation method and it consists of 3 layers, the input layer has 40 neurons, the hidden layer has 60 and the output layer has 1 neuron (see Figure 4). The 12 inputs, which are the same 12 items that the ADOS-G algorithm evaluates, can have values of 0, 1 or 2. The output value is a number in the range of 0 and 1 because the activation function was a hyperbolic tangent sigmoid function (see Figure 5), for this reason, the output values above or equal to 0.5 are considered as Autism spectrum disorders and below 0.5 and zero are considered as non-Autism spectrum disorders.

Once the ANN was created, validation of the network was performed. It is important to notice that it is a common practice for ANN training to perform a cross validation method to estimate the performance of the learning algorithm. K-fold cross validation consists of dividing the total number of cases available in k parts, so that the k% of the cases are used only for validation while the 100-k% is used for training. The training is repeated until all k parts have been used for validation. One of the most used k% for machine learning is the 10-fold cross-validation which means that 90% of the samples are used to train the network, and the other 10% are used for testing its accuracy. Another validation form is the hold out validation, which avoids the overlapping of train data and validation data, the available data is held out during training and used only for validation purpose. The problem with this type of validation is that the results are highly dependent on the choice of the training data [33].

For the presented work here, the hold out validation method was used. It makes no sense to divide the orthogonal array of 27 cases into two parts (training and validation), because the 27 cases are meant to be the most representative combinations in this method. Therefore the complete orthogonal array of 27 cases was taken as training data. In order to validate the network, 11 different cases were used. These 11 cases were obtained from real children evaluated with the ADOS-G tool by a Psychologist. That means that for this work, a total of 38 cases were used from which 71% were used for training (27 cases from the orthogonal array) and 29% were held out for validation (11 cases from real children).

After the ANN was validated, the following step was to classify the 12 items from the ADOS-G tool into impact degrees for Autism diagnosis. Tests and results from the ANN were observed to find the factors that consistently generate an Autism diagnosis. The 12 items were classified within 3 ranges of impact: low, medium and high. The complete methodology is represented as a flow diagram in Figure 6.

Figure 4. ANN network designed with OA L27

Figure 5. Hyperbolic tangent sigmoid function used for activation function.

RESULTS AND DISCUSSION

Validation of the ANN was performed with11 real cases that were not used for training before. First the 11 cases were diagnosed by a Psychologist based on clinical observation of the DSM-V parameters [6], the psychologist diagnosed 6 cases as Autism Spectrum Disorder and 5 were diagnosed as no Autism Spectrum Disorder. The same 11 real cases were diagnosed with the ADOS-G algorithm and the same 6 cases diagnosed as ASD by the psychologist were also diagnosed by the ADOS-G algorithm and the same for the 5 non ASD cases. Finally the 11 cases were tested on the ANN. Remembering that values from the ANN output above or equal to 0.5 are considered ASD, 6 true positive cases were classified as ASD and 5 true negative cases were classified as non-ASD. These results yield to a sensitivity of 1 and specificity of 1.

Once the ANN was trained and validated, the following step was to classify the 12 factors through their impact on diagnosis. Observation of the orthogonal array was needed to find the factors that consistently generate an ASD diagnosis. Then using the ANN, several tests were performed to classify the 12 areas within 3 ranges of impact: low, medium and high as shown in Table 6. It can be observed in Table 6 first row, that the factors classified as high (A2, B5 and B9) when assigned a value of 2 and zero for the rest, provide an output of 0.832 which is an Autism diagnosis. Only the combination of those 3 areas already provides an Autism diagnosis. This is the reason why they are called high impact factors. Medium and low impact factors alone diagnose no Autism; see Table 6 rows 2 and 3. When high impact factors are weighted in 2 and medium factors in 1, the diagnosis get a value of 0.996 which is even higher than the high impact factors alone, see row 6 from Table6. It can also be observed that the Low impact areas have a minimum relevance for diagnosing Autism when combined with high or Medium impact factors see rows 6-9 in Table 6. By classifying the areas from the test in three ranges, it allows the user to focus more on the High and Medium impact areas but still considers the Low one for specific cases, see Table 6.

Figure 6. Methodology flow diagram

In Table 7, the 12 items from ADOS-G tool algorithm are classified according to their impact range as High, Medium or Low according to the tests performed with the ANN. The Codes column refers to the ADOS-G tool code for easier identification. The order of the items within each impact range was not selected specifically.

A comparison was made between the result obtained from the work here presented and the work presented by Wall, Kosmicki, DeLuca, Harstad & Fusaro [28] to reduce from 29 to 8 items in the ADOS-G tool in order to classify autism using an Alternating Decision Tree. Table 8 presents the summary of the 8 items that they found. B2, C1 and C2 are items that are evaluated during the activities in the ADOS-G tool, but they are not included in the diagnosis algorithm. It is interesting to see that the 3 high impact factors A2, B5 and B9, one medium impact factor B1 and one low impact factor B10 are included in Wall´s items as well.

Table 5. ANN validation

Table 6. Impact area tests

Table 7. Classification of ADOS-G items in three ranges of impact.
Range	Item	Codes
High	Showing	B9
	Shared enjoyment in interaction	B5
	Freq. of vocalization directed to others	A2
Medium	Stereotyped use of words or phrases	A5
	Unusual eye contact	B1
	Use of other’s body to communicate	A6
	Pointing	A7
	Facial expression directed to others	B3
	Response to joint attention	B11
Low	Gestures	A8
	Spontaneous initiation of joint attention	B10
	Quality of social overtures	B12

Table 8. Items used from ADOS-G tool to classify Autism by Wall [29]
Item	Code
Freq. of voccalization directed to others	A2
Unusual eye contact	B1
Responsive social smile	B2^*
Shared enjoyment in interaction	B5
Showing	B9
Spontaneous initiation of joint attention	B10
Functional play with objects	C1^*
Imagination/creativity	C2^*

Conclusions

Artificial Neural Networks can be used for Autism Spectrum Disorder detection. Due to the fact that ANN can learn by examples, a Feed-forward network was trained with back-propagation method to approximate Autism diagnosis based on the ADOS-G tool algorithm. The training samples were selected as an orthogonal array using the Taguchi method to pick the least number of combinations that would be a representative sample suitable for training. The Design of Experiments through the Taguchi Method reduces considerably the number of cases used to train the ANN from 531,441 to 27, which reduces as well training time and computer resources.

It was observed that the network provides an accuracy of 100% for Autism diagnosis, with specificity and sensitivity of 1, validated against the ADOS-G algorithm and a psychologist evaluation based on the DSM-V.

A general advantage of ANN is that they can create approximations of an unknown system when trained by examples. This same advantage can turn into a disadvantage when the model of the system is needed to perform certain actions such as to control or to observe it. As every tool, ANN should be analyzed before using it with each specific situation.

The designed ANN was used to classify the 12 items from the ADOS-G tool algorithm into three impact ranges Low, Medium and High. It can be said that Showing, Shared enjoyment in Interaction and Frequency of vocalization directed to others are the three items of high impact for Autism detection. The medium impact items are Stereotyped use of Words or Phrases, Unusual eye contact, Use of other’s body to communicate, Pointing, Facial expression directed to others and Response to joint attention. The items that influence the least are Gestures, Spontaneous initiation of joint attention and Quality of Social overtures. The combination of High impact with Medium impact factors can improve the value obtained during diagnosis.

This classification was compared to the work done by [28]. The big difference between both works is that they used 623 individuals to train the ADT while the methodology here presented used only 27 cases using the Taguchi method to select the training data.

References

L. Wing, "The autistic spectrum", The lancet, 350(9093), pp. 1761-1766, 1997.
L. Kanner, "Autistic disturbances of affective contact", Nervous child, vol 2. no.3, pp. 217-250, 1943.
Centers for Disease Control and Prevention. "Prevalence of autism spectrum disorders-Autism and Developmental Disabilities Monitoring Network, 14 sites, United States, 2008", MMWR 2012; vol. 61, No. 3. pp.1-2.
M. Marquez-Caraveo and L. Albores-Gallo, "Autistic spectrum disorders: Diagnostic and therapeutic challenges in Mexico", Salud Mental, Vol. 34 no. 5, pp. 435-441, 2011.
C. Marcín (2013, February,6) "Prevalencia del Autismo en México", [Online]. Available: http://www.clima.org.mx/images/ pdf/prevalencia.pdf.
American Psychiatric Association [APA], "The Diagnostic and Statistical Manual of Mental Disorders: DSM 5.", Arlington, 2013, pp. 50-59.
R. Arias et al., "Diagnostico y Manejo de los Trastornos del Espectro Autista.", IMSS, México, 2012.
M. Marquez-Caraveo et al, "Guía Clínica: Trastornos Generalizados del Desarrollo", Guías Clínicas del Hospital Psiquiátrico Infantil Dr. Juan N Navarro, 2010.
PA. Filipek et al., "The Screening and Diagnosis of Autistic Spectrum Disorders", Journal of Autism and Developmental Disorders, vol.29, no.6, pp. 439-484, 1999.
World Health Organization, "International Statistical Classification of Diseases and Related Health Problems", 10th Revision, 2nd Ed, 2004.
S. Baron-Cohen et al., "Early identification of autism by the Checklist for Autism in Toddlers (CHAT)", Journal of the royal society of medicine, vol. 93, no. 10, pp. 521-525, 2000.
D Robins, et al. (2009). "Modified Checklist for Autism in Toddlers, Revised with Follow-Up (M-CHAT-R/F)" [online]. Available:http://www.autismspeaks.org/ sites/default/files/docs/sciencedocs/m-chat/m-chat-r_f.pdf?v=1.
W. Stone et al., "Brief Report: Screening Tool for Autism in Two-Year-Olds (STAT): Development and Preliminary Data", Journal of Autism and Developmental Disorders, vol. 30, no. 6, pp. 607-612, 2000.
A. Wetherby et al., "Validation of the Infant-Toddler Checklist as a Broadband Screener for Autism Spectrum Disorders from 9 to 24 Months of Age", Autism, vol.12, no.5, pp.487-511, 2008.
C. Chlebowski et al., "Using the Childhood Autism Rating Scale to Diagnose Autism Spectrum Disorders", J Autism Dev Disord, vol.40, no. 7, pp. 787- 799, 2010.
C. Lord et al., "The autism diagnostic observation schedule-generic: a standard measure of social and communication deficits associated with the spectrum of autism", Autism Dev Disord, vol.30, no. 3, pp. 205-223, 2000.
C. Lord et al., "Autism Diagnostic Interview - Revised: A Revised Version of a Diagnostic Interview for Caregivers of Individuals with Possible Pervasive Developmental Disorders", Autism and Dev Disor, vol. 4, no. 5, pp.659-685, 1994.
K. Gotham, "Standardizing ADOS Scores for a measure of Severity in autism spectrum disorders", J Autism Dev Disord, vol. 39, no. 5, pp. 693-705.
C. Lord et al., "ADOS, Escala de observación para el diagnóstico del autismo", TEA Ediciones, 2008.
P. Ponce. "Inteligencia Artificial con Aplicaciones a la Ingeniería", 1a Ed, Cd. Mexico, Mexico,Alfaomega, 2010, ch. 3, pp.193-234.
G. Borgersen and L. Karlsson, "Supervised learning in artificial neural networks" in IRCSE, Västerås, Sweden, 2008, pp.1-6.
S. Gopal, "Artificial Neural Networks for Spatial Data Analysis", in NCGIA Core Curriculum in GIScience, Boston, MA, 1998.
B. Krose and P. Van Der Smagt (1996). An Introduction to Neural Network [online]. Available: http://www.ieee.org/documents/ieeecitationref.pdf.
W. Baxt, "Application of artificial neural networks to clinical medicine," The Lancet, vol. 346, no. 8983, pp. 1135-1138, 1995.
I. Cohen et al., "A neural network approach to the classification of autism.", J Autism and Dev Dis, vol. 23, no. 3, pp. 443-66, 1993.
K. Srinivasan and S. Veeraraghavan, "Exploration of Autism using Expert Systems", in ITNG’07, Las Vegas, NV 2007. pp. 261-264.
K. Arthi and A. Tamilarasi, "Prediction of autistic disorder using neuro fuzzy system by applying ANN technique", International journal of developmental neuroscience, vol. 26, no. 7, pp. 699-704, 2008.
D. Wall et al., "Use of machine learning to shorten observation-based screening and diagnosis of autism", Translational Psychiatry, vol.2, no. 100, pp. 1-8, 2012.
D. Wall et al., "Use of Artificial Intelligence to shorten the behavioral Diagnosis of Autism", Plos One, vol. 7, no.8, pp. 1-8, 2012.
R. Ranjit, "A Primer on the Taguchi Method", New York: Van Nostrand Reinhold, 1990, pp.1-5.
L. Sun et al., "A New Modeling Technique Based on the ANN and DOE for Interconnects", in ASIC, China, 2001.
The MathWorks,Inc. (2014), "Mathlab Primer R2014a" [Online]. Available: http://www.mathworks.com/help/pdf_doc/ matlab/getstart.pdf.
P. Refaeilzadeh et al., "Cross Validation", Encyclopedia of Database Systems, pp. 532-538, 2009

Revista Mexicana de Ingeniería Biomédica

Introduction

Autism Spectrum Disorder (ASD) Detection and Diagnosis

The Autism Diagnostic Observational Schedule Generic (ADOS -G) Instrument

Artificial Neural Networks

Artificial Neural Networks in Autism

Design of Experiments with Taguchi Method

METHOD

RESULTS AND DISCUSSION

Conclusions

References