Hi all, I'm considering creating a questionnaire for helping with self-diagnosis of position on the autism spectrum, similar to the Baron-Cohen questionnaire some of you may have seen but with some improvements
( Read more... )
Good questions. The technique I'm intending to use is a gradient decent based matrix factorisation. If you imagine a matrix(table) with the questions along the top and the respondents down the side, each square in the matrix represents an answer to a question, this includes the question "Where are you on the autism spectrum". Any unanswered questions are represented by empty squares.
By factorising the matrix we obtain two vectors which when multiplied together give us an approximation of the whole matrix, including any missing values - e.g. where you are on the spectrum. We can evaluate how accurate this approximation is by comparing estimated values with actual answers.
Further to this we can subtract our approximated answers from the original matrix and perform another factorisation. We can repeat this cycle until we are no longer improving our overall predictive ability. The number of vector pairs generated is an indication of how many independent factors underly the answers matrix.
We may only find one pair of vetors and if so we might expect this to represent the position on the autism spectrum, so each question's vector value indicates it's correlation with autism and likewise for each responents value.
So in answer to question (2) the 'autisticness' of each question simply falls out of the algorithm. It is not predetermined in any way, it is the value that the algorithm has found gives the best predictions for the matrix as a whole.
(1) what questiosn do we pick? It doesn't matter, we can use whatever questions we like and as many as we like, even seemingly unrelated stuff like what brand of toothpaste you use. Any questions that aren't relevant to autism will have zero (or near zero) correlation with the autism question. There may be other non-autism based correlation discovered, but if we ask sensible questions then those will be minimal and if they do exist, enlightening.
(3) Sure we can do exactly that, we can calculate an estimation for any question given a few actual answers. So someone might enter that they are far right on the spectrum and have gluten intolerance, and from this we might generate a high prediction for rocking motions say.
I realise that was a slightly involved answer. It's all based on the matrix factorisation technique that has been widley used, discussed and modified for the Netflix Prize some time ago and was best explained in this post. So instead of answers to autism questions you have a ratings of movies.
By factorising the matrix we obtain two vectors which when multiplied together give us an approximation of the whole matrix, including any missing values - e.g. where you are on the spectrum. We can evaluate how accurate this approximation is by comparing estimated values with actual answers.
Further to this we can subtract our approximated answers from the original matrix and perform another factorisation. We can repeat this cycle until we are no longer improving our overall predictive ability. The number of vector pairs generated is an indication of how many independent factors underly the answers matrix.
We may only find one pair of vetors and if so we might expect this to represent the position on the autism spectrum, so each question's vector value indicates it's correlation with autism and likewise for each responents value.
So in answer to question (2) the 'autisticness' of each question simply falls out of the algorithm. It is not predetermined in any way, it is the value that the algorithm has found gives the best predictions for the matrix as a whole.
(1) what questiosn do we pick? It doesn't matter, we can use whatever questions we like and as many as we like, even seemingly unrelated stuff like what brand of toothpaste you use. Any questions that aren't relevant to autism will have zero (or near zero) correlation with the autism question. There may be other non-autism based correlation discovered, but if we ask sensible questions then those will be minimal and if they do exist, enlightening.
(3) Sure we can do exactly that, we can calculate an estimation for any question given a few actual answers. So someone might enter that they are far right on the spectrum and have gluten intolerance, and from this we might generate a high prediction for rocking motions say.
I realise that was a slightly involved answer. It's all based on the matrix factorisation technique that has been widley used, discussed and modified for the Netflix Prize some time ago and was best explained in this post. So instead of answers to autism questions you have a ratings of movies.
Reply
Leave a comment