Visual Question Answering Using Stacked Attention Networks