Transductive few-shot image recognition with ranking-based multi-modal knowledge transfer