Text this: Multi-Encoder Context Aggregation Network for Structured and Unstructured Urban Street Scene Analysis