Text this: In-Sensor Visual Perception and Inference