Programming

OCR with Azure Cognitive Services

We will use Bing Vision APIs to read Text from Images.
First, login to portal.azure.com and create a new Resource Group, say ‘Translator’. In this resource group, Add
Custom Vision
Vision.JPG
Free tier is good enough for the experimentation.
Take the Service Keys to be used from your app as below.

readonly static string uri = "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/ocr?language=unk&detectOrientation=true";
//Use HttpClient
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", (string)Application.Current.Resources["BingImageSearchKey"]);
content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
response = await client.PostAsync(uri, content);
responseText = await response.Content.ReadAsStringAsync();
var jsonResponse = JsonResponse.FromJson(responseText);
foreach (Region region in jsonResponse.Regions)
{
  foreach (OCRResponse.Line line in region.Lines)
  {
    foreach (Word word in line.Words)
    {
      sb.Append(word.Text); sb.Append(" ");
    }
    sb.AppendLine(); sb.Append(" ");
  }
}

Image is part of the content in above request.
Below is the Json Response from the Azure Vision API

namespace OCRResponse {
    public partial class JsonResponse {
        [JsonProperty("language")]
        public string Language {
            get;
            set;
        }
        [JsonProperty("orientation")]
        public string Orientation {
            get;
            set;
        }
        [JsonProperty("textAngle")]
        public double TextAngle {
            get;
            set;
        }
        [JsonProperty("regions")]
        public Region[] Regions {
            get;
            set;
        }
    }
    public partial class Region {
        [JsonProperty("boundingBox")]
        public string BoundingBox {
            get;
            set;
        }
        [JsonProperty("lines")]
        public Line[] Lines {
            get;
            set;
        }
    }
    public partial class Line {
        [JsonProperty("boundingBox")]
        public string BoundingBox {
            get;
            set;
        }
        [JsonProperty("words")]
        public Word[] Words {
            get;
            set;
        }
    }
    public partial class Word {
        [JsonProperty("boundingBox")]
        public string BoundingBox {
            get;
            set;
        }
        [JsonProperty("text")]
        public string Text {
            get;
            set;
        }
    }
    public partial class JsonResponse {
        public static JsonResponse FromJson(string json) => JsonConvert.DeserializeObject < JsonResponse > (json, OCRResponse.Converter.Settings);
    }
    public static class Serialize {
        public static string ToJson(this JsonResponse self) => JsonConvert.SerializeObject(self, OCRResponse.Converter.Settings);
    }
    internal static class Converter {
        public static readonly JsonSerializerSettings Settings = new JsonSerializerSettings {
            MetadataPropertyHandling = MetadataPropertyHandling.Ignore,
                DateParseHandling = DateParseHandling.None,
                Converters = {
                    new IsoDateTimeConverter {
                        DateTimeStyles = DateTimeStyles.AssumeUniversal
                    }
                },
        };
    }
}

Pretty simple, right?
This is all you need to do to write your OCR with Azure Vision APIs

I am a curious mind, always building and experimenting, trying things to decode their inner design. Kind of a small scale inventor ;-) I like to help people realise their passion.

2 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *