• caglararli@hotmail.com
  • 05386281520

Extraction of sensitive data out of down- scaled images

Çağlar Arlı      -    4 Views

Extraction of sensitive data out of down- scaled images

Our system produces preview images of A4 documents, which contain sensitive data (email, phone numbers, social networks and adresses) of our users. We would like to save those preview images in our cloud, to later present them to the corresponding users as thumbnails in our dashboard in order to improve UX.

However, one of our concerns is the security of the user's PII while those images are stored in our cloud.

In the following post people have mentioned that blurring images is a bad way of obfuscating the contents, since blurred images can be often recovered by brute-forcing the blur operator with a cost function.

My question is: Is scaling down of images to a small dimension (from A4 format to lets say 150x220px) a safe way of making sure that sensitive data cannot get extracted out of them?

PS: I have tried it out in Gimp by myself; images of such size would basically preserve the main structure (design elements, formatting, alignments etc.), but extraction of textual contents was not really possible. However, I am not a GIMP/graphical processing expert. Is it still somehow possible to extract plain text data out of those images with more advanced methods?