Distributed information encoding and decoding using self-organized spatial patterns

Overview

Dynamical systems concern about the spatiotemporal evolution of a point. They often generate distinct outputs according to different initial conditions, and one could infer the corresponding input given an output. This property captures the essence of information encoding and decoding.

While chaotic systems have been proposed for secure encoding and decoding [1], they are extremely sensitive to perturbation, such that small changes in initial condition can lead to drastically different outputs (i.e. avalanche effect). On one hand, this property lays the foundation for information security, as it is difficult for attackers to decode the message without prior knowledge of the system. On the other hand, noise and error are often unavoidable in encoding, so the high sensitivity could make decoding challenging for the designated recipient.

Figure 1: Bacteria form diverse self-organized colony patterns.

To provide both reliability and security, we propose to use biological self-organized patterns that are often more convergent. That is, for the same or similar input configurations and environmental conditions, the final patterns share global similarity despite local variances which are caused by random noise. This property is sometimes referred to as “edge of chaos”.

To this end, we demonstrate the use of a model pattern-formation system mimicking Pseudomonas aeruginosa branching patterns, and establish distributed information encoding. Coupled with machine learning (ML) mediated decoding, our system illustrates a scalable strategy for information encoding and decoding with quantifiable reliability and security.

Video 1: The paragraph on the left is encoded into this video.

How does it work?

Figure 2. Pattern-based encoding-decoding scheme.

To encode, we convert each character (e.g., letters, numbers, punctuations and etc.) into a unique initial configuration. Then, we seed the cells within the configuration and let them grow into a visually complex pattern.

To decode, we train a convolutional neural network (CNN) to distinguish the patterns via multi-class classification.

Why is it secure?

Information security could be compromised in several ways. 1) A malicious attacker has access to the pattern generator and simulation parameters, so he or she can construct the training dataset. 2) The trained decoder is leaked. 3) Data leakage happens over time, such that an attacker can also reconstruct a CNN to decode. While the first two possibilities can only be prevented by proper protection of the code and data, our data-driven decoding method provides a solution to combat the data leakage issue. The availability and amount of training data are crucial for decoding accuracy and are tunable through system dynamics and encoding settings. Thus, we can leverage the fact that the platform end-users and malicious attackers have different access to data to make sure the user with sufficient data can reliably decode while an attacker with scarce data cannot train an accurate decoder. As the platform could be overused causing sufficient data leakage, we can periodically update the training data and the corresponding CNN, due to the customizable nature of the encoding scheme.

Other important properties of self-organized patterns can also be used to further enhance the security. For example, factors that critically impact the patterning process can function as secret keys to encrypt patterns, and the biological noise fingerprint can be used to authenticate pattern integrity.

Scalability and generalizability

Our platform is scalable and generalizable for various applications. To accommodate complex information, one can tune the system dynamics and encoding settings to maximize the encoding capacity. Ensemble techniques, such as majority voting and stacking, can be implemented to improve decoding accuracy and estimate prediction uncertainty.

In order to encode English text, we constructed a collection of patterns using the 100 printable ASCII characters (download the complete list). These patterns in essence constitutes a new, digitally generated coding scheme, which we call Emorfi. We were able to encode and decode a number of famed writings in Emorfi with very high accuracy (above 99%). We envision the platform could be extended to other languages, and be applicable for communicating science and protecting intellectual properties by incorporating Greek alphabet, mathematical symbols, nucleic acid bases and etc.

Try it yourself!

Try encode and decode using Emorfi with this application. Enter a piece of text in English, and the characters will be converted into patterns and assembled into a video. To decode, the app will use a trained CNN decoder to decode the image frames sequentially. Due to technical limitations, it is only suitable for encoding short text. For longer text or other types of messages, please use our original code on Github.

Generate video

Decode video

References

[1] Wolfram, S. in Conference on the Theory and Application of Cryptographic Techniques. 429-432 (Springer).

[2] Deng, P., de Vargas Roditi, L., Van Ditmarsch, D. & Xavier, J. B. The ecological basis of morphogenesis: branching patterns in swarming colonies of bacteria. New journal of physics 16, 015006 (2014).

Acknowledgements

This website was built by Yasa Baig, Minjun Kwak, Jia Lu, Nicole Moiseyev, Shari Tian, and Alison Zhang.

If you see mistakes or want to suggest changes, please contact jia.lu@duke.edu