Data archiving with molecules

Reading synthetic polymers as data storage more efficiently

30-Oct-2024

More and more data needs to be stored, often long-term. synthetic polymers are an alternative to conventional storage media, as they retain stored information with significantly lower space and energy requirements. However, mass spectrometric data readout limits the length and thus the storage capacity of the individual polymer chains. In the journal Angewandte Chemie, a research team presents a new approach that overcomes this limitation and enables direct access to bits of interest without reading out the entire chain.

Wiley-VCH

Data is generated every day, whether in the context of business transactions, process monitoring, quality assurance or traceability of production batches. Archiving this data over decades requires a lot of space, but also energy. Macromolecules with a defined sequence, such as DNA and synthetic polymers, are an interesting alternative, particularly for the long-term archiving of large volumes of data that only need to be accessed infrequently.

Synthetic polymers offer advantages over DNA: simple synthesis, higher storage density and stability under harsh conditions. The disadvantage is that the information encoded in polymers is read out using mass spectrometry (MS) or tandem mass sequencing (MS2). The molecules must not become too large for this, which severely limits the storage capacity per chain. In addition, the entire chain is read out module by module; it is not possible to access the bits of interest directly - as if you had to read through a book completely instead of looking at the relevant page. Long DNA chains, on the other hand, can be broken down into fragments of random length, sequenced individually and computationally reconstructed into the overall sequence.

Kyoung Taek Kim and his team from the Department of Chemistry at Seoul National University (Rep. Korea) developed a new approach with which very long synthetic polymer chains, whose molecular weights significantly exceed the analytical limit of MS or MS2, can be efficiently read out. As an example, they encoded their university address in an ASCII code and translated this - together with an error detection code (CRC, a common method for checking data integrity) - into a binary code, i.e. a sequence of 1 and 0. They stored the 512-bit information generated in this way in a polymer chain consisting of two different monomers: Lactic acid coded 1 and phenyl-lactic acid 0. They also incorporated fragmentation codes containing mandelic acid at irregular points. Upon chemical activation, the chains are split there, in the example into 18 fragments of different sizes, which can be decoded individually by MS2 sequencing.

Specially developed software first identifies the fragments based on their mass and their end groups from the MS spectra. During MS2, molecular ions that have already been measured "break up" further and the fragments are analyzed again. The fragments can be sequenced on the basis of their mass differences. Using the CRC error detection codes, the software reconstructs the sequence of the entire chain. This overcomes the length limitation for polymer chains.

The team also succeeded in reading out bits of interest without sequencing the entire polymer chain (random access), e.g. the word "Chemistry" from the code for the address. Taking into account that all parts of the address are separated by commas and arranged in a specific order (department, institution, city, zip code, country), it was possible to narrow down the location where the information sought is stored within the chain and sequence only the relevant fragments.

Note: This article has been translated using a computer system without human intervention. LUMITOS offers these automatic translations to present a wider range of current news. Since this article has been translated with automatic translation, it is possible that it contains errors in vocabulary, syntax or grammar. The original article in German can be found here.

Original publication

Other news from the department science

Most read news

More news from our other portals

All FT-IR spectrometer manufacturers at a glance

See the theme worlds for related content

Topic World Mass Spectrometry

Mass spectrometry enables us to detect and identify molecules and reveal their structure. Whether in chemistry, biochemistry or forensics - mass spectrometry opens up unexpected insights into the composition of our world. Immerse yourself in the fascinating world of mass spectrometry!

35+ products
5+ whitepaper
30+ brochures
View topic world
Topic World Mass Spectrometry

Topic World Mass Spectrometry

Mass spectrometry enables us to detect and identify molecules and reveal their structure. Whether in chemistry, biochemistry or forensics - mass spectrometry opens up unexpected insights into the composition of our world. Immerse yourself in the fascinating world of mass spectrometry!

35+ products
5+ whitepaper
30+ brochures