Data Engineering: Aspects of Data Modeling

Rahul S
13 min readDec 5, 2023

Data has a lifecycle. Like a human, it goes through various life-stages. And it is imperative that we have an idea of its childhood, where it took birth, how it is generated, its environment, its characteristics, and quirks.

Data is an unorganized, context-less collection of facts and figures. The source can be many things, both analog and digital. Examples of analog data are vocal speech, pre-computer handwritten records or writing on paper, temperature sensors using IoT, and many more. Digital data has even more sources. It is either created by converting analog data to digital form or is the native product of a digital system.

Understanding the source systems and how it generates data is a key step in creating efficient and robust data pipelines.

Files and Unstructured Data — A file is a series of bytes usually saved on a storage disk, and applications frequently use files to store various…

--

--