🤖 Marketing Bytes: Digital Data Basics
A marketer's guide to how computers represent and store data
People struggle to manage digital data.
Don’t worry—you’re not alone.
But a strong understanding of fundamentals helps keep you afloat. Now that the world of marketing data has years of legacy decisions baked in, a good grasp of basics is important.
So I’m starting a new recurring series called 🤖 Marketing Bytes
You can go back to these articles when you are unsure about specific marketing tech concepts or want a refresher.
Over time, I’ll cover more complex topics and explain them in plain English. I’ll include links to futher reading if you’re interested in a specific topic.
Welcome to the first part of our three-part series on the modern data fundamentals that every business owner or marketer should know:
[This One] Data representations and storage: shows how a computer sees and remembers the world
Processing and transfer: shows how a computer can crunch numbers and share them with others
Architecture and security: shows how a computer can keep info safe from hackers and how more complex systems are structured
These concepts are the bedrock of the data world, and understanding them is crucial to making informed decisions around data projects.
In this series, we'll be breaking down each data area to explain where it comes from and why it matters. We'll use examples and stories to make sure you don’t get lost.
I’ll add 💡 to show memory tips and 🚨 to show common places where mistakes happen
First up, how a computer sees and remembers the world.
Introduction to Data Representations & Storage
To start, let’s dive into why computers and humans are different.
In this post, you’re going to learn about how computers think and remember information.
I go into some depth to include common terms that you’ll see in logs or error messages. Don’t be scared if you don’t know what a term means!
Basics of Data Representation
Summary: Computers represent data in binary. Bits form bytes that can then be used for other basic data categories. Anything that you can see or hear1 can be digitized and read into a computer.
Binary
Computers2 aren’t very smart.
They can only manage information by flipping switches on or off.
Switches can be only ‘On’ or ‘Off’—1 or 0, respectively.
💡 This On/Off system is called Binary. ‘Bi’-, like in bicycles, shows there are only two possible digits.3
Bits
Each switch is called a ‘bit’ in computer terms.
💡 One switch doesn’t tell you very much—it just tells you a bit of information.
Bytes
Every word, song, and video you see online are just switches on or off—an arrangement of binary data.
But a computer can’t jump from bits to a TikTok. We need to build up more context first.
Bits make up bytes. A ‘byte’ is 8 bits in a row.
💡 Take a byte of something good, not just a bit. Bytes are bigger than bits.
These 8 bits (2^8=256 total options) give context for us to assign letters.
The early lettering system (ASCII) assigned a specific byte4 to each latin character.
For example, the switch below is one byte of information: the letter ‘U’
💡Binary math can be unintuitive. You don’t need to know exactly how this works, but if you’re curious, check out this post for a full guide.
🚨 Don’t forget you start at position 0! This off-by-one error is very common.
Basic Data Categories
The 1’s and 0’s from these switches don’t stop at showing numbers. As you saw above, letters can be represented too. And it doesn’t stop there.
It’s important to have an idea of what format data is in. Without it, you’re at risk to mix data types—an unpleasant mess to clean up later.
A few common data categories5 are:
Numbers
Bytes can be put together to show larger—positive and negative—numbers. Math is done on these within the computer chip directly.
Common Types Encountered:
Text
Letters are assigned numbers so that a computer knows what to show a user.
A single letter (or symbol) is called a char for ‘character’.
Common Types Encountered:
Sound
Video
Images are displayed rapidly to create the illusion of motion. Each pixel is represented by numbers that describe color and brightness. Video files often contain data to synchronize audio with the visuals.
Common File Types Encountered:
Summary:
All computer data is 1’s and 0’s. A computer needs to know what a user wants it to do with those values. By knowing what category data belongs do, a computer can correctly process it.
🚨 Mixing data categories unintentionally can be a tough issue to unwind.
Want to see how data connects to your customers? This post goes into more detail with specific examples.
Data Storage
Summary: Computers are incredible at managing huge sets of data. Think of them as a giant library and librarian. Different forms of storage have different uses.
If computers are just switches, then how are they so amazing?
What computers lack in intelligence, they make up for in memory8 size and speed.
Imagine your computer has to organize lots of books (your data).
Instead of bookshelves, computers use digital storage to keep everything in order.
Hard Drives
Hard drives are like libraries with rows of physical books. Solid State Drives (SSDs) have replaced hard drives in new computers. But people still commonly refer to all onboard storage as ‘hard drive’ storage.
🚨 Onboard storage is only accessible from your computer
Cloud Storage
**Note: We will go into more detail on cloud tech later**
What if you could used a Kindle instead? You pay to have someone else manage the storage of your books.
That's cloud storage.
You can get to your files from any device, anywhere, as long as you have internet.
💡 Cloud storage is great for managing very large datasets and for shared team data
🚨 Double check your estimated cloud costs before jumping in!
USBs and External Drives
Sometimes you want to keep some information with you for easy reference.
USB flash drives and external hard drives let you do that. They're like packing a backpack with your favorite reads so you can work on them at a coffee shop.
💡 External drives are great for sharing files without using internet
Each type of storage has its own job.
Hard drives and SSDs are for keeping stuff in your computer. Cloud storage is for when you want to reach your files from anywhere. USBs and external drives are for carrying your files around.
Conclusion
Computers use these methods to make sure they can remember everything we tell them, from our favorite photos to important work documents.
As we close the book on our first lesson about digital data, let’s remember one thing: it’s all about keeping things simple and smart.
We’ve started our adventure into the 🤖 Marketing Bytes series, aiming to make the tricky parts of tech understood for everyone
Understanding the ABCs of data is like having a secret weapon.
Our journey will dive deeper, breaking down tech talk into easy chunks.
Subscribe to get the next posts in this series!
And sometimes more: This Harvard professor is digitizing scent
We’ve gotten used to computers being electronic, but any complete system of switch-flipping is a computer. Here’s more information on Turing Completeness
Deep Dive: The binary system was chosen because of its high resistance to error during transmission or storage for silicon-based architecture.
Instead of having to build in logic to determine if a voltage for a switch was High, Medium, or Low, it just has to determine if there was a voltage over threshold. This helps reduce the effects of external electric fields from corrupting memory.
With silicon-based transistors, it’s much easier to measure and control On vs Off instead of trying to his a medium third point on the steep part of the saturation curve—granted there is some nuance here depending on if a transistor is a Bipolar junction transistor or a field effect transistor.
A general look into a non-binary alternative.
As more non-traditional architectures are researched (e.g. optical with polarization), we may see more ternary—or ever higher order—operating systems for circuitry. Check out this recent (ish) paper for info on a Josephson junction-based circuit.
ASCII characters are sometimes used in data pipelines. This is generally for older systems that cannot handle non-latin characters.
🚨If you send data that a pipeline isn’t set up to handle, you’ll have issues downstream.
Trivia: The earliest ASCII system actually used a 7 bit addressing system.
I use the term ‘data categories’ instead of ‘data types’ because I’m taking a more general approach to discussing data. This is not a programming tutorial, so the differences between number types (float vs double) are not in scope.
Additionally, more complex data types (lists, arrays, dicts) are out of scope too.
For a deeper dive into data types, check this out.
Kind of a number? Depends on how its represented.
Sound wave processing is fascinating. For a math primer of the general techniques used, check out this video
I’ll be playing fast and loose with the terms ‘memory’ and ‘storage’ in this series. I’m not literally referring to RAM vs storage with each term. The difference doesn’t generally matter in marketing tech.