- Counting: 0, 1, 2, 3, ...
- Addition: 3 + 5 = 8
- Subtraction: 10 - 3 = 7
- Bits: 0 or 1 (two states)
- Byte: 8 bits grouped together
- AND: 1 AND 1 = 1, else 0
- OR: 1 OR 0 = 1, 0 OR 0 = 0
PROBLEM: How to represent numbers bigger than 1? SOLUTION: Group 8 bits together
1 bit = 0 or 1 → can represent 2 values (0, 1)
2 bits = 00, 01, 10, 11 → can represent 4 values (0, 1, 2, 3)
8 bits = 00000000 to 11111111 → can represent 256 values (0 to 255)
CALCULATION:
∴ 1 byte = 8 bits = can store number 0 to 255
PROBLEM: Where to store bytes? SOLUTION: RAM = array of bytes
RAM = giant array
RAM[0] = first byte
RAM[1] = second byte
RAM[2] = third byte
...
RAM[16000000000] = 16 billionth byte (if you have 16GB RAM)
YOUR MACHINE:
DEFINITION: Address = the index number to access a specific byte in RAM
PROBLEM: Writing long binary numbers is tedious SOLUTION: Group 4 bits, use symbols 0-9, A-F
4 bits = 0000 to 1111 = 16 values
Use: 0 1 2 3 4 5 6 7 8 9 A B C D E F
Binary → Hex:
0000 = 0
0001 = 1
...
1001 = 9
1010 = A (means 10)
1011 = B (means 11)
1100 = C (means 12)
1101 = D (means 13)
1110 = E (means 14)
1111 = F (means 15)
EXAMPLE:
PROBLEM: How to store letters like ‘H’, ‘E’, ‘L’, ‘L’, ‘O’? SOLUTION: Assign a number to each letter (ASCII table)
'H' = 72 = 0x48
'E' = 69 = 0x45
'L' = 76 = 0x4C
'L' = 76 = 0x4C
'O' = 79 = 0x4F
CALCULATION (for ‘H’):
PROBLEM: How to store multiple characters together? SOLUTION: Put characters in consecutive RAM locations
"HELLO" stored starting at RAM address 1000:
RAM[1000] = 0x48 ('H')
RAM[1001] = 0x45 ('E')
RAM[1002] = 0x4C ('L')
RAM[1003] = 0x4C ('L')
RAM[1004] = 0x4F ('O')
RAM[1005] = 0x00 (null terminator, marks end of string)
DEFINITION: String = consecutive bytes in RAM, ending with 0x00
PROBLEM: CPU can only run one instruction at a time. How to run multiple programs? SOLUTION: Operating system switches between programs rapidly
Program A runs for 10 milliseconds
Program B runs for 10 milliseconds
Program A runs for 10 milliseconds
...
DEFINITION: Process = a running program with its own:
YOUR DATA:
PROBLEM: If two processes both want to use RAM address 1000, they conflict! SOLUTION: Each process gets its OWN address space (virtual addresses)
Process A sees: VA 0x1000, VA 0x1001, VA 0x1002, ...
Process B sees: VA 0x1000, VA 0x1001, VA 0x1002, ...
These VA 0x1000 are DIFFERENT physical locations!
DEFINITION: Virtual Address (VA) = address that a process uses DEFINITION: Physical Address (PA) = actual address in RAM hardware
Process A: VA 0x1000 → PA 0x50000 (mapped by kernel)
Process B: VA 0x1000 → PA 0x80000 (mapped to different PA!)
PROBLEM: Who manages CPU, RAM, processes, devices? SOLUTION: Kernel = the core program that manages everything
USER SPACE: Your programs run here (sender.c, receiver.c)
─────────────────────────────────────────────────────────
KERNEL SPACE: Linux kernel runs here (allocates RAM, talks to NIC)
─────────────────────────────────────────────────────────
HARDWARE: CPU, RAM, NIC, Disk
DEFINITION: Kernel = program that controls hardware and manages processes
KEY FACT: User programs CANNOT directly access hardware. Must ask kernel.
PROBLEM: How does user program ask the kernel to do something? SOLUTION: System call (syscall) = special instruction to enter kernel
User program: "I want to send data to network"
↓
syscall instruction (CPU switches to kernel mode)
↓
Kernel: "OK, I'll copy your data and send it"
↓
sysret instruction (CPU switches back to user mode)
↓
User program: continues running
EXAMPLE SYSCALLS:
PROBLEM: How to identify a network connection? SOLUTION: Socket = handle (number) representing a network endpoint
int fd = socket(AF_INET, SOCK_DGRAM, 0);
DEFINITION: Socket = kernel object for network communication, identified by fd
PROBLEM: How to identify a specific computer on network? SOLUTION: IP address = 4 bytes that uniquely identify a machine
127.0.0.1 = loopback (this same computer)
Each number is 1 byte (0-255):
127 . 0 . 0 . 1
↓ ↓ ↓ ↓
byte byte byte byte
CALCULATION:
PROBLEM: One computer might have multiple programs wanting network data SOLUTION: Port = 2-byte number (0-65535) to identify specific program
IP 127.0.0.1 + Port 9999 = identifies THIS program on THIS computer
YOUR DATA:
PROBLEM: User program has data at its VA. Kernel needs the data.
sender.c has: char *msg = "HELLO_SEND_TRACE"
msg is at VA 0x6130204f0069 (in sender's address space)
BUT: Kernel cannot directly read VA 0x6130204f0069! WHY: That VA belongs to sender process. Kernel has its own address space.
SOLUTION: Kernel COPIES data from user VA to kernel VA.
User VA 0x6130204f0069 → copy → Kernel VA 0xffff8cb08de63c00
THIS IS COPY #1.
PROBLEM: Kernel needs structure to hold network packet data SOLUTION: sk_buff = kernel structure for network packets
struct sk_buff {
unsigned char *data; // pointer to packet bytes
unsigned int len; // length of packet
// ... many other fields
};
DEFINITION: sk_buff = kernel object holding network packet
YOUR DATA:
PROBLEM: How to test networking without real network? SOLUTION: Loopback = fake network device that sends packets to itself
Sender → kernel → loopback device → kernel → Receiver
↑ ↑
Same computer, same RAM, no wire
IP address 127.0.0.1 = loopback address
KEY FACT: With loopback, the SAME skb is passed from TX to RX queue. No data copy needed between send and receive sides (for loopback only).
PROBLEM: Kernel received data in its skb. User program wants the data.
receiver.c calls: recv(fd, buf, 16, 0)
buf is at VA 0x7fff43abc810 (in receiver's address space)
BUT: Kernel’s skb->data is at VA 0xffff8882cbbe612c. Receiver process cannot access kernel VA!
SOLUTION: Kernel COPIES data from its skb to user’s buffer.
Kernel VA 0xffff8882cbbe612c → copy → User VA 0x7fff43abc810
THIS IS COPY #4.
COUNTING THE COPIES:
SEND PATH:
User buffer (sender) → COPY #1 → Kernel skb
RECEIVE PATH:
Kernel skb → COPY #4 → User buffer (receiver)
TOTAL: 2 copies (for loopback)
FOR REAL NIC (not loopback):
PROBLEM: How to observe what kernel does without modifying kernel source? SOLUTION: Kprobe = insert probe at kernel function, run your code when called
register_kprobe(&kp) → tell kernel: "when function X is called, run my handler"
DEFINITION: Kprobe = runtime kernel instrumentation
YOUR KPROBES:
PROBLEM: CPU needs fast storage for current calculation SOLUTION: Registers = small, fast storage inside CPU
x86_64 calling convention:
Function call: foo(arg1, arg2, arg3)
arg1 → stored in register RDI (also called regs->di)
arg2 → stored in register RSI (also called regs->si)
arg3 → stored in register RDX (also called regs->dx)
YOUR DATA for _copy_to_iter(addr, bytes, iter):
WHAT WE MEASURED:
COPY #1 (send path):
dmesg: [COPY1] dest=ffff8cb08de63c00 len=16
sender.c: buffer at VA 0x6130204f0069
∴ memcpy(0xffff8cb08de63c00, 0x6130204f0069, 16) happened
COPY #4 (receive path):
dmesg: [COPY4] src=ffff8882cbbe612c len=16
receiver.c: buffer at VA 0x7fff43abc810, received "HELLO_SEND_TRACE"
∴ memcpy(0x7fff43abc810, 0xffff8882cbbe612c, 16) happened
THREE LOCATIONS IN RAM:
Location 1: 0x6130204f0069 (sender user VA) → "HELLO_SEND_TRACE"
Location 2: 0xffff8882cbbe612c (kernel skb VA) → "HELLO_SEND_TRACE"
Location 3: 0x7fff43abc810 (receiver user VA) → "HELLO_SEND_TRACE"
Same 16 bytes exist in 3 places = 48 bytes of RAM used for 16 bytes of data
Step Action Location Data
──── ───────────────────────────── ───────────────────────── ────────────────
1 sender writes string VA 0x6130204f0069 "HELLO_SEND_TRACE"
2 sender calls sendto()
3 COPY #1: kernel copies in VA 0xffff...c00 (skb) "HELLO_SEND_TRACE"
4 loopback passes skb to RX
5 receiver calls recv()
6 COPY #4: kernel copies out VA 0x7fff43abc810 "HELLO_SEND_TRACE"
7 receiver reads string "HELLO_SEND_TRACE"
Every term derived from previous step: