Mani's Systems 1 Study Guide

Apr 27, 2022
11 minutes

This document assumes you've been doing labs and homeworks, otherwise you do not meet core assumptions and I can not save you.

Registers

Argument Register(s)

RDI, RSI, RDX, RCX, R8, R9

Return Register(s)

RAX

Stack Pointer Register(s)

RSP

Callee Saved Register(s)

rbx, rbp, rsp, r12-r15

Caller Saved Register(s)

rdi, rsi, rdx, rcx, rax, r8-r11

Integer Representation and Basic Operations

How would you summarize this to a peer?

Which bit is the sign bit in a signed type?

Left most bit

Decoding Scheme Acronyms

What is B2U

Binary to Unsigned (unsigned = simple binary)

What is B2T

Binary to Two's Complement (signed)

What is B20

Binary to ones'-complement

What is B2S

Binary to sign-magnitude

Convert -120 to signed binary (and back)

Without sign bit: 0001111
With sign bit: 10001111

Convert 100 both to signed binary and back and unsigned binary

0110 0100 for both

What is the binary value of z here:


                int x = 100;
                unsigned int z = x;

0110 0100

If one operand is signed and another is unsigned, C implicitly casts the ____ (blank) operand to be ____ (blank)

C implicitly casts the signed operand to unsigned (check the type conversion hierarchy)

What happens in an unsigned sign extension?

Fill to the left with 0s!

What happens in a signed sign extension?

Repeat sign bit, aka left-most bit, aka most significant bit (MSB)

What is the value of z (try to do without running):


                    char x = 10;
                    short z = (short) x;

0000 0000 0000 1010

What is the value of z (try to do without running):


                    char x = -10;
                    short z = (short) x;

1111 1111 1111 0110

Truncate 1054 to 8 bits

TODO: ANSWER

Unsigned overflow at width w: x+y > 2ʷ - 1 or x + y > 2ʷ

x + y > 2ʷ - 1

Signed overflow at width w (positive): x + y > 2^(w - 1)-1 or x + y > 2^w - 1

x + y > 2^(w-1) - 1

When is a signed overflow at width w (negative)

x + y < -2^(w-1)

Integer Multiplication & Division

Is shifting or multiplying faster?

shifting

If the most significant bit in op1 is in position op1n, and if operands and the result have w bit representations, how many bits can op1 be shifted left before overflow due to shifting occurs? (This is from the slides)

w - 1 - op1n
Example: w=4, op1n=2 (0100), then 4-1-2 = 1, we can only shift left once; any more than that causes overflow Conclusion: If op2n > w – 1 – op1n , overflow due to shifting will occur

I've kind of skimped short on these slides I may add more later but I feel like this stuff is a bit esoteric to be tested on

Performance

We can't talk about performance without talking about measures, if you don't measure you don't know, and if you don't profile you can't tell where.

Don't check performance on debug code

Idk that's all, caching is important too

IEEE Floating Point Part 1

V = x · 2ʸ

What is the range of a single precision IEEE floating point value

+/- 2⁻¹⁴⁹ to +/- 2¹²⁷

What is the range of a double precision IEEE floating point value

Single +/- 2⁻¹⁰⁷⁴ to +/- 2¹⁰²³

Overflow means that values have grown too large for the representation, much in the same way that you can have integer overflow.

Underflow (a value too small to be represented) is a less serious problem because it just denotes a loss of precision, which is guaranteed to be closely approximated by zero.

I'm like 50% sure this stuff will be on the final so let's really hammer it in.

What are the three IEEE 754 bit fields and their purpose?

S field - the sign field
E field - the exponent field (encodes the exponent)
F field - significand/mantissa field: encodes a number greater than or equal to 0, and less than 2, to which the exponent is raised to obtain the real value encoded by the entire IEEE 754 representation

(Slides 25)

The following do 32 bit single precision numbers, which is a bit tedious, you should probably do some of these easier 8-bit value ones first to get practice in, then do one 32-bit convert to IEEE and one 32-bit convert from IEEE to real decimal.

What is 64.2 in IEEE single precision representation? (This is an example from slide 25)

0 1000 0101 0000 0000 1100 1100 1100 110

What is 32.313 in IEEE single precision representation?

TODO: Answer

What is 2000.4 in IEEE single precision representation?

TODO: Answer

Given binary 1100 0100 1000 0000 0010 1100 0000 0000 what is the value in decimal (also from slide 25)

-1025.375

IEEE Pt 2

When/what are denormalized values used for?

extremely small values, very close to 0, including 0. (Largest denormalized value is approx +/-1.18 * 10^(-38)) (slides 26)

When/what are normalized values used for?

These are larger than the largest denormalized value but small enough to fit into the representation. (slides 26)

What are the special values?

+/- infinity (created by overflow), or NaN (not a number, caused by like division by 0)

More on denormalized at this Stack Overflow post

There is some other stuff about precision but I think the key takeaway is eventually you cannot account for fractional values and you start losing accuracy at higher numbers.

Assembly (finally, stuff in recent memory)

The term computer architecture covers three aspects of computer design, what are they?

The ISA, computer organization, and computer hardware

What is a ISA?

Instruction set architecture - ISA refers to the actual programmer visible machine interface such as instruction set, registers, memory organization, and exception (i.e. interrupt) handling (remember this!). (slides 27)

What does RISC mean?

Reduced Instruction Set Computer architecture - used in smart phones and tablets, power efficient, there is more info on slide 27 but I'd be surprised to see it on exam.

What does CISC mean?

Complex Instruction Set Computer architecture

What are the three general programmer-visible state categories (from slides 27)?

PC: Program Counter (RIP in x86-64)
Register file (registers)
Condition codes - store status about arithmetic or logical operations, used for conditional branching

What gcc flag gives you assembly output?

-S

Skipping over some register stuff already covered...

What are the four condition codes (RFLAGS/EFLAGS) that we care about?

ZF (zero), SF (negative), OF (overflow), CF (carry)

How large are condition code registers/flags

1 bit

What are the four general categories of statements in computer languages? (rly hope this aint on the final)

Data Movement
Arithmetic/Logical Operations
Control-Flow
Declerations

You should probably know we use AT&T syntax and Intel syntax also exists

What are the four suffix sizes (ie. q in movq is one) and what do they stand for

q - quad word (8 bytes)
l - long/double word
w - word
b - byte

What is an opcode? (not a trick question)

Opcode is the "name" of the instruction in assembly, for example, mov

Does assembly have typed pointers?

No.

What are the three operand types?

Format: type - example

Immediate - Constant, $100
Register - %rax
Memory - (%rax)

What are the assembly directives?

.file - Allows a name to be assigned to the assembly language source code file.
.section - This makes the specified section the current section.
.rodata (under section) - Specifies that the following data is to be placed in the read only memory portion of the executable
.string - Specifies that the characters enclosed in quotation marks are to be stored in memory, terminated by a null byte
.data - Changes or sets the current section to the data section
.text - Changes or sets the current section to the text (or code) section
.globl - A directive needed by the linker for symbol resolution: followed by name of function
.type - Needed by the linker to identify the label as one associated with a function, as opposed to data
.size - Needed by the linker to identify the size of the text for the program

What are the assembly data size directives?

.quad <value> - Places the given value, (0x prefix for hex, no prefix for decimal) in memory, encoded in 8 bytes
.long <value> - Places the given value, (0x prefix for hex, no prefix for decimal) in memory, encoded in 4 bytes
.word <value> - you get the idea...
.byte <value>

Immediate values preceded by $ just fyi again

"Effective address DISPLACEMENT(BASE), for example, movq $0x30, 8(%rbx)" - slide 28

What is the size of a memory address on stdlinux?

General form of memory addressing: Imm(Rb,Ri,S) Mem[Imm+ Reg[Rb]+S*Reg[Ri]].
Imm: Constant 'displacement'
Rb: base register, any of 16 int registers
Ri: index register, any except for %rsp
S: scale: 1, 2, 4, or 8

Try interpreting movq 24(%rax,%rcx,4), %rdx, movl 24(%rax,%rcx,4), %edx, movb 24(%rax,%rcx,4), %dl (all from slides 28). There are a few other goodies on slide 22ish of slides 28 you should definitely do.

leaq uhh exists and you definitely should know how to use it.

Write this in assembly int *p = &points[i].ycoord; (I have good reason to think this will be on the exam)

Check out this Stackoverflow post that was in the slides.

TODO: more LEA problems

Arithmetic vs logical shift right the following number: 1001, by one bit

Arithmetic: 1101 (I think) Logical: 0101

Can you do an arithmetic left shift?

No. (I think)

Stack top address is always in %rsp and it grows down lower (look at slide 35 of slides 28).

Three things to do to setup stack:

At the beginning of the function, set %rbp to point to the bottom of the current stack frame.
At the beginning of the function, set %rsp to point to the top of the stack (the same address as the stack bottom initially)
At the end of the function, put them back.

Example:

            
            # Beginning of function
            pushq %rbp # Save caller’s base pointer
            movq %rsp, %rbp # Set my base pointer
            ... # other stuff
            # end of function
            leave # set caller’s stack frame back up
            ret

What does the call instruction do?

Store the address of the instruction after the call instruction onto the stack, and address of destination is assigned to the PC (program counter, %rip)

How do we determine the return address for after a call?

PC (%rip) contains address of the current instruction, and we know call has a one-byte instruction + an 8-byte address...so, 9 bytes after the current value in the PC is the address of the next instruction. So PC+9 is what we wanna push (slide 28)

Skipping the parts of slides 28 that talk about register conventions as this document has already covered them

"When a function taking variable-arguments is called, %rax must be set to the total number of floating point parameters passed to the function in vector registers" - we did not use floating point so %rax should always be set to zero before calling a function that allows a variable argument list (like printf).

Let's test some jumps because the syntax is a bit off

cmpq b, a is like a - b

                cmpq $20, $0 # I don't know if this actually works it's 2:12am get off my case
                jg some_label

Will it jump?

No. (I'm 90% sure but if someone could check id appreciate that)

TODO: more cmp/test/jmp stuff

Initialize a local array of longs, iterate through it, and print out each element

TODO: Answer

What instruction that we went over doesn't implicitly set any condition codes?

leaq

Here's an easier one, what is je for? jg? jge?

Jump when equal, jump when greater, jump when greater or equal

Exercise: write a while, do-while, and if-else in assembly

TODO ANSWER

TODO: figure out a branchless code algorithm problem, good practice + might be the bonus

Maximize performance by minimizing ____ (blank) and minimizing ____ (blank) (slides 32)

Minimize data motion and memory accesses

Write out what the push instruction does by hand (as in, with other instructions, not using the push instruction)

TODO: Answer

Now do the same thing for the pop instruction, implement it yourself

TODO: Answer

BONUS: This is a bit extra, but implement ret and then call as well.

TODO: Answer

REMEMBER: The x86-64 ABI requires the stack address be 16 byte aligned prior to any call. But this is only necessary for programs which use SSE instructions and floats, and we won't use those so we can ignore this

When you have 7 args to pass to a procedure, does just the 7th arg go onto the stack or do all 7 args?

Just the 7th

Each procedure has its own stack ____ (blank)

Frame! :)

Without looking at your previous code, write a branchless recursive fibonacci function. (im not sure if this is possible but I suspect it is - maybe skip this one until I get to it)

TODO: Answer

Nothing from decoding raw hex seems like it'd be on the exam but maybe look at it anyway

There's a lot of stuff in slides 35 that I'm not sure how to quiz on/how relevant it'll be so maybe peruse yourself.

Structure align to the size of their ____ (blank) element

largest

If you have a structure that is an int, char, and long, what is the padding/alignment of this structure?

int - 4 bytes (4) char - 1 byte (5) padding - 3 bytes (8) long - 8 bytes (16)

Do some of the charts in slides 40-50 of slides #38

Bonus: Write a linked list implementation in asm! (if you have time)

CISC micro-architectures are designed for ____-level programming while RISC designs are often designed for ____-level

CISC - high RISC - low

RISC processors generally have ____-length instructions (bonus) because ____

Fixed-length, because they're pipeline friendly