performance/perf-cpp/cpp.md

24 KiB
Raw Permalink Blame History

CPP

learncpp

  • compilation process: preprocess -> compilation -> assembly -> linking
    • lexing for grammar validity -> ast type checking

    • files are compiled top to bottom, & individually (must include libraries multiple times)

    • e.g. below, type checker evaluates, sees f returns void -> pairs with std::cout operator << -> this is invalid -> compile-time caught

#include <iostream>

void printA()
{
    std::cout << "A\n";
}

int main()
{
    std::cout << printA() << '\n';

    return 0;
}
  • initialization

    • why list initialization -> narrowing conversions disallowed
  • operator precendece

  • nested functions disallowed

    • many features of cpp don't nec. have rhyme or reason; it is a language with history for good or bad
  • cpp compiles sequentially (i.e. top to bottom)

  • One Definition Rule: per file, per scope, one definition only; per program too; types/templates/inline functions can have other definitions

    • ? why are header files usually omitted in c++ compilation processes?
    • ? why does the cpp - include header structure in projects work, espec. wrt ODR?
      • NOTE: compiling does not need impls (just needs ) -> linking does
      • NOTE: research about linking later, but just know linking has a symbol table to map symbols to literaly code locations

namespaces

  • simple -> wrap simple name
    • "name mangling" includes the namespace prefix in compilation -> easy to identify

preprocessing

  • when is it done?

    • for each cpp file, copy-paste in the includes & compile into translation units separately
  • what's the output of pre-processsed files called?

  • Lesson: use macros sparingly besides eliminating code compilation (i.e. in libraries) and compilation-specific processes

  • "" vs. <> in includes: look locally relative to the path, then system/only look in the system

  • Why not use relative includes (configure the compile-time include path instead)

  • Why even use header guards? What's the common pattern that necessitates them?

    • 1 file including 2 files that each include another file -> duplicate definition
    • ? why do header guards resolve this problem? because each translation unit is treated separately by the pre-processor. files are compiled separately but a file compiling joint ones doesn't duplicate includes via the guard.
      • long story short, every file must have one of all each included file -> prevent ODR violations
  • ? why don't define methods in header files? multiple cpp can include headers, leading to ODR violations

    • only have ONE function implementation in source files -> compilation resolves with declarations & implementations linked easily

type conversion

  • implicit conversions convert types when acceptable
    • pre-defined set of compiler rules
  • why is using const in value parameters useless?

optimization

  • what's the official rule for how c++ compilres can change code? when the "observable behavior" is identical - i.e. cahnge anything if output is the same
  • methods:
  1. constant folding
  2. dead code elimination

constants

  • "by default, expressions are evaluated at _"?
  • it's up to the compiler as to whether anything is evaluated at compile time
  • constexpr: declares value as a compile-time available constant - all args must be evaluatable at compile time (i.e. be constant expressoions) - BUT... this is optional!
    • values must be KNOWN -> not nec. imply compilation
  • "constexprness" implies constness
  • std::string_view is a fragile container

order of operations

  • why is this bad? according to c++ spec, order of function evaluation is not specified - generally speaking, args should not have side effects
auto x = 5;

f(x++, ++x);
  • what does this return, and why? what does it teach us abou tthe ternary operations?
    • due to operator precedence, 0 is first printed; then the returned std::cout& is implicitly cast to a boolean and the result of the terary is discarded
#include <iostream>

int main() {
    int x { 2 };
    std::cout << (x < 0) ? "negative" : "non-negative";

    return 0;
}
  • NOTE: <=>, bitwise operators, and more have lower priority than <<

  • I allocate N bits into a std::bitset. How many bits of memory are allocated and why? ceil(N / bits per machine word) * number bits per word (rounded up) - bitsets are fixed size, contiguous arrays of machine words for convenient accessing

scopes

  • explain lifetime management in c++

  • an identifier can have multiple scopes - T/F

  • how does the compiler search for scope resolution? ||it works from the inside out resolving symbols||

  • how can i refer to x in the outer example? ||use the scope resolution operator: f::x||

void f() {
  int x;
  {
    int x;
  }
}

linkage

  • what is linkage? ||linkage determines visibility & to the linker||

  • can the linker see variables with internal linkage? ||intuitively no - they'll never need to be linked because they are not accessible from other translation units||

  • functions default to {external, internal} linkage ||external||

    • non-constant global variables? ||external||
  • how to declare constant global to have external linkage? ||add the storage class specifier extern||

  • why can extern variables not be constexpr? ||value must be evaluatable at compile-time - if another file sets the value, the compiler isn't sophisticated to know (it compiles files in isolation)||

  • Whats the difference between a variables scope, duration, and linkage? What kind of scope, duration, and linkage do global variables have?

    • ||scope -> where defined/accessible in the code; duration -> creation/destruction bounds (can be different in more complex areas? give an example), linkage -> visibility beyond translation unit||

inlining

  • centrally concerned with linkage

    • external linkage can cause ODR violations -> resolve with inline
  • when to not inline? ||size of the inlined function outweighs the overhead of inlining itself||

  • why can you not force the compiler to inline?

  • why is inline not to be used for inlining?

    • ||inlining is a per-function call concept, but the keyword makes it apply to all scenarios||
  • inline is about removing ODR violations across multiple different files.

    • how does this vary from header guards?
  • when should you use inline?

  • inline really means ||multiple definitions are permitted|| instead of inline the call

  • NOTE: this still honestly doesn't make much sense to me

    • Review linkage, compilation process, ODR, and inlining
  • We usually just want ONE definition - so we keep implementations in ||source|| files so that on compilation the ||linker|| can easily find the function definition (which it can do because it has ||external|| linkage by default)

    • When defining something like a function ||implementation|| in a ||header|| file, this pattern is broken. why? ||multiple function impls with external linkage will exist||
      • inline then allows separate ||translation units||, indeed compiled separately, to resolve the symbol fine
  • marking something as inline does/does not ||does not|| change the storage class specifier. the linker ||merges|| multiple definitions of variables marked inline

  • inline variables have ||external|| linkage by default - why? ||duh - so linker can resolve between TUs||

static

  • has so many different meaning

  • What does static refer to inside/outside a function? ||static storage duration/internal linkage|| variable to start/end at the time of the program

  • how do I give a function internal linkage? ||mark as static||

scoping and brackets

  • which does the else bind to? ||closest/nearest unmatched if -> the if (y) block||

  • explain loops without brackets in terms of grammar

    • ||basically, grammatically an if/for/etc. cover ones statement. an enclosing body with brackets is a compound statement. no brackets -> the first statement in the tree is grabbed||
      • why does this then work with chained elses and else ifs? note that else ifs are parsed as ifs inside of else blocks
if (x)
  if (y)
    a();
  else
    b();

switch statements

  • compilers are smart, and may do the following for switch statements:
  1. direct access jump tables
  2. binary search when too big
  3. re-indexing (i.e. values start at large n)

loops

  • mostly get compiled to the same thing

function overloading implicit type conversion

  • why is it needed? ||internal representation of different types differs||

  • using keyword does a variety of things

  • what happens when multiple function candidates are available?

    • overload resolution -> choose ONE best candidate (if multiple, error)
  • What's the output (and why?)

    • ||double because there's a promotion to float||
#include <iostream>

void f(int)    { std::cout << "f(int)\n"; }
void f(double) { std::cout << "f(double)\n"; }

int main() {
    f(3.14f);
    int x = 3.14f;  // also conversion
}
  • in function overloading, acc. to C++ spec resolve function choices:
  1. exact match
  2. promote arguments
  3. user defined conversions
  4. ellipses
  5. give up
  • ? what's the difference between compiler vs. linker resolving functions?

    • compiler resolves function calls via OVERLOADING - indeed, the compiler must know all function signatures. subsequently, each function is then mapped to in the linker phase.
  • What's the output (and why?)

    • ||compilation error -> CONVERSION of the same rank||
void f(long)   { std::cout << "f(long)\n"; }
void f(unsigned long) { std::cout << "f(unsigned long)\n"; }

int main() {
  f(3.14f);
  return 0;
  • NOTE: conversion is complex
    • promotion -> special, correct promotion -> prioritized (prioritized first)
      • only includes small types, not like long
    • conversion -> otherwise
    • compilers do the five following conversions, and in which order?

||

  1. exact (including qualifiers such as cv, array-to-pointer, etc.)
  2. promotion
  3. conversion (and any path, promotion + conversion, etc)
  4. user-defined
  5. variadic (i.e. smash into an ellipses) ||
  • valid or not, and why?
    • ||invalid - return types not used in function overloading for a) compatibility reasons with C and b) trouble distinguishing between desired return type (how can you indicate which function you want when there isn't always a straightforward way to do such?). e.g. auto X = x(); f(());, etc. Generally, it's convenient to just look at call context & know return value after deducing the right function call.||
      • || in other words, a function's type signature is everything but the return type.||
int x();
double x();
  • exactly what does this do?
    • || halt compilation if the function is called. code is still compiled, but delete forbids the call||
template <typename T>
void printInt(T x) = delete;

default arguments

  • are default args part of function signature ||no||
  • how are default args instantiated
  • what's the output of the following code, and why?
    • ||behaves as expected. compile substitutes in the default args at the call site, so the calls become g(int x=foo())||.
int foo() {
  static int s = 0;
  return ++s;
}

void g(int x = foo()) {
  // ...
}

int main() {
  g();
  g();
}
  • does this compile? y or no? ||no - default args not considered in function signature in overload resolution process - call is ambiguous||
void print()
{
    std::cout << "void\n";
}

void print(int x=0)
{
    std::cout << "int " << x << '\n';
}

int main() { print(); }

templates

  • how do templates work in the compiler?
    • ||an internal symbol table - one template blueprint itself does not generate machine code (instantiations do)||
  • say i have two functions - one regular, one templated. how can i prefer the templated function? do i even need to?
    • ||if you want to target the templated function, which is more generic and thus not prioritized by the compiler, wrap it in <>||
template<typename T>
void print();
void f(bool);

// ??
f(true);
f<(bool)>(); // NOTE: bool is optional
  • what's the output? || 1) 2) 1) - consider the instantiated code per the compiler - one static local variable per template specialization ||

  • what about here? ||second function called. template can create an exact match which is preferred over promotion||.

void f(int x) { ... }
void f(auto x) { ... }

// ??
f(short{3});
#include <iostream>

template <typename T>
void printIDAndValue(T value)
{
    static int id{ 0 };
    std::cout << ++id << ") " << value << '\n';
}

int main() {
    printIDAndValue(12);
    printIDAndValue(13);

    printIDAndValue(14.5);

    return 0;
}
  • ? concepts and generics, best ways to link together?
  • what's partial ordering of templates?
    • ? what's the entire way compilers resolve function calls?
template <typename T>
auto add(T x, T y) {
    return x + y;
}

template <typename T, typename U>
auto add(T x, U y) {
    return x + y;
}
  • what's wrong with this? how do i fix it?
    • ||add specializations or an else clause with constexpr. picture the compiler - it needs to instantiate all proper templates on compilation. in some sense, the compiler is literally executing the code.
template <int N>
constexpr long long fibonacci() {
    static_assert(N >= 0);
    if (N <= 1) { return N; }
    return fibonacci<N - 1>() + fibonacci<N - 2>();
}
  • NOTE: this is a sharp edge. skip it, honestly

  • functions instantiated from templates are implicitly ||inline||. Why? ||must be so that different, identical specializations in files do not violate the ODR.||

    • the odr is a compile and/xor link time concept ||and||

pointers and references

  • how do references work under the hood?
    • alias to variable only known to compiler - implicitly dereffed
    • NEVER reassignable
    • output?
int x = 5;
int y = 4;
int& z = x;
z = y;  // x holds value of 4 -> can never reassign alias
(*x) = y;
  • what's the output?
    • ||A 5 A; reference calls always implicitly resolve to (*ref) by the compiler - "syntactic sugar". In other words, it's impossible to actually inspect the reference pointer itself.||
#include <iostream>

void printAddresses(int val, int& ref)
{
    std::cout << val << '\n';
    std::cout << &ref << '\n';
}

int main() {
    int x { 5 };
    std::cout << "The address of x is: " << &x << '\n';
    printAddresses(x, x);
}
  • will this compile? ||it has a memory leak - that doesn't mean it won't compile (it is valid c++, sigh)||
const std::string& getProgramName() {
    const std::string programName { "Calculator" };

    return programName;
}
// automatic storage duration of `programName` - destroyed
  • auto and what not - name the types, assuming getRef() returns T:

    • auto x = getRef() -> ||T (reference discarded)||
    • auto& x = getRef() -> ||T&. This is known as "reapplying the reference and can also be done with const"||
    • auto x = getConstRef() -> ||T (const discarded)||
    • auto x = getPointer() -> ||T* (pointer kept - notion of converting/dropping is incorrect, its sort of its own type)||
    • auto* x = getPointer() -> ||T* (pointer kept)||
  • low-level vs. top level const?

    • ||A top-level const applies to the object itself (e.g. const int x or int* const ptr)||
    • ||A low-level const applies to the object accessed through a reference or pointer (e.g. const int& ref, const int* ptr).||

enums

  • by default, enums are ||unscoped|| which means they ||leak into the scope they're defined in||. For this reason default to enums keyed with the word ||class||, or "||scoped||" enums. We just need to access them with the ||scope resolution|| operator. not leak.

  • ||bad, good - unscoped do not implicitly convert||

enum class X { A };
enum Y { A };

X::A == 0;
Y::A == 0;
  • ||NO but yes if unscoped, NEVER||
void f() {
  // NOTE: even enum does not work here (both scoped inside)
  enum class A { B };

  // does this compile?
  A::B;
}

// this?
A::B;

structs/classes

  • static in structs means one variable for the lifetime

    • where do static class/struct vars live in memory? ||not in struct, prob in the data segment||
  • any function implementation in a struct is implicitly marked ||inline|| - why?

  • why declare a static struct variable inline? ||so you can declare it once inside the struct (C++17+)||

    • NOTE: only ever makes sense to declare variables as inline when done static inside a class - shared, program-living variables. i.e. marking object-local variable as inline is nonsensical - it's "already inline"
  • static-static interaction only

  • no this pointer -> must access via ClassName::{}

  • structs are like/not like member functions in the sense that they can/cannot be declared in any order

  • specification default vs. class

  • valid? & why? ||yes, const is part of signature. useful for things like operator overload, const & non-const versions||

  • what is const mem func -> ||cannot modift class internals||

  • When can you not call non-const member functions? ||from const objects -> may mutate state||

    • pretty straightforward, which is why you can/cannot call non-const MF from non-const objects (||can||)
// ??
class A{
  void f() const;
  void f();
}

access specification

  • private/public/protected
  • given your definitions, why is the code below valid or invalid? ||valid - private -> any object & friend of class (inheritor or not) can touch its members||
class A{
  int x;
  void f(A a) {
    // ??
    a.x;
  }
}
  • structs/class differ in that struct/class AS is ||public/private||

  • access specification is a ||compile||-time construct and occurs before/after (||after||) method resolution in the compilation process

  • good or bad (||bad - best match, then resolve specification||)

struct Gate {
private:
  void calibrate(int);
public:
  void calibrate(double);
};

void run(Gate& g) {
  // ??
  g.calibrate(7);
}

struct/class memory management

  • struct members can/cannot be reordered and can/cannot have padding inserted
  • in terms of the following memory model:
    • code: code itself
    • data: globals, statics, string literals
    • stack: stack frames, local vars, etc
    • heap: heap-allocated things
struct Widget {
    int id;                        // per-object data
    std::string name;              // per-object data
    static int total_count;        // one global variable
    static constexpr double pi = 3.14159; // goes in rodata
    void ping();                   // function
    virtual void vfunc();          // function pointer in vtable
};
  • ... member functions are in (||code, so you don't have multiple per-instantiation||), regular variables in ||stack||, static in ||data||, virtual stuff ||also somewhere in code - like having one virtual pointer||
    • WHY does this memory management make sense?

construction

  • do constructors create objects || no they initialize ||

  • why not const?

  • follow normal conversion rules

  • ouptput: ||member initializer list always takes precedence||

    • NOTE: member IL != IL
class A {
public:
A(int x) : x(x) {}
int x{2};
};

// ??
A{}.x;
  • members are initialized in order of declaration in initialize list (why? no idea)
class Foo {
private:
    int m_x{};
    int m_y{};

public:
    Foo(int x, int y)
        // m_x is ALWAYS garbage
        : m_y { std::max(x, y) }, m_x { m_y }
    {}
};
  • delegating and chaining constructors can be usefukl for off-loading logic

    • CALLING CTORS of other classes and our own in CTORS IS PERFECTLY FINE!
  • more rules:

    • T/F: if I have constructors but no default, c++ creates an implicit one for me ||F||
    • T/F: if I have no default constructors, c++ makes ||T||
    • Say you want to disable copying/default construction. I claim you can just make the methods private. Does this work? ||Yes - but members of the class/inheritors/etc. can still call it -> defeating the purpose - just use = delete||
  • anything wrong? ||incompete type - infinite recursion||

class A {
  A(A  a) {}
};
  • why should copy constructors not have side effects? ||if copy elision occurs, the side effects will not. copy elision is immune to the "as-if" rule||

  • what is (N)RVO and which c++ concept is behind it? ||copy elision & copy constructors||

  • does this compile? || no - only one user-defined constructor can be applied (trying to go from std::string -> std::string_view -> std::string)||

class Employee {
private:
    std::string m_name{};
public:
    Employee(std::string_view name)
        : m_name{ name } { }
    const std::string& getName() const { return m_name; }
};

void printEmployee(Employee e) {
    std::cout << e.getName();
}

int main() {
    printEmployee("Joe");

    return 0;
}
  • ? know each constructor syntax - which does this call?
Y y = Y(5);  // cpp isn't stupid -> calls Y::Y(const& y), the copy constructor - not default + assigned
Y y = Y();  // default & copy, but copy elided (c++17+) -> just default
Y o;
Y y(o);  // textbook copy
Y y = Y(Y());  // one defeault - BOTH elided
  • explicit - only accept exact value type (no implicit conversions)
    • why use explicit -> implicit conversions
class A {
public:
  int a;
  A(int a) : a(a) {}
  void meth(A a) {}
};

A a = A(true); // this is valid when A::A(int) is not explicit
a.meth(A(false)); // so i can do this too
  • called by static_cast just fine, too

this pointer

  • "Every non-static member function has an implicit parameter called this."
  • how does this and implicit this work?
    • inserted implicitly
  • what's the type of this in non-const/const member functions? || TheClass\* const (const pointer - you can still change underlying object)/const TheClass* const (you cannot change underlying object, so pointer to const)||
  • NOTE: references didn't exist when c++ was made, so even tho it should be a reference it's not
class MethodChain {
  MethodChain& mc() {
    // deref -> automatically aliased in return type
    return *this;
  }
};
  • ODR again - why does defining classes in headers "just work?" - because class member functions defined ||& implemented fully in the header|| are ||implicitly marked as inline||
    • I could just define an entire class in a header - why not? ||organization + increased compile time -> recompile header + all deps on change, rather than just cpp source file)
    • why do class variables and other things not need to be marked inline ||member functions have external linkage by default. other stuff does not and is a per-object local storage - is has NO linkage||

more static

  • static vars are/are not associated with the lifetime of a class ||are not - start & end of program - can access w/ 0 instances||
class A {
  static x;
};

A::x; // valid
  • say you want to shield static class vars - how? ||make private - access specifiers trump||

  • static member functions have/don't have a this pointer ||don't||

    • therefore, they can/cannot call non-static functions - why? ||cannot - non-static member functions need an implicit this object point to operate on, and statics do not have this||

beginning c++ 23: beginner to pro