This is my own styleguide, meant mainly for me. You don’t have to agree or follow anything that’s written here.
The opening brace of a function should be placed on the same line as the function definition, seperated by a space from the class definition. The closing brace should be placed on the line following the last line of the function.
void main() {
// do something
return 0;
}
Braces should not be used for a function if you are only declaring the signature.
void main();
Identation should be done using TABS.
Files should be formatted with UTF-8 and LF, regardless of operating system. (This does not include binaries)
Avoid using smart pointers unless custom management becomes too complex.
Use _alloca() or alloca() whenever you need to store basic types such as integers, but don’t know EXACTLY how many there are. Make sure to never allocate too much memory with _alloca() or you’ll run into Stack overflow errors. If you get a stack overflow error, always take a look at your _alloca() calls and try logging the amount of memory you allocate each time and in total on each line.
Use malloc() (or new) whenever you can’t use _alloca() (long loops, large memory, types that have deconstructors).
Storing pointer locations as uintptr_t instead of char* for pointer arithmetic is recommended. Only convert to void* when returning from a function. This is mainly for readability.
All output binaries should be as position-independent as possible, regardless of usecase. (Excluding kernel-level programming, or binaries where size is a priority)
On Linux, position-independent binaries should be made with the use of -fPIC or -fPIE -pie flags for g++.
On Windows, position-independent binaries should be made ith the use of -O2 -fno-common -Wl,--dynamicbase -Wl,--high-entropy-va -Wl,--nxcompat flags for g++. Because Windows uses PE instead of ELF, this is the closest you can get to truly position-independent Windows code.
Debug builds should be built with the following arguments:
g++ -g -O1 -fPIE -pie or g++ -g -O1 -fPICg++ -g -O1 -fno-common, -Wl,--high-entropy-vaFor builds making excessive use of _alloca(), consider giving them a bigger stack size.
Not all of the following optimization flags have to be used, but they are valid options.
-Ofast-Ofast enables all optimizations from -O3 plus aggressive transformations that may violate strict IEEE or ISO compliance rules. It includes -ffast-math, allowing algebraic simplifications and less strict floating-point semantics.
Command:
g++ -Ofast your_code.cpp -o your_program
Additional Enhancements:
-Ofast -fipa-pta -fipa-cp-clone-Wl,--icf=all -Wl,--gc-sections -Wl,-O2-Wl,--enable-icf -Wl,--gc-sections -Wl,-O2PGO uses runtime profiling data to guide the compiler’s optimization decisions, improving instruction layout, branch prediction, and cache locality.
Step 1 – Instrument the program
g++ -O3 -fprofile-generate your_code.cpp -o your_program_gen
Step 2 – Run with representative data
./your_program_gen <test_input>
Step 3 – Recompile using the profile data
g++ -O3 -fprofile-use -fprofile-correction your_code.cpp -o your_program_pgo
Alternative (Linux only):
Use automatic feedback from performance data with AutoFDO:
g++ -O3 -fauto-profile=perf.data your_code.cpp -o your_program_autofdo
Notes:
-fprofile-correction corrects incomplete or mismatched profiling data.LTO enables whole-program optimization across multiple translation units. It allows interprocedural inlining, devirtualization, and dead code elimination at link time.
Standard LTO:
g++ -O3 -flto your_code.cpp -o your_program
ThinLTO (faster linking, similar benefits):
g++ -O3 -flto=thin your_code.cpp -o your_program
PGO + LTO Combined:
g++ -O3 -flto -fprofile-use -fipa-pta -fdevirtualize-at-ltrans your_code.cpp -o your_program
Advantages:
-fdevirtualize-at-ltrans and -fipa-pta improve pointer and call devirtualization analysis.-Wl,--gc-sections -Wl,--icf=all -Wl,-O2-Wl,--enable-icf -Wl,--gc-sections -Wl,-O2Optimize code generation for the specific CPU architecture to leverage advanced instruction sets (SSE, AVX, AVX2, AVX-512, etc.).
Command:
g++ -O3 -march=native -mtune=native
Examples:
-march=skylake-avx512 or -march=haswell-march=znver4-mcpu=apple-m1-march=armv8-a+simdAdditional Flags:
-fprefetch-loop-arrays – enable prefetching for loop data (Linux/macOS only).-funroll-all-loops – unroll all loops aggressively (caution: larger binaries).-frename-registers – improve register allocation in tight numeric loops.Notes:
-march=native automatically detects and enables all supported features on the build host.native.For maximum optimization potential across modern systems, combine all major optimizations, profile-guided feedback, LTO, and CPU targeting.
Command (Linux/macOS):
g++ -Ofast -ffast-math -funsafe-math-optimizations -fno-trapping-math -finline-functions -fno-math-errno -fivopts -faggressive-loop-optimizations -flto=thin -fprofile-use -fprofile-correction -fipa-pta -fdevirtualize-at-ltrans -march=native -mtune=native -funroll-loops -fomit-frame-pointer -fprefetch-loop-arrays -Wl,--icf=all -Wl,--gc-sections -Wl,-O2 your_code.cpp -o your_program
Command (Windows – MinGW/MSYS2):
g++ -Ofast -ffast-math -funsafe-math-optimizations -fno-trapping-math -finline-functions -fno-math-errno -fivopts -faggressive-loop-optimizations -flto=thin -fprofile-use -fprofile-correction -fipa-pta -fdevirtualize-at-ltrans -march=native -mtune=native -funroll-loops -fomit-frame-pointer -Wl,--enable-icf -Wl,--gc-sections -Wl,-O2 your_code.cpp -o your_program.exe
Notes:
-fauto-profile=perf.data on Linux for automatic profile feedback instead of manual PGO.-O3).