November 8, 2025 by Alessandro Colucci
Optimizing code for embedded systems is a critical skill for engineers and developers working with constrained hardware. The ESP32, a powerful and versatile microcontroller, is widely used in IoT, robotics, and sensor applications. While it offers impressive capabilities, achieving peak performance requires a deep understanding of code optimization techniques, memory management, and timing considerations.
Although the ESP32 is powerful, understanding how to efficiently use its resources ensures your projects run faster, consume less power, and behave predictably.
In this article, we will explore practical and advanced strategies to optimize your ESP32 firmware, from compiler-level tricks to algorithmic improvements, memory handling, and RTOS-level optimizations. By the end, you will have actionable methods to improve both speed and efficiency in your embedded projects.
This guide is structured to help you move from basic tweaks to deeper firmware strategies, giving you tangible improvements in your ESP32 projects.
Embedded systems, unlike general-purpose computers, are constrained by limited CPU speed, memory, and power. Poorly optimized code can lead to:
Even minor inefficiencies can have a big impact in constrained environments.
By understanding these pitfalls, you can write code that not only works, but performs consistently under all conditions.
For instance, consider a sensor polling loop running on an ESP32:
void loop() {
int sensorValue = analogRead(34);
delay(10);
}
This simple loop seems fine, but in multitasking or high-frequency scenarios, it can introduce delays and jitter, affecting critical timing.
Before optimizing, it’s essential to understand the ESP32 architecture:
Knowing how your microcontroller works internally helps you place code and data in the right memory, and schedule tasks efficiently.
Placing critical routines in IRAM and minimizing cache misses improves responsiveness, especially in real-time applications.

ESP32 Architecture Image
This block diagram illustrates the main components of the ESP32, helping visualize where optimization matters most.

ESP32 SRAM Allocation
Understanding SRAM allocation allows you to place time-critical data in fast memory and avoid performance bottlenecks.
The ESP32 uses GCC via ESP-IDF or PlatformIO. Compiler optimizations are the first line of defense for code efficiency.
Even before touching your code logic, the compiler can help improve execution speed and reduce memory usage.
| Flag | Focus | When to use |
|---|---|---|
| -O0 | No optimization | Debugging |
| -O1 | Minimal | Reduce code size slightly |
| -O2 | Balanced | General optimization |
| -O3 | Maximum Speed | CPU-intensive loops |
| -Os | Optimized for Size | Memory-limited applications |
| -Ofast | Aggressive speed | Non-standard behavior possible |
Choosing the right optimization flag balances speed, memory usage, and deterministic behavior. Profiling is key before changing flags.
In PlatformIO, you can set optimization flags in platformio.ini:
[env:esp32dev]
platform = espressif32
board = esp32dev
framework = arduino
build_flags = -O2 -flto
This allows you to fine-tune performance without changing source code.
inline: Suggests function inlining to reduce call overhead.const and restrict: Help compiler optimize memory access.Using these keywords helps the compiler produce faster, more efficient code.
inline int square(const int x) {
return x * x;
}
Small changes like inlining frequently called functions can noticeably reduce execution time in critical loops.
Critical functions should be placed in IRAM for faster execution. ESP-IDF provides IRAM_ATTR:
void IRAM_ATTR onTimer() {
// Time-critical ISR code
}
Placing interrupt service routines (ISRs) in IRAM reduces jitter and ensures deterministic timing.
hw_timer_t *timer = NULL;
void IRAM_ATTR onTimer() {
static volatile int count = 0;
count++;
}
void setup() {
timer = timerBegin(0, 80, true); // 80 prescaler for 1 MHz
timerAttachInterrupt(timer, &onTimer, true);
timerAlarmWrite(timer, 1000, true); // 1 kHz
timerAlarmEnable(timer);
}
This setup demonstrates how to measure and minimize ISR latency.
const uint8_t lookupTable[256] PROGMEM = { /* values */ };
Efficiently placing data ensures fast access to frequently used values while keeping scarce RAM free for runtime operations.
Micro-optimizations are useful, but algorithmic efficiency is paramount.
// Slow
float average = (float)(sum) / count;
// Optimized
int average = sum / count; // integer division
Integer operations are much faster than floating-point on microcontrollers, making this change critical in performance-sensitive loops.
// Example: sine wave lookup
const int16_t sineTable[360] = { /* precomputed values */ };
int getSine(int angle) {
return sineTable[angle % 360];
}
Precomputing values avoids costly runtime calculations.
// Before
for (int i=0; i<n; i++) {
float val = sin(i * PI / 180.0); // recalculated every iteration
}
// After
float step = PI / 180.0; //calculated only one time
for (int i=0; i<n; i++) {
float val = sin(i * step);
}
Reduce repetitive computations to save cycles and improve loop efficiency.
ESP32 often runs FreeRTOS, where task scheduling impacts performance:
Proper task management ensures your critical code runs predictably and efficiently.
void highPriorityTask(void *pvParameters) {
while(1) {
// critical loop
vTaskDelay(1);
}
}
void setup() {
xTaskCreatePinnedToCore(highPriorityTask, "HighTask", 2048, NULL, 2, NULL, 1);
}
vTaskGetRunTimeStats(buffer); // Provides execution time per task
Profiling lets you identify bottlenecks and adjust tasks to maximize CPU efficiency.
To optimize effectively, measure before you optimize:
esp_timer_get_time(): microsecond timing.esp_log_level_set, esp_timer)Profiling provides insight into where optimization efforts will have the greatest impact.
uint64_t start = esp_timer_get_time();
myCriticalFunction();
uint64_t end = esp_timer_get_time();
Serial.printf("Execution time: %llu us\n", end - start);
This simple measurement allows you to compare optimizations and verify improvements.
Performance and power often conflict. Consider:
esp_sleep_enable_timer_wakeup(1000000); // 1 second
esp_deep_sleep_start();
Reducing energy usage is critical for battery-powered projects without compromising essential operations.
Iterating through this workflow ensures each optimization delivers measurable benefits.
Integrate this workflow into PlatformIO Tasks:
[env:esp32dev]
extra_scripts = pre:benchmark.py
Automating benchmarking reduces manual effort and keeps optimizations safe.
inline or volatile.Being aware of these pitfalls prevents subtle bugs and performance issues.
Optimization is iterative, measurable, and essential in embedded systems. By combining:
You can achieve significant performance gains on the ESP32 and build reliable, efficient embedded systems.
💡 Ready to take your ESP32 firmware to the next level?
Try Please Code Generator today to automatically analyze and optimize your embedded code for peak performance, streamlining the path from measurement to improvement.